In various classification problems characterized by a large number of features, feature selection (FS) is essential to guarantee generalization capabilities. The FS problem is often ill-posed due to significant correlations among features, which may lead to several different feature subsets with comparable scores in terms of classification performance. However, not all these subsets are equivalent from a domain-oriented point of view due to known relationships among features and their different acquisition costs in production to deploy the trained classifier. In this paper, we consider the potential benefits of including the domain expert's preferences in the FS task, thus integrating both objective elements (e.g., classification accuracy) and subjective (often not quantifiable) considerations in the selection process. This goes in the direction of increasing the interpretability and the trustworthiness of the machine learning model, which is an often desired property in many application domains such as in medicine. The proposed method consists of an iterative procedure. At each iteration, the expert is asked to express a "human" preference on pairs of classifiers, each one trained from a different subset of features. The expressed preferences are used algorithmically to update a suitable surrogate function that mimics the latent subjective expert's objective function, and then to propose a new classifier for testing and comparison. The proposed method has been tested on academic and experimental FS problems, and notably, on a COVID'19 patients record. The preliminary experimental results are promising, in that a parsimonious and accurate solution is obtained after a relatively short number of iterations. (c) 2022 European Control Association. Published by Elsevier Ltd. All rights reserved.

Active preference-based optimization for human-in-the-loop feature selection

Federico Bianchi;Luigi Piroddi;Alberto Bemporad;
2022-01-01

Abstract

In various classification problems characterized by a large number of features, feature selection (FS) is essential to guarantee generalization capabilities. The FS problem is often ill-posed due to significant correlations among features, which may lead to several different feature subsets with comparable scores in terms of classification performance. However, not all these subsets are equivalent from a domain-oriented point of view due to known relationships among features and their different acquisition costs in production to deploy the trained classifier. In this paper, we consider the potential benefits of including the domain expert's preferences in the FS task, thus integrating both objective elements (e.g., classification accuracy) and subjective (often not quantifiable) considerations in the selection process. This goes in the direction of increasing the interpretability and the trustworthiness of the machine learning model, which is an often desired property in many application domains such as in medicine. The proposed method consists of an iterative procedure. At each iteration, the expert is asked to express a "human" preference on pairs of classifiers, each one trained from a different subset of features. The expressed preferences are used algorithmically to update a suitable surrogate function that mimics the latent subjective expert's objective function, and then to propose a new classifier for testing and comparison. The proposed method has been tested on academic and experimental FS problems, and notably, on a COVID'19 patients record. The preliminary experimental results are promising, in that a parsimonious and accurate solution is obtained after a relatively short number of iterations. (c) 2022 European Control Association. Published by Elsevier Ltd. All rights reserved.
2022
Feature selection
Preference-based learning
Randomized algorithms
Human-in-the-loop
File in questo prodotto:
File Dimensione Formato  
2022 - EJC - BianchiPiroddiBemporadHalaszVillaniPiga (post-print version).pdf

Open Access dal 16/04/2024

Descrizione: Articolo principale (versione post-referaggio)
: Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione 584.5 kB
Formato Adobe PDF
584.5 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1233058
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 0
social impact