RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

In various classification problems characterized by a large number of features, feature selection (FS) is essential to guarantee generalization capabilities. The FS problem is often ill-posed due to significant correlations among features, which may lead to several different feature subsets with comparable scores in terms of classification performance. However, not all these subsets are equivalent from a domain-oriented point of view due to known relationships among features and their different acquisition costs in production to deploy the trained classifier. In this paper, we consider the potential benefits of including the domain expert's preferences in the FS task, thus integrating both objective elements (e.g., classification accuracy) and subjective (often not quantifiable) considerations in the selection process. This goes in the direction of increasing the interpretability and the trustworthiness of the machine learning model, which is an often desired property in many application domains such as in medicine. The proposed method consists of an iterative procedure. At each iteration, the expert is asked to express a "human" preference on pairs of classifiers, each one trained from a different subset of features. The expressed preferences are used algorithmically to update a suitable surrogate function that mimics the latent subjective expert's objective function, and then to propose a new classifier for testing and comparison. The proposed method has been tested on academic and experimental FS problems, and notably, on a COVID'19 patients record. The preliminary experimental results are promising, in that a parsimonious and accurate solution is obtained after a relatively short number of iterations. (c) 2022 European Control Association. Published by Elsevier Ltd. All rights reserved.

Active preference-based optimization for human-in-the-loop feature selection

Federico Bianchi;Luigi Piroddi;Alberto Bemporad;Geza Halasz;Matteo Villani;Dario Piga

2022-01-01

Abstract

In various classification problems characterized by a large number of features, feature selection (FS) is essential to guarantee generalization capabilities. The FS problem is often ill-posed due to significant correlations among features, which may lead to several different feature subsets with comparable scores in terms of classification performance. However, not all these subsets are equivalent from a domain-oriented point of view due to known relationships among features and their different acquisition costs in production to deploy the trained classifier. In this paper, we consider the potential benefits of including the domain expert's preferences in the FS task, thus integrating both objective elements (e.g., classification accuracy) and subjective (often not quantifiable) considerations in the selection process. This goes in the direction of increasing the interpretability and the trustworthiness of the machine learning model, which is an often desired property in many application domains such as in medicine. The proposed method consists of an iterative procedure. At each iteration, the expert is asked to express a "human" preference on pairs of classifiers, each one trained from a different subset of features. The expressed preferences are used algorithmically to update a suitable surrogate function that mimics the latent subjective expert's objective function, and then to propose a new classifier for testing and comparison. The proposed method has been tested on academic and experimental FS problems, and notably, on a COVID'19 patients record. The preliminary experimental results are promising, in that a parsimonious and accurate solution is obtained after a relatively short number of iterations. (c) 2022 European Control Association. Published by Elsevier Ltd. All rights reserved.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2022
			
	Titolo della rivista
	
				EUROPEAN JOURNAL OF CONTROL
			
	Parole chiave
	
				Feature selection
Preference-based learning
Randomized algorithms
Human-in-the-loop
			
	Appare nelle tipologie:
	
				01.1 Articolo in Rivista

File in questo prodotto:

File	Dimensione	Formato
2022 - EJC - BianchiPiroddiBemporadHalaszVillaniPiga (post-print version).pdf Open Access dal 16/04/2024 Descrizione: Articolo principale (versione post-referaggio) : Post-Print (DRAFT o Author’s Accepted Manuscript-AAM) Dimensione 584.5 kB Formato Adobe PDF Visualizza/Apri	584.5 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1233058

Citazioni

ND

6

5

social impact