RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Black-box optimization involves solving optimization problems where the objective function and/or constraints are unknown, inaccessible, or do not explicitly exist. In many applications, particularly those involving human interaction, the optimization problem can only be accessed through physical experiments, with the available outcomes based on the preference of one candidate over one or more others. Accordingly, algorithms for active preference learning have been developed to exploit this specific information in constructing a surrogate of the objective function. This surrogate is then used to define an acquisition function that suggests new decision vectors to search for the optimal solution iteratively. Based on this idea, our approach aims to extend active preference learning algorithms to leverage further information effectively, which can be obtained in reality, such as: a five-point Likert-type scale for the outcomes of the preference query (i.e., the preference can be described not only as “this is better than that” but also as “this is much better than that”), or multiple outcomes for a single preference query with possible additive information on how certain the outcomes are. The validation of the proposed algorithm is done through some standard benchmark functions, and, in practice, through tuning parameters for robot sealing and human–robot collaboration experiments, showing a promising improvement with respect to the state-of-the-art algorithm in the same context.

Experience in Engineering Complex Systems: Active Preference Learning With Multiple Outcomes and Certainty Levels

Dao, Le Anh;Maccarini, Marco;Nicora, Matteo Lavit;Falerni, Matteo Meregalli;Mondellini, Marta;Veerappan, Palaniappan;Mantovani, Lorenzo;Piga, Dario;Formentin, Simone;Malosio, Matteo;Roveda, Loris

2025-01-01

Abstract

Black-box optimization involves solving optimization problems where the objective function and/or constraints are unknown, inaccessible, or do not explicitly exist. In many applications, particularly those involving human interaction, the optimization problem can only be accessed through physical experiments, with the available outcomes based on the preference of one candidate over one or more others. Accordingly, algorithms for active preference learning have been developed to exploit this specific information in constructing a surrogate of the objective function. This surrogate is then used to define an acquisition function that suggests new decision vectors to search for the optimal solution iteratively. Based on this idea, our approach aims to extend active preference learning algorithms to leverage further information effectively, which can be obtained in reality, such as: a five-point Likert-type scale for the outcomes of the preference query (i.e., the preference can be described not only as “this is better than that” but also as “this is much better than that”), or multiple outcomes for a single preference query with possible additive information on how certain the outcomes are. The validation of the proposed algorithm is done through some standard benchmark functions, and, in practice, through tuning parameters for robot sealing and human–robot collaboration experiments, showing a promising improvement with respect to the state-of-the-art algorithm in the same context.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2025
			
	Titolo della rivista
	
				IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS
			
	Parole chiave
	
				Human-robot interaction; learning algorithms; optimization machine;
			
	Appare nelle tipologie:
	
				01.1 Articolo in Rivista

File in questo prodotto:

File	Dimensione	Formato
Experience_in_Engineering_Complex_Systems_Active_Preference_Learning_With_Multiple_Outcomes_and_Certainty_Levels.pdf Accesso riservato : Publisher’s version Dimensione 1.66 MB Formato Adobe PDF Visualizza/Apri	1.66 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1301118

Citazioni

ND

0

0

social impact