Exploiting History Data for Nonstationary Multi-armed Bandit

Gerlando, Re; Chiusano, F.; Trovo', Francesco; Carrera, D.; Boracchi, G.; Restelli, M.

doi:10.1007/978-3-030-86486-6_4

The Multi-armed Bandit (MAB) framework has been applied successfully in many application fields. In the last years, the use of active approaches to tackle the nonstationary MAB setting, i.e., algorithms capable of detecting changes in the environment and re-configuring automatically to the change, has been widening the areas of application of MAB techniques. However, such approaches have the drawback of not reusing information in those settings where the same environment conditions recur over time. This paper presents a framework to integrate past information in the abruptly changing nonstationary setting, which allows the active MAB approaches to recover from changes quickly. The proposed framework is based on well-known break-point prediction methods to correctly identify the instant the environment changed in the past, and on the definition of recurring concepts specifically for the MAB setting to reuse information from recurring MAB states, when necessary. We show that this framework does not change the order of the regret suffered by the active approaches commonly used in the bandit field. Finally, we provide an extensive experimental analysis on both synthetic and real-world data, showing the improvement provided by our framework.

Exploiting History Data for Nonstationary Multi-armed Bandit

Re Gerlando;Chiusano F.;Trovo' Francesco;Carrera D.;Boracchi G.;Restelli M.

2021-01-01

Abstract

The Multi-armed Bandit (MAB) framework has been applied successfully in many application fields. In the last years, the use of active approaches to tackle the nonstationary MAB setting, i.e., algorithms capable of detecting changes in the environment and re-configuring automatically to the change, has been widening the areas of application of MAB techniques. However, such approaches have the drawback of not reusing information in those settings where the same environment conditions recur over time. This paper presents a framework to integrate past information in the abruptly changing nonstationary setting, which allows the active MAB approaches to recover from changes quickly. The proposed framework is based on well-known break-point prediction methods to correctly identify the instant the environment changed in the past, and on the definition of recurring concepts specifically for the MAB setting to reuse information from recurring MAB states, when necessary. We show that this framework does not change the order of the regret suffered by the active approaches commonly used in the bandit field. Finally, we provide an extensive experimental analysis on both synthetic and real-world data, showing the improvement provided by our framework.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2021
			
	Titolo del libro
	
				Proceedings of ECML 2021
			
	Parole chiave
	
				Break-point prediction
Multi-armed bandit
Non-stationary MAB
Recurring concepts
			
	Appare nelle tipologie:
	
				04.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
2021_ECML_Nonstationary_MAB.pdf accesso aperto : Pre-Print (o Pre-Refereeing) Dimensione 499.69 kB Formato Adobe PDF Visualizza/Apri	499.69 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1203790

Citazioni

ND

2

4

Nome	Dominio	Durata	Descrizione
s_.*	plu.mx	sessione	recupero grafico citazioni sociali da plumx
A_.*	core.ac.uk	7 giorni	recupero pubblicazioni consigliate per il pannello core-recommander
GS_.*	gstatic.com	richiesta http	visualizza grafico citazioni
CC_.*	creativecommons.org	richiesta http	visualizza licenza bitstream

RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Exploiting History Data for Nonstationary Multi-armed Bandit

Re Gerlando;Chiusano F.;Trovo' Francesco;Carrera D.;Boracchi G.;Restelli M.

2021-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)