RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

In Bayesian persuasion, an informed sender has to design a signaling scheme that discloses the right amount of information so as to influence the behavior of a self-interested receiver. This kind of strategic interaction is ubiquitous in real-world economic scenarios. However, the seminal model by Kamenica and Gentzkow makes some stringent assumptions that limit its applicability in practice. One of the most limiting assumptions is, arguably, that the sender is required to know the receiver’s utility function to compute an optimal signaling scheme. We relax this assumption through an online learning framework in which the sender repeatedly faces a receiver whose type is unknown and chosen adversarially at each round from a finite set of possible types. We are interested in no-regret algorithms prescribing a signaling scheme at each round of the repeated interaction with performances close to that of a best-in-hindsight signaling scheme. First, we prove a hardness result on the per-round running time required to achieve no-a-regret for any a < 1. Then, we provide algorithms for the full and partial feedback models with regret bounds sublinear in the number of rounds and polynomial in the size of the instance.

Online Bayesian Persuasion

Matteo Castiglioni;Andrea Celli;Alberto Marchesi;Nicola Gatti

2020-01-01

Abstract

In Bayesian persuasion, an informed sender has to design a signaling scheme that discloses the right amount of information so as to influence the behavior of a self-interested receiver. This kind of strategic interaction is ubiquitous in real-world economic scenarios. However, the seminal model by Kamenica and Gentzkow makes some stringent assumptions that limit its applicability in practice. One of the most limiting assumptions is, arguably, that the sender is required to know the receiver’s utility function to compute an optimal signaling scheme. We relax this assumption through an online learning framework in which the sender repeatedly faces a receiver whose type is unknown and chosen adversarially at each round from a finite set of possible types. We are interested in no-regret algorithms prescribing a signaling scheme at each round of the repeated interaction with performances close to that of a best-in-hindsight signaling scheme. First, we prove a hardness result on the per-round running time required to achieve no-a-regret for any a < 1. Then, we provide algorithms for the full and partial feedback models with regret bounds sublinear in the number of rounds and polynomial in the size of the instance.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2020
			
	Titolo del libro
	
				34th Conference on Neural Information Processing Systems, NeurIPS 2020
			
	Titolo della collana
	
				ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS
			
	ISBN (International Standard Book Number)
	
				978-1-7138-2954-6
			
	Parole chiave
	
				INF
			
	Appare nelle tipologie:
	
				04.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
NeurIPS-2020-online-bayesian-persuasion-Paper.pdf accesso aperto : Post-Print (DRAFT o Author’s Accepted Manuscript-AAM) Dimensione 465.35 kB Formato Adobe PDF Visualizza/Apri	465.35 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1167021

Citazioni

ND

36

ND

social impact