In Bayesian persuasion, an informed sender has to design a signaling scheme that discloses the right amount of information so as to influence the behavior of a self-interested receiver. This kind of strategic interaction is ubiquitous in real-world economic scenarios. However, the seminal model by Kamenica and Gentzkow makes some stringent assumptions that limit its applicability in practice. One of the most limiting assumptions is, arguably, that the sender is required to know the receiver’s utility function to compute an optimal signaling scheme. We relax this assumption through an online learning framework in which the sender repeatedly faces a receiver whose type is unknown and chosen adversarially at each round from a finite set of possible types. We are interested in no-regret algorithms prescribing a signaling scheme at each round of the repeated interaction with performances close to that of a best-in-hindsight signaling scheme. First, we prove a hardness result on the per-round running time required to achieve no-a-regret for any a < 1. Then, we provide algorithms for the full and partial feedback models with regret bounds sublinear in the number of rounds and polynomial in the size of the instance.

Online Bayesian Persuasion

Matteo Castiglioni;Andrea Celli;Alberto Marchesi;Nicola Gatti
2020-01-01

Abstract

In Bayesian persuasion, an informed sender has to design a signaling scheme that discloses the right amount of information so as to influence the behavior of a self-interested receiver. This kind of strategic interaction is ubiquitous in real-world economic scenarios. However, the seminal model by Kamenica and Gentzkow makes some stringent assumptions that limit its applicability in practice. One of the most limiting assumptions is, arguably, that the sender is required to know the receiver’s utility function to compute an optimal signaling scheme. We relax this assumption through an online learning framework in which the sender repeatedly faces a receiver whose type is unknown and chosen adversarially at each round from a finite set of possible types. We are interested in no-regret algorithms prescribing a signaling scheme at each round of the repeated interaction with performances close to that of a best-in-hindsight signaling scheme. First, we prove a hardness result on the per-round running time required to achieve no-a-regret for any a < 1. Then, we provide algorithms for the full and partial feedback models with regret bounds sublinear in the number of rounds and polynomial in the size of the instance.
2020
34th Conference on Neural Information Processing Systems, NeurIPS 2020
978-1-7138-2954-6
INF
File in questo prodotto:
File Dimensione Formato  
NeurIPS-2020-online-bayesian-persuasion-Paper.pdf

accesso aperto

: Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione 465.35 kB
Formato Adobe PDF
465.35 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1167021
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 35
  • ???jsp.display-item.citation.isi??? ND
social impact