Many learning problems involve multiple agents that optimize different interactive functions. In these problems, standard policy gradient algorithms fail due to the non-stationarity of the setting and the different interests of each agent. In fact, the learning algorithms must consider the complex dynamics of these systems to guarantee rapid convergence towards a (local) Nash equilibrium. In this paper, we propose NOHD (Newton Optimization on Helmholtz Decomposition), a Newton-like algorithm for multi-agent learning problems based on the decomposition of the system dynamics into its irrotational (Potential) and solenoidal (Hamiltonian) components. This method ensures quadratic convergence in purely irrotational systems and pure solenoidal systems. Furthermore, we show that NOHD is attracted to stable fixed points in general multi-agent systems and repelled by strict saddle ones. Finally, we empirically compare the NOHD's performance with state-of-the-art algorithms on some bimatrix games and in a continuous Gridworld environment.

Newton Optimization on Helmholtz Decomposition for Continuous Games

G. Ramponi;M. Restelli
2021-01-01

Abstract

Many learning problems involve multiple agents that optimize different interactive functions. In these problems, standard policy gradient algorithms fail due to the non-stationarity of the setting and the different interests of each agent. In fact, the learning algorithms must consider the complex dynamics of these systems to guarantee rapid convergence towards a (local) Nash equilibrium. In this paper, we propose NOHD (Newton Optimization on Helmholtz Decomposition), a Newton-like algorithm for multi-agent learning problems based on the decomposition of the system dynamics into its irrotational (Potential) and solenoidal (Hamiltonian) components. This method ensures quadratic convergence in purely irrotational systems and pure solenoidal systems. Furthermore, we show that NOHD is attracted to stable fixed points in general multi-agent systems and repelled by strict saddle ones. Finally, we empirically compare the NOHD's performance with state-of-the-art algorithms on some bimatrix games and in a continuous Gridworld environment.
2021
Thirty-Fifth {AAAI} Conference on Artificial Intelligence, {AAAI}2021, Thirty-Third Conference on Innovative Applications of ArtificialIntelligence, {IAAI} 2021, The Eleventh Symposium on Educational Advancesin Artificial Intelligence, {EAAI} 2021, Virtual Event, February 2-9,2021
File in questo prodotto:
File Dimensione Formato  
17350-Article Text-20844-1-2-20210518.pdf

accesso aperto

: Publisher’s version
Dimensione 801.66 kB
Formato Adobe PDF
801.66 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1207965
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 1
social impact