We consider the problem of goal-directed planning under a deterministic transition model. Monte Carlo Tree Search has shown remarkable performance in solving deterministic control problems. By using function approximators to bias the search of the tree, MCTS has been extended to complex continuous domains, resulting in the AlphaZero family of algorithms. Nonetheless, these algorithms still struggle with control problems with sparse rewards such as goal-directed domains, where a positive reward is awarded only when reaching a goal state. In this work, we extend AlphaZero with Hindsight Experience Replay to tackle complex goal-directed planning tasks. We demonstrate the effectiveness of the proposed approach through an extensive empirical evaluation in several simulated domains, including a novel application to a quantum compiling domain.

Goal-Directed Planning via Hindsight Experience Replay

Lorenzo Moro;Amarildo Likmeta;Enrico Prati;Marcello Restelli
2022-01-01

Abstract

We consider the problem of goal-directed planning under a deterministic transition model. Monte Carlo Tree Search has shown remarkable performance in solving deterministic control problems. By using function approximators to bias the search of the tree, MCTS has been extended to complex continuous domains, resulting in the AlphaZero family of algorithms. Nonetheless, these algorithms still struggle with control problems with sparse rewards such as goal-directed domains, where a positive reward is awarded only when reaching a goal state. In this work, we extend AlphaZero with Hindsight Experience Replay to tackle complex goal-directed planning tasks. We demonstrate the effectiveness of the proposed approach through an extensive empirical evaluation in several simulated domains, including a novel application to a quantum compiling domain.
2022
10th International Conference on Learning Representations, ICLR 2022
File in questo prodotto:
File Dimensione Formato  
goal_directed_planning_via_hin.pdf

accesso aperto

: Publisher’s version
Dimensione 959.43 kB
Formato Adobe PDF
959.43 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1219889
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? ND
social impact