The applications of deep reinforcement learning to racing games so far struggled to reach a performance competitive with the state of the art in this field. Previous work, mainly focused on a low-level input design, show that artificial agents are able to learn to stay on track starting from no driving knowledge; however, the final performance is still far from those of competitive driving. The scope of this work is to investigate in which measure rising the abstraction level can help the learning process. Using The Open Racing Car Simulator (TORCS) environment and the Deep Deterministic Policy Gradients (DDPG) algorithm, we develop artificial agents, considering both numerical and visual inputs, based on deep neural networks. These agents learn to compute either a target point on track or, additionally, a correction to the target maximum speed at the current position, which are then provided as input to a low-level control logic. Our results show that our approach is able to achieve a fair performance, though extremely sensitive to the low-level logic. Further work is necessary in order to understand how to fully exploit a high-level control design.

Short-Term Trajectory Planning in TORCS using Deep Reinforcement Learning

Capo E.;Loiacono D.
2020-01-01

Abstract

The applications of deep reinforcement learning to racing games so far struggled to reach a performance competitive with the state of the art in this field. Previous work, mainly focused on a low-level input design, show that artificial agents are able to learn to stay on track starting from no driving knowledge; however, the final performance is still far from those of competitive driving. The scope of this work is to investigate in which measure rising the abstraction level can help the learning process. Using The Open Racing Car Simulator (TORCS) environment and the Deep Deterministic Policy Gradients (DDPG) algorithm, we develop artificial agents, considering both numerical and visual inputs, based on deep neural networks. These agents learn to compute either a target point on track or, additionally, a correction to the target maximum speed at the current position, which are then provided as input to a low-level control logic. Our results show that our approach is able to achieve a fair performance, though extremely sensitive to the low-level logic. Further work is necessary in order to understand how to fully exploit a high-level control design.
2020
2020 IEEE Symposium Series on Computational Intelligence, SSCI 2020
978-1-7281-2547-3
File in questo prodotto:
File Dimensione Formato  
09308138.pdf

Accesso riservato

: Publisher’s version
Dimensione 204.39 kB
Formato Adobe PDF
204.39 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1163656
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 1
social impact