The applications of deep reinforcement learning to racing games so far struggled to reach a performance competitive with the state of the art in this field. Previous work, mainly focused on a low-level input design, show that artificial agents are able to learn to stay on track starting from no driving knowledge; however, the final performance is still far from those of competitive driving. The scope of this work is to investigate in which measure rising the abstraction level can help the learning process. Using The Open Racing Car Simulator (TORCS) environment and the Deep Deterministic Policy Gradients (DDPG) algorithm, we develop artificial agents, considering both numerical and visual inputs, based on deep neural networks. These agents learn to compute either a target point on track or, additionally, a correction to the target maximum speed at the current position, which are then provided as input to a low-level control logic. Our results show that our approach is able to achieve a fair performance, though extremely sensitive to the low-level logic. Further work is necessary in order to understand how to fully exploit a high-level control design.
Short-Term Trajectory Planning in TORCS using Deep Reinforcement Learning
Capo E.;Loiacono D.
2020-01-01
Abstract
The applications of deep reinforcement learning to racing games so far struggled to reach a performance competitive with the state of the art in this field. Previous work, mainly focused on a low-level input design, show that artificial agents are able to learn to stay on track starting from no driving knowledge; however, the final performance is still far from those of competitive driving. The scope of this work is to investigate in which measure rising the abstraction level can help the learning process. Using The Open Racing Car Simulator (TORCS) environment and the Deep Deterministic Policy Gradients (DDPG) algorithm, we develop artificial agents, considering both numerical and visual inputs, based on deep neural networks. These agents learn to compute either a target point on track or, additionally, a correction to the target maximum speed at the current position, which are then provided as input to a low-level control logic. Our results show that our approach is able to achieve a fair performance, though extremely sensitive to the low-level logic. Further work is necessary in order to understand how to fully exploit a high-level control design.File | Dimensione | Formato | |
---|---|---|---|
09308138.pdf
Accesso riservato
:
Publisher’s version
Dimensione
204.39 kB
Formato
Adobe PDF
|
204.39 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.