Typically, pedestrians are cooperative; that is, even if a robot cannot avoid a pedestrian, it can stop moving and wait for the pedestrian to bypass it. However, non-cooperators also exist, who cannot avoid robots due to distractions, and some collisions are inevitable for robots in environments with dense pedestrians. To deal with this situation, we propose the variable velocity dynamic window approach with learning-based warning policy (VVDWA-LWP) for mobile robots. For the navigation policy, a network is used to predict the future trajectories of pedestrians, then cost maps for each prediction time step are created based on trajectory predictions for calculating the optimal velocity. For the warning policy, deep reinforcement learning is used to train a policy to decide whether the robot performs a warning action. The collision risk of the trajectory prediction and the time of the last warning action are used as states. Experiments in the Gazebo simulator show that VVDWA-LWP has good performance.

Variable Velocity Dynamic Window Approach with Learning-Based Warning Policy for Environments with Non-cooperators

Long J.;Matteucci M.
2025-01-01

Abstract

Typically, pedestrians are cooperative; that is, even if a robot cannot avoid a pedestrian, it can stop moving and wait for the pedestrian to bypass it. However, non-cooperators also exist, who cannot avoid robots due to distractions, and some collisions are inevitable for robots in environments with dense pedestrians. To deal with this situation, we propose the variable velocity dynamic window approach with learning-based warning policy (VVDWA-LWP) for mobile robots. For the navigation policy, a network is used to predict the future trajectories of pedestrians, then cost maps for each prediction time step are created based on trajectory predictions for calculating the optimal velocity. For the warning policy, deep reinforcement learning is used to train a policy to decide whether the robot performs a warning action. The collision risk of the trajectory prediction and the time of the last warning action are used as states. Experiments in the Gazebo simulator show that VVDWA-LWP has good performance.
2025
Advanced Intelligent Computing Technology and Applications. ICIC 2025
9789819699070
9789819699087
Cost Map
Deep Reinforcement Learning (DRL)
Dynamic Window Approach (DWA)
Robot Navigation
File in questo prodotto:
File Dimensione Formato  
978-981-96-9908-7_10.pdf

Accesso riservato

: Publisher’s version
Dimensione 1.18 MB
Formato Adobe PDF
1.18 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1298172
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact