Typically, pedestrians are cooperative; that is, even if a robot cannot avoid a pedestrian, it can stop moving and wait for the pedestrian to bypass it. However, non-cooperators also exist, who cannot avoid robots due to distractions, and some collisions are inevitable for robots in environments with dense pedestrians. To deal with this situation, we propose the variable velocity dynamic window approach with learning-based warning policy (VVDWA-LWP) for mobile robots. For the navigation policy, a network is used to predict the future trajectories of pedestrians, then cost maps for each prediction time step are created based on trajectory predictions for calculating the optimal velocity. For the warning policy, deep reinforcement learning is used to train a policy to decide whether the robot performs a warning action. The collision risk of the trajectory prediction and the time of the last warning action are used as states. Experiments in the Gazebo simulator show that VVDWA-LWP has good performance.
Variable Velocity Dynamic Window Approach with Learning-Based Warning Policy for Environments with Non-cooperators
Long J.;Matteucci M.
2025-01-01
Abstract
Typically, pedestrians are cooperative; that is, even if a robot cannot avoid a pedestrian, it can stop moving and wait for the pedestrian to bypass it. However, non-cooperators also exist, who cannot avoid robots due to distractions, and some collisions are inevitable for robots in environments with dense pedestrians. To deal with this situation, we propose the variable velocity dynamic window approach with learning-based warning policy (VVDWA-LWP) for mobile robots. For the navigation policy, a network is used to predict the future trajectories of pedestrians, then cost maps for each prediction time step are created based on trajectory predictions for calculating the optimal velocity. For the warning policy, deep reinforcement learning is used to train a policy to decide whether the robot performs a warning action. The collision risk of the trajectory prediction and the time of the last warning action are used as states. Experiments in the Gazebo simulator show that VVDWA-LWP has good performance.| File | Dimensione | Formato | |
|---|---|---|---|
|
978-981-96-9908-7_10.pdf
Accesso riservato
:
Publisher’s version
Dimensione
1.18 MB
Formato
Adobe PDF
|
1.18 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


