RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Typically, pedestrians are cooperative; that is, even if a robot cannot avoid a pedestrian, it can stop moving and wait for the pedestrian to bypass it. However, non-cooperators also exist, who cannot avoid robots due to distractions, and some collisions are inevitable for robots in environments with dense pedestrians. To deal with this situation, we propose the variable velocity dynamic window approach with learning-based warning policy (VVDWA-LWP) for mobile robots. For the navigation policy, a network is used to predict the future trajectories of pedestrians, then cost maps for each prediction time step are created based on trajectory predictions for calculating the optimal velocity. For the warning policy, deep reinforcement learning is used to train a policy to decide whether the robot performs a warning action. The collision risk of the trajectory prediction and the time of the last warning action are used as states. Experiments in the Gazebo simulator show that VVDWA-LWP has good performance.

Variable Velocity Dynamic Window Approach with Learning-Based Warning Policy for Environments with Non-cooperators

Long J.;Matteucci M.

2025-01-01

Abstract

Typically, pedestrians are cooperative; that is, even if a robot cannot avoid a pedestrian, it can stop moving and wait for the pedestrian to bypass it. However, non-cooperators also exist, who cannot avoid robots due to distractions, and some collisions are inevitable for robots in environments with dense pedestrians. To deal with this situation, we propose the variable velocity dynamic window approach with learning-based warning policy (VVDWA-LWP) for mobile robots. For the navigation policy, a network is used to predict the future trajectories of pedestrians, then cost maps for each prediction time step are created based on trajectory predictions for calculating the optimal velocity. For the warning policy, deep reinforcement learning is used to train a policy to decide whether the robot performs a warning action. The collision risk of the trajectory prediction and the time of the last warning action are used as states. Experiments in the Gazebo simulator show that VVDWA-LWP has good performance.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2025
			
	Titolo del libro
	
				Advanced Intelligent Computing Technology and Applications. ICIC 2025
			
	Titolo della collana
	
				LECTURE NOTES IN COMPUTER SCIENCE
			
	ISBN (International Standard Book Number)
	
				9789819699070
9789819699087
			
	Parole chiave
	
				Cost Map
Deep Reinforcement Learning (DRL)
Dynamic Window Approach (DWA)
Robot Navigation
			
	Appare nelle tipologie:
	
				04.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
978-981-96-9908-7_10.pdf Accesso riservato : Publisher’s version Dimensione 1.18 MB Formato Adobe PDF Visualizza/Apri	1.18 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1298172

Citazioni

ND

1

0

social impact