RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Robots are nowadays increasingly required to deal with (partially) unknown tasks and situations. The robot has, therefore, to adapt its behavior to the specific working conditions. Classical control methods in robotics require manually programming all actions of a robot. While very effective in fixed conditions, such model-based approaches cannot handle variations, demanding tedious tuning of parameters for every new task. Reinforcement learning (RL) holds the promise of autonomously learning new control policies through trial-and-error. However, RL approaches are prone to learning with high samples, particularly for continuous control problems. In this paper, a learning-based method is presented that leverages simulation data to learn an object manipulation task through RL. The control policy is parameterized by a neural network and learned using modern Proximal Policy Optimization (PPO) algorithm. A dense reward function has been designed for the task to enable efficient learning of an agent. The proposed approach is trained entirely in simulation (exploiting the MuJoCo environment) from scratch without any demonstrations of the task. A grasping task involving a Franka Emika Panda manipulator has been considered as the reference task to be learned. The task requires the robot to reach the part, grasp it, and lift it off the contact surface. The proposed approach has been demonstrated to be generalizable across multiple object geometries and initial robot/parts configurations, having the robot able to learn and re-execute the target task.

Learning Continuous Control Actions for Robotic Grasping with Reinforcement Learning

Shahid A. A.;Roveda L.;Piga D.;Braghin F.

2020-01-01

Abstract

Robots are nowadays increasingly required to deal with (partially) unknown tasks and situations. The robot has, therefore, to adapt its behavior to the specific working conditions. Classical control methods in robotics require manually programming all actions of a robot. While very effective in fixed conditions, such model-based approaches cannot handle variations, demanding tedious tuning of parameters for every new task. Reinforcement learning (RL) holds the promise of autonomously learning new control policies through trial-and-error. However, RL approaches are prone to learning with high samples, particularly for continuous control problems. In this paper, a learning-based method is presented that leverages simulation data to learn an object manipulation task through RL. The control policy is parameterized by a neural network and learned using modern Proximal Policy Optimization (PPO) algorithm. A dense reward function has been designed for the task to enable efficient learning of an agent. The proposed approach is trained entirely in simulation (exploiting the MuJoCo environment) from scratch without any demonstrations of the task. A grasping task involving a Franka Emika Panda manipulator has been considered as the reference task to be learned. The task requires the robot to reach the part, grasp it, and lift it off the contact surface. The proposed approach has been demonstrated to be generalizable across multiple object geometries and initial robot/parts configurations, having the robot able to learn and re-execute the target task.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2020
			
	Titolo del libro
	
				Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2020
			
	ISBN (International Standard Book Number)
	
				978-1-7281-8526-2
			
	Parole chiave
	
				intelligent robotics
object manipulation
Proximal Policy Optimization
Reinforcement learning
			
	Appare nelle tipologie:
	
				04.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
09282951.pdf Accesso riservato Descrizione: paper : Publisher’s version Dimensione 523.38 kB Formato Adobe PDF Visualizza/Apri	523.38 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1163438

Citazioni

ND

33

19

social impact