Robots are nowadays increasingly required to deal with (partially) unknown tasks and situations. The robot has, therefore, to adapt its behavior to the specific working conditions. Classical control methods in robotics require manually programming all actions of a robot. While very effective in fixed conditions, such model-based approaches cannot handle variations, demanding tedious tuning of parameters for every new task. Reinforcement learning (RL) holds the promise of autonomously learning new control policies through trial-and-error. However, RL approaches are prone to learning with high samples, particularly for continuous control problems. In this paper, a learning-based method is presented that leverages simulation data to learn an object manipulation task through RL. The control policy is parameterized by a neural network and learned using modern Proximal Policy Optimization (PPO) algorithm. A dense reward function has been designed for the task to enable efficient learning of an agent. The proposed approach is trained entirely in simulation (exploiting the MuJoCo environment) from scratch without any demonstrations of the task. A grasping task involving a Franka Emika Panda manipulator has been considered as the reference task to be learned. The task requires the robot to reach the part, grasp it, and lift it off the contact surface. The proposed approach has been demonstrated to be generalizable across multiple object geometries and initial robot/parts configurations, having the robot able to learn and re-execute the target task.

Learning Continuous Control Actions for Robotic Grasping with Reinforcement Learning

Roveda L.;Braghin F.
2020-01-01

Abstract

Robots are nowadays increasingly required to deal with (partially) unknown tasks and situations. The robot has, therefore, to adapt its behavior to the specific working conditions. Classical control methods in robotics require manually programming all actions of a robot. While very effective in fixed conditions, such model-based approaches cannot handle variations, demanding tedious tuning of parameters for every new task. Reinforcement learning (RL) holds the promise of autonomously learning new control policies through trial-and-error. However, RL approaches are prone to learning with high samples, particularly for continuous control problems. In this paper, a learning-based method is presented that leverages simulation data to learn an object manipulation task through RL. The control policy is parameterized by a neural network and learned using modern Proximal Policy Optimization (PPO) algorithm. A dense reward function has been designed for the task to enable efficient learning of an agent. The proposed approach is trained entirely in simulation (exploiting the MuJoCo environment) from scratch without any demonstrations of the task. A grasping task involving a Franka Emika Panda manipulator has been considered as the reference task to be learned. The task requires the robot to reach the part, grasp it, and lift it off the contact surface. The proposed approach has been demonstrated to be generalizable across multiple object geometries and initial robot/parts configurations, having the robot able to learn and re-execute the target task.
2020
Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2020
978-1-7281-8526-2
intelligent robotics
object manipulation
Proximal Policy Optimization
Reinforcement learning
File in questo prodotto:
File Dimensione Formato  
09282951.pdf

Accesso riservato

Descrizione: paper
: Publisher’s version
Dimensione 523.38 kB
Formato Adobe PDF
523.38 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1163438
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 27
  • ???jsp.display-item.citation.isi??? 19
social impact