Robots are nowadays increasingly required to deal with (partially) unknown tasks and situations. The robot has, therefore, to adapt its behavior to the specific working conditions. Classical control methods in robotics require manually programming all actions of a robot. While very effective in fixed conditions, such model-based approaches cannot handle variations, demanding tedious tuning of parameters for every new task. Reinforcement learning (RL) holds the promise of autonomously learning new control policies through trial-and-error. However, RL approaches are prone to learning with high samples, particularly for continuous control problems. In this paper, a learning-based method is presented that leverages simulation data to learn an object manipulation task through RL. The control policy is parameterized by a neural network and learned using modern Proximal Policy Optimization (PPO) algorithm. A dense reward function has been designed for the task to enable efficient learning of an agent. The proposed approach is trained entirely in simulation (exploiting the MuJoCo environment) from scratch without any demonstrations of the task. A grasping task involving a Franka Emika Panda manipulator has been considered as the reference task to be learned. The task requires the robot to reach the part, grasp it, and lift it off the contact surface. The proposed approach has been demonstrated to be generalizable across multiple object geometries and initial robot/parts configurations, having the robot able to learn and re-execute the target task.
Learning Continuous Control Actions for Robotic Grasping with Reinforcement Learning
Roveda L.;Braghin F.
2020-01-01
Abstract
Robots are nowadays increasingly required to deal with (partially) unknown tasks and situations. The robot has, therefore, to adapt its behavior to the specific working conditions. Classical control methods in robotics require manually programming all actions of a robot. While very effective in fixed conditions, such model-based approaches cannot handle variations, demanding tedious tuning of parameters for every new task. Reinforcement learning (RL) holds the promise of autonomously learning new control policies through trial-and-error. However, RL approaches are prone to learning with high samples, particularly for continuous control problems. In this paper, a learning-based method is presented that leverages simulation data to learn an object manipulation task through RL. The control policy is parameterized by a neural network and learned using modern Proximal Policy Optimization (PPO) algorithm. A dense reward function has been designed for the task to enable efficient learning of an agent. The proposed approach is trained entirely in simulation (exploiting the MuJoCo environment) from scratch without any demonstrations of the task. A grasping task involving a Franka Emika Panda manipulator has been considered as the reference task to be learned. The task requires the robot to reach the part, grasp it, and lift it off the contact surface. The proposed approach has been demonstrated to be generalizable across multiple object geometries and initial robot/parts configurations, having the robot able to learn and re-execute the target task.File | Dimensione | Formato | |
---|---|---|---|
09282951.pdf
Accesso riservato
Descrizione: paper
:
Publisher’s version
Dimensione
523.38 kB
Formato
Adobe PDF
|
523.38 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.