Sorting of parcels is a critical process in intralogistics for the proper processing and dispatching of packages. Commonly, such a process is manually executed by operators along the plant, without any added value, and might result in musculoskeletal injuries due to the non-ergonomic working conditions. Automation solutions are also present in the market and scientific literature. However, available solutions are usually implemented with pre-defined, simplified sorting rules/finite state machines capable of managing only a limited number of parcel types/sorting scenarios. To generalize and fully automate the sorting process in intralogistics, we propose to employ Reinforcement Learning (RL) for the derivation of sorting policies in combination with machine vision for the online tracking of the parcels, used as the state of the RL. More in detail, the on-policy Proximal Policy Optimization (PPO) algorithm is used for RL, and Yolo is chosen as the machine vision algorithm for parcel recognition and tracking. Based on the AMS sorting module of the SAIET Engineering company, a modular kinematic model (with parcels collision modeling) of the sorting system (an n by m AMS - i.e., 2-action actuators - matrix) is derived, and used as the environment for the PPO. Offline sorting policy training is performed by randomizing the parcel number, size, and entry positions. The trained policy is then deployed to the sorting module, which is equipped with cameras for machine vision implementation and performance evaluation. In-distribution and out-of-distribution (i.e., with parcel types not considered in the off-line training) tests achieved the target performance of 96.5% and 94% sorting accuracy, respectively.
Optimizing Parcels Sorting Through Reinforcement Learning for Intralogistics
Roveda, Loris;
2025-01-01
Abstract
Sorting of parcels is a critical process in intralogistics for the proper processing and dispatching of packages. Commonly, such a process is manually executed by operators along the plant, without any added value, and might result in musculoskeletal injuries due to the non-ergonomic working conditions. Automation solutions are also present in the market and scientific literature. However, available solutions are usually implemented with pre-defined, simplified sorting rules/finite state machines capable of managing only a limited number of parcel types/sorting scenarios. To generalize and fully automate the sorting process in intralogistics, we propose to employ Reinforcement Learning (RL) for the derivation of sorting policies in combination with machine vision for the online tracking of the parcels, used as the state of the RL. More in detail, the on-policy Proximal Policy Optimization (PPO) algorithm is used for RL, and Yolo is chosen as the machine vision algorithm for parcel recognition and tracking. Based on the AMS sorting module of the SAIET Engineering company, a modular kinematic model (with parcels collision modeling) of the sorting system (an n by m AMS - i.e., 2-action actuators - matrix) is derived, and used as the environment for the PPO. Offline sorting policy training is performed by randomizing the parcel number, size, and entry positions. The trained policy is then deployed to the sorting module, which is equipped with cameras for machine vision implementation and performance evaluation. In-distribution and out-of-distribution (i.e., with parcel types not considered in the off-line training) tests achieved the target performance of 96.5% and 94% sorting accuracy, respectively.| File | Dimensione | Formato | |
|---|---|---|---|
|
FAIA-413-FAIA251484.pdf
accesso aperto
:
Publisher’s version
Dimensione
2.25 MB
Formato
Adobe PDF
|
2.25 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


