In formation control for unmanned aerial vehicles (UAVs), a fleet of drones is arranged in a predefined geometric configuration that must be maintained throughout the flight, while avoiding collisions with other drones and obstacles. In real-world applications, the need for quick deployment of UAV fleets often makes controller parameter tuning a significant challenge. In this paper, we introduce an end-to-end formation controller based on Nonlinear Model Predictive Control (NMPC), enhanced by a reinforcement learning algorithm for optimal hyperparameter tuning. Specifically, we adapt the Policy Gradient with Parameter-based Exploration (PGPE) algorithm to the formation control context. This method offers a fast and scalable solution for parameter tuning that does not require a differentiable controller and can be customized to the specific needs of the deployer. To validate our approach, we conduct simulation experiments using a realistic quadrotor model in a three-dimensional environment with static obstacles. Our results demonstrate the effectiveness and advantages of our method in comparison to state-of-the-art algorithms.
Precision UAV formation control via PGPE-enhanced NMPC
Olivieri, Pierriccardo;Sanchini, Andrea;Gatti, Nicola;Formentin, Simone
2025-01-01
Abstract
In formation control for unmanned aerial vehicles (UAVs), a fleet of drones is arranged in a predefined geometric configuration that must be maintained throughout the flight, while avoiding collisions with other drones and obstacles. In real-world applications, the need for quick deployment of UAV fleets often makes controller parameter tuning a significant challenge. In this paper, we introduce an end-to-end formation controller based on Nonlinear Model Predictive Control (NMPC), enhanced by a reinforcement learning algorithm for optimal hyperparameter tuning. Specifically, we adapt the Policy Gradient with Parameter-based Exploration (PGPE) algorithm to the formation control context. This method offers a fast and scalable solution for parameter tuning that does not require a differentiable controller and can be customized to the specific needs of the deployer. To validate our approach, we conduct simulation experiments using a realistic quadrotor model in a three-dimensional environment with static obstacles. Our results demonstrate the effectiveness and advantages of our method in comparison to state-of-the-art algorithms.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


