RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

The planetary landing problem is gaining relevance in the space sector, spanning a wide range of applications from unmanned probes landing on other planetary bodies to reusable first and second stages of launcher vehicles. In the existing methodology there is a lack of flexibility in handling complex non-linear dynamics, in particular in the case of non-convexifiable constraints. It is therefore crucial to assess the performance of novel techniques and their advantages and disadvantages. The purpose of this work is the development of an integrated 6-DOF guidance and control approach based on reinforcement learning of deep neural network policies for fuel-optimal planetary landing control, specifically with application to a launcher first-stage terminal landing, and the assessment of its performance and robustness. 3-DOF and 6-DOF simulators are developed and encapsulated in MDP-like (Markov Decision Process) industry-standard compatible environments. Particular care is given in thoroughly shaping reward functions capable of achieving the landing both successfully and in a fuel-optimal manner. A cloud pipeline for effective training of an agent using a PPO reinforcement learning algorithm to successfully achieve the landing goal is developed.

Propulsive landing of launchers’ first stages with Deep Reinforcement Learning

Iafrate, Davide;Brandonisio, Andrea;Hinz, Robert;Lavagna, Michèle

2025-01-01

Abstract

The planetary landing problem is gaining relevance in the space sector, spanning a wide range of applications from unmanned probes landing on other planetary bodies to reusable first and second stages of launcher vehicles. In the existing methodology there is a lack of flexibility in handling complex non-linear dynamics, in particular in the case of non-convexifiable constraints. It is therefore crucial to assess the performance of novel techniques and their advantages and disadvantages. The purpose of this work is the development of an integrated 6-DOF guidance and control approach based on reinforcement learning of deep neural network policies for fuel-optimal planetary landing control, specifically with application to a launcher first-stage terminal landing, and the assessment of its performance and robustness. 3-DOF and 6-DOF simulators are developed and encapsulated in MDP-like (Markov Decision Process) industry-standard compatible environments. Particular care is given in thoroughly shaping reward functions capable of achieving the landing both successfully and in a fuel-optimal manner. A cloud pipeline for effective training of an agent using a PPO reinforcement learning algorithm to successfully achieve the landing goal is developed.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2025
			
	Titolo della rivista
	
				ACTA ASTRONAUTICA
			
	Parole chiave
	
				Controls
GNC
Launchers
Machine learning
Reinforcement learning
Retropropulsive landing
			
	Appare nelle tipologie:
	
				01.1 Articolo in Rivista

File in questo prodotto:

File	Dimensione	Formato
IAFRD01-25.pdf accesso aperto : Publisher’s version Dimensione 4.26 MB Formato Adobe PDF Visualizza/Apri	4.26 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1279157

Citazioni

ND

0

ND

social impact