RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

In recent years, space research has shifted heavily its focus towards enhanced autonomy on-board spacecrafts for on-orbit servicing activities (OOS). OOS and proximity operations include a variety of activities: the focal point of this work is the autonomous guidance of a chaser spacecraft for the shape reconstruction of an artificial uncooperative object. Adaptive guidance depends on the ability of the system to build a map of the uncertain environment, figuring out its location inside of it and accordingly determining the control law. Thus, autonomous navigation is framed as an active Simultaneous Localization and Mapping (SLAM) problem and modeled as a Partially Observable Markov Decision Process (POMDP). A state-of-the-art Deep Reinforcement Learning (DRL) method, Proximal Policy Optimization (PPO), is investigated to develop an agent capable of cleverly planning the shape reconstruction of the target. Starting from previous research on the topic, this work develops further proposing a continuous action space, such that the agent is no more forced to choose between a predefined set of possible discrete actions, fixed both in magnitude and direction. In this way any combination of the three-dimensional thrust vector components is available. The chaser spacecraft is a small satellite mounting an electric propulsion engine defining the action space range, in linearized eccentric relative motion with the selected uncooperative object. Through a rendered triangular mesh, the agent capabilities of geometry reconstruction and mapping are evaluated, considering the number of quality pictures made for each face. Extensive training tests are performed with random initial conditions to verify the generalizing capability of the DRL agent. The results are then validated in a comprehensive testing campaign, whose primary focus is the introduction of noisy measurements coming from navigation, affecting pose estimation. The sensitivity of the proposed method to this condition is analyzed and the efficiency of a retraining procedure is examined. The applicability of DRL methods and neural networks to support autonomous guidance in a close proximity scenario is corroborated and the technique employed is vastly tested and verified.

Deep reinforcement learning spacecraft guidance with state uncertainty for autonomous shape reconstruction of uncooperative target

Brandonisio, Andrea;Capra, Lorenzo;Lavagna, Michèle

2024-01-01

Abstract

In recent years, space research has shifted heavily its focus towards enhanced autonomy on-board spacecrafts for on-orbit servicing activities (OOS). OOS and proximity operations include a variety of activities: the focal point of this work is the autonomous guidance of a chaser spacecraft for the shape reconstruction of an artificial uncooperative object. Adaptive guidance depends on the ability of the system to build a map of the uncertain environment, figuring out its location inside of it and accordingly determining the control law. Thus, autonomous navigation is framed as an active Simultaneous Localization and Mapping (SLAM) problem and modeled as a Partially Observable Markov Decision Process (POMDP). A state-of-the-art Deep Reinforcement Learning (DRL) method, Proximal Policy Optimization (PPO), is investigated to develop an agent capable of cleverly planning the shape reconstruction of the target. Starting from previous research on the topic, this work develops further proposing a continuous action space, such that the agent is no more forced to choose between a predefined set of possible discrete actions, fixed both in magnitude and direction. In this way any combination of the three-dimensional thrust vector components is available. The chaser spacecraft is a small satellite mounting an electric propulsion engine defining the action space range, in linearized eccentric relative motion with the selected uncooperative object. Through a rendered triangular mesh, the agent capabilities of geometry reconstruction and mapping are evaluated, considering the number of quality pictures made for each face. Extensive training tests are performed with random initial conditions to verify the generalizing capability of the DRL agent. The results are then validated in a comprehensive testing campaign, whose primary focus is the introduction of noisy measurements coming from navigation, affecting pose estimation. The sensitivity of the proposed method to this condition is analyzed and the efficiency of a retraining procedure is examined. The applicability of DRL methods and neural networks to support autonomous guidance in a close proximity scenario is corroborated and the technique employed is vastly tested and verified.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2024
			
	Titolo della rivista
	
				ADVANCES IN SPACE RESEARCH
			
	Parole chiave
	
				On-orbit servicing, Relative dynamics, Reinforcement learning, State uncertainty, Shape reconstruction
			
	Appare nelle tipologie:
	
				01.1 Articolo in Rivista

File in questo prodotto:

File	Dimensione	Formato
BRANA03-23.pdf accesso aperto : Publisher’s version Dimensione 2.37 MB Formato Adobe PDF Visualizza/Apri	2.37 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1245738

Citazioni

ND

23

15

ND

social impact