RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Leading space agencies are increasingly investing in the gradual automation of space missions. In fact, autonomous flight operations may be a key enabler for on-orbit servicing, assembly and manufacturing (OSAM) missions, carrying inherent benefits such as cost and risk reduction. Within the spectrum of proximity operations, this work focuses on autonomous path-planning for the reconstruction of geometry properties of an uncooperative target. The autonomous navigation problem is called active Simultaneous Localization and Mapping (SLAM) problem, and it has been largely studied within the field of robotics. Active SLAM problem may be formulated as a Partially Observable Markov Decision Process (POMDP). Previous works in astrodynamics have demonstrated that is possible to use Reinforcement Learning (RL) techniques to teach an agent that is moving along a pre-determined orbit when to collect measurements to optimize a given mapping goal. In this work, different RL methods are explored to develop an artificial intelligence agent capable of planning sub-optimal paths for autonomous shape reconstruction of an unknown and uncooperative object via imaging. Proximity orbit dynamics are linearized and include orbit eccentricity. The geometry of the target object is rendered by a polyhedron shaped with a triangular mesh. Artificial intelligent agents are created using both the Deep Q-Network (DQN) and the Advantage Actor Critic (A2C) method. State-action value functions are approximated using Artificial Neural Networks (ANN) and trained according to RL principles. Training of the RL agent architecture occurs under fixed or random initial environment conditions. A large database of training tests has been collected. Trained agents show promising performance in achieving extended coverage of the target. Policy learning is demonstrated by displaying that RL agents, at minimum, have higher mapping performance than agents that behave randomly. Furthermore, RL agent may learn to maneuver the spacecraft to control target lighting conditions as a function of the Sun location. This work, therefore, preliminary demonstrates the applicability of RL to autonomous imaging of an uncooperative space object, thus setting a baseline for future works.

Reinforcement Learning for Uncooperative Space Objects Smart Imaging Path-Planning

Brandonisio, Andrea;Lavagna, Michèle;Guzzetti, Davide

2021-01-01

Abstract

Leading space agencies are increasingly investing in the gradual automation of space missions. In fact, autonomous flight operations may be a key enabler for on-orbit servicing, assembly and manufacturing (OSAM) missions, carrying inherent benefits such as cost and risk reduction. Within the spectrum of proximity operations, this work focuses on autonomous path-planning for the reconstruction of geometry properties of an uncooperative target. The autonomous navigation problem is called active Simultaneous Localization and Mapping (SLAM) problem, and it has been largely studied within the field of robotics. Active SLAM problem may be formulated as a Partially Observable Markov Decision Process (POMDP). Previous works in astrodynamics have demonstrated that is possible to use Reinforcement Learning (RL) techniques to teach an agent that is moving along a pre-determined orbit when to collect measurements to optimize a given mapping goal. In this work, different RL methods are explored to develop an artificial intelligence agent capable of planning sub-optimal paths for autonomous shape reconstruction of an unknown and uncooperative object via imaging. Proximity orbit dynamics are linearized and include orbit eccentricity. The geometry of the target object is rendered by a polyhedron shaped with a triangular mesh. Artificial intelligent agents are created using both the Deep Q-Network (DQN) and the Advantage Actor Critic (A2C) method. State-action value functions are approximated using Artificial Neural Networks (ANN) and trained according to RL principles. Training of the RL agent architecture occurs under fixed or random initial environment conditions. A large database of training tests has been collected. Trained agents show promising performance in achieving extended coverage of the target. Policy learning is demonstrated by displaying that RL agents, at minimum, have higher mapping performance than agents that behave randomly. Furthermore, RL agent may learn to maneuver the spacecraft to control target lighting conditions as a function of the Sun location. This work, therefore, preliminary demonstrates the applicability of RL to autonomous imaging of an uncooperative space object, thus setting a baseline for future works.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2021
			
	Titolo della rivista
	
				THE JOURNAL OF THE ASTRONAUTICAL SCIENCES
			
	Appare nelle tipologie:
	
				01.1 Articolo in Rivista

File in questo prodotto:

File	Dimensione	Formato
BRANA01-21.pdf accesso aperto Descrizione: Paper : Publisher’s version Dimensione 3 MB Formato Adobe PDF Visualizza/Apri	3 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1189138

Citazioni

ND

19

17

social impact