RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

System identification and optimal control have always contributed to water resources systems planning and management. Although water control problems are commonly formulated as multi-objective Markov Decision Processes, accurately modeling reservoir systems controlled by human operators remains challenging due to the absence of a formal definition of the objective function guiding their behavior. In this letter, we introduce a mixed Reinforcement Learning approach to model the dynamics of multipurpose reservoir systems. Specifically, our method first uses Inverse Reinforcement Learning to extract the tradeoff among competing objectives from historical observations of the reservoir system dynamics. The identified objective function is then used in the formulation of an optimal control problem returning a closed-loop policy which allows the simulation of the observed dynamics of the reservoir system. We demonstrate the potential of the proposed method in a real-world application involving the multipurpose regulation of Lake Como in northern Italy. Results show that our approach effectively infers the tradeoff between flood control and water supply adopted in the observed system's operation, and yields a control policy that closely approximates the observed system dynamics.

Integrating Inverse Reinforcement Learning and Direct Policy Search for Modeling Multipurpose Water Reservoir Systems

Giuliani, Matteo;Castelletti, Andrea

2024-01-01

Abstract

System identification and optimal control have always contributed to water resources systems planning and management. Although water control problems are commonly formulated as multi-objective Markov Decision Processes, accurately modeling reservoir systems controlled by human operators remains challenging due to the absence of a formal definition of the objective function guiding their behavior. In this letter, we introduce a mixed Reinforcement Learning approach to model the dynamics of multipurpose reservoir systems. Specifically, our method first uses Inverse Reinforcement Learning to extract the tradeoff among competing objectives from historical observations of the reservoir system dynamics. The identified objective function is then used in the formulation of an optimal control problem returning a closed-loop policy which allows the simulation of the observed dynamics of the reservoir system. We demonstrate the potential of the proposed method in a real-world application involving the multipurpose regulation of Lake Como in northern Italy. Results show that our approach effectively infers the tradeoff between flood control and water supply adopted in the observed system's operation, and yields a control policy that closely approximates the observed system dynamics.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2024
			
	Titolo della rivista
	
				IEEE CONTROL SYSTEMS LETTERS
			
	Appare nelle tipologie:
	
				01.1 Articolo in Rivista

File in questo prodotto:

File	Dimensione	Formato
Giuliani2024_IRL.pdf accesso aperto : Publisher’s version Dimensione 1.33 MB Formato Adobe PDF Visualizza/Apri	1.33 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1275666

Citazioni

ND

0

0

social impact