RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Although being one of the most important approaches to design optimal water reservoir operating policies, the Stochastic Dynamic Programming is challenged by the three curses of dimensionality, modeling, and multiple objectives that make it unsuitable in most practical applications. Increased hydrological variability induced by climate change and human activities further challenges the control of hydraulic infrastructures calling for more flexible and efficient approaches to operation design. Tree-based fitted Q-iteration (FQI) is a value-based, offline and batch mode reinforcement learning method, which employs the principles of continuous approximation of value function through non parametric randomized ensemble of regression tree, i.e. Extremely Randomized Tree. So far FQI has been used for relatively simple systems, including one dam and several state variables, and looking at historical hydrology. In this work, we explore the potential for FQI to design reservoir network operation under varying hydro-climatological conditions. The approach is demonstrated on a real-world case study concerning the optimal operation of a network of three water reservoirs in the Qingjiang River basin, China. Preliminary results show that the computational efficiency and performance of the policies derived by FQI are all satisfactory compare to traditional Stochastic Dynamic Programming, and the advantages in terms of computational efficiency and policies performance become more relevant when evaluated considering uncertain hydro-climatological and socio-economic conditions that requires using more information for conditioning the control policy.

Fitted Q-iteration for optimal water reservoir network operation under varying hydro-climatic conditions

Liang, Bin;Giuliani, Matteo;Zhang, Liping;Chen, Senlin;Castelletti, Andrea

2020-01-01

Abstract

Although being one of the most important approaches to design optimal water reservoir operating policies, the Stochastic Dynamic Programming is challenged by the three curses of dimensionality, modeling, and multiple objectives that make it unsuitable in most practical applications. Increased hydrological variability induced by climate change and human activities further challenges the control of hydraulic infrastructures calling for more flexible and efficient approaches to operation design. Tree-based fitted Q-iteration (FQI) is a value-based, offline and batch mode reinforcement learning method, which employs the principles of continuous approximation of value function through non parametric randomized ensemble of regression tree, i.e. Extremely Randomized Tree. So far FQI has been used for relatively simple systems, including one dam and several state variables, and looking at historical hydrology. In this work, we explore the potential for FQI to design reservoir network operation under varying hydro-climatological conditions. The approach is demonstrated on a real-world case study concerning the optimal operation of a network of three water reservoirs in the Qingjiang River basin, China. Preliminary results show that the computational efficiency and performance of the policies derived by FQI are all satisfactory compare to traditional Stochastic Dynamic Programming, and the advantages in terms of computational efficiency and policies performance become more relevant when evaluated considering uncertain hydro-climatological and socio-economic conditions that requires using more information for conditioning the control policy.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno di pubblicazione

2020

Appare nelle tipologie:

04.2 Abstract in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1207444

Citazioni

ND

ND

ND

social impact