RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

When the state dimension is large, classical approximate dynamic programming techniques may become computationally unfeasible, since the complexity of the algorithm grows exponentially with the state space size (curse of dimensionality). Policy search techniques are able to overcome this problem because, instead of estimating the value function over the entire state space, they search for the optimal control policy in a restricted parameterized policy space. This paper presents a new policy parametrization that exploits a single point (particle) to represent an entire region of the state space and can be tuned through a recently introduced policy gradient method with parameter-based exploration. Experiments demonstrate the superior performance of the proposed approach in high dimensional environments.

A particle-based policy for the optimal control of Markov decision processes

PIROTTA, MATTEO;MANGANINI, GIORGIO;PIRODDI, LUIGI;PRANDINI, MARIA;RESTELLI, MARCELLO

2014-01-01

Abstract

When the state dimension is large, classical approximate dynamic programming techniques may become computationally unfeasible, since the complexity of the algorithm grows exponentially with the state space size (curse of dimensionality). Policy search techniques are able to overcome this problem because, instead of estimating the value function over the entire state space, they search for the optimal control policy in a restricted parameterized policy space. This paper presents a new policy parametrization that exploits a single point (particle) to represent an entire region of the state space and can be tuned through a recently introduced policy gradient method with parameter-based exploration. Experiments demonstrate the superior performance of the proposed approach in high dimensional environments.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2014
			
	Titolo del libro
	
				Proceedings of the IFAC World Congress 2014
			
	Titolo della collana
	
				IFAC PROCEEDINGS VOLUMES
			
	ISBN (International Standard Book Number)
	
				9783902823625
			
	Appare nelle tipologie:
	
				04.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
2014 - IFAC WC - PirottaManganiniPiroddiPrandiniRestelli.pdf Accesso riservato : Publisher’s version Dimensione 433.58 kB Formato Adobe PDF Visualizza/Apri	433.58 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/824727

Citazioni

ND

2

ND

social impact