RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Learning in real-world domains often requires to deal with continuous state and action spaces. Although many solutions have been proposed to apply reinforcement Learning algorithms to continuous state problems, the same techniques can be hardly extended to continuous action spaces, where, besides the computation of a good approximation of the value function, a fast method for the identiﬁcation of the highest-valued action is needed. In this paper, we propose a novel actor-critic approach in which the policy of the actor is estimated through sequential Monte Carlo methods. The importance sampling step is performed on the basis of the values learned by the critic, while the resampling step modiﬁes the actor’s policy. The proposed approach has been empirically compared to other learning algorithms into several domains; in this paper, we report results obtained in a control problem consisting of steering a boat across a river.

Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods.

LAZARIC, ALESSANDRO;RESTELLI, MARCELLO;BONARINI, ANDREA

2009-01-01

Abstract

Learning in real-world domains often requires to deal with continuous state and action spaces. Although many solutions have been proposed to apply reinforcement Learning algorithms to continuous state problems, the same techniques can be hardly extended to continuous action spaces, where, besides the computation of a good approximation of the value function, a fast method for the identiﬁcation of the highest-valued action is needed. In this paper, we propose a novel actor-critic approach in which the policy of the actor is estimated through sequential Monte Carlo methods. The importance sampling step is performed on the basis of the values learned by the critic, while the resampling step modiﬁes the actor’s policy. The proposed approach has been empirically compared to other learning algorithms into several domains; in this paper, we report results obtained in a control problem consisting of steering a boat across a river.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2009
			
	Titolo del libro
	
				Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference
			
	ISBN (International Standard Book Number)
	
				9781605609492
			
	Appare nelle tipologie:
	
				04.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
nips2007.pdf Accesso riservato : Post-Print (DRAFT o Author’s Accepted Manuscript-AAM) Dimensione 220.53 kB Formato Adobe PDF Visualizza/Apri	220.53 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/539022

Citazioni

ND

ND

ND

social impact