RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

We present an approach to support effective learning and adaptation of behaviors for autonomous agents with reinforcement learning algorithms. These methods can identify control systems that optimize a reinforcement program, which is, usually, a straightforward representation of the designer's goals. Reinforcement learning algorithms usually are too slow to be applied in real time on embodied agents, although they provide a suitable way to represent the desired behavior. We have tackled three aspects of this problem: the speed of the algorithm, the learning procedure, and the control system architecture. The learning algorithm we have developed includes features to speed up learning, such as niche-based learning, and a representation of the control modules in terms of fuzzy rules that reduces the search space and improves robustness to noisy data. Our learning procedure exploits methodologies such as learning from easy missions and transfer of policy from simpler environments to the more complex. The architecture of our control system is layered and modular, so that each module has a low complexity and can be learned in a short time. The composition of the actions proposed by the modules is either learned or predefined. Finally, we adopt an anytime learning approach to improve the quality of the control system on-line and to adapt it to dynamic environments. The experiments we present in this article concern learning to reach another moving agent in a real, dynamic environment that includes nontrivial situations such as that in which the moving target is faster than the agent and that in which the target is hidden by obstacles.

Anytime learning and adaptation of fuzzy logic behaviors

BONARINI, ANDREA

1997-01-01

Abstract

We present an approach to support effective learning and adaptation of behaviors for autonomous agents with reinforcement learning algorithms. These methods can identify control systems that optimize a reinforcement program, which is, usually, a straightforward representation of the designer's goals. Reinforcement learning algorithms usually are too slow to be applied in real time on embodied agents, although they provide a suitable way to represent the desired behavior. We have tackled three aspects of this problem: the speed of the algorithm, the learning procedure, and the control system architecture. The learning algorithm we have developed includes features to speed up learning, such as niche-based learning, and a representation of the control modules in terms of fuzzy rules that reduces the search space and improves robustness to noisy data. Our learning procedure exploits methodologies such as learning from easy missions and transfer of policy from simpler environments to the more complex. The architecture of our control system is layered and modular, so that each module has a low complexity and can be learned in a short time. The composition of the actions proposed by the modules is either learned or predefined. Finally, we adopt an anytime learning approach to improve the quality of the control system on-line and to adapt it to dynamic environments. The experiments we present in this article concern learning to reach another moving agent in a real, dynamic environment that includes nontrivial situations such as that in which the moving target is faster than the agent and that in which the target is hidden by obstacles.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				1997
			
	Titolo della rivista
	
				ADAPTIVE BEHAVIOR
			
	Parole chiave
	
				Robotics; Autonomous Robots; Machine learning; reinforcement learning; fuzzy systems; fuzzy behavior
			
	Appare nelle tipologie:
	
				01.1 Articolo in Rivista

File in questo prodotto:

File	Dimensione	Formato
10.1.1.57.3281.pdf accesso aperto Descrizione: Articolo principale : Post-Print (DRAFT o Author’s Accepted Manuscript-AAM) Dimensione 405.63 kB Formato Adobe PDF Visualizza/Apri	405.63 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/518618

Citazioni

ND

27

18

social impact