RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

In this article, the identifier–critic–actor neural adaptive optimal control issue is addressed for a class of fully nonaffine pure-feedback nonlinear cyber-physical systems with input quantization and time-reference-dependent output constraints. By constructing a time-varying asymmetric barrier Lyapunov function and integrating it with the dynamic surface control method and a reinforcement learning algorithm, a controller is designed based on a neural network approximation of the identifier–critic–actuator structure. In this framework, the identifier estimates unknown dynamics, the critic evaluates system performance, and the actor executes the control action. The control scheme involves designing the actual control inputs for all virtual and dynamic surface controls as the optimal solutions of their corresponding subsystems. The updated law is derived by taking the negative gradient of a simple positive function, which is constructed from the partial derivatives of the Hamilton–Jacobi–Bellman equation. In parallel, the proposed quantizer combines the benefits of both hysteretic and uniform quantization. A key aspect of this article is the simultaneous consideration of constraint boundaries related to both the reference signal and time, which adds complexity to the design of the control algorithm. Stability analysis confirms that all signals remain bounded and adhere to the time- and reference-dependent constraints imposed on output.

Neural Adaptive Quantized Tracking Control Using Reinforcement Learning for Nonlinear Cyber-Physical Systems With Time-Reference-Dependent Constraints

Chen, Penghao;Reza Karimi, Hamid;Luan, Xiaoli;Liu, Fei

2025-01-01

Abstract

In this article, the identifier–critic–actor neural adaptive optimal control issue is addressed for a class of fully nonaffine pure-feedback nonlinear cyber-physical systems with input quantization and time-reference-dependent output constraints. By constructing a time-varying asymmetric barrier Lyapunov function and integrating it with the dynamic surface control method and a reinforcement learning algorithm, a controller is designed based on a neural network approximation of the identifier–critic–actuator structure. In this framework, the identifier estimates unknown dynamics, the critic evaluates system performance, and the actor executes the control action. The control scheme involves designing the actual control inputs for all virtual and dynamic surface controls as the optimal solutions of their corresponding subsystems. The updated law is derived by taking the negative gradient of a simple positive function, which is constructed from the partial derivatives of the Hamilton–Jacobi–Bellman equation. In parallel, the proposed quantizer combines the benefits of both hysteretic and uniform quantization. A key aspect of this article is the simultaneous consideration of constraint boundaries related to both the reference signal and time, which adds complexity to the design of the control algorithm. Stability analysis confirms that all signals remain bounded and adhere to the time- and reference-dependent constraints imposed on output.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2025
			
	Titolo della rivista
	
				INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL
			
	Parole chiave
	
				adaptive control; cyber-physical systems; input quantization; reinforcement learning; time-reference-dependent constraints;
			
	Appare nelle tipologie:
	
				01.1 Articolo in Rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1310749

Citazioni

ND

1

1

social impact