In this article, the identifier–critic–actor neural adaptive optimal control issue is addressed for a class of fully nonaffine pure-feedback nonlinear cyber-physical systems with input quantization and time-reference-dependent output constraints. By constructing a time-varying asymmetric barrier Lyapunov function and integrating it with the dynamic surface control method and a reinforcement learning algorithm, a controller is designed based on a neural network approximation of the identifier–critic–actuator structure. In this framework, the identifier estimates unknown dynamics, the critic evaluates system performance, and the actor executes the control action. The control scheme involves designing the actual control inputs for all virtual and dynamic surface controls as the optimal solutions of their corresponding subsystems. The updated law is derived by taking the negative gradient of a simple positive function, which is constructed from the partial derivatives of the Hamilton–Jacobi–Bellman equation. In parallel, the proposed quantizer combines the benefits of both hysteretic and uniform quantization. A key aspect of this article is the simultaneous consideration of constraint boundaries related to both the reference signal and time, which adds complexity to the design of the control algorithm. Stability analysis confirms that all signals remain bounded and adhere to the time- and reference-dependent constraints imposed on output.

Neural Adaptive Quantized Tracking Control Using Reinforcement Learning for Nonlinear Cyber-Physical Systems With Time-Reference-Dependent Constraints

Reza Karimi, Hamid;
2025-01-01

Abstract

In this article, the identifier–critic–actor neural adaptive optimal control issue is addressed for a class of fully nonaffine pure-feedback nonlinear cyber-physical systems with input quantization and time-reference-dependent output constraints. By constructing a time-varying asymmetric barrier Lyapunov function and integrating it with the dynamic surface control method and a reinforcement learning algorithm, a controller is designed based on a neural network approximation of the identifier–critic–actuator structure. In this framework, the identifier estimates unknown dynamics, the critic evaluates system performance, and the actor executes the control action. The control scheme involves designing the actual control inputs for all virtual and dynamic surface controls as the optimal solutions of their corresponding subsystems. The updated law is derived by taking the negative gradient of a simple positive function, which is constructed from the partial derivatives of the Hamilton–Jacobi–Bellman equation. In parallel, the proposed quantizer combines the benefits of both hysteretic and uniform quantization. A key aspect of this article is the simultaneous consideration of constraint boundaries related to both the reference signal and time, which adds complexity to the design of the control algorithm. Stability analysis confirms that all signals remain bounded and adhere to the time- and reference-dependent constraints imposed on output.
2025
adaptive control; cyber-physical systems; input quantization; reinforcement learning; time-reference-dependent constraints;
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1310749
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact