Offshore crane operations are frequently carried out under adverse weather conditions. While offshore cranes attempt to finish the load-landing or-lifting operation, the impact between the loads and the vessels is critical, as it can cause serious injuries and extensive damage. Multiple offshore crane operations, including load-landing operations, have used reinforcement learning (RL) to control their activities. In this paper, the Q-learning algorithm is used to develop optimal control sequences for the offshore crane’s actuators to minimize the impact velocity between the crane’s load and the moving vessel. To expand the RL environment, a mathematical model is constructed for the dynamical analysis utilizing the Denavit–Hartenberg (DH) technique and the Lagrange approach. The Double Q-learning algorithm is used to locate the common bias in Q-learning algorithms. The average return feature was studied to assess the performance of the Q-learning algorithm. Furthermore, the trained control sequence was tested on a separate sample of episodes, and the hypothesis that, unlike supervised learning, reinforcement learning cannot have a global optimal control sequence but only a local one, was confirmed in this application domain.

Reinforcement learning-based control for offshore crane load-landing operations

Karimi H. R.
2022-01-01

Abstract

Offshore crane operations are frequently carried out under adverse weather conditions. While offshore cranes attempt to finish the load-landing or-lifting operation, the impact between the loads and the vessels is critical, as it can cause serious injuries and extensive damage. Multiple offshore crane operations, including load-landing operations, have used reinforcement learning (RL) to control their activities. In this paper, the Q-learning algorithm is used to develop optimal control sequences for the offshore crane’s actuators to minimize the impact velocity between the crane’s load and the moving vessel. To expand the RL environment, a mathematical model is constructed for the dynamical analysis utilizing the Denavit–Hartenberg (DH) technique and the Lagrange approach. The Double Q-learning algorithm is used to locate the common bias in Q-learning algorithms. The average return feature was studied to assess the performance of the Q-learning algorithm. Furthermore, the trained control sequence was tested on a separate sample of episodes, and the hypothesis that, unlike supervised learning, reinforcement learning cannot have a global optimal control sequence but only a local one, was confirmed in this application domain.
2022
Marine operation
offshore crane
Q-learning algorithm
reinforcement learning
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1263231
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? ND
social impact