This article describes and applies a new optimization method based on multi-objective programming and reinforcement learning. The new method, called MORL-DB (multi-objective reinforcement learning dominace based), introduces the concept of Pareto dominance into the reinforcement learning framework. MORL-DB employs deep deterministic policy gradient (DDPG) with a reward function based on Pareto optimality. At first, the MORL-DB method is tested by solving the Viennet's benchmark problem, then it is applied to the Osyczka and Kundu benchmark problem. Finally, it is used to compute the Pareto front for the vertical dynamics of the quarter vehicle model in terms of two design variables and three objective functions. The results of these three case studies are then compared with the ones obtained using the parameter space investigation method and a nondominated sorting genetic algorithm. The comparison highlights the ability of MORL-DB to generate a high number of optimal solutions with a low number of objective function evaluations.

Multi-Objective Optimal Design Based on Reinforcement Learning

De Santanna, Lorenzo;Guidotti, Giacomo;Mastinu, Gianpiero;Gobbi, Massimiliano
2025-01-01

Abstract

This article describes and applies a new optimization method based on multi-objective programming and reinforcement learning. The new method, called MORL-DB (multi-objective reinforcement learning dominace based), introduces the concept of Pareto dominance into the reinforcement learning framework. MORL-DB employs deep deterministic policy gradient (DDPG) with a reward function based on Pareto optimality. At first, the MORL-DB method is tested by solving the Viennet's benchmark problem, then it is applied to the Osyczka and Kundu benchmark problem. Finally, it is used to compute the Pareto front for the vertical dynamics of the quarter vehicle model in terms of two design variables and three objective functions. The results of these three case studies are then compared with the ones obtained using the parameter space investigation method and a nondominated sorting genetic algorithm. The comparison highlights the ability of MORL-DB to generate a high number of optimal solutions with a low number of objective function evaluations.
2025
DDPG; design optimization; multi-objective optimization; quarter vehicle model; reinforcement learning;
File in questo prodotto:
File Dimensione Formato  
md-24-1765.pdf

accesso aperto

: Publisher’s version
Dimensione 1.14 MB
Formato Adobe PDF
1.14 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1311312
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact