RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

This article describes and applies a new optimization method based on multi-objective programming and reinforcement learning. The new method, called MORL-DB (multi-objective reinforcement learning dominace based), introduces the concept of Pareto dominance into the reinforcement learning framework. MORL-DB employs deep deterministic policy gradient (DDPG) with a reward function based on Pareto optimality. At first, the MORL-DB method is tested by solving the Viennet's benchmark problem, then it is applied to the Osyczka and Kundu benchmark problem. Finally, it is used to compute the Pareto front for the vertical dynamics of the quarter vehicle model in terms of two design variables and three objective functions. The results of these three case studies are then compared with the ones obtained using the parameter space investigation method and a nondominated sorting genetic algorithm. The comparison highlights the ability of MORL-DB to generate a high number of optimal solutions with a low number of objective function evaluations.

Multi-Objective Optimal Design Based on Reinforcement Learning

De Santanna, Lorenzo;Guidotti, Giacomo;Mastinu, Gianpiero;Gobbi, Massimiliano

2025-01-01

Abstract

This article describes and applies a new optimization method based on multi-objective programming and reinforcement learning. The new method, called MORL-DB (multi-objective reinforcement learning dominace based), introduces the concept of Pareto dominance into the reinforcement learning framework. MORL-DB employs deep deterministic policy gradient (DDPG) with a reward function based on Pareto optimality. At first, the MORL-DB method is tested by solving the Viennet's benchmark problem, then it is applied to the Osyczka and Kundu benchmark problem. Finally, it is used to compute the Pareto front for the vertical dynamics of the quarter vehicle model in terms of two design variables and three objective functions. The results of these three case studies are then compared with the ones obtained using the parameter space investigation method and a nondominated sorting genetic algorithm. The comparison highlights the ability of MORL-DB to generate a high number of optimal solutions with a low number of objective function evaluations.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2025
			
	Titolo della rivista
	
				JOURNAL OF MECHANICAL DESIGN
			
	Parole chiave
	
				DDPG; design optimization; multi-objective optimization; quarter vehicle model; reinforcement learning;
			
	Appare nelle tipologie:
	
				01.1 Articolo in Rivista

File in questo prodotto:

File	Dimensione	Formato
md-24-1765.pdf accesso aperto : Publisher’s version Dimensione 1.14 MB Formato Adobe PDF Visualizza/Apri	1.14 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1311312

Citazioni

ND

2

2

ND

social impact