RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

We study a stochastic variant of the vehicle routing problem (VRP) arising in the context of domestic donor collection services. The problem we consider combines the following attributes. Customers requesting services are variable, in the sense that they are stochastic, but are not restricted to a predefined set. Furthermore, demand volumes are also stochastic and are observed upon visiting customers. The objective is to maximize the expected served demands while meeting vehicle capacity and time restrictions. We call this problem the VRP with a highly Variable Customer basis and Stochastic Demands (VRP-VCSD). We first propose a classical Markov Decision Process (MDP) formulation for the VRP-VCSD. The resulting model is, however, unusable due to the explosion in the dimension of the state and action spaces.To solve the VRP-VCSD, we propose a number of methodological contributions aimed at reducing the state and action spaces. We first reformulate the MDP as an MDP with a consecutive action selection procedure. In this formulation, we enforce the treatment of a single vehicle (as opposed to multiple vehicles) at each decision epoch. We then introduce an observation function that selects a subset of the available information, which is deemed relevant for the considered vehicle in each epoch. We develop a Q-learning algorithm called QN-CO. In particular, we use a continuous state representation and incorporate a two-layer artificial neural network to approximate the Q values. Furthermore, we propose an aggregation strategy yielding a fixed-size output. Finally, we enhance our algorithm with Replay Memory and a Double Q Network.We conduct a thorough computational analysis. Results show that QN-CO considerably outperforms five benchmark policies. Moreover, we show that QN-CO can compete with specialized methods developed for the particular case of the VRP-VCSD where customer locations and expected demands are known in advance.

Off-line approximate dynamic programming for the vehicle routing problem with a highly variable customer basis and stochastic demands

Dastpak, M;Errico, F;Jabali, O

2023-01-01

Abstract

We study a stochastic variant of the vehicle routing problem (VRP) arising in the context of domestic donor collection services. The problem we consider combines the following attributes. Customers requesting services are variable, in the sense that they are stochastic, but are not restricted to a predefined set. Furthermore, demand volumes are also stochastic and are observed upon visiting customers. The objective is to maximize the expected served demands while meeting vehicle capacity and time restrictions. We call this problem the VRP with a highly Variable Customer basis and Stochastic Demands (VRP-VCSD). We first propose a classical Markov Decision Process (MDP) formulation for the VRP-VCSD. The resulting model is, however, unusable due to the explosion in the dimension of the state and action spaces.To solve the VRP-VCSD, we propose a number of methodological contributions aimed at reducing the state and action spaces. We first reformulate the MDP as an MDP with a consecutive action selection procedure. In this formulation, we enforce the treatment of a single vehicle (as opposed to multiple vehicles) at each decision epoch. We then introduce an observation function that selects a subset of the available information, which is deemed relevant for the considered vehicle in each epoch. We develop a Q-learning algorithm called QN-CO. In particular, we use a continuous state representation and incorporate a two-layer artificial neural network to approximate the Q values. Furthermore, we propose an aggregation strategy yielding a fixed-size output. Finally, we enhance our algorithm with Replay Memory and a Double Q Network.We conduct a thorough computational analysis. Results show that QN-CO considerably outperforms five benchmark policies. Moreover, we show that QN-CO can compete with specialized methods developed for the particular case of the VRP-VCSD where customer locations and expected demands are known in advance.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2023
			
	Titolo della rivista
	
				COMPUTERS & OPERATIONS RESEARCH
			
	Parole chiave
	
				Stochastic vehicle routing problem
Approximate dynamic programming
Q-learning
			
	Appare nelle tipologie:
	
				01.1 Articolo in Rivista

File in questo prodotto:

File	Dimensione	Formato
VRP_VCSD_C_OR___.pdf embargo fino al 30/11/2024 : Post-Print (DRAFT o Author’s Accepted Manuscript-AAM) Dimensione 994.31 kB Formato Adobe PDF Visualizza/Apri	994.31 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1255217

Citazioni

ND

4

0

social impact