We study a stochastic variant of the vehicle routing problem (VRP) arising in the context of domestic donor collection services. The problem we consider combines the following attributes. Customers requesting services are variable, in the sense that they are stochastic, but are not restricted to a predefined set. Furthermore, demand volumes are also stochastic and are observed upon visiting customers. The objective is to maximize the expected served demands while meeting vehicle capacity and time restrictions. We call this problem the VRP with a highly Variable Customer basis and Stochastic Demands (VRP-VCSD). We first propose a classical Markov Decision Process (MDP) formulation for the VRP-VCSD. The resulting model is, however, unusable due to the explosion in the dimension of the state and action spaces.To solve the VRP-VCSD, we propose a number of methodological contributions aimed at reducing the state and action spaces. We first reformulate the MDP as an MDP with a consecutive action selection procedure. In this formulation, we enforce the treatment of a single vehicle (as opposed to multiple vehicles) at each decision epoch. We then introduce an observation function that selects a subset of the available information, which is deemed relevant for the considered vehicle in each epoch. We develop a Q-learning algorithm called QN-CO. In particular, we use a continuous state representation and incorporate a two-layer artificial neural network to approximate the Q values. Furthermore, we propose an aggregation strategy yielding a fixed-size output. Finally, we enhance our algorithm with Replay Memory and a Double Q Network.We conduct a thorough computational analysis. Results show that QN-CO considerably outperforms five benchmark policies. Moreover, we show that QN-CO can compete with specialized methods developed for the particular case of the VRP-VCSD where customer locations and expected demands are known in advance.

Off-line approximate dynamic programming for the vehicle routing problem with a highly variable customer basis and stochastic demands

Jabali, O
2023-01-01

Abstract

We study a stochastic variant of the vehicle routing problem (VRP) arising in the context of domestic donor collection services. The problem we consider combines the following attributes. Customers requesting services are variable, in the sense that they are stochastic, but are not restricted to a predefined set. Furthermore, demand volumes are also stochastic and are observed upon visiting customers. The objective is to maximize the expected served demands while meeting vehicle capacity and time restrictions. We call this problem the VRP with a highly Variable Customer basis and Stochastic Demands (VRP-VCSD). We first propose a classical Markov Decision Process (MDP) formulation for the VRP-VCSD. The resulting model is, however, unusable due to the explosion in the dimension of the state and action spaces.To solve the VRP-VCSD, we propose a number of methodological contributions aimed at reducing the state and action spaces. We first reformulate the MDP as an MDP with a consecutive action selection procedure. In this formulation, we enforce the treatment of a single vehicle (as opposed to multiple vehicles) at each decision epoch. We then introduce an observation function that selects a subset of the available information, which is deemed relevant for the considered vehicle in each epoch. We develop a Q-learning algorithm called QN-CO. In particular, we use a continuous state representation and incorporate a two-layer artificial neural network to approximate the Q values. Furthermore, we propose an aggregation strategy yielding a fixed-size output. Finally, we enhance our algorithm with Replay Memory and a Double Q Network.We conduct a thorough computational analysis. Results show that QN-CO considerably outperforms five benchmark policies. Moreover, we show that QN-CO can compete with specialized methods developed for the particular case of the VRP-VCSD where customer locations and expected demands are known in advance.
2023
Stochastic vehicle routing problem
Approximate dynamic programming
Q-learning
File in questo prodotto:
File Dimensione Formato  
VRP_VCSD_C_OR___.pdf

embargo fino al 30/11/2024

: Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione 994.31 kB
Formato Adobe PDF
994.31 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1255217
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 0
social impact