The network availability of business-critical applications is fundamental to decrease the possibility of a malfunction of daily operations that rely on it. One way to tackle these contingencies is the involvement of Software-Defined Wide Area Network (SD-WAN) paradigm to optimize network performance and reliability by employing intelligent traffic steering reducing the number of disruptions. In this work, we explore an SD-WAN scenario in which clients communicate with servers via channels through overlay tunnels. Our goal is to improve network availability by dynamically rerouting the traffic flow between clients and servers over the channel with the best performance. This is accomplished by leveraging on a multi-agent reinforcement learning environment designed to handle incoming telemetry data, and network agents to learn and adapt their decision-making based on real-time feedback. The obtained outcome shows that our approach based on the double deep q-network algorithm performs better compared to an RTT-based greedy policy both in the case of a single-agent scenario and in a context where four agents are employed.

A MARL Approach to Employ Intelligent Traffic Steering in SD-WAN

Giacometti L.;Selvamuthukumaran K.;Sguotti G.;Troia S.;Verticale G.
2025-01-01

Abstract

The network availability of business-critical applications is fundamental to decrease the possibility of a malfunction of daily operations that rely on it. One way to tackle these contingencies is the involvement of Software-Defined Wide Area Network (SD-WAN) paradigm to optimize network performance and reliability by employing intelligent traffic steering reducing the number of disruptions. In this work, we explore an SD-WAN scenario in which clients communicate with servers via channels through overlay tunnels. Our goal is to improve network availability by dynamically rerouting the traffic flow between clients and servers over the channel with the best performance. This is accomplished by leveraging on a multi-agent reinforcement learning environment designed to handle incoming telemetry data, and network agents to learn and adapt their decision-making based on real-time feedback. The obtained outcome shows that our approach based on the double deep q-network algorithm performs better compared to an RTT-based greedy policy both in the case of a single-agent scenario and in a context where four agents are employed.
2025
2025 IEEE Conference on Standards for Communications and Networking, CSCN 2025
double deep q-network
multi-agent reinforcement learning
path selection
reinforcement learning
sd-wan
traffic steering
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1305190
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact