Geo-distributed machine learning (GDML) can facilitate collaborative learning among geographically dispersed data centers to meet the demands of distributed and privacy-preserving training for large-scale distributed Internet of Things applications. Unfortunately, the efficiency of distributed training tasks heavily depends on synchronized communication between multiple distributed models over bandwidth-limited wide area networks (WANs). The fine-grained optical transport network (fgOTN), thanks to its adjustable bandwidth connections, represents more flexible transmission and has the ability for accurate synchronization across GDML tasks in WANs. However, flexible bandwidth assignment and complex interdependencies among tasks pose significant challenges to resource allocation for GDML in fgOTN. Specifically, flexible bandwidth assignment exacerbates resource competition among task flows, leading to decreased learning efficiency. This article provides novel resource allocation solutions for GDML in fgOTN. We first formulate this problem as a linear programming aimed at maximizing the completion ratio of GDML tasks. Subsequently, we propose an innovative resource allocation algorithm based on genetic algorithm (GARA) for GDML in fgOTN. GARA considers both task completion and bandwidth adjustment through population generation based on prior knowledge and adaptive mutation based on completion ratio. Simulation analysis demonstrates that GARA effectively prioritizes resource allocation for high-priority tasks to alleviate resource competition, achieving the highest task completion ratio while avoiding excessive network reconfiguration.
Resource Allocation in Flexible-Bandwidth Fine-Grained Optical Transport Networks for Geo-Distributed Machine Learning
Tornatore, Massimo;
2025-01-01
Abstract
Geo-distributed machine learning (GDML) can facilitate collaborative learning among geographically dispersed data centers to meet the demands of distributed and privacy-preserving training for large-scale distributed Internet of Things applications. Unfortunately, the efficiency of distributed training tasks heavily depends on synchronized communication between multiple distributed models over bandwidth-limited wide area networks (WANs). The fine-grained optical transport network (fgOTN), thanks to its adjustable bandwidth connections, represents more flexible transmission and has the ability for accurate synchronization across GDML tasks in WANs. However, flexible bandwidth assignment and complex interdependencies among tasks pose significant challenges to resource allocation for GDML in fgOTN. Specifically, flexible bandwidth assignment exacerbates resource competition among task flows, leading to decreased learning efficiency. This article provides novel resource allocation solutions for GDML in fgOTN. We first formulate this problem as a linear programming aimed at maximizing the completion ratio of GDML tasks. Subsequently, we propose an innovative resource allocation algorithm based on genetic algorithm (GARA) for GDML in fgOTN. GARA considers both task completion and bandwidth adjustment through population generation based on prior knowledge and adaptive mutation based on completion ratio. Simulation analysis demonstrates that GARA effectively prioritizes resource allocation for high-priority tasks to alleviate resource competition, achieving the highest task completion ratio while avoiding excessive network reconfiguration.| File | Dimensione | Formato | |
|---|---|---|---|
|
LianM_IoT_25.pdf
Accesso riservato
Descrizione: LianN_IoT_25
:
Publisher’s version
Dimensione
2.44 MB
Formato
Adobe PDF
|
2.44 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


