New and evolving Artificial Intelligence (AI) applications span across the full spectrum of computing resources, seamlessly integrating across Edge and Cloud platforms. Deploying applications at the network edge reduces latency, while cloud computing ensures a higher processing power. Managing resources in such a dynamic and diverse environment requires a strategic approach to satisfy Quality of Service requirements and minimize costs. To address this challenge, we propose FIGARO (reinForcement learnInG mAnagement acRoss computing cOntinuum), which exploits offline training and imitation learning to speedup the training of reinforcement learning-based agents able to control resources in the full cloud continuum stack. By extending our framework, we designed a hierarchical system structure and tested agents that only need to manage one computational layer at a time. This approach enables the system to efficiently manage multiple application components in complex AI pipelines. The results demonstrate the effectiveness of the hierarchical method, as the local agents dynamically scale computational resources, limiting QoS constraint violations to a maximum of 1.4% in the reference use-case application.
Runtime Management of Artificial Intelligence Applications Through Hierarchical Reinforcement Learning
Riccardo Cavadini;Hamta Sedghani;Federica Filippini;Danilo Ardagna
2024-01-01
Abstract
New and evolving Artificial Intelligence (AI) applications span across the full spectrum of computing resources, seamlessly integrating across Edge and Cloud platforms. Deploying applications at the network edge reduces latency, while cloud computing ensures a higher processing power. Managing resources in such a dynamic and diverse environment requires a strategic approach to satisfy Quality of Service requirements and minimize costs. To address this challenge, we propose FIGARO (reinForcement learnInG mAnagement acRoss computing cOntinuum), which exploits offline training and imitation learning to speedup the training of reinforcement learning-based agents able to control resources in the full cloud continuum stack. By extending our framework, we designed a hierarchical system structure and tested agents that only need to manage one computational layer at a time. This approach enables the system to efficiently manage multiple application components in complex AI pipelines. The results demonstrate the effectiveness of the hierarchical method, as the local agents dynamically scale computational resources, limiting QoS constraint violations to a maximum of 1.4% in the reference use-case application.| File | Dimensione | Formato | |
|---|---|---|---|
|
RuntimeManagement_HRL.pdf
accesso aperto
Dimensione
4.5 MB
Formato
Adobe PDF
|
4.5 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


