The rapid expansion of Machine Learning (ML) and Artificial Intelligence (AI) has profoundly influenced the technological landscape, reshaping various industries and applications. This surge in computational demands has led to the widespread adoption of Cloud data centers, crucial for supporting the storage and processing requirements of these advanced technologies. However, this expansion poses significant challenges, particularly in terms of energy consumption and associated carbon emissions. As the reliance on cloud data centers intensifies, there is a growing concern about the environmental impact, necessitating innovative solutions to enhance energy efficiency and reduce the ecological footprint of these computational infrastructures. This paper focuses on addressing the challenges linked to training ML and AI applications, emphasizing the importance of energy-efficient solutions. The proposed framework integrates components from the AI-SPRINT project toolchain, such as Krake, Space4AI-R, and PyCOMPSs. Our reference application involves training a Random Forest model for electrocardiogram classification, profiling available resources to obtain a performance model able to predict the training time, and dynamically migrating the workload to sites with cleaner energy sources providing guarantees on the training process due date. Results demonstrate the framework’s capacity to estimate execution time and resource requirements with low error, highlighting its potential for establishing an environmentally sustainable AI ecosystem.
Greening AI: A Framework for Energy-Aware Resource Allocation of ML Training Jobs with Performance Guarantees
R. Sala;F. Filippini;D. Ardagna;
2024-01-01
Abstract
The rapid expansion of Machine Learning (ML) and Artificial Intelligence (AI) has profoundly influenced the technological landscape, reshaping various industries and applications. This surge in computational demands has led to the widespread adoption of Cloud data centers, crucial for supporting the storage and processing requirements of these advanced technologies. However, this expansion poses significant challenges, particularly in terms of energy consumption and associated carbon emissions. As the reliance on cloud data centers intensifies, there is a growing concern about the environmental impact, necessitating innovative solutions to enhance energy efficiency and reduce the ecological footprint of these computational infrastructures. This paper focuses on addressing the challenges linked to training ML and AI applications, emphasizing the importance of energy-efficient solutions. The proposed framework integrates components from the AI-SPRINT project toolchain, such as Krake, Space4AI-R, and PyCOMPSs. Our reference application involves training a Random Forest model for electrocardiogram classification, profiling available resources to obtain a performance model able to predict the training time, and dynamically migrating the workload to sites with cleaner energy sources providing guarantees on the training process due date. Results demonstrate the framework’s capacity to estimate execution time and resource requirements with low error, highlighting its potential for establishing an environmentally sustainable AI ecosystem.File | Dimensione | Formato | |
---|---|---|---|
Greening_AI__A_Framework__for_Energy_Aware_Resource_Allocation_of_ML_Training_Jobs_with_Performance_Guarantees.pdf
Accesso riservato
Dimensione
1.34 MB
Formato
Adobe PDF
|
1.34 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.