Dynamic Thermal Management (DTM) has become a major challenge since it directly affects Multiprocessors Systems-on-chip (MPSoCs) performance, power consumption, and reliability. In this work, we propose a transient fan model, enabling adaptive fan speed control simulation for efficient DTM. Our model is validated through a thermal test chip achieving less than 2°C error in the worst case. With multiple fan speeds, however, the DTM design space grows significantly, which can ultimately make conventional solutions impractical. We address this challenge through a reinforcement learning-based solution to proactively determine the number of active cores, operating frequency, and fan speed. The proposed solution is able to reduce fan power by up to 40% compared to a DTM with constant fan speed with less than 1% performance degradation. Also, compared to a state-of-the-art DTM technique our solution improves the performance by up to 19% for the same fan power.

Dynamic Thermal Management with Proactive Fan Speed Control Through Reinforcement Learning

Federico Terraneo;William Fornaciari;
2020-01-01

Abstract

Dynamic Thermal Management (DTM) has become a major challenge since it directly affects Multiprocessors Systems-on-chip (MPSoCs) performance, power consumption, and reliability. In this work, we propose a transient fan model, enabling adaptive fan speed control simulation for efficient DTM. Our model is validated through a thermal test chip achieving less than 2°C error in the worst case. With multiple fan speeds, however, the DTM design space grows significantly, which can ultimately make conventional solutions impractical. We address this challenge through a reinforcement learning-based solution to proactively determine the number of active cores, operating frequency, and fan speed. The proposed solution is able to reduce fan power by up to 40% compared to a DTM with constant fan speed with less than 1% performance degradation. Also, compared to a state-of-the-art DTM technique our solution improves the performance by up to 19% for the same fan power.
2020
Proceedings of the 2020 Design, Automation & Test in Europe (DATE) Conference
978-3-9819263-4-7
978-1-7281-4468-9
Dynamic Thermal Management, Multiprocessors Systems-on-chip, performance, power consumption, reliability.
File in questo prodotto:
File Dimensione Formato  
camera ready 383_OutputPaper(1).pdf

accesso aperto

Descrizione: Articolo principale
: Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione 344.25 kB
Formato Adobe PDF
344.25 kB Adobe PDF Visualizza/Apri
DATE2020 published.pdf

Accesso riservato

Descrizione: versione pubblicata
: Publisher’s version
Dimensione 678.87 kB
Formato Adobe PDF
678.87 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1129453
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 11
  • ???jsp.display-item.citation.isi??? 6
social impact