We present INTERCEPT, a compile-time toolchain enabling manifold throughput improvements when running intermittent DNN inference on IoT devices, in exchange of a maximum 1% accuracy loss. Intermittently-computing IoT devices rely on ambient energy harvesting and compute opportunistically, as energy is available. They use NVM to persist intermediate results in anticipation of energy failures. Without requiring changes to existing models and by exploiting the features of STT-MRAM as NVM, INTERCEPT optimizes the placement and configuration of state persistence operations when executing the inference process. This happens off-line with no user intervention, while enforcing a maximum 1% accuracy loss. Our results, obtained across three platforms and six diverse neural networks, indicate that INTERCEPT provides a 40% energy gain in a single inference process, on average. With the same energy budget, this yields a 1.9x throughput speedup.

Intermittent Inference: Trading a 1% Accuracy Loss for a 1.9x Throughput Speedup

Barjami R.;Miele A.;Mottola L.
2024-01-01

Abstract

We present INTERCEPT, a compile-time toolchain enabling manifold throughput improvements when running intermittent DNN inference on IoT devices, in exchange of a maximum 1% accuracy loss. Intermittently-computing IoT devices rely on ambient energy harvesting and compute opportunistically, as energy is available. They use NVM to persist intermediate results in anticipation of energy failures. Without requiring changes to existing models and by exploiting the features of STT-MRAM as NVM, INTERCEPT optimizes the placement and configuration of state persistence operations when executing the inference process. This happens off-line with no user intervention, while enforcing a maximum 1% accuracy loss. Our results, obtained across three platforms and six diverse neural networks, indicate that INTERCEPT provides a 40% energy gain in a single inference process, on average. With the same energy budget, this yields a 1.9x throughput speedup.
2024
Proceedings of the 2024 ACM Conference on Embedded Networked Sensor Systems (SenSys 2024)
9798400706974
deep neural network (DNN) inference
energy efficiency
intermittent computing
File in questo prodotto:
File Dimensione Formato  
3666025.3699364.pdf

accesso aperto

: Publisher’s version
Dimensione 1.82 MB
Formato Adobe PDF
1.82 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1280391
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 4
social impact