RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Investigating the effects of Single Event Upset in domain-specific accelerators represents one of the key enablers to deploy Deep Neural Networks (DNNs) in mission-critical edge applications. Currently, reliability analyses related to DNNs mainly focus either on the DNNs model, at application level, or on the hardware accelerator, at architecture level. This paper presents a systematic cross-layer reliability analysis of NVIDIA Deep-Learning Accelerator, a popular family of industry-grade, open and free DNN accelerators. The goals are i) to analyze the propagation of faults from the hardware to the application level, and ii) to compare different architectural configurations. Our investigation delivers new insights into the performance-accuracy-reliability trade-off spanned by the configuration space of Deep Learning accelerators. In particular, the Failure in Time can be reduced up to 4.3x for the same DNN model accuracy and by up to 9.4x for the same performance, while accounting 6.5x inference latency and 1.1% accuracy drop, respectively.

Cross-Layer Reliability Analysis of NVDLA Accelerators: Exploring the Configuration Space

Veronesi A.;Nazzari A.;Passarello D.;Krstic M.;Favalli M.;Cassano L.;Miele A.;Bertozzi D.;Bolchini C.

2024-01-01

Abstract

Investigating the effects of Single Event Upset in domain-specific accelerators represents one of the key enablers to deploy Deep Neural Networks (DNNs) in mission-critical edge applications. Currently, reliability analyses related to DNNs mainly focus either on the DNNs model, at application level, or on the hardware accelerator, at architecture level. This paper presents a systematic cross-layer reliability analysis of NVIDIA Deep-Learning Accelerator, a popular family of industry-grade, open and free DNN accelerators. The goals are i) to analyze the propagation of faults from the hardware to the application level, and ii) to compare different architectural configurations. Our investigation delivers new insights into the performance-accuracy-reliability trade-off spanned by the configuration space of Deep Learning accelerators. In particular, the Failure in Time can be reduced up to 4.3x for the same DNN model accuracy and by up to 9.4x for the same performance, while accounting 6.5x inference latency and 1.1% accuracy drop, respectively.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2024
			
	Titolo del libro
	
				Proceedings of the European Test Workshop
			
	ISBN (International Standard Book Number)
	
				9798350349320
			
	Parole chiave
	
				Accelerators
Cross-Layer Reliability Analysis
Deep Neural Networks
Error Simulation
Fault Injection
			
	Appare nelle tipologie:
	
				04.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
ETS2024.pdf Accesso riservato : Pre-Print (o Pre-Refereeing) Dimensione 310.17 kB Formato Adobe PDF Visualizza/Apri	310.17 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1275977

Citazioni

ND

0

0

social impact