RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Deep Convolutional Neural Networks (DCNNs) achieve state of the art results compared to classic machine learning in many applications that need recognition, identification and classification. An ever-increasing embedded deployment of DCNNs inference engines thus supporting the intelligence close to the sensor paradigm has been observed, overcoming limitations of cloud-based computing as bandwidth requirements, security, privacy, scalability, and responsiveness. However, increasing the robustness and accuracy of DCNNs comes at the price of increased computational cost. As result, implementing CNNs on embedded devices with real-time constraints is a challenge if the lowest power consumption shall be achieved. A solution to the challenge is to take advantage of the intra-device massive fine grain parallelism offered by these systems and benefit from the extensive concurrency exhibited by DCNN processing pipelines. The trick is to divide intensive tasks into smaller, weakly interacting batches subject to parallel processing. Referred to that, this paper has mainly two goals: 1) describe the implementation of a state-of-art technique to map DCNN most intensive tasks (dominated by multiply-and-accumulate ops) onto Orlando SoC, an ultra-low power heterogeneous multi cores developed by STMicroelectronics; 2) integrate the proposed implementation on a toolchain that allows deep learning developers to deploy DCNNs on low-power applications.

Parallelized Convolutions for Embedded Ultra Low Power Deep Learning SoC

Cunial L.;Erdem A.;Silvano C.;Falchetto M.;Ornstein A. C.;Plebani E.;Desoli G.;Pau D.

2018-01-01

Abstract

Deep Convolutional Neural Networks (DCNNs) achieve state of the art results compared to classic machine learning in many applications that need recognition, identification and classification. An ever-increasing embedded deployment of DCNNs inference engines thus supporting the intelligence close to the sensor paradigm has been observed, overcoming limitations of cloud-based computing as bandwidth requirements, security, privacy, scalability, and responsiveness. However, increasing the robustness and accuracy of DCNNs comes at the price of increased computational cost. As result, implementing CNNs on embedded devices with real-time constraints is a challenge if the lowest power consumption shall be achieved. A solution to the challenge is to take advantage of the intra-device massive fine grain parallelism offered by these systems and benefit from the extensive concurrency exhibited by DCNN processing pipelines. The trick is to divide intensive tasks into smaller, weakly interacting batches subject to parallel processing. Referred to that, this paper has mainly two goals: 1) describe the implementation of a state-of-art technique to map DCNN most intensive tasks (dominated by multiply-and-accumulate ops) onto Orlando SoC, an ultra-low power heterogeneous multi cores developed by STMicroelectronics; 2) integrate the proposed implementation on a toolchain that allows deep learning developers to deploy DCNNs on low-power applications.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2018
			
	Titolo del libro
	
				IEEE 4th International Forum on Research and Technologies for Society and Industry, RTSI 2018 - Proceedings
			
	ISBN (International Standard Book Number)
	
				978-1-5386-6282-3
			
	Parole chiave
	
				Deep Learning
Convolutional Neural Networks
SoC
Low-power
			
	Appare nelle tipologie:
	
				04.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
RTSI_2018_08548362.pdf Accesso riservato Descrizione: Articolo pubblicato : Publisher’s version Dimensione 886.02 kB Formato Adobe PDF Visualizza/Apri	886.02 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1146034

Citazioni

ND

2

ND

ND

social impact