RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

The recent years have seen a rapid diffusion of deep learning algorithms as Convolutional Neural Networks (CNNs), and as a consequence, an intensification of industrial and academic research focused on optimizing their implementation. Different computing architectures have been explored, and among all of them, FPGAs seem to be a very attractive choice, since they can deliver sustained performance with high power efficiency, as CNNs can be directly mapped onto hardware, and still offer flexibility thanks to their programmability. In this paper, we present an end-to-end framework to implement CNNs using a dataflow acceleration methodology. The resulting spatial accelerator can be scaled in size if enough resources are available and can exploit both intra- and inter-layers parallelism. We integrate the proposed framework with the deep learning engine Caffe, meaning that we are able to generate the accelerator starting from a Caffe model. We also provide cloud integration of such framework, enabling users to synthesize and deploy the accelerator on the Amazon F1 instances.

A framework with cloud integration for CNN acceleration on FPGA devices

RASPA, NICCOLO';Natale, Giuseppe;Bacis, Marco;Santambrogio, Marco D.

2018-01-01

Abstract

The recent years have seen a rapid diffusion of deep learning algorithms as Convolutional Neural Networks (CNNs), and as a consequence, an intensification of industrial and academic research focused on optimizing their implementation. Different computing architectures have been explored, and among all of them, FPGAs seem to be a very attractive choice, since they can deliver sustained performance with high power efficiency, as CNNs can be directly mapped onto hardware, and still offer flexibility thanks to their programmability. In this paper, we present an end-to-end framework to implement CNNs using a dataflow acceleration methodology. The resulting spatial accelerator can be scaled in size if enough resources are available and can exploit both intra- and inter-layers parallelism. We integrate the proposed framework with the deep learning engine Caffe, meaning that we are able to generate the accelerator starting from a Caffe model. We also provide cloud integration of such framework, enabling users to synthesize and deploy the accelerator on the Amazon F1 instances.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2018
			
	Titolo del libro
	
				Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018
			
	ISBN (International Standard Book Number)
	
				9781538655559
			
	Parole chiave
	
				Cloud-based acceleration; Convolutional Neural Networks; Dataflow Architectures; FPGA; Artificial Intelligence; Computer Networks and Communications; Hardware and Architecture; Information Systems and Management
			
	Appare nelle tipologie:
	
				04.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
PID5270709.pdf Accesso riservato : Pre-Print (o Pre-Refereeing) Dimensione 11.43 MB Formato Adobe PDF Visualizza/Apri	11.43 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1063568

Citazioni

ND

3

2

social impact