FPGAs have proven to be valid architectures to accelerate the inference phase of Convolutional Neural Networks (CNNs). State-of-the-art works also demonstrated that it is possible to take advantage of a distributed FPGA-base system to improve performance, power consumption and scalability of such algorithms. However, the hardware resource usage, communication, and the nodes management become main aspects when dealing with an embedded distributed scenario. In this context, FINN optimizes the FPGA-based CNNs with binarization and FARD is a framework that allows the acceleration of fog computing-based application with FPGAs. In this work, we present how to extend FARD to deal with job-based applications rather than the event-based fog computing scenario. In particular, we analyzed two PYNQ-Z1 connected each other and we implemented a distributed BNN algorithm based on FINN's CnvW2A2. Results show how hardware resources vary according to the division of the network when splitting after each convolutional layer.

Hardware resources analysis of BNNs splitting for FARD-based multi-FPGAs Distributed Systems

Marco Speziali;Luca Stornaiuolo;Marco Santambrogio;Donatella Sciuto
2020-01-01

Abstract

FPGAs have proven to be valid architectures to accelerate the inference phase of Convolutional Neural Networks (CNNs). State-of-the-art works also demonstrated that it is possible to take advantage of a distributed FPGA-base system to improve performance, power consumption and scalability of such algorithms. However, the hardware resource usage, communication, and the nodes management become main aspects when dealing with an embedded distributed scenario. In this context, FINN optimizes the FPGA-based CNNs with binarization and FARD is a framework that allows the acceleration of fog computing-based application with FPGAs. In this work, we present how to extend FARD to deal with job-based applications rather than the event-based fog computing scenario. In particular, we analyzed two PYNQ-Z1 connected each other and we implemented a distributed BNN algorithm based on FINN's CnvW2A2. Results show how hardware resources vary according to the division of the network when splitting after each convolutional layer.
2020
Proceedings of 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
9781728174457
File in questo prodotto:
File Dimensione Formato  
12_2020_PyNOLI_RAW2020_cr_validated.pdf

accesso aperto

: Pre-Print (o Pre-Refereeing)
Dimensione 1.64 MB
Formato Adobe PDF
1.64 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1145905
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact