With reconfigurable fabrics delivering increasing performance over the years, Field-Programmable Gate Arrays (FPGAs) are becoming an appealing solution for next-generation High-Performance Computing (HPC) systems. However, in order to gain traction among traditional Von Neumann architectures, the optimization process of FPGA designs should be further abstracted to a higher level. In fact, while High-Level Synthesis (HLS) already provides a handy way to write FPGA code with procedural languages, substantial effort and expertise are still required to optimize the resulting FPGA design for the underlying hardware. To overcome this problem, we propose a semi-automated performance optimization methodology based on a Hierarchical Roofline model for FPGAs. System-wide and applications-specific optimizations such as off-chip memory transfer and data locality optimizations are guided by the FPGA Roofline model whereas FPGA-specific optimizations are automatically searched by a Design Space Exploration (DSE) engine. We demonstrate how this methodology allows to easily analyze and optimize a wide set of applications ranging from particle methods, wavefront algorithms, and sparse arithmetic computations. In addition, we illustrate how the integrated DSE engine achieves a 14.36x maximum speedup if compared to previous automated solutions in the literature.
A Comprehensive Methodology to Optimize FPGA Designs via the Roofline Model
Del Sozzo E.;Rabozzi M.;Di Tucci L.;Sciuto D.;Santambrogio M. D.
2021-01-01
Abstract
With reconfigurable fabrics delivering increasing performance over the years, Field-Programmable Gate Arrays (FPGAs) are becoming an appealing solution for next-generation High-Performance Computing (HPC) systems. However, in order to gain traction among traditional Von Neumann architectures, the optimization process of FPGA designs should be further abstracted to a higher level. In fact, while High-Level Synthesis (HLS) already provides a handy way to write FPGA code with procedural languages, substantial effort and expertise are still required to optimize the resulting FPGA design for the underlying hardware. To overcome this problem, we propose a semi-automated performance optimization methodology based on a Hierarchical Roofline model for FPGAs. System-wide and applications-specific optimizations such as off-chip memory transfer and data locality optimizations are guided by the FPGA Roofline model whereas FPGA-specific optimizations are automatically searched by a Design Space Exploration (DSE) engine. We demonstrate how this methodology allows to easily analyze and optimize a wide set of applications ranging from particle methods, wavefront algorithms, and sparse arithmetic computations. In addition, we illustrate how the integrated DSE engine achieves a 14.36x maximum speedup if compared to previous automated solutions in the literature.File | Dimensione | Formato | |
---|---|---|---|
A_Comprehensive_Methodology_to_Optimize_FPGA_Designs_via_the_Roofline_Model.pdf
accesso aperto
:
Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione
4.54 MB
Formato
Adobe PDF
|
4.54 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.