Conventional High-Level Synthesis (HLS) tools exploit parallelism mostly at the Instruction Level (ILP). They statically schedule the input specifications and build centralized Finite State Machine (FSM) controllers. However, aggressive exploitation of ILP in many applications has diminishing returns and, usually, centralized approaches do not efficiently exploit coarser parallelism, because FSMs are inherently serial. In this paper we present an HLS framework able to synthesize applications that, beside ILP, also expose Task Level Parallelism (TLP). An application can expose TLP through annotations that identify the parallel functions (i.e., tasks). To generate accelerators that efficiently execute concurrent tasks, we need to solve several issues: devise a mechanism to support concurrent execution flows, exploit memory parallelism, and manage synchronization. To support concurrent execution flows, we introduce a novel adaptive controller. The adaptive controller is composed of a set of interacting control elements that independently manage the execution of a task. These control elements check dependencies and resource constraints at runtime, enabling as soon as possible execution. To support parallel access to shared memories and synchronization, we integrate with a novel Hierarchical Memory Interface (HMI). With respect to previous solutions, the proposed interface supports multi-ported memories and atomic memory operations, which commonly occur in parallel programming. Our framework can generate the hardware implementation of C functions by employing two different approaches, depending on its characteristics. If a function exposes TLP, then the framework generates hardware implementations based on the adaptive controller. Otherwise, the framework implements the function through the FSM approach, which is optimized for ILP exploitation. We evaluate our framework on a set of parallel applications and show substantial performance improvements (average speedup of 4.7) with limited area overheads (average area increase of 5.48 times)

High-Level Synthesis of Parallel Specifications Coupling Static and Dynamic Controllers

Ferrandi, Fabrizio
2021

Abstract

Conventional High-Level Synthesis (HLS) tools exploit parallelism mostly at the Instruction Level (ILP). They statically schedule the input specifications and build centralized Finite State Machine (FSM) controllers. However, aggressive exploitation of ILP in many applications has diminishing returns and, usually, centralized approaches do not efficiently exploit coarser parallelism, because FSMs are inherently serial. In this paper we present an HLS framework able to synthesize applications that, beside ILP, also expose Task Level Parallelism (TLP). An application can expose TLP through annotations that identify the parallel functions (i.e., tasks). To generate accelerators that efficiently execute concurrent tasks, we need to solve several issues: devise a mechanism to support concurrent execution flows, exploit memory parallelism, and manage synchronization. To support concurrent execution flows, we introduce a novel adaptive controller. The adaptive controller is composed of a set of interacting control elements that independently manage the execution of a task. These control elements check dependencies and resource constraints at runtime, enabling as soon as possible execution. To support parallel access to shared memories and synchronization, we integrate with a novel Hierarchical Memory Interface (HMI). With respect to previous solutions, the proposed interface supports multi-ported memories and atomic memory operations, which commonly occur in parallel programming. Our framework can generate the hardware implementation of C functions by employing two different approaches, depending on its characteristics. If a function exposes TLP, then the framework generates hardware implementations based on the adaptive controller. Otherwise, the framework implements the function through the FSM approach, which is optimized for ILP exploitation. We evaluate our framework on a set of parallel applications and show substantial performance improvements (average speedup of 4.7) with limited area overheads (average area increase of 5.48 times)
Proceedings of International Symposium on Parallel and Distributed Processing (IPDPS)
978-1-6654-4066-0
Annotations , Memory management , Parallel processing , Tools , Dynamic scheduling , Hardware , Finite element analysis
File in questo prodotto:
File Dimensione Formato  
High-Level_Synthesis_of_Parallel_Specifications_Coupling_Static_and_Dynamic_Controllers.pdf

Accesso riservato

: Publisher’s version
Dimensione 393.41 kB
Formato Adobe PDF
393.41 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11311/1180263
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 0
social impact