LALP: A Language to Program Custom FPGA-Based Acceleration Engines

Ricardo  Menotti

LALP: A Language to Program Custom FPGA-Based Acceleration Engines

Ricardo Menotti

2010, International Journal of …

visibility

…

description

28 pages

link

1 file

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

Field-Programmable Gate Arrays (FPGAs) are becoming increasingly important in embedded and high-performance computing systems. They allow performance levels close to the ones obtained with Application-Specific Integrated Circuits (ASICs), while still keeping design and implementation flexibility. However, to efficiently program FPGAs, one needs the expertise of hardware developers in order to master hardware description languages (HDLs) such as VHDL or Verilog. Attempts to furnish a high-level compilation flow (e.g., from C programs) still have to address open issues before broader efficient results can be obtained. Bearing in mind the hardware resources available in contemporary FPGAs, we developed LALP (Language for Aggressive Loop Pipelining), a novel language to program FPGA-based accelerators, and its compilation framework. The main ideas behind LALP are to provide a higher abstraction level than HDLs, to exploit the intrinsic parallelism of hardware resources, and to allow the programmer to control execution stages whenever compiler techniques are unable to generate efficient implementations. Those features are particularly useful to implement loop pipelining, a well regarded technique used to accelerate computations in several application domains. This paper describes LALP, and shows how it can be used to achieve high-performance embedded computing solutions.

akif kerem Ozkan

Journal of Signal Processing Systems, 2017

Current tools for High-Level Synthesis (HLS) excel at exploiting Instruction-Level Parallelism (ILP). The support for Data-Level Parallelism (DLP), one of the key advantages of Field Programmable Gate Arrays (FPGAs), is in contrast very limited. This work examines the exploitation of DLP on FPGAs using code generation for C-based HLS of image filters and streaming pipelines. In addition to well-known loop tiling techniques, we propose loop coarsening, which delivers superior performance and scalability. Loop tiling corresponds to splitting an image into separate regions, which are then processed in parallel by replicated accelerators. For data streaming, this also requires the generation of glue logic for the distribution of image data. Conversely, loop coarsening allows processing multiple pixels in parallel, whereby only the kernel operator is replicated within a single accelerator. We present concrete implementations of tiling and coarsening for Vivado HLS and Altera OpenCL. Furthermore, we present a comparison of our implementations to the keyword-driven parallelization support provided by the Altera Offline Compiler. We augment the FPGA back end of the heterogeneous Domain-Specific Language (DSL) framework Hipacc to generate loop coarsening implementations for Vivado HLS and Altera OpenCL. Moreover, we compare the resulting FPGA accelerators to highly optimized software implementations for Graphics Processing Units (GPUs), all generated from exactly the same code base.

Log In

LALP: A Language to Program Custom FPGA-Based Acceleration Engines

Sign up for access to the world's latest research

Abstract

Related papers

Related papers

Related topics