Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2019
…
13 pages
1 file
FPGAs excel in performing simple operations on high-speed streaming data, at high (energy) efficiency. However, so far, their difficult programming model and poor floating-point support prevented a wide adoption for typical HPC applications. This is changing, due to recent FPGA technology developments: support for the high-level OpenCL programming language, hard floating-point units, and tight integration with CPU cores. Combined, these are game changers: they dramatically reduce development times and allow using FPGAs for applications that were previously deemed too complex.
Astronomy and Computing, 2020
Realizing the next generation of radio telescopes such as the Square Kilometre Array (SKA) requires both more efficient hardware and algorithms than today's technology provides. The image-domain gridding (IDG) algorithm is a novel approach towards solving the most compute-intensive parts of creating sky images: gridding and degridding. It alleviates the performance bottlenecks of traditional AW-projection gridding by applying instrumental and environmental corrections in the image domain instead of in the Fourier domain. In this paper, we present a thorough performance analysis of this algorithm for an Intel Xeon CPU, Intel Xeon Phi, and GPUs from AMD and NVIDIA. We show that, by evaluating trigonometric functions in hardware, GPUs are both much faster and more energy efficient than a CPU or Xeon Phi. Furthermore, on GPUs, IDG is an order of magnitude faster and more energy efficient than traditional AW-projection. IDG on GPUs is the ideal candidate imaging technique for the SKA, as it meets the computational and energy constraints of the SKA Science Data Processor system.
IEEE Access
Radio telescopes produce large volumes of data that need to be processed to obtain high-resolution sky images. This is a complex task that requires computing systems that provide both high performance and high energy efficiency. Hardware accelerators such as GPUs (Graphics Processing Units) and FPGAs (Field Programmable Gate Arrays) can provide these two features and are thus an appealing option for this application. Most HPC (High-Performance Computing) systems operate in double precision (64-bit) or in single precision (32-bit), and radio-astronomical imaging is no exception. With reduced precision computing, smaller data types (e.g., 16-bit) are used to improve energy efficiency and throughput performance in noise-tolerant applications. We demonstrate that reduced precision can also be used to produce high-quality sky images. To this end, we analyze the gridding component (Image-Domain Gridding) of the widely-used WSClean imaging application. Gridding is typically one of the most time-consuming steps in the imaging process and, therefore, an excellent candidate for acceleration. We identify the minimum required exponent and mantissa bits for a custom floating-point data type. Then, we propose the first custom floating-point accelerator on a Xilinx Alveo U50 FPGA using High-Level Synthesis. Our reduced-precision implementation improves the throughput and energy efficiency of respectively 1.84x and 2.03x compared to the single-precision floating-point baseline on the same FPGA. Our solution is also 2.12x faster and 3.46x more energy-efficient than an Intel i9 9900k CPU (Central Processing Unit) and manages to keep up in throughput with an AMD RX 550 GPU.
2006
Our group seeks to revolutionize the development of radio astronomy signal processing instrumentation by designing and demonstrating a scalable, upgradeable, FPGA-based computing platform and software design methodology that targets a range of real-time radio telescope signal processing applications. This project relies on the development of a small number of modular, connectible, upgradeable hardware components and platformindependent signal processing algorithms and libraries which can be reused and scaled as hardware capabilities expand. We have developed such a hardware platform and many of the necessary signal processing libraries for applications in antenna array correlation, wide-band spectroscopy, and pulsar surveys. We present this platform and two applications we have developed for it as demonstrations of the technology. We also identify future directions for the development of this platform, such as packetization, RFI rejection libraries, and real-time imaging.
MECO, 2020
Modern radio telescopes like the Square Kilometer Array (SKA) will need to process in real-time exabytes of radio-astronomical signals to construct a high-resolution map of the sky. Near-Memory Computing (NMC) could alleviate the performance bottlenecks due to frequent memory accesses in a state-of-the-art radio-astronomy imaging algorithm. In this paper, we show that a sub-module performing a two-dimensional fast Fourier transform (2D FFT) is memory bound using CPI breakdown analysis on IBM Power9. Then, we present an NMC approach on FPGA for 2D FFT that outperforms a CPU by up to a factor of 120x and performs comparably to a high-end GPU, while using less bandwidth and memory.
Publications of the Astronomical Society of the Pacific, 2017
As a dedicated solar radio interferometer, the MingantU SpEctral RadioHeliograph (MUSER) generates massive observational data in the frequency range of 400 MHz-15 GHz. High-performance imaging forms a significantly important aspect of MUSER's massive data processing requirements. In this study, we implement a practical highperformance imaging pipeline for MUSER data processing. At first, the specifications of the MUSER are introduced and its imaging requirements are analyzed. Referring to the most commonly used radio astronomy software such as CASA and MIRIAD, we then implement a high-performance imaging pipeline based on the Graphics Processing Unit technology with respect to the current operational status of the MUSER. A series of critical algorithms and their pseudo codes, i.e., detection of the solar disk and sky brightness, automatic centering of the solar disk and estimation of the number of iterations for clean algorithms, are proposed in detail. The preliminary experimental results indicate that the proposed imaging approach significantly increases the processing performance of MUSER and generates images with high-quality, which can meet the requirements of the MUSER data processing.
Publications of the Astronomical Society of Australia, 2011
General purpose computing on graphics processing units (GPGPU) is dramatically changing the landscape of high performance computing in astronomy. In this paper, we identify and investigate several key decision areas, with a goal of simplyfing the early adoption of GPGPU in astronomy. We consider the merits of OpenCL as an open standard in order to reduce risks associated with coding in a native, vendor-specific programming environment, and present a GPU programming philosophy based on using brute force solutions. We assert that effective use of new GPU-based supercomputing facilities will require a change in approach from astronomers. This will likely include improved programming training, an increased need for software development best-practice through the use of profiling and related optimisation tools, and a greater realiance on third-party code libraries. As with any new technology, those willing to take the risks, and make the investment of time and effort to become early adopters of GPGPU in astronomy, stand to reap great benefits. 1 http://kolob.ziti.uni-heidelberg.de/ 2 1 flop = 1 floating point operation; 1 flop/s = 1 floating point operation/second. 3
2006 Fortieth Asilomar Conference on Signals, Systems and Computers, 2006
Our group, the Center for Astronomy Signal Processing and Electronics Research (CASPER), seeks to speed the development of radio astronomy signal processing instrumentation by designing and demonstrating a scalable, upgradeable, FPGA-based computing platform and software design methodology that targets a range of realtime radio telescope signal processing applications. This project relies on a small number of modular, connectible hardware components and open-source signal processing libraries which can be reused and scaled as hardware capabilities expand. We have demonstrated the use of 10 Gb Ethernet packetization and switches to manage high-bandwidth inter-board communication. Using these tools, we have built spectrometers, correlators, beamformers, VLBI data recorders, and many other applications. Future directions for the development include a fully packetized scalable correlator, additional library and toolflow development, and a next generation of modular FPGA-based hardware.
2017
In radio astronomy Field Programmable Gate Array (FPGA) technology is largely used for the implementation of digital signal processing techniques applied to antenna arrays. This is mainly due to the good trade-off among computing resources, power consumption and cost offered by FPGA chip compared to other technologies like ASIC, GPU and CPU. In the last years several digital backend systems based on such devices have been developed at the Medicina radio astronomical station (INAF-IRA, Bologna, Italy). Instruments like FX correlator, direct imager, beamformer, multi-beam system have been successfully designed and realized on CASPER (Collaboration for Astronomy Signal Processing and Electronics Research, https://casper.berkeley.edu) processing boards. In this paper we present the gained experience in this kind of applications.
2010
Abstract Next-generation radio telescopes are expected to produce petaflops of astronomical image data of the universe. To complicate matters, the raw astronomical data must be processed in various ways before it is usable by astronomers. Source extraction is the final stage in the astronomical image processing pipeline, where properties and characteristics of astronomical points of interest, or sources, are calculated and reported.
Monthly Notices of the Royal Astronomical Society, 2013
We present a high-performance, graphics processing unit (GPU)-based framework for the efficient analysis and visualization of (nearly) terabyte (TB)-sized 3-dimensional images. Using a cluster of 96 GPUs, we demonstrate for a 0.5 TB image: (1) volume rendering using an arbitrary transfer function at 7-10 frames per second; (2) computation of basic global image statistics such as the mean intensity and standard deviation in 1.7 s; (3) evaluation of the image histogram in 4 s; and (4) evaluation of the global image median intensity in just 45 s. Our measured results correspond to a raw computational throughput approaching one teravoxel per second, and are 10-100 times faster than the best possible performance with traditional single-node, multi-core CPU implementations. A scalability analysis shows the framework will scale well to images sized 1 TB and beyond. Other parallel data analysis algorithms can be added to the framework with relative ease, and accordingly, we present our framework as a possible solution to the image analysis and visualization requirements of nextgeneration telescopes, including the forthcoming Square Kilometre Array pathfinder radiotelescopes.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Publications of the Astronomical Society of the Pacific, 2016
Lecture Notes in Computer Science, 2014
Proceedings of the ISC
Computer Science - Research and Development, 2009
Software - Practice and Experience, 2008
Revista Brasileira de Computação Aplicada, 2014
submitted to the The …
Proceedings of the 23rd international conference on Conference on Supercomputing - ICS '09, 2009
International Journal of Parallel Programming, 2010
Monthly Notices of the Royal Astronomical Society, 2010
Monthly Notices of the Royal Astronomical Society, 2014
2012 IEEE 26th International Parallel and Distributed Processing Symposium, 2012
Aerospace …, 2006
2017 XXXIInd General Assembly and Scientific Symposium of the International Union of Radio Science (URSI GASS), 2017