Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2017, PubMed
…
19 pages
1 file
Graphics processing unit (GPU) implementations of signal processing algorithms can outperform CPU-based implementations. This paper describes the GPU implementation of several algorithms encountered in a wide range of high-data rate communication receivers including filters, multirate filters, numerically controlled oscillators, and multi-stage digital down converters. These structures are tested by processing the 20 MHz wide FM radio band (88-108 MHz). Two receiver structures are explored: a single channel receiver and a filter bank channelizer. Both run in real time on NVIDIA GeForce GTX 1080 graphics card.
This paper describes the benefits of the using GPUs for large scale signal processing applications from a holistic perspective. It considers not only the raw performance of the hardware but also the other factors from initial development to life cycle support to understand the benefits of using such a powerful processing platform.
Journal of Signal Processing Systems, 2012
This article studies the integration of Graphics Processing Units in a Software Defined Radio environment. Two main solutions are considered, based on two levels of granularity for the parallelization. First, a fine grain parallelism solution, which is an extension of the existing solutions but adapted to operations of large computational complexity, is proposed. Second, an original solution based on coarse grain approach allowing better usage of the computing resources and easier parallelism extraction is described. For both solutions, scheduling and communication design as well as implementation are given, along with integration in the environment. Both solutions have been implemented and compared on different operations types and on multi-operations sequences. It is clearly shown that using the second solution can provide performance improvement, while the first one is not adapted to SDR applications.
Design and Implementation of Digital Signal Processing Hardware for a Software Radio Reciever by This project summarizes the design and implementation of field programmable gate array (FPGA) based digital signal processing (DSP) hardware meant to be used in a software radio system. The filters and processing were first designed in MATLAB and then implemented using very high speed integrated circuit hardware description language (VHDL).
Proc. 46th Asilomar Conference on Signals, Systems and Computers, Pacific Grove, California, USA, 4-7 Nov. 2012., 2012
High transmission bit rate in wireless channels gives rise to severe inter-symbol interference (lSI) and this makes the detection task very challenging. In such cases, Near-Maximum-Likelihood (NML) detection gives good performance. This paper describes the Design and Testing of a QPSK transceiver using NML detection on an NVIDIA Graphics Processing Unit (GPU) for Software Defined Radio (SDR) systems. Recent advances in programmable, highly parallel GPUs have enabled high performance general purpose computation. NVIDIA GPU is used for realizing this application. The data is transmitted over a frequency selective channel and experimental results are obtained. All the processing is done onto the NVIDIA GPU and the results obtained are compared for varying Signal to Noise Ratios (SNR) for different channel configurations.
2013 International Conference on Parallel and Distributed Systems, 2013
Wideband channelization is a computationally intensive task within software-defined radio (SDR). To support this task, the underlying hardware should provide high performance and allow flexible implementations. Traditional solutions use field-programmable gate arrays (FPGAs) to satisfy these requirements. While FPGAs allow for flexible implementations, realizing a FPGA implementation is a difficult and time-consuming process. On the other hand, multicore processors while more programmable, fail to satisfy performance requirements. Graphics processing units (GPUs) overcome the above limitations. However, traditional GPUs are power-hungry and can consume as much as 350 watts, making them ill-suited for many SDR environments, particularly those that are battery-powered. Here we explore the viability of low-power mobile graphics processors to simultaneously overcome the limitations of performance, flexibility, and power. Via execution profiling and performance analysis, we identify major bottlenecks in mapping the wideband channelization algorithm onto these devices and adopt several optimization techniques to achieve multiplicative speed-up over a multithreaded implementation. Overall, our approach delivers a speedup of up to 43-fold on the discrete AMD Radeon HD 6470M GPU and 27-fold on the integrated AMD Radeon HD 6480G GPU, when compared to a vectorized and multithreaded version running on the AMD A4-3300M CPU.
This paper presents the effective exploitation of Graphical Processing Unit (GPU) in Raspberry Pi for fast Fourier transform (FFT) computation. Very fast computation of FFT is found useful in computer vision based navigation system, Global Positioning System (GPS), HAM radio and on Raspberry Pi. A comparison is performed over the speed of FFT computation on BCM2835 GPU with that of 700 MHz ARM processor available in Raspberry Pi and also with intel-COREi5 processors. The FFT is computed for any one dimensional input signal and its analysis is done on different processors with varying signal lengths. The GNU radio is installed on Raspberry Pi, and the FFT computation done on GNU radio is accelerated using GPU of Raspberry Pi. Even though the Raspberry Pi GPU is primarily built for video enhancement, the parallel computational ability of GPU is utilized in this paper for accelerated FFT computation.
GPS Solutions, 2010
Off-the-shelf graphics processing units provide low-cost massive parallel computing performance, which can be utilized for the implementation of a GPS software receiver. In order to realize a real-time capable system the crucial stages of the receiver should be optimized to suit the requirements of a parallel processor. Moreover, the receiver should be capable to provide wider correlation functions and provide easy access to the spectral domain of the signals. Thus, the most suitable correlation algorithm, which forms the core part of each receivers should be chosen and implemented on the graphics processor. Since the sampling rate of the received signal limits the real-time capabilities of the software radio it is necessary to determine an optimum value, considering that the precision of the observable varies with sampling bandwidth. We are going to discuss details and present our single frequency multi-channel implementation, which is capable of operating in real-time mode. Our implementation differs from other solutions by the wideness of the correlation function and allows simple handling of data in the spectral domain. Comparison with output from a commercial hardware receiver, which shares the antenna with the software radio, confirms the consistency and accuracy of our development.
ElConRusNW-2015
The present paper discusses radio monitoring tasks and their solution using DFT-modulated filter banks. Filter bank software-hardware implementations are studied on the basis of Central Processing Unit (CPU) and Compute Unified Device Architecture (CUDA) with the use of Graphics Processing Unit (GPU). It is shown that CUDA technology is efficient for processing large datasets and outperforms computational results on CPU. The paper also considers signal classification in real time for different signal-to-noise ratios using a binary tree together with the iterative AdaBoost technique. Experiments show that it is possible to reach the total classification error of 10% for signals handled in radio monitoring tasks.
This paper describes the implementation of a streaming spectral processing system for realtime audio in a consumer-level onboard GPU (Graphics Processing Unit) attached to an off-the-shelf laptop computer. It explores the implementation of four processes: standard phase vocoder analysis and synthesis, additive synthesis and the sliding phase vocoder. These were developed under the CUDA development environment as plugins for the Csound 6 audio programming language. Following a detailed exposition of the GPU code, results of performance tests are discussed for each algorithm. They demonstrate that such a system is capable of realtime audio, even under the restrictions imposed by a limited GPU capability.
2009 IEEE Workshop on Signal Processing Systems, 2009
Multiple-input multiple-output (MIMO) is an existing technique that can significantly increase throughput of the system by employing multiple antennas at the transmitter and the receiver. Realizing maximum benefit from this technique requires computationally intensive detectors which poses significant challenges to receiver design. Furthermore, a flexible detector or multiple detectors are needed to handle different configurations. Graphical Processor Unit (GPU), a highly parallel commodity programmable co-processor, can deliver extremely high computation throughput and is well suited for signal processing applications. However, careful architecture aware design is needed to leverage performance offered by GPU. We show we can achieve good performance while maintaining flexibility by employing an optimized trellis-based MIMO detector on GPU.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
2015 IEEE Global Conference on Signal and Information Processing (GlobalSIP), 2015
Journal of Signal Processing Systems, 2017
2010 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation, 2010
Proceedings of the 2012 workshop on High-Performance Computing for Astronomy Date - Astro-HPC '12, 2012
IEEE Signal Processing Magazine, 2007
IEEE Transactions on Aerospace and Electronic Systems, 1998
Analog Integrated Circuits and Signal Processing, 2003
2011 22nd IEEE International Symposium on Rapid System Prototyping, 2011
The Journal of Supercomputing, 2015
American Journal of Engineering and Applied Sciences, 2010
2004 IEEE International Conference on Industrial Technology, 2004. IEEE ICIT '04., 2004
Springer eBooks, 2010
Microsystem Technologies, 2019
2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014