Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2009
…
8 pages
1 file
The focus on throughput and large data volumes separates Information Retrieval (IR) from scientific computing, since for IR it is critical to process large amounts of data efficiently, a task which the GPU currently does not excel at. Only recently has the IR community begun to explore the possibilities, and an implementation of a search engine for the GPU was published recently in April 2009. This paper analyzes how GPUs can be improved to better suit such large data volume applications. Current graphics cards have a bottleneck regarding the transfer of data between the host and the GPU. One approach to resolve this bottleneck is to include the host memory as part of the GPUsŠ memory hierarchy. Benchmarks from NVIDIA ION, 9800m and GTX 240 are included. Several suggestions for future GPU features are also prensented.
2010
Abstract. The focus on throughput and large data volumes separates Information Retrieval (IR) from scientific computing, since for IR it is critical to process large amounts of data efficiently, a task which the GPU currently does not excel at. Only recently has the IR community begun to explore the possibilities, and an implementation of a search engine for the GPU was published recently in April 2009. This paper analyzes how GPUs can be improved to better suit such large data volume applications. Current graphics cards have a bottleneck regarding the transfer of data between the host and the GPU. One approach to resolve this bottleneck is to include the host memory as part of the GPUsŠ memory hierarchy. Benchmarks from NVIDIA ION, 9800m and GTX 240 are included. Several suggestions for future GPU features are also prensented.
International Journal of Electronic Business, 2005
E-commerce database and web servers are distinguished for accessing large indexed data sets. The paper proposes a novel approach for exploiting the graphical resources of a commodity PC, transforming the graphics pipeline (GPU) into an indirection engine for a fast retrieval of indexed data. We show how a GeForce FX graphics card can make use of vertices, textures and colours to solve up to six nested indexed lookups entirely in hardware, achieving a performance gain of up to 400% vs. a Pentium 4 processor with five times higher clock frequency. As they are evolving, GPUs are converting into general-purpose processors willing to cooperate with the CPU for a fast joint execution. Our results demonstrate the efficiency of the graphics processor for accessing indexed information, suggesting its extensive use in web servers, relational databases and XML systems, where multiple queries can run simultaneously on both processors.
Studies in Informatics and Control
The Parallel computing platforms enable dramatic increases in computing performance by harnessing the power of Graphics Processing Units (GPUs). Their design, based on a high level of hardware parallelization achieved through a big number of processing cores, made GPUs serious competitors for CPU based processing architectures. This fact is most obvious when it comes to processing huge amount of data. Many recent studies aimed at the development of GPU based implementations for various fields such as: astronomy, medicine, image processing, data compression and others. However, very few of them aimed at achieving information retrieval improvement based on GPU. Considering this, along with the latest stage of content based information retrieval algorithms and their practical efficiency for a wide variety of applications, this paper focuses on emphasizing parallel GPGPU algorithms performances against their CPU equivalents.
Procedia Computer Science, 2010
This study is devoted to exploring possible applications of GPU technology for acceleration of the database access. We use the n-gram based approximate text search engine as a test bed for GPU based acceleration algorithms. Two solutions -hybrid CPU/GPU and pure GPU algorithms for query processing are studied and compared with the baseline CPU algorithm as well as with the optimized versions of the CPU algorithm. The hybrid algorithm performs poorly on most queries and only modest acceleration is achievable for long queries with high error level. On the other hand speedups up to 18 times were achieved for pure GPU algorithm. Application of the GPU acceleration for more general data base problems is discussed.
Proceedings of the 21st spring conference on Computer graphics - SCCG '05, 2005
With the introduction in 2003 of standard GPUs with 32 bit floating point numbers and programmable Vertex and Fragment processors, the processing power of the GPU was made available to non-graphics applications. As the GPU is aimed at computer graphics, the concepts in GPU-programming are based on computer graphics terminology, and the strategies for programming have to be based on the architecture of the graphics pipeline. At SINTEF in Norway a 4-year strategic institute project (2004)(2005)(2006)(2007) "Graphics hardware as a high-end computational resource", aims at making GPUs available as a computational resource both to academia and industry. This paper addresses the challenges of GPUprogramming and results of the project's first year.
Proceedings of the IEEE, 2000
The graphics processing unit (GPU) has become an integral part of today's mainstream computing systems. Over the past six years, there has been a marked increase in the performance and capabilities of GPUs. The modern GPU is not only a powerful graphics engine but also a highly parallel programmable processor featuring peak arithmetic and memory bandwidth that substantially outpaces its CPU counterpart. The GPU's rapid increase in both programmability and capability has spawned a research community that has successfully mapped a broad range of computationally demanding, complex problems to the GPU. This effort in generalpurpose computing on the GPU, also known as GPU computing, has positioned the GPU as a compelling alternative to traditional microprocessors in high-performance computer systems of the future. We describe the background, hardware, and programming model for GPU computing, summarize the state of the art in tools and techniques, and present four GPU computing successes in game physics and computational biophysics that deliver order-of-magnitude performance gains over optimized CPU applications. KEYWORDS | General-purpose computing on the graphics processing unit (GPGPU); GPU computing; parallel computing I. INTRODUCTION Parallelism is the future of computing. Future microprocessor development efforts will continue to concentrate on adding cores rather than increasing single-thread performance. One example of this trend, the heterogeneous nine-core Cell broadband engine, is the main processor in the Sony Playstation 3 and has also attracted substantial interest from the scientific computing community. Similarly, the highly parallel graphics processing unit (GPU) is rapidly gaining maturity as a powerful engine for computationally demanding applications. The GPU's performance and potential offer a great deal of promise for future computing systems, yet the architecture and programming model of the GPU are markedly different than most other commodity single-chip processors. The GPU is designed for a particular class of applications with the following characteristics. Over the past few years, a growing community has identified other applications with similar characteristics and successfully mapped these applications onto the GPU. Computational requirements are large. Real-time rendering requires billions of pixels per second, and each pixel requires hundreds or more operations. GPUs must deliver an enormous amount of compute performance to satisfy the demand of complex real-time applications. Parallelism is substantial. Fortunately, the graphics pipeline is well suited for parallelism. Operations on vertices and fragments are well matched to finegrained closely coupled programmable parallel compute units, which in turn are applicable to many other computational domains. Throughput is more important than latency. GPU implementations of the graphics pipeline prioritize throughput over latency. The human visual system operates on millisecond time scales, while operations within a modern processor take nanoseconds. This six-order-of-magnitude gap means that the latency of any individual operation is unimportant. As a consequence, the graphics pipeline is quite
2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications, 2012
Similarity search has been widely studied in the last years, as it can be applied to several fields such as searching by content in multimedia objects, text retrieval or computational biology. These applications usually work on very large databases that are often indexed off-line to enable the acceleration of online searches. However, to maintain an acceptable throughput, it is essential to exploit the intrinsic parallelism of the algorithms used for the on-line query solving process, even with indexed databases. Therefore, many strategies have been proposed in the literature to parallelize these algorithms, both on shared and distributed memory multiprocessor systems. Lately, GPUs have also been used to implement brute-force approaches instead of using indexing structures, due to the difficulties introduced by the index in the efficient exploitation of the GPU resources. In this work we propose a Multi-GPU metric-space technique that efficiently exploits index data structures for similarity search in large databases, and show how it outperforms previous OpenMP and GPU brute-force strategies. Furthermore, our analysis covers the effects of the database size and its nature.
Data Science and Engineering
Once exotic, computational accelerators are now commonly available in many computing systems. Graphics processing units (GPUs) are perhaps the most frequently encountered computational accelerators. Recent work has shown that GPUs are beneficial when analyzing massive data sets. Specifically related to this study, it has been demonstrated that GPUs can significantly reduce the query processing time of database bitmap index queries. Bitmap indices are typically used for large, read-only data sets and are often compressed using some form of hybrid run-length compression. In this paper, we present three GPU algorithm enhancement strategies for executing queries of bitmap indices compressed using word aligned hybrid compression: (1) data structure reuse (2) metadata creation with various type alignment and (3) a preallocated memory pool. The data structure reuse greatly reduces the number of costly memory system calls. The use of metadata exploits the immutable nature of bitmaps to pre-...
2009 International Conference on High Performance Computing (HiPC), 2009
With General Purpose programmable GPUs becoming more and more popular, automated tools are needed to bridge the gap between achievable performance from highly parallel architectures and the performance required in applications. In this paper, we concentrate on improving GPU memory management for applications with large and intermediate data sets that do not completely fit in GPU memory. For such applications, the movement of the extra data to CPU memory must be carefully managed. In particular, we focus on solving the joint task scheduling and data transfer scheduling problem posed in [1], and propose an algorithm that gives close to optimal results (as measured by running simulated annealing overnight) in terms of the amount of data transferred for image processing benchmarks such as edge detection and Convolutional Neural Networks. Our results enable a reduction of up to 30× in the amount of data transfers compared to an unoptimized implementation. They are up to 2× better than the methods previously proposed in [1] and less than 16% away from the optimal solution.
Pollack Periodica, 2008
The evolution of GPUs (graphics processing units) has been enormous in the past few years. Their calculation power has improved exponentially, while the range of the tasks computable on GPUs has got significantly wider. The milestone of GPU development of the recent years is the appearance of the unified architecture-based devices. These GPUs implement a massively parallel design, which led them be capable not only of processing the common computer graphics tasks, but qualifies them for performing highly parallel mathematical algorithms effectively. Recognizing this availability GPU providers have issued developer platforms, which let the programmers manage computations on the GPU as a data-parallel computing device without the need of mapping them to a graphics API. Researchers salute this initiative, and the application of the new technology is quickly spreading in various branches of science
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
ACM SIGGRAPH 2005 Courses on - SIGGRAPH '05, 2005
International Journal of …, 2011
Information Processing & …, 2011
2016
Bulletin of Electrical Engineering and Informatics, 2021
Lecture Notes in Computer Science, 2012
The Journal of Supercomputing, 2014