GBTL-CUDA: Graph Algorithms and Primitives for GPUs

Andrew Lumsdaine

GBTL-CUDA: Graph Algorithms and Primitives for GPUs

Andrew Lumsdaine

2016

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

GraphBLAS is an emerging paradigm for graph computation that makes it easy to program new graph algorithms in a highly abstract language of linear algebra. The promise of GraphBLAS is that an abstract graph program will execute in a wide variety of programming environments, ranging from embedded environments to distributed memory computers. In this paper we present our initial implementation of GraphBLAS primitives for graphics processing unit (GPU) systems called GraphBLAS Template Library (GBTL). Our implementation is an ongoing effort in the context of GraphBLAS standardization efforts by a diverse group of academics and representatives of the industry. Our implementation consists of a high-level C ++ frontend, and the GPU functionality is implemented with a combination of the CUSP library for sparse-matrix computation on GPU and the NVIDIA Thrust framework for abstract GPU programs. We give initial performance results of our implementations, and we discuss solutions to the problems we encountered when providing a low-level implementation for a high-level generic interface.

Carl Yang

ACM Transactions on Mathematical Software

High-performance implementations of graph algorithms are challenging to implement on new parallel hardware such as GPUs because of three challenges: (1) the difficulty of coming up with graph building blocks, (2) load imbalance on parallel hardware, and (3) graph problems having low arithmetic intensity. To address some of these challenges, GraphBLAS is an innovative, on-going effort by the graph analytics community to propose building blocks based on sparse linear algebra, which allow graph algorithms to be expressed in a performant, succinct, composable, and portable manner. In this paper, we examine the performance challenges of a linear-algebra-based approach to building graph frameworks and describe new design principles for overcoming these bottlenecks. Among the new design principles is exploiting input sparsity , which allows users to write graph algorithms without specifying push and pull direction. Exploiting output sparsity allows users to tell the backend which values of...

Log In

GBTL-CUDA: Graph Algorithms and Primitives for GPUs

Sign up for access to the world's latest research

Abstract

Related papers

Related topics

Related papers