100% found this document useful (1 vote)
3K views15 pages

PyCUDA Guide for Scientists

Short tutorial, which I gave during Advanced School on High Performance and Grid Computing in Abdus Salam International Center for Theoretical Physics (11-22 April 2011).

Uploaded by

PhtRaveller
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
3K views15 pages

PyCUDA Guide for Scientists

Short tutorial, which I gave during Advanced School on High Performance and Grid Computing in Abdus Salam International Center for Theoretical Physics (11-22 April 2011).

Uploaded by

PhtRaveller
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

What is PyCUDA?

Why Python PyCUDA

PyCUDA: short tutorial

Glib Ivashkevych

A.I. Akhiezer Institute of Theoretical Physics, NSC KIPT


Kharkov, Ukraine

Glib Ivashkevych PyCUDA: short tutorial


What is PyCUDA? Why Python PyCUDA

What is PyCUDA?

A simple way to compute on GPU from Python:


I complete access to CUDA features

I automatic resources management


I error checking and reporting
I high level abstractions: GPUArray, for example
I integration with NumPy
I documentation

Glib Ivashkevych PyCUDA: short tutorial


What is PyCUDA? Why Python PyCUDA

But before PyCUDA: why Python at all?

I general purpose
I interpreted
I simple to learn and use
I extensible and embeddable: Python C API
I science oriented too: NumPy, SciPy, SymPy, mpi4py,
MatPlotLib
I very well documented

NumPy
I flexible and effective arrays creation and manipulation

I FFT’s, signal processing, effective I/O and more

SciPy
I ODE’s, special functions, linear algebra, root finding and more

Glib Ivashkevych PyCUDA: short tutorial


What is PyCUDA? Why Python PyCUDA

Python goodies for scientific computing: NumPy arrays

Example:

>>> import numpy a s np


>>> a = np . a r a n g e ( 1 . , 2 . , 0 . 1 )
>>> a
array ([1. , 1.1 , 1.2 , 1.3 , 1.4 , 1.5 , 1.6 , 1.7 , 1.8 , 1.9])
>>> a [ 0 : 3 ]
array ([1. , 1.1 , 1.2])
>>> a [ : : 3 ]
array ([1. , 1.3 , 1.6 , 1.9])
>>> a [ a > 1 . 4 5 ]
array ([1.5 , 1.6 , 1.7 , 1.8 , 1.9])

Glib Ivashkevych PyCUDA: short tutorial


What is PyCUDA? Why Python PyCUDA

Python goodies for scientific computing: SciPy

How could we use C/C++ from Python?


I Python C API – a hard way
I SWIG, Boost::Python – simplier, but not simple enough
I scipy.weave.inline – just pass your C code as a string

Example:
>>> import numpy a s np
>>> from s c i p y . weave import i n l i n e
>>> b = np . n d a r r a y ( s h a p e =(10) , d t y p e=f l o a t ) ; b. f i l l (1.)
>>> b
array ([ 1. , 1. , 1. , 1. , 1. , 1. , 1. , 1. , 1. , 1.])
>>> inum = b . s i z e
>>> C c o d e = ’ f o r ( i n t i =0; i <inum ; i ++) b [ i ] ∗= 2 . ; ’
>>> i n l i n e ( C code , [ ’ b ’ , ’ inum ’ ] )
>>> b
array ([ 2. , 2. , 2. , 2. , 2. , 2. , 2. , 2. , 2. , 2.])

Glib Ivashkevych PyCUDA: short tutorial


What is PyCUDA? Why Python PyCUDA

Why is it so simple to use C/C++ from Python?


Python has C under the hood
I Python objects are in fact C structs (roughly speaking)
I ... and NumPy arrays too

Look under the hood: NumPy C API

typedef struct PyArrayObject {


PyObject HEAD
char ∗ data ;
i n t nd ;
npy intp ∗ dimensions ;
npy intp ∗ s t r i d e s ;
PyObject ∗ base ;
PyArray Descr ∗ descr ;
int flags ;
PyObject ∗ w e a k r e f l i s t ;
} PyArrayObject ;

Glib Ivashkevych PyCUDA: short tutorial


What is PyCUDA? Why Python PyCUDA

The same simplicity with CUDA in Python? Yes!

Image courtesy: Andreas Klöckner.


Glib Ivashkevych PyCUDA: short tutorial
What is PyCUDA? Why Python PyCUDA

Concept behind PyCUDA: code is


not a compile–time constant → metaprogramming

Image courtesy: Andreas Klöckner.


Glib Ivashkevych PyCUDA: short tutorial
What is PyCUDA? Why Python PyCUDA

Metaprogramming = runtime generation of code from


templates or code snippets
Example 1: simple metaprogramming (simple mprog.py)
...
from j i n j a 2 import Template
...
m o d u l e t p l = Template ( ”””
global void
k e r n e l ( { { t y p e }} ∗a , {{ t y p e }} ∗b , {{ t y p e }} ∗ r e s )
{
i n t i d x = t h r e a d I d x . x + {{ t p e r b l k }} ∗ b l o c k I d x . x ;
r e s [ i d x ] = a [ i d x ] {{ op }} b [ i d x ] ;
}
””” )
...
m o d u l e c o d e = m o d u l e t p l . r e n d e r ( t y p e=” f l o a t ” ,
op=”+” , t p e r b l k=k )
module = pycuda . c o m p i l e r . S o u r c e M o d u l e ( m o d u l e c o d e )
k e r n e l f u n c = module . g e t f u n c t i o n ( ” k e r n e l ” )
...
Glib Ivashkevych PyCUDA: short tutorial
What is PyCUDA? Why Python PyCUDA

PyCUDA abstractions: GPUArray

I you could do GPU programming that way (kinda C style) or ...


I ... make use of high level PyCUDA abstractions

Example 2: GPUArray class (add arrays.py)


import numpy a s np
import pycuda . a u t o i n i t
import pycuda . g p u a r r a y a s g a r r a y

a = np . a r a n g e ( 0 . , 1 2 8 . ∗ 1 2 8 . , 1 . , d t y p e = np . f l o a t 3 2 )
b = np . a r a n g e ( 1 2 8 . ∗ 1 2 8 . , 0 . , −1. , d t y p e = np . f l o a t 3 2 )

a gpu = g a r r a y . to gpu ( a )
b gpu = g a r r a y . to gpu ( b )
c = ( a gpu + b gpu ) . get ( )
p r i n t np . amax ( c − a − b )

Glib Ivashkevych PyCUDA: short tutorial


What is PyCUDA? Why Python PyCUDA

PyCUDA abstractions: GPUArray

Example 3: GPUArray class (double array.py)


...
a = np . n d a r r a y ( s h a p e =(128 , 1 2 8 ) , d t y p e = np . f l o a t 3 2 )
a . f i l l (1.)
a gpu = gpuarray . to gpu ( a )
b = (2 ∗ a gpu ) . get ( )
p r i n t np . amax ( b − 2∗ a )

GPUArray is handy: no need to ...


I ...allocate and free memory

I ...copy data between Host and Device


I ...write kernels for (at least) simple operations

Glib Ivashkevych PyCUDA: short tutorial


What is PyCUDA? Why Python PyCUDA

PyCUDA abstractions: ReductionKernel


Z ∞
2
Example: I = e −x dx
−∞

Example 4: ReductionKernel (gaussian integral.py)


...
G a u s s i a n I n t = RednKer ( np . f l o a t 3 2 , n e u t r a l=” 0 . ” ,
r e d u c e e x p r=” a+b” , m a p e x p r=” x ∗ y [ i ] ” ,
a r g u m e n t s=” f l o a t x , f l o a t ∗ y ” )
...

ReductionKernel for ...


I ...scalar products

I ...integrals
I ...even n-body and more

Glib Ivashkevych PyCUDA: short tutorial


What is PyCUDA? Why Python PyCUDA

And even more:

I handy GPUArray creation routines (mimics NumPy)


I pycuda.cumath & ElementwiseKernel & Prefix Sum
I FFT: PyFFT – designed to work with GPUArray’s, by Bogdan
Opanchuk

PyOpenCL
Pretty much the same in concept, but for OpenCL: platform
independent.

Glib Ivashkevych PyCUDA: short tutorial


What is PyCUDA? Why Python PyCUDA

Links:

I PyCUDA documentation:
http://documen.tician.de/pycuda/
I GTC 2010 presentations archive:
http://www.nvidia.com/object/gtc2010-presentation-
archive.html
I PASI screencasts (including 4 lectures on OpenCL &
PyOpenCL by Andreas):
http://www.bu.edu/pasi/materials/
I Scientific and numerical packages for Python:
http://wiki.python.org/moin/NumericAndScientific

Glib Ivashkevych PyCUDA: short tutorial


What is PyCUDA? Why Python PyCUDA

Thanks

To our directors for opportunity to present this tutorial


To Andreas Klöckner for PyCUDA, useful hints and images

Glib Ivashkevych PyCUDA: short tutorial

You might also like