Welcome to PyCUDA’s documentation!

PyCUDA gives you easy, Pythonic access to Nvidia‘s CUDA parallel computation API. Several wrappers of the CUDA API already exist–so why the need for PyCUDA?

Here’s an example, to given you an impression:

import pycuda.autoinit
import pycuda.driver as drv
import numpy

from pycuda.compiler import SourceModule
mod = SourceModule("""
__global__ void multiply_them(float *dest, float *a, float *b)
{
  const int i = threadIdx.x;
  dest[i] = a[i] * b[i];
}
""")

multiply_them = mod.get_function("multiply_them")

a = numpy.random.randn(400).astype(numpy.float32)
b = numpy.random.randn(400).astype(numpy.float32)

dest = numpy.zeros_like(a)
multiply_them(
        drv.Out(dest), drv.In(a), drv.In(b),
        block=(400,1,1), grid=(1,1))

print dest-a*b

(This example is examples/hello_gpu.py in the PyCUDA source distribution.)

On the surface, this program will print a screenful of zeros. Behind the scenes, a lot more interesting stuff is going on:

Curious? Let’s get started.

Contents

Note that this guide will not explain CUDA programming and technology. Please refer to Nvidia’s programming documentation for that.

PyCUDA also has its own web site, where you can find updates, new versions, documentation, and support.

Indices and tables