Welcome to PyOpenCL’s documentation!¶

PyOpenCL gives you easy, Pythonic access to the OpenCL parallel computation API. What makes PyOpenCL special?

  • Object cleanup tied to lifetime of objects. This idiom, often called RAII in C++, makes it much easier to write correct, leak- and crash-free code.

  • Completeness. PyOpenCL puts the full power of OpenCL’s API at your disposal, if you wish. Every obscure get_info() query and all CL calls are accessible.

  • Automatic Error Checking. All errors are automatically translated into Python exceptions.

  • Speed. PyOpenCL’s base layer is written in C++, so all the niceties above are virtually free.

  • Helpful Documentation. You’re looking at it. ;)

  • Liberal license. PyOpenCL is open-source under the MIT license and free for commercial, academic, and private use.

Here’s an example, to give you an impression:

#!/usr/bin/env python

import numpy as np

import pyopencl as cl


rng = np.random.default_rng()
a_np = rng.random(50000, dtype=np.float32)
b_np = rng.random(50000, dtype=np.float32)

ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)

mf = cl.mem_flags
a_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a_np)
b_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=b_np)

prg = cl.Program(ctx, """
__kernel void sum(
    __global const float *a_g, __global const float *b_g, __global float *res_g)
{
  int gid = get_global_id(0);
  res_g[gid] = a_g[gid] + b_g[gid];
}
""").build()

res_g = cl.Buffer(ctx, mf.WRITE_ONLY, a_np.nbytes)
knl = prg.sum  # Use this Kernel object for repeated calls
knl(queue, a_np.shape, None, a_g, b_g, res_g)

res_np = np.empty_like(a_np)
cl.enqueue_copy(queue, res_np, res_g)

# Check on CPU with Numpy:
error_np = res_np - (a_np + b_np)
print(f"Error:\n{error_np}")
print(f"Norm: {np.linalg.norm(error_np):.16e}")
assert np.allclose(res_np, a_np + b_np)

(You can find this example as examples/demo.py in the PyOpenCL source distribution.)

Tutorials¶

Software that works with or enhances PyOpenCL¶

  • Jon Roose’s pyclblas (code) makes BLAS in the form of clBLAS available from within pyopencl code.

    Two earlier wrappers continue to be available: one by Eric Hunsberger and one by Lars Ericson.

  • Cedric Nugteren provides a wrapper for the CLBlast OpenCL BLAS library: PyCLBlast.

  • Gregor Thalhammer’s gpyfft provides a Python wrapper for the OpenCL FFT library clFFT from AMD.

  • Bogdan Opanchuk’s reikna offers a variety of GPU-based algorithms (FFT, random number generation, matrix multiplication) designed to work with pyopencl.array.Array objects.

  • Troels Henriksen, Ken Friis Larsen, and Cosmin Oancea’s Futhark programming language offers a nice way to code nested-parallel programs with reductions and scans on data in pyopencl.array.Array instances.

  • Robbert Harms and Alard Roebroeck’s MOT offers a variety of GPU-enabled non-linear optimization algorithms and MCMC sampling routines for parallel optimization and sampling of multiple problems.

  • Vincent Favre-Nicolin’s pyvkfft makes vkfft accessible from PyOpenCL.

If you know of a piece of software you feel that should be on this list, please let me know, or, even better, send a patch!

Contents¶

Note that this guide does not explain OpenCL programming and technology. Please refer to the official Khronos OpenCL documentation for that.

PyOpenCL also has its own web site, where you can find updates, new versions, documentation, and support.

Indices and tables¶