Multi-dimensional arrays

The functionality in this module provides something of a work-alike for numpy arrays, but with all operations executed on the CL compute device.

Data Types

PyOpenCL provides some amount of integration between the numpy type system, as represented by numpy.dtype, and the types available in OpenCL. All the simple scalar types map straightforwardly to their CL counterparts.

Vector Types

class pyopencl.array.vec

All of OpenCL’s supported vector types, such as float3 and long4 are available as numpy data types within this class. These numpy.dtype instances have field names of x, y, z, and w just like their OpenCL counterparts. They will work both for parameter passing to kernels as well as for passing data back and forth between kernels and Python code. For each type, a make_type function is also provided (e.g. make_float3(x, y, z)).

If you want to construct a pre-initialized vector type you have three new functions to choose from:

  • zeros_type()

  • ones_type()

  • filled_type(fill_value)

Added in version 2014.1.

Changed in version 2014.1: The make_type functions have a default value (0) for each component. Relying on the default values has been deprecated. Either specify all components or use one of th new flavors mentioned above for constructing a vector.

Custom data types

If you would like to use your own (struct/union/whatever) data types in array operations where you supply operation source code, define those types in the preamble passed to pyopencl.elementwise.ElementwiseKernel, pyopencl.reduction.ReductionKernel (or similar), and let PyOpenCL know about them using this function:

pyopencl.tools.get_or_register_dtype(c_names, dtype=None)[source]

Get or register a numpy.dtype associated with the C type names in the string list c_names. If dtype is None, no registration is performed, and the numpy.dtype must already have been registered. If so, it is returned. If not, TypeNameNotKnown is raised.

If dtype is not None, registration is attempted. If the c_names are already known and registered to identical numpy.dtype objects, then the previously dtype object of the previously registered type is returned. If the c_names are not yet known, the type is registered. If one of the c_names is known but registered to a different type, an error is raised. In this latter case, the type may end up partially registered and any further behavior is undefined.

Added in version 2012.2.

exception pyopencl.tools.TypeNameNotKnown[source]

Added in version 2013.1.

pyopencl.tools.register_dtype(dtype, name)[source]

Changed in version 2013.1: This function has been deprecated. It is recommended that you develop against the new interface, get_or_register_dtype().

pyopencl.tools.dtype_to_ctype(dtype)[source]

Returns a C name registered for dtype.

This function helps with producing C/OpenCL declarations for structured numpy.dtype instances:

pyopencl.tools.match_dtype_to_c_struct(device, name, dtype, context=None)[source]

Return a tuple (dtype, c_decl) such that the C struct declaration in c_decl and the structure numpy.dtype instance dtype have the same memory layout.

Note that dtype may be modified from the value that was passed in, for example to insert padding.

(As a remark on implementation, this routine runs a small kernel on the given device to ensure that numpy and C offsets and sizes match.)

Added in version 2013.1.

This example explains the use of this function:

>>> import numpy as np
>>> import pyopencl as cl
>>> import pyopencl.tools
>>> ctx = cl.create_some_context()
>>> dtype = np.dtype([("id", np.uint32), ("value", np.float32)])
>>> dtype, c_decl = pyopencl.tools.match_dtype_to_c_struct(
...     ctx.devices[0], 'id_val', dtype)
>>> print c_decl
typedef struct {
  unsigned id;
  float value;
} id_val;
>>> print dtype
[('id', '<u4'), ('value', '<f4')]
>>> cl.tools.get_or_register_dtype('id_val', dtype)

As this example shows, it is important to call get_or_register_dtype() on the modified dtype returned by this function, not the original one.

A more complete example of how to use custom structured types can be found in examples/demo-struct-reduce.py in the PyOpenCL distribution.

Complex Numbers

PyOpenCL’s Array type supports complex numbers out of the box, by simply using the corresponding numpy types.

If you would like to use this support in your own kernels, here’s how to proceed: Since OpenCL 1.2 (and earlier) do not specify native complex number support, PyOpenCL works around that deficiency. By saying:

#include <pyopencl-complex.h>

in your kernel, you get complex types cfloat_t and cdouble_t, along with functions defined on them such as cfloat_mul(a, b) or cdouble_log(z). Elementwise kernels automatically include the header if your kernel has complex input or output. See the source file for a precise list of what’s available.

If you need double precision support, please:

#define PYOPENCL_DEFINE_CDOUBLE

before including the header, as DP support apparently cannot be reliably autodetected.

Under the hood, the complex types are struct types as defined in the header. Ideally, you should only access the structs through the provided functions, never directly.

Added in version 2012.1.

Changed in version 2015.2: [INCOMPATIBLE] Changed PyOpenCL’s complex numbers from float2 and double2 OpenCL vector types to custom struct. This was changed because it very easily introduced bugs where

  • complex*complex

  • real+complex

look like they may do the right thing, but silently do the wrong thing.

The Array Class

class pyopencl.array.Array(cq: Context | CommandQueue | None, shape: Tuple[int, ...] | int, dtype: Any, order: str = 'C', allocator: AllocatorBase | None = None, data: Any = None, offset: int = 0, strides: Tuple[int, ...] | None = None, events: List[Event] | None = None, _flags: Any = None, _fast: bool = False, _size: int | None = None, _context: Context | None = None, _queue: CommandQueue | None = None)[source]

A numpy.ndarray work-alike that stores its data and performs its computations on the compute device. shape and dtype work exactly as in numpy. Arithmetic methods in Array support the broadcasting of scalars. (e.g. array + 5).

cq must be a CommandQueue or a Context.

If it is a queue, cq specifies the queue in which the array carries out its computations by default. If a default queue (and thereby overloaded operators and many other niceties) are not desired, pass a Context.

allocator may be None or a callable that, upon being called with an argument of the number of bytes to be allocated, returns a pyopencl.Buffer object. (A pyopencl.tools.MemoryPool instance is one useful example of an object to pass here.)

Changed in version 2011.1: Renamed context to cqa, made it general-purpose.

All arguments beyond order should be considered keyword-only.

Changed in version 2015.2: Renamed context to cq, disallowed passing allocators through it.

data

The pyopencl.MemoryObject instance created for the memory that backs this Array.

Changed in version 2013.1: If a non-zero offset has been specified for this array, this will fail with ArrayHasOffsetError.

base_data

The pyopencl.MemoryObject instance created for the memory that backs this Array. Unlike data, the base address of base_data is allowed to be different from the beginning of the array. The actual beginning is the base address of base_data plus offset bytes.

Unlike data, retrieving base_data always succeeds.

Added in version 2013.1.

offset

See base_data.

Added in version 2013.1.

shape

A tuple of lengths of each dimension in the array.

ndim

The number of dimensions in shape.

dtype

The numpy.dtype of the items in the GPU array.

size

The number of meaningful entries in the array. Can also be computed by multiplying up the numbers in shape.

nbytes

The size of the entire array in bytes. Computed as size times dtype.itemsize.

strides

A tuple of bytes to step in each dimension when traversing an array.

flags

An object with attributes c_contiguous, f_contiguous and forc, which may be used to query contiguity properties in analogy to numpy.ndarray.flags.

Methods

with_queue(queue)[source]

Return a copy of self with the default queue set to queue.

None is allowed as a value for queue.

Added in version 2013.1.

__len__()[source]

Returns the size of the leading dimension of self.

reshape(*shape, **kwargs)[source]

Returns an array containing the same data with a new shape.

ravel(order='C')[source]

Returns flattened array containing the same data.

view(dtype=None)[source]

Returns view of array with the same data. If dtype is different from current dtype, the actual bytes of memory will be reinterpreted.

squeeze()[source]

Returns a view of the array with dimensions of length 1 removed.

Added in version 2015.2.

transpose(axes=None)[source]

Permute the dimensions of an array.

Parameters:

axes – list of ints, optional. By default, reverse the dimensions, otherwise permute the axes according to the values given.

Returns:

Array A view of the array with its axes permuted.

Added in version 2015.2.

T
set(ary, queue=None, async_=None, **kwargs)[source]

Transfer the contents the numpy.ndarray object ary onto the device.

ary must have the same dtype and size (not necessarily shape) as self.

async_ is a Boolean indicating whether the function is allowed to return before the transfer completes. To avoid synchronization bugs, this defaults to False.

Changed in version 2017.2.1: Python 3.7 makes async a reserved keyword. On older Pythons, we will continue to accept async as a parameter, however this should be considered deprecated. async_ is the new, official spelling.

get(queue=None, ary=None, async_=None, **kwargs)[source]

Transfer the contents of self into ary or a newly allocated numpy.ndarray. If ary is given, it must have the same shape and dtype.

Changed in version 2019.1.2: Calling with async_=True was deprecated and replaced by get_async(). The event returned by pyopencl.enqueue_copy() is now stored into events to ensure data is not modified before the copy is complete.

Changed in version 2015.2: ary with different shape was deprecated.

Changed in version 2017.2.1: Python 3.7 makes async a reserved keyword. On older Pythons, we will continue to accept async as a parameter, however this should be considered deprecated. async_ is the new, official spelling.

get_async(queue=None, ary=None, **kwargs)[source]

Asynchronous version of get() which returns a tuple (ary, event) containing the host array ary and the pyopencl.NannyEvent event returned by pyopencl.enqueue_copy().

Added in version 2019.1.2.

copy(queue=<class 'pyopencl.array._copy_queue'>)[source]
Parameters:

queue – The CommandQueue for the returned array.

Changed in version 2017.1.2: Updates the queue of the returned array.

Added in version 2013.1.

__str__()[source]

Return str(self).

__repr__()[source]

Return repr(self).

mul_add(selffac, other, otherfac, queue=None)[source]

Return selffac * self + otherfac * other.

__add__(other)[source]

Add an array with an array or an array with a scalar.

__sub__(other)[source]

Subtract an array from an array or a scalar from an array.

__iadd__(other)[source]
__isub__(other)[source]
__pos__()[source]
__neg__()[source]
__mul__(other)[source]
__div__(other)[source]

Divides an array by an array or a scalar, i.e. self / other.

__rdiv__(other)[source]

Divides an array by a scalar or an array, i.e. other / self.

__pow__(other)[source]

Exponentiation by a scalar or elementwise by another Array.

__and__(other)[source]
__xor__(other)[source]
__or__(other)[source]

Return self|value.

__iand__(other)[source]
__ixor__(other)[source]
__ior__(other)[source]
__abs__()[source]

Return an Array of the absolute values of the elements of self.

__invert__()[source]
fill(value, queue=None, wait_for=None)[source]

Fill the array with scalar.

Returns:

self.

astype(dtype, queue=None)[source]

Return a copy of self, cast to dtype.

real

Added in version 2012.1.

imag

Added in version 2012.1.

conj()[source]

Added in version 2012.1.

conjugate()[source]

Added in version 2012.1.

__getitem__(index)[source]

Added in version 2013.1.

__setitem__(subscript, value)[source]

Set the slice of self identified subscript to value.

value is allowed to be:

Non-scalar broadcasting is not currently supported.

Added in version 2013.1.

setitem(subscript, value, queue=None, wait_for=None)[source]

Like __setitem__(), but with the ability to specify a queue and wait_for.

Added in version 2013.1.

Changed in version 2013.2: Added wait_for.

map_to_host(queue=None, flags=None, is_blocking=True, wait_for=None)[source]

If is_blocking, return a numpy.ndarray corresponding to the same memory as self.

If is_blocking is not true, return a tuple (ary, evt), where ary is the above-mentioned array.

The host array is obtained using pyopencl.enqueue_map_buffer(). See there for further details.

Parameters:

flags – A combination of pyopencl.map_flags. Defaults to read-write.

Added in version 2013.2.

Comparisons, conditionals, any, all

Added in version 2013.2.

Boolean arrays are stored as numpy.int8 because bool has an unspecified size in the OpenCL spec.

__bool__()[source]

Only works for device scalars. (i.e. “arrays” with shape == ())

any(queue=None, wait_for=None)[source]
all(queue=None, wait_for=None)[source]
__eq__(other)[source]

Return self==value.

__ne__(other)[source]

Return self!=value.

__lt__(other)[source]

Return self<value.

__le__(other)[source]

Return self<=value.

__gt__(other)[source]

Return self>value.

__ge__(other)[source]

Return self>=value.

Event management

If an array is used from within an out-of-order queue, it needs to take care of its own operation ordering. The facilities in this section make this possible.

Added in version 2014.1.1.

events

A list of pyopencl.Event instances that the current content of this array depends on. User code may read, but should never modify this list directly. To update this list, instead use the following methods.

add_event(evt)[source]

Add evt to events. If events is too long, this method may implicitly wait for a subset of events and clear them from the list.

finish()[source]

Wait for the entire contents of events, clear it.

exception pyopencl.array.ArrayHasOffsetError(val='The operation you are attempting does not yet support arrays that start at an offset from the beginning of their buffer.')[source]

Added in version 2013.1.

Constructing Array Instances

pyopencl.array.to_device(queue, ary, allocator=None, async_=None, array_queue=<class 'pyopencl.array._same_as_transfer'>, **kwargs)[source]

Return a Array that is an exact copy of the numpy.ndarray instance ary.

Parameters:

array_queue – The CommandQueue which will be stored in the resulting array. Useful to make sure there is no implicit queue associated with the array by passing None.

See Array for the meaning of allocator.

Changed in version 2015.2: array_queue argument was added.

Changed in version 2017.2.1: Python 3.7 makes async a reserved keyword. On older Pythons, we will continue to accept async as a parameter, however this should be considered deprecated. async_ is the new, official spelling.

pyopencl.array.empty(queue, shape, dtype, order='C', allocator=None, data=None)[source]

A synonym for the Array constructor.

pyopencl.array.zeros(queue, shape, dtype, order='C', allocator=None)[source]

Same as empty(), but the Array is zero-initialized before being returned.

Changed in version 2011.1: context argument was deprecated.

pyopencl.array.empty_like(ary, queue=<class 'pyopencl.array._copy_queue'>, allocator=None)[source]

Make a new, uninitialized Array having the same properties as other_ary.

pyopencl.array.zeros_like(ary)[source]

Make a new, zero-initialized Array having the same properties as other_ary.

pyopencl.array.arange(queue, [start, ]stop, [step, ]**kwargs)[source]

Create a Array filled with numbers spaced step apart, starting from start and ending at stop. If not given, start defaults to 0, step defaults to 1.

For floating point arguments, the length of the result is ceil((stop - start)/step). This rule may result in the last element of the result being greater than stop.

dtype is a required keyword argument.

Changed in version 2011.1: context argument was deprecated.

Changed in version 2011.2: allocator keyword argument was added.

pyopencl.array.take(a, indices, out=None, queue=None, wait_for=None)[source]

Return the Array [a[indices[0]], ..., a[indices[n]]]. For the moment, a must be a type that can be bound to a texture.

pyopencl.array.concatenate(arrays, axis=0, queue=None, allocator=None)[source]

Added in version 2013.1.

Note

The returned array is of the same type as the first array in the list.

pyopencl.array.stack(arrays, axis=0, queue=None)[source]

Join a sequence of arrays along a new axis.

Parameters:
  • arrays – A sequence of Array.

  • axis – Index of the dimension of the new axis in the result array. Can be -1, for the new axis to be last dimension.

Returns:

Array

Manipulating Array instances

pyopencl.array.transpose(a, axes=None)[source]

Permute the dimensions of an array.

Parameters:
  • aArray

  • axes – list of ints, optional. By default, reverse the dimensions, otherwise permute the axes according to the values given.

Returns:

Array A view of the array with its axes permuted.

pyopencl.array.reshape(a, shape)[source]

Gives a new shape to an array without changing its data.

Added in version 2015.2.

Conditionals

pyopencl.array.if_positive(criterion, then_, else_, out=None, queue=None)[source]

Return an array like then_, which, for the element at index i, contains then_[i] if criterion[i]>0, else else_[i].

pyopencl.array.maximum(a, b, out=None, queue=None)[source]

Return the elementwise maximum of a and b.

pyopencl.array.minimum(a, b, out=None, queue=None)[source]

Return the elementwise minimum of a and b.

Logical Operations

pyopencl.array.logical_and(x1, x2, /, out=None, queue=None)[source]

Returns the element-wise logical AND of x1 and x2.

pyopencl.array.logical_or(x1, x2, /, out=None, queue=None)[source]

Returns the element-wise logical OR of x1 and x2.

pyopencl.array.logical_not(x, /, out=None, queue=None)[source]

Returns the element-wise logical NOT of x.

Reductions

pyopencl.array.sum(a, dtype=None, queue=None, slice=None, initial=<no value>)[source]

Added in version 2011.1.

pyopencl.array.all(a, queue=None, wait_for=None)[source]
pyopencl.array.any(a, queue=None, wait_for=None)[source]
pyopencl.array.dot(a, b, dtype=None, queue=None, slice=None)[source]

Added in version 2011.1.

pyopencl.array.vdot(a, b, dtype=None, queue=None, slice=None)[source]

Like numpy.vdot().

Added in version 2013.1.

pyopencl.array.subset_dot(subset, a, b, dtype=None, queue=None, slice=None)[source]

Added in version 2011.1.

pyopencl.array.max(a, queue=None, initial=<no value>)[source]

Added in version 2011.1.

pyopencl.array.min(a, queue=None, initial=<no value>)[source]

Added in version 2011.1.

pyopencl.array.subset_max(subset, a, queue=None, slice=None)[source]

Added in version 2011.1.

pyopencl.array.subset_min(subset, a, queue=None, slice=None)[source]

Added in version 2011.1.

See also Sums and counts (“reduce”).

Elementwise Functions on Array Instances

The pyopencl.clmath module contains exposes array versions of the C functions available in the OpenCL standard. (See table 6.8 in the spec.)

pyopencl.clmath.acos(array, queue=None)[source]
pyopencl.clmath.acosh(array, queue=None)[source]
pyopencl.clmath.acospi(array, queue=None)[source]
pyopencl.clmath.asin(array, queue=None)[source]
pyopencl.clmath.asinh(array, queue=None)[source]
pyopencl.clmath.asinpi(array, queue=None)[source]
pyopencl.clmath.atan(array, queue=None)[source]
pyopencl.clmath.atan2(y, x, queue=None)[source]

Added in version 2013.1.

pyopencl.clmath.atanh(array, queue=None)[source]
pyopencl.clmath.atanpi(array, queue=None)[source]
pyopencl.clmath.atan2pi(y, x, queue=None)[source]

Added in version 2013.1.

pyopencl.clmath.cbrt(array, queue=None)[source]
pyopencl.clmath.ceil(array, queue=None)[source]
pyopencl.clmath.cos(array, queue=None)[source]
pyopencl.clmath.cosh(array, queue=None)[source]
pyopencl.clmath.cospi(array, queue=None)[source]
pyopencl.clmath.erfc(array, queue=None)[source]
pyopencl.clmath.erf(array, queue=None)[source]
pyopencl.clmath.exp(array, queue=None)[source]
pyopencl.clmath.exp2(array, queue=None)[source]
pyopencl.clmath.exp10(array, queue=None)[source]
pyopencl.clmath.expm1(array, queue=None)[source]
pyopencl.clmath.fabs(array, queue=None)[source]
pyopencl.clmath.floor(array, queue=None)[source]
pyopencl.clmath.fmod(arg, mod, queue=None)[source]

Return the floating point remainder of the division arg / mod, for each element in arg and mod.

pyopencl.clmath.frexp(arg, queue=None)[source]

Return a tuple (significands, exponents) such that arg == significand * 2**exponent.

pyopencl.clmath.ilogb(array, queue=None)[source]
pyopencl.clmath.ldexp(significand, exponent, queue=None)[source]

Return a new array of floating point values composed from the entries of significand and exponent, paired together as result = significand * 2**exponent.

pyopencl.clmath.lgamma(array, queue=None)[source]
pyopencl.clmath.log(array, queue=None)[source]
pyopencl.clmath.log2(array, queue=None)[source]
pyopencl.clmath.log10(array, queue=None)[source]
pyopencl.clmath.log1p(array, queue=None)[source]
pyopencl.clmath.logb(array, queue=None)[source]
pyopencl.clmath.modf(arg, queue=None)[source]

Return a tuple (fracpart, intpart) of arrays containing the integer and fractional parts of arg.

pyopencl.clmath.nan(array, queue=None)[source]
pyopencl.clmath.rint(array, queue=None)[source]
pyopencl.clmath.round(array, queue=None)[source]
pyopencl.clmath.sin(array, queue=None)[source]
pyopencl.clmath.sinh(array, queue=None)[source]
pyopencl.clmath.sinpi(array, queue=None)[source]
pyopencl.clmath.sqrt(array, queue=None)[source]
pyopencl.clmath.tan(array, queue=None)[source]
pyopencl.clmath.tanh(array, queue=None)[source]
pyopencl.clmath.tanpi(array, queue=None)[source]
pyopencl.clmath.tgamma(array, queue=None)[source]
pyopencl.clmath.trunc(array, queue=None)[source]

Generating Arrays of Random Numbers

PyOpenCL includes and uses some of the Random123 random number generators by D.E. Shaw Research. In addition to being usable through the convenience functions above, they are available in any piece of code compiled through PyOpenCL by:

#include <pyopencl-random123/philox.cl>
#include <pyopencl-random123/threefry.cl>

See the Philox source and the Threefry source for some documentation if you’re planning on using Random123 directly.

class pyopencl.clrandom.PhiloxGenerator(context, key=None, counter=None, seed=None)[source]

Added in version 2016.2.

fill_uniform(ary, a=0, b=1, queue=None)[source]
uniform(*args, **kwargs)[source]

Make a new empty array, apply fill_uniform() to it.

fill_normal(ary, mu=0, sigma=1, queue=None)[source]

Fill ary with normally distributed numbers with mean mu and standard deviation sigma.

normal(*args, **kwargs)[source]

Make a new empty array, apply fill_normal() to it.

class pyopencl.clrandom.ThreefryGenerator(context, key=None, counter=None, seed=None)[source]

Added in version 2016.2.

fill_uniform(ary, a=0, b=1, queue=None)[source]
uniform(*args, **kwargs)[source]

Make a new empty array, apply fill_uniform() to it.

fill_normal(ary, mu=0, sigma=1, queue=None)[source]

Fill ary with normally distributed numbers with mean mu and standard deviation sigma.

normal(*args, **kwargs)[source]

Make a new empty array, apply fill_normal() to it.

pyopencl.clrandom.rand(queue, shape, dtype, luxury=None, a=0, b=1)[source]

Return an array of shape filled with random values of dtype in the range \([a, b)\).

pyopencl.clrandom.fill_rand(result, queue=None, a=0, b=1)[source]

Fill result with random values in the range \([0, 1)\).