The functionality in this module provides something of a workalike for
numpy
arrays, but with all operations executed on the CL compute device.
PyOpenCL provides some amount of integration between the numpy
type system, as represented by numpy.dtype
, and the types
available in OpenCL. All the simple scalar types map straightforwardly
to their CL counterparts.
pyopencl.array.
vec
¶All of OpenCL’s supported vector types, such as float3 and long4 are
available as numpy
data types within this class. These
numpy.dtype
instances have field names of x, y, z, and w
just like their OpenCL counterparts. They will work both for parameter passing
to kernels as well as for passing data back and forth between kernels and
Python code. For each type, a make_type function is also provided (e.g.
make_float3(x,y,z)).
If you want to construct a preinitialized vector type you have three new functions to choose from:
New in version 2014.1.
Changed in version 2014.1: The make_type functions have a default value (0) for each component. Relying on the default values has been deprecated. Either specify all components or use one of th new flavors mentioned above for constructing a vector.
If you would like to use your own (struct/union/whatever) data types in array
operations where you supply operation source code, define those types in the
preamble passed to pyopencl.elementwise.ElementwiseKernel
,
pyopencl.reduction.ReductionKernel
(or similar), and let PyOpenCL know
about them using this function:
pyopencl.tools.
get_or_register_dtype
(self, c_names, dtype=None)¶Get or register a numpy.dtype
associated with the C type names
in the string list c_names. If dtype is None, no registration is
performed, and the numpy.dtype
must already have been registered.
If so, it is returned. If not, TypeNameNotKnown
is raised.
If dtype is not None, registration is attempted. If the c_names are
already known and registered to identical numpy.dtype
objects,
then the previously dtype object of the previously registered type is
returned. If the c_names are not yet known, the type is registered. If
one of the c_names is known but registered to a different type, an error
is raised. In this latter case, the type may end up partially registered
and any further behavior is undefined.
New in version 2012.2.
pyopencl.tools.
TypeNameNotKnown
¶New in version 2013.1.
pyopencl.tools.
register_dtype
(dtype, name)¶Changed in version 2013.1: This function has been deprecated. It is recommended that you develop
against the new interface, get_or_register_dtype()
.
pyopencl.tools.
dtype_to_ctype
(dtype)¶Returns a C name registered for dtype.
This function helps with producing C/OpenCL declarations for structured
numpy.dtype
instances:
pyopencl.tools.
match_dtype_to_c_struct
(device, name, dtype, context=None)¶Return a tuple (dtype, c_decl) such that the C struct declaration
in c_decl and the structure numpy.dtype
instance dtype
have the same memory layout.
Note that dtype may be modified from the value that was passed in, for example to insert padding.
(As a remark on implementation, this routine runs a small kernel on
the given device to ensure that numpy
and C offsets and
sizes match.)
This example explains the use of this function:
>>> import numpy as np
>>> import pyopencl as cl
>>> import pyopencl.tools
>>> ctx = cl.create_some_context()
>>> dtype = np.dtype([("id", np.uint32), ("value", np.float32)])
>>> dtype, c_decl = pyopencl.tools.match_dtype_to_c_struct(
... ctx.devices[0], 'id_val', dtype)
>>> print c_decl
typedef struct {
unsigned id;
float value;
} id_val;
>>> print dtype
[('id', '<u4'), ('value', '<f4')]
>>> cl.tools.get_or_register_dtype('id_val', dtype)
As this example shows, it is important to call
get_or_register_dtype()
on the modified dtype returned by this
function, not the original one.
A more complete example of how to use custom structured types can be
found in examples/demostructreduce.py
in the PyOpenCL
distribution.
PyOpenCL’s Array
type supports complex numbers out of the box, by
simply using the corresponding numpy
types.
If you would like to use this support in your own kernels, here’s how to proceed: Since OpenCL 1.2 (and earlier) do not specify native complex number support, PyOpenCL works around that deficiency. By saying:
#include <pyopenclcomplex.h>
in your kernel, you get complex types cfloat_t and cdouble_t, along with functions defined on them such as cfloat_mul(a, b) or cdouble_log(z). Elementwise kernels automatically include the header if your kernel has complex input or output. See the source file for a precise list of what’s available.
If you need double precision support, please:
#define PYOPENCL_DEFINE_CDOUBLE
before including the header, as DP support apparently cannot be reliably autodetected.
Under the hood, the complex types are simply float2 and double2.
Warning
Note that, at the OpenCL source code level, addition (real + complex) and
multiplication (complex*complex) are defined for e.g. float2, but yield
wrong results, so that you need to use the corresponding functions.
(The Array
type implements complex arithmetic as you remember it,
without any idiotic quirks like this.)
New in version 2012.1.
Array
Class¶pyopencl.array.
Array
(cq, shape, dtype, order='C', allocator=None, data=None, offset=0, strides=None, events=None)¶A numpy.ndarray
workalike that stores its data and performs
its computations on the compute device. shape and dtype work exactly
as in numpy
. Arithmetic methods in Array
support the
broadcasting of scalars. (e.g. array+5)
cq must be a pyopencl.CommandQueue
or a pyopencl.Context
.
If it is a queue, cq specifies the queue in which the array carries out
its computations by default. If a default queue (and thereby overloaded
operators and many other niceties) are not desired, pass a
Context
.
allocator may be None or a callable that, upon being called with an
argument of the number of bytes to be allocated, returns an
pyopencl.Buffer
object. (A pyopencl.tools.MemoryPool
instance is one useful example of an object to pass here.)
Changed in version 2011.1: Renamed context to cqa, made it generalpurpose.
All arguments beyond order should be considered keywordonly.
Changed in version 2015.2: Renamed context to cq, disallowed passing allocators through it.
data
¶The pyopencl.MemoryObject
instance created for the memory that
backs this Array
.
Changed in version 2013.1: If a nonzero offset
has been specified for this array,
this will fail with ArrayHasOffsetError
.
base_data
¶The pyopencl.MemoryObject
instance created for the memory that
backs this Array
. Unlike data
, the base address of
base_data is allowed to be different from the beginning of the array.
The actual beginning is the base address of base_data plus
offset
in units of dtype
.
Unlike data
, retrieving base_data
always succeeds.
New in version 2013.1.
shape
¶The tuple of lengths of each dimension in the array.
dtype
¶The numpy.dtype
of the items in the GPU array.
size
¶The number of meaningful entries in the array. Can also be computed by
multiplying up the numbers in shape
.
strides
¶Tuple of bytes to step in each dimension when traversing an array.
flags
¶Return an object with attributes c_contiguous, f_contiguous and
forc, which may be used to query contiguity properties in analogy to
numpy.ndarray.flags
.
Methods
with_queue
(queue)¶Return a copy of self with the default queue set to queue.
None is allowed as a value for queue.
New in version 2013.1.
__len__
()¶Returns the size of the leading dimension of self.
reshape
(*shape, **kwargs)¶Returns an array containing the same data with a new shape.
ravel
()¶Returns flattened array containing the same data.
view
(dtype=None)¶Returns view of array with the same data. If dtype is different from current dtype, the actual bytes of memory will be reinterpreted.
transpose
(axes=None)¶Permute the dimensions of an array.
Parameters:  axes – list of ints, optional. By default, reverse the dimensions, otherwise permute the axes according to the values given. 

Returns:  Array A view of the array with its axes permuted. 
New in version 2015.2.
T
¶set
(ary, queue=None, async=False)¶Transfer the contents the numpy.ndarray
object ary
onto the device.
ary must have the same dtype and size (not necessarily shape) as self.
get
(queue=None, ary=None, async=False)¶Transfer the contents of self into ary or a newly allocated
numpy.ndarray
. If ary is given, it must have the same
shape and dtype.
Changed in version 2015.2: ary with different shape was deprecated.
copy
(queue=None)¶New in version 2013.1.
__str__
()¶__repr__
()¶mul_add
(selffac, other, otherfac, queue=None)¶Return selffac * self + otherfac*other.
__add__
(other)¶Add an array with an array or an array with a scalar.
__sub__
(other)¶Substract an array from an array or a scalar from an array.
__iadd__
(other)¶__isub__
(other)¶__neg__
()¶__mul__
(other)¶__div__
(other)¶Divides an array by an array or a scalar, i.e. self / other
.
__rdiv__
(other)¶Divides an array by a scalar or an array, i.e. other / self
.
__abs__
()¶Return a Array of the absolute values of the elements of self.
fill
(value, queue=None, wait_for=None)¶Fill the array with scalar.
Returns:  self. 

astype
(dtype, queue=None)¶Return a copy of self, cast to dtype.
real
¶New in version 2012.1.
imag
¶New in version 2012.1.
conj
()¶New in version 2012.1.
__getitem__
(index)¶New in version 2013.1.
__setitem__
(subscript, value)¶Set the slice of self identified subscript to value.
value is allowed to be:
Array
of the same shape
and (for now) strides
,
but with potentially different dtype
.numpy.ndarray
of the same shape
and (for now)
strides
, but with potentially different dtype
.Nonscalar broadcasting is not currently supported.
New in version 2013.1.
setitem
(subscript, value, queue=None, wait_for=None)¶Like __setitem__()
, but with the ability to specify
a queue and wait_for.
New in version 2013.1.
Changed in version 2013.2: Added wait_for.
map_to_host
(queue=None, flags=None, is_blocking=True, wait_for=None)¶If is_blocking, return a numpy.ndarray
corresponding to the
same memory as self.
If is_blocking is not true, return a tuple (ary, evt)
, where
ary is the abovementioned array.
The host array is obtained using pyopencl.enqueue_map_buffer()
.
See there for further details.
Parameters:  flags – A combination of pyopencl.map_flags .
Defaults to readwrite. 

New in version 2013.2.
Comparisons, conditionals, any, all
New in version 2013.2.
Boolean arrays are stored as numpy.int8
because bool
has an unspecified size in the OpenCL spec.
__nonzero__
()¶Only works for device scalars. (i.e. “arrays” with shape == ()
.)
any
(queue=None, wait_for=None)¶all
(queue=None, wait_for=None)¶__eq__
(other)¶__ne__
(other)¶__lt__
(other)¶__le__
(other)¶__gt__
(other)¶__ge__
(other)¶Event management
If an array is used from within an outoforder queue, it needs to take care of its own operation ordering. The facilities in this section make this possible.
New in version 2014.1.1.
events
¶A list of pyopencl.Event
instances that the current content of
this array depends on. User code may read, but should never modify this
list directly. To update this list, instead use the following methods.
pyopencl.array.
ArrayHasOffsetError
(val='The operation you are attempting does not yet support arrays that start at an offset from the beginning of their buffer.')¶New in version 2013.1.
Array
Instances¶pyopencl.array.
to_device
(queue, ary, allocator=None, async=False)¶Return a Array
that is an exact copy of the
numpy.ndarray
instance ary.
See Array
for the meaning of allocator.
Changed in version 2011.1: context argument was deprecated.
pyopencl.array.
empty
(queue, shape, dtype, order="C", allocator=None, data=None)¶A synonym for the Array
constructor.
pyopencl.array.
zeros
(queue, shape, dtype, order='C', allocator=None)¶Same as empty()
, but the Array
is zeroinitialized before
being returned.
Changed in version 2011.1: context argument was deprecated.
pyopencl.array.
empty_like
(ary)¶Make a new, uninitialized Array
having the same properties
as other_ary.
pyopencl.array.
zeros_like
(ary)¶Make a new, zeroinitialized Array
having the same properties
as other_ary.
pyopencl.array.
arange
(queue, *args, **kwargs)¶Create a Array
filled with numbers spaced step apart,
starting from start and ending at stop.
For floating point arguments, the length of the result is ceil((stop  start)/step). This rule may result in the last element of the result being greater than stop.
dtype, if not specified, is taken as the largest common type of start, stop and step.
Changed in version 2011.1: context argument was deprecated.
Changed in version 2011.2: allocator keyword argument was added.
pyopencl.array.
take
(a, indices, out=None, queue=None, wait_for=None)¶Return the Array
[a[indices[0]], ..., a[indices[n]]]
.
For the moment, a must be a type that can be bound to a texture.
pyopencl.array.
concatenate
(arrays, axis=0, queue=None, allocator=None)¶New in version 2013.1.
Array
instances¶pyopencl.array.
transpose
(a, axes=None)¶Permute the dimensions of an array.
Parameters: 


Returns: 

pyopencl.array.
reshape
(a, shape)¶Gives a new shape to an array without changing its data.
New in version 2015.2.
pyopencl.array.
if_positive
(criterion, then_, else_, out=None, queue=None)¶Return an array like then_, which, for the element at index i, contains then_[i] if criterion[i]>0, else else_[i].
pyopencl.array.
maximum
(a, b, out=None, queue=None)¶Return the elementwise maximum of a and b.
pyopencl.array.
minimum
(a, b, out=None, queue=None)¶Return the elementwise minimum of a and b.
pyopencl.array.
sum
(a, dtype=None, queue=None)¶New in version 2011.1.
pyopencl.array.
dot
(a, b, dtype=None, queue=None)¶New in version 2011.1.
pyopencl.array.
vdot
(a, b, dtype=None, queue=None)¶Like numpy.vdot()
.
New in version 2013.1.
pyopencl.array.
subset_dot
(subset, a, b, dtype=None, queue=None)¶New in version 2011.1.
pyopencl.array.
max
(a, queue=None)¶New in version 2011.1.
pyopencl.array.
min
(a, queue=None)¶New in version 2011.1.
pyopencl.array.
subset_max
(subset, a, queue=None)¶New in version 2011.1.
pyopencl.array.
subset_min
(subset, a, queue=None)¶New in version 2011.1.
See also Sums and counts (“reduce”).
Arrray
Instances¶The pyopencl.clmath
module contains exposes array versions of the C
functions available in the OpenCL standard. (See table 6.8 in the spec.)
pyopencl.clmath.
acos
(array, queue=None)¶pyopencl.clmath.
acosh
(array, queue=None)¶pyopencl.clmath.
acospi
(array, queue=None)¶pyopencl.clmath.
asin
(array, queue=None)¶pyopencl.clmath.
asinh
(array, queue=None)¶pyopencl.clmath.
asinpi
(array, queue=None)¶pyopencl.clmath.
atan
(array, queue=None)¶pyopencl.clmath.
atan2
(y, x, queue=None)¶New in version 2013.1.
pyopencl.clmath.
atanh
(array, queue=None)¶pyopencl.clmath.
atanpi
(array, queue=None)¶pyopencl.clmath.
atan2pi
(y, x, queue=None)¶New in version 2013.1.
pyopencl.clmath.
cbrt
(array, queue=None)¶pyopencl.clmath.
ceil
(array, queue=None)¶pyopencl.clmath.
cos
(array, queue=None)¶pyopencl.clmath.
cosh
(array, queue=None)¶pyopencl.clmath.
cospi
(array, queue=None)¶pyopencl.clmath.
erfc
(array, queue=None)¶pyopencl.clmath.
erf
(array, queue=None)¶pyopencl.clmath.
exp
(array, queue=None)¶pyopencl.clmath.
exp2
(array, queue=None)¶pyopencl.clmath.
exp10
(array, queue=None)¶pyopencl.clmath.
expm1
(array, queue=None)¶pyopencl.clmath.
fabs
(array, queue=None)¶pyopencl.clmath.
floor
(array, queue=None)¶pyopencl.clmath.
fmod
(arg, mod, queue=None)¶Return the floating point remainder of the division arg/mod, for each element in arg and mod.
pyopencl.clmath.
frexp
(arg, queue=None)¶Return a tuple (significands, exponents) such that arg == significand * 2**exponent.
pyopencl.clmath.
ilogb
(array, queue=None)¶pyopencl.clmath.
ldexp
(significand, exponent, queue=None)¶Return a new array of floating point values composed from the entries of significand and exponent, paired together as result = significand * 2**exponent.
pyopencl.clmath.
lgamma
(array, queue=None)¶pyopencl.clmath.
log
(array, queue=None)¶pyopencl.clmath.
log2
(array, queue=None)¶pyopencl.clmath.
log10
(array, queue=None)¶pyopencl.clmath.
log1p
(array, queue=None)¶pyopencl.clmath.
logb
(array, queue=None)¶pyopencl.clmath.
modf
(arg, queue=None)¶Return a tuple (fracpart, intpart) of arrays containing the integer and fractional parts of arg.
pyopencl.clmath.
nan
(array, queue=None)¶pyopencl.clmath.
rint
(array, queue=None)¶pyopencl.clmath.
round
(array, queue=None)¶pyopencl.clmath.
sin
(array, queue=None)¶pyopencl.clmath.
sinh
(array, queue=None)¶pyopencl.clmath.
sinpi
(array, queue=None)¶pyopencl.clmath.
sqrt
(array, queue=None)¶pyopencl.clmath.
tan
(array, queue=None)¶pyopencl.clmath.
tanh
(array, queue=None)¶pyopencl.clmath.
tanpi
(array, queue=None)¶pyopencl.clmath.
tgamma
(array, queue=None)¶pyopencl.clmath.
trunc
(array, queue=None)¶PyOpenCL now includes and uses the RANLUXCL random number generator by Ivar Ursin Nikolaisen. In addition to being usable through the convenience functions above, it is available in any piece of code compiled through PyOpenCL by:
#include <pyopenclranluxcl.cl>
See the source for some documentation if you’re planning on using RANLUXCL directly.
The RANLUX generator is described in the following two articles. If you use the generator for scientific purposes, please consider citing them:
pyopencl.clrandom.
RanluxGenerator
(queue, num_work_items=None, luxury=None, seed=None, no_warmup=False, use_legacy_init=False, max_work_items=None)¶New in version 2011.2.
state
¶A pyopencl.array.Array
containing the state of the generator.
nskip
¶nskip is an integer which can (optionally) be defined in the kernel code as RANLUXCL_NSKIP. If this is done the generator will be faster for luxury setting 0 and 1, or when the pvalue is manually set to a multiple of 24.
Parameters: 


Changed in version 2013.1: Added default value for num_work_items.
fill_uniform
(ary, a=0, b=1, queue=None)¶Fill ary with uniformly distributed random numbers in the interval (a, b), endpoints excluded.
Returns:  a pyopencl.Event 

Changed in version 2014.1.1: Added return value.
uniform
(*args, **kwargs)¶Make a new empty array, apply fill_uniform()
to it.
fill_normal
(ary, mu=0, sigma=1, queue=None)¶Fill ary with normally distributed numbers with mean mu and standard deviation sigma.
Changed in version 2014.1.1: Added return value.
normal
(*args, **kwargs)¶Make a new empty array, apply fill_normal()
to it.
synchronize
(queue)¶The generator gets inefficient when different work items invoke the generator a differing number of times. This function ensures efficiency.
pyopencl.clrandom.
rand
(queue, shape, dtype, luxury=None, a=0, b=1)¶Return an array of shape filled with random values of dtype in the range [a,b).
pyopencl.clrandom.
fill_rand
(result, queue=None, luxury=4, a=0, b=1)¶Fill result with random values of dtype in the range [0,1).