OpenCL Runtime: Memory

class pyopencl.MemoryObject
info

Lower case versions of the mem_info constants may be used as attributes on instances of this class to directly query info attributes.

hostbuf
get_info(param)

See mem_info for values of param.

release()
get_host_array(shape, dtype, order="C")

Return the memory object’s associated host memory area as a numpy.ndarray of the given shape, dtype and order.

from_int_ptr(int_ptr_value, retain=True)

Constructs a pyopencl handle from a C-level pointer (given as the integer int_ptr_value). If retain is True (the defauult) pyopencl will call clRetainXXX on the provided object. If the previous owner of the object will not release the reference, retain should be set to False, to effectively transfer ownership to pyopencl.

Changed in version 2016.1: retain added

int_ptr

Instances of this class are hashable, and two instances of this class may be compared using “==” and ”!=”. (Hashability was added in version 2011.2.) Two objects are considered the same if the underlying OpenCL object is the same, as established by C pointer equality.

Memory Migration

pyopencl.enqueue_migrate_mem_objects(queue, mem_objects, flags=0, wait_for=None)
Parameters:flags – from mem_migration_flags

New in version 2011.2.

Only available with CL 1.2.

pyopencl.enqueue_migrate_mem_object_ext(queue, mem_objects, flags=0, wait_for=None)
Parameters:flags – from migrate_mem_object_flags_ext

New in version 2011.2.

Only available with the cl_ext_migrate_memobject extension.

Buffer

class pyopencl.Buffer(context, flags, size=0, hostbuf=None)

Create a Buffer. See mem_flags for values of flags. If hostbuf is specified, size defaults to the size of the specified buffer if it is passed as zero.

Buffer inherits from MemoryObject.

Note

Python also defines a type of buffer object, and PyOpenCL interacts with those, too, as the host-side target of enqueue_copy(). Make sure to always be clear on whether a Buffer or a Python buffer object is needed.

Note that actual memory allocation in OpenCL may be deferred. Buffers are attached to a Context and are only moved to a device once the buffer is used on that device. That is also the point when out-of-memory errors will occur. If you’d like to be sure that there’s enough memory for your allocation, either use enqueue_migrate_mem_objects() (if available) or simply perform a small transfer to the buffer. See also pyopencl.tools.ImmediateAllocator.

get_sub_region(origin, size, flags=0)

Only available in OpenCL 1.1 and newer.

__getitem__(slc)

slc is a slice object indicating from which byte index range a sub-buffer is to be created. The flags argument of get_sub_region() is set to the same flags with which self was created.

pyopencl.enqueue_fill_buffer(queue, mem, pattern, offset, size, wait_for=None)
Parameters:
  • mem – the on device Buffer
  • pattern – a buffer object (likely a numpy.ndarray, eg. np.uint32(0))

Fills a buffer with the provided pattern

Returns a new pyopencl.Event. wait_for may either be None or a list of pyopencl.Event instances for whose completion this command waits before starting exeuction.

Only available with CL 1.2.

New in version 2011.2.

pyopencl.enqueue_copy_buffer(queue, src, dst, byte_count=-1, src_offset=0, dst_offset=0, wait_for=None)
Parameters:
  • src – the source Buffer
  • dst – the destination device Buffer
  • byte_count – the number of bytes to copy

Performs a device to device copy from src to dst.

Returns a new pyopencl.Event. wait_for may either be None or a list of pyopencl.Event instances for whose completion this command waits before starting exeuction.

Shared Virtual Memory (SVM)

Shared virtual memory allows the host and the compute device to share address space, so that pointers on the host and on the device may have the same meaning. In addition, it allows the same memory to be accessed by both the host and the device. Coarse-grain SVM requires that buffers be mapped before being accessed on the host, fine-grain SVM does away with that requirement.

SVM requires OpenCL 2.0.

class pyopencl.SVM(mem)

Tags an object exhibiting the Python buffer interface (such as a numpy.ndarray) as referring to shared virtual memory.

Depending on the features of the OpenCL implementation, the following types of objects may be passed to/wrapped in this type:

  • coarse-grain shared memory as returned by (e.g.) csvm_empty() for any implementation of OpenCL 2.0.

    This is how coarse-grain SVM may be used from both host and device:

    svm_ary = cl.SVM(cl.csvm_empty(ctx, 1000, np.float32, alignment=64))
    assert isinstance(svm_ary.mem, np.ndarray)
    
    with svm_ary.map_rw(queue) as ary:
        ary.fill(17)  # use from host
    
    prg.twice(queue, svm_ary.mem.shape, None, svm_ary)
    
  • fine-grain shared memory as returned by (e.g.) fsvm_empty(), if the implementation supports fine-grained shared virtual memory. This memory may directly be passed to a kernel:

    ary = cl.fsvm_empty(ctx, 1000, np.float32)
    assert isinstance(ary, np.ndarray)
    
    prg.twice(queue, ary.shape, None, cl.SVM(ary))
    queue.finish() # synchronize
    print(ary) # access from host
    

    Observe how mapping (as needed in coarse-grain SVM) is no longer necessary.

  • any numpy.ndarray (or other Python object with a buffer interface) if the implementation supports fine-grained system shared virtual memory.

    This is how plain numpy arrays may directly be passed to a kernel:

    ary = np.zeros(1000, np.float32)
    prg.twice(queue, ary.shape, None, cl.SVM(ary))
    queue.finish() # synchronize
    print(ary) # access from host
    

Objects of this type may be passed to kernel calls and enqueue_copy(). Coarse-grain shared-memory must be mapped into host address space using map() before being accessed through the numpy interface.

Note

This object merely serves as a ‘tag’ that changes the behavior of functions to which it is passed. It has no special management relationship to the memory it tags. For example, it is permissible to grab a numpy.array out of SVM.mem of one SVM instance and use the array to construct another. Neither of the tags need to be kept alive.

New in version 2016.2.

mem

The wrapped object.

__init__(mem)
map(queue, flags, is_blocking=True, wait_for=None)
Parameters:
  • is_blocking – If False, subsequent code must wait on SVMMap.event in the returned object before accessing the mapped memory.
  • flags – a combination of pyopencl.map_flags, defaults to read-write.
Returns:

an SVMMap instance

Returns a new pyopencl.Event. wait_for may either be None or a list of pyopencl.Event instances for whose completion this command waits before starting exeuction.

map_ro(queue, is_blocking=True, wait_for=None)

Like map(), but with flags set for a read-only map.

map_rw(queue, is_blocking=True, wait_for=None)

Like map(), but with flags set for a read-only map.

as_buffer(ctx, flags=None)
Parameters:
  • ctx – a Context
  • flags – a combination of pyopencl.map_flags, defaults to read-write.
Returns:

a Buffer corresponding to self.

The memory referred to by this object must not be freed before the returned Buffer is released.

class pyopencl.SVMMap(svm, queue, event)
event

New in version 2016.2.

release(queue=None, wait_for=None)
Parameters:queue – a pyopencl.CommandQueue. Defaults to the one with which the map was created, if not specified.
Returns:a pyopencl.Event

Returns a new pyopencl.Event. wait_for may either be None or a list of pyopencl.Event instances for whose completion this command waits before starting exeuction.

This class may also be used as a context manager in a with statement. release() will be called upon exit from the with region. The value returned to the as part of the context manager is the mapped Python object (e.g. a numpy array).

Allocating SVM

pyopencl.svm_empty(ctx, flags, shape, dtype, order='C', alignment=None)

Allocate an empty numpy.ndarray of the given shape, dtype and order. (See numpy.empty() for the meaning of these arguments.) The array will be allocated in shared virtual memory belonging to ctx.

Parameters:
  • ctx – a Context
  • flags – a combination of flags from svm_mem_flags.
  • alignment – the number of bytes to which the beginning of the memory is aligned. Defaults to the numpy.dtype.itemsize of dtype.
Returns:

a numpy.ndarray whose numpy.ndarray.base attribute is a SVMAllocation.

To pass the resulting array to an OpenCL kernel or enqueue_copy(), you will likely want to wrap the returned array in an SVM tag.

New in version 2016.2.

pyopencl.svm_empty_like(ctx, flags, ary, alignment=None)

Allocate an empty numpy.ndarray like the existing numpy.ndarray ary. The array will be allocated in shared virtual memory belonging to ctx.

Parameters:
  • ctx – a Context
  • flags – a combination of flags from svm_mem_flags.
  • alignment – the number of bytes to which the beginning of the memory is aligned. Defaults to the numpy.dtype.itemsize of dtype.
Returns:

a numpy.ndarray whose numpy.ndarray.base attribute is a SVMAllocation.

To pass the resulting array to an OpenCL kernel or enqueue_copy(), you will likely want to wrap the returned array in an SVM tag.

New in version 2016.2.

pyopencl.csvm_empty(ctx, shape, dtype, order='C', alignment=None)

Like svm_empty(), but with flags set for a coarse-grain read-write buffer.

New in version 2016.2.

pyopencl.csvm_empty_like(ctx, ary, alignment=None)

Like svm_empty_like(), but with flags set for a coarse-grain read-write buffer.

New in version 2016.2.

pyopencl.fsvm_empty(ctx, shape, dtype, order='C', alignment=None)

Like svm_empty(), but with flags set for a fine-grain read-write buffer.

New in version 2016.2.

pyopencl.fsvm_empty_like(ctx, ary, alignment=None)

Like svm_empty_like(), but with flags set for a fine-grain read-write buffer.

New in version 2016.2.

Operations on SVM

(See also Transfers.)

pyopencl.enqueue_svm_memfill(queue, dest, pattern, byte_count=None, wait_for=None)

Fill shared virtual memory with a pattern.

Parameters:
  • dest – a Python buffer object, optionally wrapped in an SVM object
  • pattern – a Python buffer object (e.g. a numpy.ndarray with the fill pattern to be used.
  • byte_count – The size of the memory to be fill. Defaults to the entirety of dest.

Returns a new pyopencl.Event. wait_for may either be None or a list of pyopencl.Event instances for whose completion this command waits before starting exeuction.

New in version 2016.2.

pyopencl.enqueue_svm_migratemem(queue, svms, flags, wait_for=None)
Parameters:
  • svms – a collection of Python buffer objects (e.g. numpy arrrays), optionally wrapped in SVM objects.
  • flags – a combination of mem_migration_flags

Returns a new pyopencl.Event. wait_for may either be None or a list of pyopencl.Event instances for whose completion this command waits before starting exeuction.

New in version 2016.2.

This function requires OpenCL 2.1.

SVM Allocation Holder

class pyopencl.SVMAllocation(ctx, size, alignment, flags, _interface=None)

An object whose lifetime is tied to an allocation of shared virtual memory.

Note

Most likely, you will not want to use this directly, but rather svm_empty() and related functions which allow access to this functionality using a friendlier, more Pythonic interface.

New in version 2016.2.

__init__(self, ctx, size, alignment, flags=None)
Parameters:
release()
enqueue_release(queue, wait_for=None)
Parameters:flags – a combination of pyopencl.map_flags
Returns:a pyopencl.Event

Returns a new pyopencl.Event. wait_for may either be None or a list of pyopencl.Event instances for whose completion this command waits before starting exeuction.

Parameters:

Image

class pyopencl.ImageFormat([channel_order, channel_type])
channel_order

See channel_order for possible values.

channel_data_type

See channel_type for possible values.

channel_count

New in version 0.91.5.

dtype_size

New in version 0.91.5.

itemsize

New in version 0.91.5.

__repr__()

Returns a str representation of the image format.

New in version 0.91.

Instances of this class are hashable, and two instances of this class may be compared using “==” and ”!=”. (Hashability was added in version 2011.2.) Two objects are considered the same if the underlying OpenCL object is the same, as established by C pointer equality.

Changed in version 0.91: Constructor arguments added.

Changed in version 2013.2: ImageFormat was made comparable and hashable

pyopencl.get_supported_image_formats(context, flags, image_type)

See mem_flags for possible values of flags and mem_object_type for possible values of image_type.

Image(context, flags, format, shape=None, pitches=None, hostbuf=None, is_array=False, buffer=None):

See mem_flags for values of flags. shape is a 2- or 3-tuple. format is an instance of ImageFormat. pitches is a 1-tuple for 2D images and a 2-tuple for 3D images, indicating the distance in bytes from one scan line to the next, and from one 2D image slice to the next.

If hostbuf is given and shape is None, then hostbuf.shape is used as the shape parameter.

Image inherits from MemoryObject.

Note

If you want to load images from numpy.ndarray instances or read images back into them, be aware that OpenCL images expect the x dimension to vary fastest, whereas in the default (C) order of numpy arrays, the last index varies fastest. If your array is arranged in the wrong order in memory, there are two possible fixes for this:

  • Convert the array to Fortran (column-major) order using numpy.asarray().
  • Pass ary.T.copy() to the image creation function.

New in version 0.91.

Changed in version 2011.2: Added is_array and buffer, which are only available on CL 1.2 and newer.

pyopencl.info

Lower case versions of the mem_info and image_info constants may be used as attributes on instances of this class to directly query info attributes.

pyopencl.shape

Return the value of the shape constructor argument as a tuple.

pyopencl.get_image_info(param)

See image_info for values of param.

pyopencl.release()

Instances of this class are hashable, and two instances of this class may be compared using “==” and ”!=”. (Hashability was added in version 2011.2.) Two objects are considered the same if the underlying OpenCL object is the same, as established by C pointer equality.

pyopencl.image_from_array(ctx, ary, num_channels=None, mode="r", norm_int=False)

Build a 2D or 3D Image from the numpy.ndarray ary. If num_channels is greater than one, the last dimension of ary must be identical to num_channels. ary must be in C order. If num_channels is not given, it defaults to 1 for scalar types and the number of entries for Vector Types.

The ImageFormat is chosen as the first num_channels components of “RGBA”.

Parameters:mode – “r” or “w” for read/write

Note

When reading from the image object, the indices passed to read_imagef are in the reverse order from what they would be when accessing ary from Python.

If norm_int is True, then the integer values are normalized to a floating point scale of 0..1 when read.

New in version 2011.2.

pyopencl.enqueue_fill_image(queue, mem, color, origin, region, wait_for=None)
Parameters:color – a buffer object (likely a numpy.ndarray)

Returns a new pyopencl.Event. wait_for may either be None or a list of pyopencl.Event instances for whose completion this command waits before starting exeuction.

Only available with CL 1.2.

New in version 2011.2.

Transfers

pyopencl.enqueue_copy(queue, dest, src, **kwargs)

Copy from Image, Buffer or the host to Image, Buffer or the host. (Note: host-to-host copies are unsupported.)

The following keyword arguments are available:

Parameters:
  • wait_for – (optional, default empty)
  • is_blocking – Wait for completion. Defaults to True. (Available on any copy involving host memory)
Returns:

A NannyEvent if the transfer involved a host-side buffer, otherwise an Event.

Note

Two types of ‘buffer’ occur in the arguments to this function, Buffer and ‘host-side buffers’. The latter are defined by Python and commonly called buffer objects. numpy arrays are a very common example. Make sure to always be clear on whether a Buffer or a Python buffer object is needed.

Transfer Buffer ↔ host

Parameters:device_offset – offset in bytes (optional)

Note

The size of the transfer is controlled by the size of the of the host-side buffer. If the host-side buffer is a numpy.ndarray, you can control the transfer size by transfering into a smaller ‘view’ of the target array, like this:

cl.enqueue_copy(queue, large_dest_numpy_array[:15], src_buffer)

Transfer BufferBuffer

Parameters:
  • byte_count – (optional) If not specified, defaults to the size of the source in versions 2012.x and earlier, and to the minimum of the size of the source and target from 2013.1 on.
  • src_offset – (optional)
  • dest_offset – (optional)

Rectangular Buffer ↔ host transfers (CL 1.1 and newer)

Parameters:
  • buffer_origintuple of int of length three or shorter. (mandatory)
  • host_origintuple of int of length three or shorter. (mandatory)
  • regiontuple of int of length three or shorter. (mandatory)
  • buffer_pitchestuple of int of length two or shorter. (optional, “tightly-packed” if unspecified)
  • host_pitchestuple of int of length two or shorter. (optional, “tightly-packed” if unspecified)

Rectangular BufferBuffer transfers (CL 1.1 and newer)

Parameters:
  • src_origintuple of int of length three or shorter. (mandatory)
  • dst_origintuple of int of length three or shorter. (mandatory)
  • regiontuple of int of length three or shorter. (mandatory)
  • src_pitchestuple of int of length two or shorter. (optional, “tightly-packed” if unspecified)
  • dst_pitchestuple of int of length two or shorter. (optional, “tightly-packed” if unspecified)

Transfer Image ↔ host

Parameters:
  • origintuple of int of length three or shorter. (mandatory)
  • regiontuple of int of length three or shorter. (mandatory)
  • pitchestuple of int of length two or shorter. (optional)

Transfer BufferImage

Parameters:
  • offset – offset in buffer (mandatory)
  • origintuple of int of length three or shorter. (mandatory)
  • regiontuple of int of length three or shorter. (mandatory)

Transfer ImageImage

Parameters:
  • src_origintuple of int of length three or shorter. (mandatory)
  • dest_origintuple of int of length three or shorter. (mandatory)
  • regiontuple of int of length three or shorter. (mandatory)

Transfer SVM/host ↔ SVM/host

Parameters:byte_count – (optional) If not specified, defaults to the size of the source in versions 2012.x and earlier, and to the minimum of the size of the source and target from 2013.1 on.

Returns a new pyopencl.Event. wait_for may either be None or a list of pyopencl.Event instances for whose completion this command waits before starting exeuction.

New in version 2011.1.

Mapping Memory into Host Address Space

class pyopencl.MemoryMap

” .. automethod:: release

This class may also be used as a context manager in a with statement.

pyopencl.enqueue_map_buffer(queue, buf, flags, offset, shape, dtype, order="C", strides=None, wait_for=None, is_blocking=True)

wait_for may either be None or a list of pyopencl.Event instances for whose completion this command waits before starting exeuction. shape, dtype, and order have the same meaning as in numpy.empty(). See map_flags for possible values of flags. strides, if given, overrides order.

Returns:a tuple (array, event). array is a numpy.ndarray representing the host side of the map. Its .base member contains a MemoryMap.

Changed in version 2011.1: is_blocking now defaults to True.

Changed in version 2013.1: order now defaults to “C”.

Changed in version 2013.2: Added strides argument.

pyopencl.enqueue_map_image(queue, buf, flags, origin, region, shape, dtype, order="C", strides=None, wait_for=None, is_blocking=True)

wait_for may either be None or a list of pyopencl.Event instances for whose completion this command waits before starting exeuction. shape, dtype, and order have the same meaning as in numpy.empty(). See map_flags for possible values of flags. strides, if given, overrides order.

Returns:a tuple (array, event). array is a numpy.ndarray representing the host side of the map. Its .base member contains a MemoryMap.

Changed in version 2011.1: is_blocking now defaults to True.

Changed in version 2013.1: order now defaults to “C”.

Changed in version 2013.2: Added strides argument.

Samplers

class pyopencl.Sampler(context, normalized_coords, addressing_mode, filter_mode)

normalized_coords is a bool indicating whether to use coordinates between 0 and 1 (True) or the texture’s natural pixel size (False). See addressing_mode and filter_mode for possible argument values.

info

Lower case versions of the sampler_info constants may be used as attributes on instances of this class to directly query info attributes.

get_info(param)

See sampler_info for values of param.

from_int_ptr(int_ptr_value, retain=True)

Constructs a pyopencl handle from a C-level pointer (given as the integer int_ptr_value). If retain is True (the defauult) pyopencl will call clRetainXXX on the provided object. If the previous owner of the object will not release the reference, retain should be set to False, to effectively transfer ownership to pyopencl.

Changed in version 2016.1: retain added

int_ptr

Instances of this class are hashable, and two instances of this class may be compared using “==” and ”!=”. (Hashability was added in version 2011.2.) Two objects are considered the same if the underlying OpenCL object is the same, as established by C pointer equality.