Reference: Documentation for Internal API#

Targets#

See also Targets.

class loopy.target.c.POD(ast_builder, dtype, name)[source]#

A simple declarator: The type is given as a numpy.dtype and the name is given as a string.

class loopy.target.c.ScopingBlock(contents=[])[source]#

A block that is mandatory for scoping and may not be simplified away by loopy.codegen.result.merge_codegen_results().

class loopy.target.c.codegen.expression.ExpressionToCExpressionMapper(codegen_state, fortran_abi=False, type_inf_mapper=None)[source]#

Mapper that converts a loopy-semantic expression to a C-semantic expression with typecasts, appropriate arithmetic semantic mapping, etc.

Symbolic#

See also Expressions.

Loopy-specific expression types#

class loopy.symbolic.Literal(s)[source]#

A literal to be used during code generation.

Note

Only used in the output of loopy.target.c.codegen.expression.ExpressionToCExpressionMapper (and similar mappers). Not for use in Loopy source representation.

class loopy.symbolic.ArrayLiteral(children)[source]#

An array literal.

Note

Only used in the output of loopy.target.c.codegen.expression.ExpressionToCExpressionMapper (and similar mappers). Not for use in Loopy source representation.

class loopy.symbolic.FunctionIdentifier[source]#

A base class for symbols representing functions.

class loopy.symbolic.TypedCSE(child, prefix=None, dtype=None)[source]#

A pymbolic.primitives.CommonSubexpression annotated with a numpy.dtype.

class loopy.symbolic.TypeCast(type, child)[source]#

Only defined for numerical types with semantics matching numpy.ndarray.astype().

child#

The expression to be cast.

class loopy.symbolic.TaggedVariable(name, tags)[source]#

This is an identifier with tags, such as matrix$one, where ‘one’ identifies this specific use of the identifier. This mechanism may then be used to address these uses–such as by prefetching only accesses tagged a certain way.

tags#

A frozenset of subclasses of pytools.tag.Tag used to provide metadata on this object. Legacy string tags are converted to LegacyStringInstructionTag or, if they used to carry a functional meaning, the tag carrying that same fucntional meaning (e.g. UseStreamingStoreTag).

Inherits from pymbolic.primitives.Variable and pytools.tag.Taggable.

class loopy.symbolic.Reduction(operation, inames, expr, allow_simultaneous=False)[source]#

Represents a reduction operation on expr across inames.

operation#
an instance of :class:`loopy.library.reduction.ReductionOperation`
inames#

a list of inames across which reduction on expr is being carried out.

expr#

An expression which may have tuple type. If the expression has tuple type, it must be one of the following: * a tuple of pymbolic.primitives.Expression, or * a loopy.symbolic.Reduction, or * a function call or substitution rule invocation.

allow_simultaneous#

A bool. If not True, an iname is allowed to be used in precisely one reduction, to avoid mis-nesting errors.

class loopy.symbolic.LinearSubscript(aggregate, index)[source]#

Represents a linear index into a multi-dimensional array, completely ignoring any multi-dimensional layout.

class loopy.symbolic.RuleArgument(index)[source]#

Represents a (numbered) argument of a loopy.SubstitutionRule. Only used internally in the rule-aware mappers to match subst rules independently of argument names.

class loopy.symbolic.ExpansionState(*args, **kwargs)[source]#
kernel#
instruction#
stack#

a tuple representing the current expansion stack, as a tuple of (name, tag) pairs.

arg_context#

a dict representing current argument values

class loopy.symbolic.RuleAwareIdentityMapper(rule_mapping_context)[source]#

Note: the third argument dragged around by this mapper is the current ExpansionState.

Subclasses of this must be careful to not touch identifiers that are in ExpansionState.arg_context.

class loopy.symbolic.ResolvedFunction(function)[source]#

A function identifier whose definition is known in a loopy program. A function is said to be known in a TranslationUnit if its name maps to an InKernelCallable in loopy.TranslationUnit.callables_table. Refer to Function Interface.

function#

An instance of pymbolic.primitives.Variable or loopy.library.reduction.ReductionOpFunction.

class loopy.symbolic.SubArrayRef(swept_inames, subscript)[source]#

An algebraic expression to map an affine memory layout pattern (known as sub-arary) as consecutive elements of the sweeping axes which are defined using SubArrayRef.swept_inames.

swept_inames#

An instance of tuple denoting the axes to which the sub array is supposed to be mapped to.

subscript#

An instance of pymbolic.primitives.Subscript denoting the array in the kernel.

is_equal(other)[source]#

Returns True iff the sub-array refs have identical expressions.

Expression Manipulation Helpers#

loopy.symbolic.simplify_using_aff(kernel, expr)[source]#

Simplifies expr on kernel’s domain.

Parameters

expr – An instance of pymbolic.primitives.Expression.

Types#

DTypes of variables in a loopy.LoopKernel must be picklable, so in the codegen pipeline user-provided types are converted to loopy.types.LoopyType.

class loopy.types.LoopyType[source]#

Abstract class for dtypes of variables encountered in a loopy.LoopKernel.

class loopy.types.NumpyType(dtype, target=None)[source]#
class loopy.types.AtomicType[source]#

Abstract class for dtypes of variables encountered in a loopy.LoopKernel on which atomic operations are performed .

class loopy.types.AtomicNumpyType(dtype, target=None)[source]#

A dtype wrapper that indicates that the described type should be capable of atomic operations.

Codegen#

class loopy.codegen.ImplementedDataInfo(target, name, dtype, arg_class, base_name=None, shape=None, strides=None, unvec_shape=None, unvec_strides=None, offset_for_name=None, stride_for_name_and_axis=None, allows_offset=None, is_written=None)[source]#
name#

The expanded name of the array. Note that, for example in the case of separate-array-tagged axes, multiple implemented arrays may correspond to one user-facing array.

dtype#
arg_class#
base_name#

The user-facing name of the underlying array. May be None for non-array arguments.

shape#
strides#

Strides in multiples of dtype.itemsize.

unvec_shape#
unvec_strides#

Strides in multiples of dtype.itemsize that accounts for loopy.kernel.array.VectorArrayDimTag in a scalar manner

offset_for_name#
stride_for_name_and_axis#

A tuple (name, axis) indicating the (implementation-facing) name of the array and axis number for which this argument provides the strides.

allows_offset#
is_written#
class loopy.codegen.PreambleInfo(*args, **kwargs)[source]#
kernel#
seen_dtypes#
seen_functions#
seen_atomic_dtypes#
codegen_state#
class loopy.codegen.VectorizationInfo(iname, length, space)[source]#
iname#
length#
space#
class loopy.codegen.SeenFunction(name, c_name, arg_dtypes, result_dtypes)[source]#

This is used to track functions that emerge late during code generation, e.g. C functions to realize arithmetic. No connection with InKernelCallable.

name#
c_name#
arg_dtypes#

a tuple of arg dtypes

result_dtypes#

a tuple of result dtypes

class loopy.codegen.CodeGenerationState(kernel, target, implemented_data_info, implemented_domain, implemented_predicates, seen_dtypes, seen_functions, seen_atomic_dtypes, var_subst_map, allow_complex, callables_table, is_entrypoint, vectorization_info=None, var_name_generator=None, is_generating_device_code=None, gen_program_name=None, schedule_index_end=None, codegen_cachemanager=None)[source]#
kernel#
target#
implemented_data_info#

a list of ImplementedDataInfo objects.

implemented_domain#

The entire implemented domain (as an islpy.Set) i.e. all constraints that have been enforced so far.

implemented_predicates#

A frozenset of predicates for which checks have been implemented.

seen_dtypes#

set of dtypes that were encountered

seen_functions#

set of SeenFunction instances

seen_atomic_dtypes#
var_subst_map#
allow_complex#
vectorization_info#

None or an instance of VectorizationInfo

is_generating_device_code#
gen_program_name#

None (indicating that host code is being generated) or the name of the device program currently being generated.

schedule_index_end#
callables_table#

A mapping from callable names to instances of loopy.kernel.function_interface.InKernelCallable.

is_entrypoint#

A bool to indicate if the code is being generated for an entrypoint kernel

codegen_cache_manager#

An instance of loopy.codegen.tools.CodegenOperationCacheManager.

class loopy.codegen.TranslationUnitCodeGenerationResult(*args, **kwargs)[source]#
host_program#

A mapping from names of entrypoints to their host GeneratedProgram.

device_programs#

A list of GeneratedProgram instances intended to run on the compute device.

host_preambles#
device_preambles#
implemented_data_infos#

A mapping from names of entrypoints to their list of ImplementedDataInfo objects.

host_code()[source]#
device_code()[source]#
all_code()[source]#
class loopy.codegen.result.GeneratedProgram(*args, **kwargs)[source]#
name#
is_device_program#
ast#

Once generated, this captures the AST of the overall function definition, including the body.

body_ast#

Once generated, this captures the AST of the operative function body (including declaration of necessary temporaries), but not the overall function definition.

class loopy.codegen.result.CodeGenerationResult(*args, **kwargs)[source]#
host_program#
device_programs#

A list of GeneratedProgram instances intended to run on the compute device.

implemented_domains#

A mapping from instruction ID to a list of islpy.Set objects.

host_preambles#
device_preambles#
host_code()[source]#
device_code()[source]#
all_code()[source]#
implemented_data_info#

a list of loopy.codegen.ImplementedDataInfo objects. Only added at the very end of code generation.

loopy.codegen.result.merge_codegen_results(codegen_state, elements, collapse=True)[source]#
loopy.codegen.result.generate_host_or_device_program(codegen_state, schedule_index)[source]#
class loopy.codegen.tools.KernelProxyForCodegenOperationCacheManager(instructions: List[loopy.kernel.instruction.InstructionBase], linearization: List[loopy.schedule.ScheduleItem], inames: Dict[str, loopy.kernel.data.Iname])[source]#

Proxy to loopy.LoopKernel to be used by CodegenOperationCacheManager.

class loopy.codegen.tools.CodegenOperationCacheManager(kernel_proxy)[source]#

Caches operations arising during the codegen pipeline.

kernel_proxy#

An instance of KernelProxyForCodegenOperationCacheManager.

with_kernel(kernel)[source]#

Returns a new instance of CodegenOperationCacheManager corresponding to kernel if the cached variables in self would be invalid for kernel, else returns self.

get_parallel_inames_in_a_callkernel(callkernel_index)[source]#

Returns a frozenset of parallel inames in a callkernel

Parameters

callkernel_index – Index of the loopy.schedule.CallKernel in the CodegenOperationCacheManager.kernel_proxy’s schedule, whose parallel inames are to be found.

Reduction Operation#

class loopy.library.reduction.ReductionOperation[source]#

Subclasses of this type have to be hashable, picklable, and equality-comparable.

class loopy.library.reduction.ScalarReductionOperation[source]#
class loopy.library.reduction.SumReductionOperation[source]#
class loopy.library.reduction.ProductReductionOperation[source]#
class loopy.library.reduction.MaxReductionOperation[source]#
class loopy.library.reduction.MinReductionOperation[source]#
class loopy.library.reduction.ReductionOpFunction(reduction_op)[source]#

Iname Tags#

loopy.kernel.data.filter_iname_tags_by_type(tags, tag_type, max_num=None, min_num=None)[source]#

Return a subset of tags that matches type tag_type. Raises exception if the number of tags found were greater than max_num or less than min_num.

Parameters
  • tags – An iterable of tags.

  • tag_type – a subclass of loopy.kernel.data.InameImplementationTag.

  • max_num – the maximum number of tags expected to be found.

  • min_num – the minimum number of tags expected to be found.

class loopy.kernel.data.InameImplementationTag(*args, **kwargs)[source]#
class loopy.kernel.data.ConcurrentTag(*args, **kwargs)[source]#
class loopy.kernel.data.UniqueInameTag(*args, **kwargs)[source]#
class loopy.kernel.data.AxisTag(axis)[source]#
class loopy.kernel.data.LocalInameTag(axis)[source]#
class loopy.kernel.data.GroupInameTag(axis)[source]#
class loopy.kernel.data.VectorizeTag(*args, **kwargs)[source]#
class loopy.kernel.data.UnrollTag(*args, **kwargs)[source]#
class loopy.kernel.data.Iname(name, tags=frozenset({}))[source]#

Records an iname in a LoopKernel. See Loop Domain Forest for semantics of inames in loopy.

This class records the metadata attached to an iname as instances of :class:pytools.tag.Tag`. A tag maybe a builtin tag like loopy.kernel.data.InameImplementationTag or a user-defined custom tag. Custom tags may be attached to inames to be used in targeting later during transformations.

name#

An instance of str, denoting the iname’s name.

tas#

An instance of frozenset of pytools.tag.Tag.

class loopy.kernel.data.KernelArgument(**kwargs)[source]#

Base class for all argument types

Array#

class loopy.kernel.array.ArrayDimImplementationTag(*args, **kwargs)[source]#
class loopy.kernel.array._StrideArrayDimTagBase(*args, **kwargs)[source]#
target_axis#

For objects (such as images) with more than one axis, target_axis sets which of these indices is being targeted by this dimension. Note that there may be multiple dim_tags with the same target_axis, their contributions are combined additively.

Note that “normal” arrays only have one target_axis.

layout_nesting_level#

For determining the stride of ComputedStrideArrayDimTag, this determines the layout nesting level of this axis. This must be a contiguous sequence of unique integers starting at 0 in a single ArrayBase.dim_tags. The lowest nesting level varies fastest when viewed in linear memory.

May be None on FixedStrideArrayDimTag, in which case no ComputedStrideArrayDimTag instances may occur.

class loopy.kernel.array.FixedStrideArrayDimTag(stride, target_axis=0, layout_nesting_level=None)[source]#

An arg dimension implementation tag for a fixed (potentially symbolic) stride.

stride#

May be one of the following:

The stride is given in units of ArrayBase.dtype.

class loopy.kernel.array.ComputedStrideArrayDimTag(layout_nesting_level, pad_to=None, target_axis=0)[source]#
pad_to#

ArrayBase.dtype granularity to which to pad this dimension

This type of stride arg dim gets converted to FixedStrideArrayDimTag on input to ArrayBase subclasses.

class loopy.kernel.array.SeparateArrayArrayDimTag(*args, **kwargs)[source]#
class loopy.kernel.array.VectorArrayDimTag(*args, **kwargs)[source]#
loopy.kernel.array.parse_array_dim_tags(dim_tags, n_axes=None, use_increasing_target_axes=False, dim_names=None)[source]#

Checks#

loopy.check.check_for_integer_subscript_indices(t_unit)[source]#

Checks if every array access is of type int.

loopy.check.check_for_duplicate_insn_ids(knl)[source]#

Check if multiple instructions of knl have the same loopy.InstructionBase.id.

loopy.check.check_for_double_use_of_hw_axes(t_unit)[source]#

Check if any instruction of kernel is within multiple inames tagged with the same hw axis tag.

loopy.check.check_insn_attributes(kernel)[source]#

Check for legality of attributes of every instruction in kernel.

loopy.check.check_loop_priority_inames_known(kernel)[source]#

Checks if the inames in loopy.LoopKernel.loop_priority are part of the kernel’s domain.

loopy.check.check_multiple_tags_allowed(kernel)[source]#

Checks if a multiple tags of an iname are compatible.

loopy.check.check_for_inactive_iname_access(kernel)[source]#

Check if any instruction accesses an iname but is not within it.

loopy.check.check_for_unused_inames(kernel)[source]#

Check if there are any unused inames in the kernel.

loopy.check.check_for_write_races(kernel)[source]#

Check if any memory accesses lead to write races.

loopy.check.check_for_data_dependent_parallel_bounds(kernel)[source]#

Check that inames tagged as hw axes have bounds that are known at kernel launch.

loopy.check.check_bounds(t_unit)[source]#

Performs out-of-bound check for every array access.

loopy.check.check_variable_access_ordered(kernel)[source]#

Checks that between each write to a variable and all other accesses to the variable there is either:

Schedule#

class loopy.schedule.ScheduleItem(*args, **kwargs)[source]#
class loopy.schedule.BeginBlockItem(*args, **kwargs)[source]#
class loopy.schedule.EndBlockItem(*args, **kwargs)[source]#
class loopy.schedule.CallKernel(*args, **kwargs)[source]#
class loopy.schedule.Barrier(*args, **kwargs)[source]#
comment#

A plain-text comment explaining why the barrier was inserted.

synchronization_kind#

"local" or "global"

mem_kind#

"local" or "global"

originating_insn_id#
class loopy.schedule.RunInstruction(*args, **kwargs)[source]#
class loopy.schedule.MinRecursionLimitForScheduling(kernel)[source]#