# Reference: Documentation for Internal API¶

## Targets¶

class loopy.target.c.POD(ast_builder, dtype, name)

A simple declarator: The type is given as a numpy.dtype and the name is given as a string.

class loopy.target.c.ScopingBlock(contents=[])

A block that is mandatory for scoping and may not be simplified away by loopy.codegen.result.merge_codegen_results().

class loopy.target.c.codegen.expression.ExpressionToCExpressionMapper(codegen_state, fortran_abi=False, type_inf_mapper=None)

Mapper that converts a loopy-semantic expression to a C-semantic expression with typecasts, appropriate arithmetic semantic mapping, etc.

## Symbolic¶

class loopy.symbolic.Literal(s)

A literal to be used during code generation.

Note

Only used in the output of loopy.target.c.codegen.expression.ExpressionToCExpressionMapper (and similar mappers). Not for use in Loopy source representation.

class loopy.symbolic.ArrayLiteral(children)

An array literal.

Note

Only used in the output of loopy.target.c.codegen.expression.ExpressionToCExpressionMapper (and similar mappers). Not for use in Loopy source representation.

class loopy.symbolic.FunctionIdentifier

A base class for symbols representing functions.

class loopy.symbolic.TypedCSE(child, prefix=None, dtype=None)

A pymbolic.primitives.CommonSubexpression annotated with a numpy.dtype.

class loopy.symbolic.TypeCast(type, child)

Only defined for numerical types with semantics matching numpy.ndarray.astype().

child

The expression to be cast.

class loopy.symbolic.TaggedVariable(name, tag)

This is an identifier with a tag, such as ‘matrix\$one’, where ‘one’ identifies this specific use of the identifier. This mechanism may then be used to address these uses–such as by prefetching only accesses tagged a certain way.

class loopy.symbolic.Reduction(operation, inames, expr, allow_simultaneous=False)

Represents a reduction operation on expr across inames.

operation

an instance of loopy.library.reduction.ReductionOperation

inames

a list of inames across which reduction on expr is being carried out.

expr

An expression which may have tuple type. If the expression has tuple type, it must be one of the following: * a tuple of pymbolic.primitives.Expression, or * a loopy.symbolic.Reduction, or * a function call or substitution rule invocation.

allow_simultaneous

A bool. If not True, an iname is allowed to be used in precisely one reduction, to avoid mis-nesting errors.

class loopy.symbolic.LinearSubscript(aggregate, index)

Represents a linear index into a multi-dimensional array, completely ignoring any multi-dimensional layout.

class loopy.symbolic.RuleArgument(index)

Represents a (numbered) argument of a loopy.SubstitutionRule. Only used internally in the rule-aware mappers to match subst rules independently of argument names.

class loopy.symbolic.ExpansionState(*args, **kwargs)
kernel
instruction
stack

a tuple representing the current expansion stack, as a tuple of (name, tag) pairs.

arg_context

a dict representing current argument values

class loopy.symbolic.RuleAwareIdentityMapper(rule_mapping_context)

Note: the third argument dragged around by this mapper is the current ExpansionState.

Subclasses of this must be careful to not touch identifiers that are in ExpansionState.arg_context.

## Types¶

DTypes of variables in a loopy.LoopKernel must be picklable, so in the codegen pipeline user-provided types are converted to loopy.types.LoopyType.

class loopy.types.LoopyType

Abstract class for dtypes of variables encountered in a loopy.LoopKernel.

class loopy.types.NumpyType(dtype, target=None)

This object works around several issues with pickling numpy.dtype objects. It does so by serving as a picklable wrapper around the original dtype.

The issues are the following

• numpy.dtype objects for custom types in loopy are usually registered in the target’s dtype registry. This registration may have been lost after unpickling. This container restores it implicitly, as part of unpickling.

• There is anumpy bug <https://github.com/numpy/numpy/issues/4317>_ that prevents unpickled dtypes from hashing properly. This is solved by retrieving the ‘canonical’ type from the dtype registry.

class loopy.types.AtomicType

Abstract class for dtypes of variables encountered in a loopy.LoopKernel on which atomic operations are performed .

class loopy.types.AtomicNumpyType(dtype, target=None)

A dtype wrapper that indicates that the described type should be capable of atomic operations.

## Codegen¶

class loopy.codegen.ImplementedDataInfo(target, name, dtype, arg_class, base_name=None, shape=None, strides=None, unvec_shape=None, unvec_strides=None, offset_for_name=None, stride_for_name_and_axis=None, allows_offset=None, is_written=None)
name

The expanded name of the array. Note that, for example in the case of separate-array-tagged axes, multiple implemented arrays may correspond to one user-facing array.

dtype
arg_class
base_name

The user-facing name of the underlying array. May be None for non-array arguments.

shape
strides

Strides in multiples of dtype.itemsize.

unvec_shape
unvec_strides

Strides in multiples of dtype.itemsize that accounts for loopy.kernel.array.VectorArrayDimTag in a scalar manner

offset_for_name
stride_for_name_and_axis

A tuple (name, axis) indicating the (implementation-facing) name of the array and axis number for which this argument provides the strides.

allows_offset
is_written
class loopy.codegen.PreambleInfo(*args, **kwargs)
kernel
seen_dtypes
seen_functions
seen_atomic_dtypes
codegen_state
class loopy.codegen.VectorizationInfo(iname, length, space)
iname
length
space
class loopy.codegen.SeenFunction(name, c_name, arg_dtypes, result_dtypes)
name
c_name
arg_dtypes

a tuple of arg dtypes

result_dtypes

a tuple of result dtypes

class loopy.codegen.CodeGenerationState(kernel, implemented_data_info, implemented_domain, implemented_predicates, seen_dtypes, seen_functions, seen_atomic_dtypes, var_subst_map, allow_complex, vectorization_info=None, var_name_generator=None, is_generating_device_code=None, gen_program_name=None, schedule_index_end=None)
kernel
implemented_data_info

a list of ImplementedDataInfo objects.

implemented_domain

The entire implemented domain (as an islpy.Set) i.e. all constraints that have been enforced so far.

implemented_predicates

A frozenset of predicates for which checks have been implemented.

seen_dtypes

set of dtypes that were encountered

seen_functions

set of SeenFunction instances

seen_atomic_dtypes
var_subst_map
allow_complex
vectorization_info

None or an instance of VectorizationInfo

is_generating_device_code
gen_program_name

None (indicating that host code is being generated) or the name of the device program currently being generated.

schedule_index_end
class loopy.codegen.result.GeneratedProgram(*args, **kwargs)
name
is_device_program
ast

Once generated, this captures the AST of the overall function definition, including the body.

body_ast

Once generated, this captures the AST of the operative function body (including declaration of necessary temporaries), but not the overall function definition.

class loopy.codegen.result.CodeGenerationResult(*args, **kwargs)
host_program
device_programs

A list of GeneratedProgram instances intended to run on the compute device.

implemented_domains

A mapping from instruction ID to a list of islpy.Set objects.

host_preambles
device_preambles
host_code()
device_code()
all_code()
implemented_data_info

a list of loopy.codegen.ImplementedDataInfo objects. Only added at the very end of code generation.

loopy.codegen.result.merge_codegen_results(codegen_state, elements, collapse=True)
loopy.codegen.result.generate_host_or_device_program(codegen_state, schedule_index)

## Reduction Operation¶

class loopy.library.reduction.ReductionOperation

Subclasses of this type have to be hashable, picklable, and equality-comparable.

class loopy.library.reduction.ScalarReductionOperation(forced_result_type=None)
class loopy.library.reduction.SumReductionOperation(forced_result_type=None)
class loopy.library.reduction.ProductReductionOperation(forced_result_type=None)
class loopy.library.reduction.MaxReductionOperation(forced_result_type=None)
class loopy.library.reduction.MinReductionOperation(forced_result_type=None)

## Iname Tags¶

loopy.kernel.data.filter_iname_tags_by_type(tags, tag_type, max_num=None, min_num=None)

Return a subset of tags that matches type tag_type. Raises exception if the number of tags found were greater than max_num or less than min_num.

Parameters
• tags – An iterable of tags.

• tag_type – a subclass of loopy.kernel.data.IndexTag.

• max_num – the maximum number of tags expected to be found.

• min_num – the minimum number of tags expected to be found.

class loopy.kernel.data.IndexTag(*args, **kwargs)
class loopy.kernel.data.ConcurrentTag(*args, **kwargs)
class loopy.kernel.data.UniqueTag(*args, **kwargs)
class loopy.kernel.data.AxisTag(axis)
class loopy.kernel.data.LocalIndexTag(axis)
class loopy.kernel.data.GroupIndexTag(axis)
class loopy.kernel.data.VectorizeTag(*args, **kwargs)
class loopy.kernel.data.UnrollTag(*args, **kwargs)

## Array¶

class loopy.kernel.array.ArrayDimImplementationTag(*args, **kwargs)
class loopy.kernel.array._StrideArrayDimTagBase(*args, **kwargs)
target_axis

For objects (such as images) with more than one axis, target_axis sets which of these indices is being targeted by this dimension. Note that there may be multiple dim_tags with the same target_axis, their contributions are combined additively.

Note that “normal” arrays only have one target_axis.

layout_nesting_level

For determining the stride of ComputedStrideArrayDimTag, this determines the layout nesting level of this axis. This must be a contiguous sequence of unique integers starting at 0 in a single ArrayBase.dim_tags. The lowest nesting level varies fastest when viewed in linear memory.

May be None on FixedStrideArrayDimTag, in which case no ComputedStrideArrayDimTag instances may occur.

class loopy.kernel.array.FixedStrideArrayDimTag(stride, target_axis=0, layout_nesting_level=None)

An arg dimension implementation tag for a fixed (potentially symbolic) stride.

stride

May be one of the following:

The stride is given in units of ArrayBase.dtype.

class loopy.kernel.array.ComputedStrideArrayDimTag(layout_nesting_level, pad_to=None, target_axis=0)
pad_to

ArrayBase.dtype granularity to which to pad this dimension

This type of stride arg dim gets converted to FixedStrideArrayDimTag on input to ArrayBase subclasses.

class loopy.kernel.array.SeparateArrayArrayDimTag(*args, **kwargs)
class loopy.kernel.array.VectorArrayDimTag(*args, **kwargs)
loopy.kernel.array.parse_array_dim_tags(dim_tags, n_axes=None, use_increasing_target_axes=False, dim_names=None)

## Checks¶

loopy.check.check_for_integer_subscript_indices(kernel)

Checks is every array access is of type int.

loopy.check.check_for_duplicate_insn_ids(knl)

Check if multiple instructions of knl have the same loopy.InstructionBase.id.

loopy.check.check_for_double_use_of_hw_axes(kernel)

Check if any instruction of kernel is within multiple inames tagged with the same hw axis tag.

loopy.check.check_insn_attributes(kernel)

Check for legality of attributes of every instruction in kernel.

loopy.check.check_loop_priority_inames_known(kernel)

Checks if the inames in loopy.LoopKernel.loop_priority are part of the kernel’s domain.

loopy.check.check_multiple_tags_allowed(kernel)

Checks if a multiple tags of an iname are compatible.

loopy.check.check_for_inactive_iname_access(kernel)

Check if any instruction accesses an iname but is not within it.

loopy.check.check_for_unused_inames(kernel)

Check if there are any unused inames in the kernel.

loopy.check.check_for_write_races(kernel)

Check if any memory accesses lead to write races.

loopy.check.check_for_data_dependent_parallel_bounds(kernel)

Check that inames tagged as hw axes have bounds that are known at kernel launch.

loopy.check.check_bounds(kernel)

Performs out-of-bound check for every array access.

loopy.check.check_variable_access_ordered(kernel)

Checks that between each write to a variable and all other accesses to the variable there is either:

## Schedule¶

class loopy.schedule.ScheduleItem(*args, **kwargs)
class loopy.schedule.MinRecursionLimitForScheduling(kernel)