Reference: Documentation for Internal API#
Targets#
See also Targets.
- class loopy.target.c.POD(ast_builder, dtype, name)[source]#
A simple declarator: The type is given as a
numpy.dtype
and the name is given as a string.
- class loopy.target.c.ScopingBlock(contents=[])[source]#
A block that is mandatory for scoping and may not be simplified away by
loopy.codegen.result.merge_codegen_results()
.
Symbolic#
See also Expressions.
Loopy-specific expression types#
- class loopy.symbolic.Literal(s)[source]#
A literal to be used during code generation.
Note
Only used in the output of
loopy.target.c.codegen.expression.ExpressionToCExpressionMapper
(and similar mappers). Not for use in Loopy source representation.
- class loopy.symbolic.ArrayLiteral(children)[source]#
An array literal.
Note
Only used in the output of
loopy.target.c.codegen.expression.ExpressionToCExpressionMapper
(and similar mappers). Not for use in Loopy source representation.
- class loopy.symbolic.TypedCSE(child, prefix=None, dtype=None)[source]#
A
pymbolic.primitives.CommonSubexpression
annotated with anumpy.dtype
.
- class loopy.symbolic.TypeCast(type, child)[source]#
Only defined for numerical types with semantics matching
numpy.ndarray.astype()
.- child#
The expression to be cast.
- class loopy.symbolic.TaggedVariable(name, tags)[source]#
This is an identifier with tags, such as
matrix$one
, where ‘one’ identifies this specific use of the identifier. This mechanism may then be used to address these uses–such as by prefetching only accesses tagged a certain way.- tags#
A
frozenset
of subclasses ofpytools.tag.Tag
used to provide metadata on this object. Legacy string tags are converted toLegacyStringInstructionTag
or, if they used to carry a functional meaning, the tag carrying that same fucntional meaning (e.g.UseStreamingStoreTag
).
Inherits from
pymbolic.primitives.Variable
andpytools.tag.Taggable
.
- class loopy.symbolic.Reduction(operation, inames, expr, allow_simultaneous=False)[source]#
Represents a reduction operation on
expr
acrossinames
.- operation#
- an instance of :class:`loopy.library.reduction.ReductionOperation`
- expr#
An expression which may have tuple type. If the expression has tuple type, it must be one of the following: * a
tuple
ofpymbolic.primitives.Expression
, or * aloopy.symbolic.Reduction
, or * a function call or substitution rule invocation.
- class loopy.symbolic.LinearSubscript(aggregate, index)[source]#
Represents a linear index into a multi-dimensional array, completely ignoring any multi-dimensional layout.
- class loopy.symbolic.RuleArgument(index)[source]#
Represents a (numbered) argument of a
loopy.SubstitutionRule
. Only used internally in the rule-aware mappers to match subst rules independently of argument names.
- class loopy.symbolic.ExpansionState(*args, **kwargs)[source]#
- kernel#
- instruction#
- stack#
a tuple representing the current expansion stack, as a tuple of (name, tag) pairs.
- arg_context#
a dict representing current argument values
- class loopy.symbolic.RuleAwareIdentityMapper(rule_mapping_context)[source]#
Note: the third argument dragged around by this mapper is the current
ExpansionState
.Subclasses of this must be careful to not touch identifiers that are in
ExpansionState.arg_context
.
- class loopy.symbolic.ResolvedFunction(function)[source]#
A function identifier whose definition is known in a
loopy
program. A function is said to be known in aTranslationUnit
if its name maps to anInKernelCallable
inloopy.TranslationUnit.callables_table
. Refer to Function Interface.- function#
An instance of
pymbolic.primitives.Variable
orloopy.library.reduction.ReductionOpFunction
.
- class loopy.symbolic.SubArrayRef(swept_inames, subscript)[source]#
An algebraic expression to map an affine memory layout pattern (known as sub-arary) as consecutive elements of the sweeping axes which are defined using
SubArrayRef.swept_inames
.- swept_inames#
An instance of
tuple
denoting the axes to which the sub array is supposed to be mapped to.
- subscript#
An instance of
pymbolic.primitives.Subscript
denoting the array in the kernel.
Expression Manipulation Helpers#
- loopy.symbolic.simplify_using_aff(kernel, expr)[source]#
Simplifies expr on kernel’s domain.
- Parameters
expr – An instance of
pymbolic.primitives.Expression
.
Types#
DTypes of variables in a loopy.LoopKernel
must be picklable, so in
the codegen pipeline user-provided types are converted to
loopy.types.LoopyType
.
- class loopy.types.LoopyType[source]#
Abstract class for dtypes of variables encountered in a
loopy.LoopKernel
.
- class loopy.types.AtomicType[source]#
Abstract class for dtypes of variables encountered in a
loopy.LoopKernel
on which atomic operations are performed .
Codegen#
- class loopy.codegen.ImplementedDataInfo(target, name, dtype, arg_class, base_name=None, shape=None, strides=None, unvec_shape=None, unvec_strides=None, offset_for_name=None, stride_for_name_and_axis=None, allows_offset=None, is_written=None)[source]#
- name#
The expanded name of the array. Note that, for example in the case of separate-array-tagged axes, multiple implemented arrays may correspond to one user-facing array.
- dtype#
- arg_class#
- base_name#
The user-facing name of the underlying array. May be None for non-array arguments.
- shape#
- strides#
Strides in multiples of
dtype.itemsize
.
- unvec_shape#
- unvec_strides#
Strides in multiples of
dtype.itemsize
that accounts forloopy.kernel.array.VectorArrayDimTag
in a scalar manner
- offset_for_name#
- stride_for_name_and_axis#
A tuple (name, axis) indicating the (implementation-facing) name of the array and axis number for which this argument provides the strides.
- allows_offset#
- is_written#
- class loopy.codegen.PreambleInfo(*args, **kwargs)[source]#
- kernel#
- seen_dtypes#
- seen_functions#
- seen_atomic_dtypes#
- codegen_state#
- class loopy.codegen.SeenFunction(name, c_name, arg_dtypes, result_dtypes)[source]#
This is used to track functions that emerge late during code generation, e.g. C functions to realize arithmetic. No connection with
InKernelCallable
.- name#
- c_name#
- arg_dtypes#
a tuple of arg dtypes
- result_dtypes#
a tuple of result dtypes
- class loopy.codegen.CodeGenerationState(kernel, target, implemented_data_info, implemented_domain, implemented_predicates, seen_dtypes, seen_functions, seen_atomic_dtypes, var_subst_map, allow_complex, callables_table, is_entrypoint, vectorization_info=None, var_name_generator=None, is_generating_device_code=None, gen_program_name=None, schedule_index_end=None, codegen_cachemanager=None)[source]#
- kernel#
- target#
- implemented_data_info#
a list of
ImplementedDataInfo
objects.
- implemented_domain#
The entire implemented domain (as an
islpy.Set
) i.e. all constraints that have been enforced so far.
- seen_dtypes#
set of dtypes that were encountered
- seen_functions#
set of
SeenFunction
instances
- seen_atomic_dtypes#
- var_subst_map#
- allow_complex#
- vectorization_info#
None or an instance of
VectorizationInfo
- is_generating_device_code#
- gen_program_name#
None (indicating that host code is being generated) or the name of the device program currently being generated.
- schedule_index_end#
- callables_table#
A mapping from callable names to instances of
loopy.kernel.function_interface.InKernelCallable
.
- codegen_cache_manager#
An instance of
loopy.codegen.tools.CodegenOperationCacheManager
.
- class loopy.codegen.TranslationUnitCodeGenerationResult(*args, **kwargs)[source]#
- host_program#
A mapping from names of entrypoints to their host
GeneratedProgram
.
- device_programs#
A list of
GeneratedProgram
instances intended to run on the compute device.
- host_preambles#
- device_preambles#
- implemented_data_infos#
A mapping from names of entrypoints to their list of
ImplementedDataInfo
objects.
- class loopy.codegen.result.GeneratedProgram(*args, **kwargs)[source]#
- name#
- is_device_program#
- ast#
Once generated, this captures the AST of the overall function definition, including the body.
- body_ast#
Once generated, this captures the AST of the operative function body (including declaration of necessary temporaries), but not the overall function definition.
- class loopy.codegen.result.CodeGenerationResult(*args, **kwargs)[source]#
- host_program#
- device_programs#
A list of
GeneratedProgram
instances intended to run on the compute device.
- host_preambles#
- device_preambles#
- implemented_data_info#
a list of
loopy.codegen.ImplementedDataInfo
objects. Only added at the very end of code generation.
- class loopy.codegen.tools.KernelProxyForCodegenOperationCacheManager(instructions: List[loopy.kernel.instruction.InstructionBase], linearization: List[loopy.schedule.ScheduleItem], inames: Dict[str, loopy.kernel.data.Iname])[source]#
Proxy to
loopy.LoopKernel
to be used byCodegenOperationCacheManager
.
- class loopy.codegen.tools.CodegenOperationCacheManager(kernel_proxy)[source]#
Caches operations arising during the codegen pipeline.
- kernel_proxy#
An instance of
KernelProxyForCodegenOperationCacheManager
.
- with_kernel(kernel)[source]#
Returns a new instance of
CodegenOperationCacheManager
corresponding to kernel if the cached variables in self would be invalid for kernel, else returns self.
- get_parallel_inames_in_a_callkernel(callkernel_index)[source]#
Returns a
frozenset
of parallel inames in a callkernel- Parameters
callkernel_index – Index of the
loopy.schedule.CallKernel
in theCodegenOperationCacheManager.kernel_proxy
’s schedule, whose parallel inames are to be found.
Reduction Operation#
Iname Tags#
- loopy.kernel.data.filter_iname_tags_by_type(tags, tag_type, max_num=None, min_num=None)[source]#
Return a subset of tags that matches type tag_type. Raises exception if the number of tags found were greater than max_num or less than min_num.
- Parameters
tags – An iterable of tags.
tag_type – a subclass of
loopy.kernel.data.InameImplementationTag
.max_num – the maximum number of tags expected to be found.
min_num – the minimum number of tags expected to be found.
- class loopy.kernel.data.Iname(name, tags=frozenset({}))[source]#
Records an iname in a
LoopKernel
. See Loop Domain Forest for semantics of inames inloopy
.This class records the metadata attached to an iname as instances of :class:pytools.tag.Tag`. A tag maybe a builtin tag like
loopy.kernel.data.InameImplementationTag
or a user-defined custom tag. Custom tags may be attached to inames to be used in targeting later during transformations.- tas#
An instance of
frozenset
ofpytools.tag.Tag
.
Array#
- class loopy.kernel.array._StrideArrayDimTagBase(*args, **kwargs)[source]#
- target_axis#
For objects (such as images) with more than one axis, target_axis sets which of these indices is being targeted by this dimension. Note that there may be multiple dim_tags with the same target_axis, their contributions are combined additively.
Note that “normal” arrays only have one target_axis.
- layout_nesting_level#
For determining the stride of
ComputedStrideArrayDimTag
, this determines the layout nesting level of this axis. This must be a contiguous sequence of unique integers starting at 0 in a singleArrayBase.dim_tags
. The lowest nesting level varies fastest when viewed in linear memory.May be None on
FixedStrideArrayDimTag
, in which case noComputedStrideArrayDimTag
instances may occur.
- class loopy.kernel.array.FixedStrideArrayDimTag(stride, target_axis=0, layout_nesting_level=None)[source]#
An arg dimension implementation tag for a fixed (potentially symbolic) stride.
- stride#
May be one of the following:
A
pymbolic.primitives.Expression
, including an integer, indicating the stride in units of the underlying array’sArrayBase.dtype
.loopy.auto
, indicating that a new kernel argument for this stride should automatically be created.
The stride is given in units of
ArrayBase.dtype
.
- class loopy.kernel.array.ComputedStrideArrayDimTag(layout_nesting_level, pad_to=None, target_axis=0)[source]#
- pad_to#
ArrayBase.dtype
granularity to which to pad this dimension
This type of stride arg dim gets converted to
FixedStrideArrayDimTag
on input toArrayBase
subclasses.
Checks#
- loopy.check.check_for_integer_subscript_indices(t_unit)[source]#
Checks if every array access is of type
int
.
- loopy.check.check_for_duplicate_insn_ids(knl)[source]#
Check if multiple instructions of knl have the same
loopy.InstructionBase.id
.
- loopy.check.check_for_double_use_of_hw_axes(t_unit)[source]#
Check if any instruction of kernel is within multiple inames tagged with the same hw axis tag.
- loopy.check.check_insn_attributes(kernel)[source]#
Check for legality of attributes of every instruction in kernel.
- loopy.check.check_loop_priority_inames_known(kernel)[source]#
Checks if the inames in
loopy.LoopKernel.loop_priority
are part of the kernel’s domain.
- loopy.check.check_multiple_tags_allowed(kernel)[source]#
Checks if a multiple tags of an iname are compatible.
- loopy.check.check_for_inactive_iname_access(kernel)[source]#
Check if any instruction accesses an iname but is not within it.
- loopy.check.check_for_unused_inames(kernel)[source]#
Check if there are any unused inames in the kernel.
- loopy.check.check_for_write_races(kernel)[source]#
Check if any memory accesses lead to write races.
- loopy.check.check_for_data_dependent_parallel_bounds(kernel)[source]#
Check that inames tagged as hw axes have bounds that are known at kernel launch.
- loopy.check.check_variable_access_ordered(kernel)[source]#
Checks that between each write to a variable and all other accesses to the variable there is either:
a direct/indirect depdendency edge, or
an explicit statement that no ordering is necessary (expressed through a bi-directional
loopy.InstructionBase.no_sync_with
)