Reference: Documentation for Internal API¶
Targets¶
See also Targets.
-
class
loopy.target.c.
POD
(ast_builder, dtype, name)¶ A simple declarator: The type is given as a
numpy.dtype
and the name is given as a string.
-
class
loopy.target.c.
ScopingBlock
(contents=[])¶ A block that is mandatory for scoping and may not be simplified away by
loopy.codegen.result.merge_codegen_results()
.
-
class
loopy.target.c.codegen.expression.
ExpressionToCExpressionMapper
(codegen_state, fortran_abi=False, type_inf_mapper=None)¶ Mapper that converts a loopy-semantic expression to a C-semantic expression with typecasts, appropriate arithmetic semantic mapping, etc.
Symbolic¶
See also Expressions.
-
class
loopy.symbolic.
Literal
(s)¶ A literal to be used during code generation.
Note
Only used in the output of
loopy.target.c.codegen.expression.ExpressionToCExpressionMapper
(and similar mappers). Not for use in Loopy source representation.
-
class
loopy.symbolic.
ArrayLiteral
(children)¶ An array literal.
Note
Only used in the output of
loopy.target.c.codegen.expression.ExpressionToCExpressionMapper
(and similar mappers). Not for use in Loopy source representation.
-
class
loopy.symbolic.
FunctionIdentifier
¶ A base class for symbols representing functions.
-
class
loopy.symbolic.
TypedCSE
(child, prefix=None, dtype=None)¶ A
pymbolic.primitives.CommonSubexpression
annotated with anumpy.dtype
.
-
class
loopy.symbolic.
TypeCast
(type, child)¶ Only defined for numerical types with semantics matching
numpy.ndarray.astype()
.-
child
¶ The expression to be cast.
-
-
class
loopy.symbolic.
TaggedVariable
(name, tags)¶ This is an identifier with tags, such as
matrix$one
, where ‘one’ identifies this specific use of the identifier. This mechanism may then be used to address these uses–such as by prefetching only accesses tagged a certain way.A
frozenset
of subclasses ofpytools.tag.Tag
used to provide metadata on this object. Legacy string tags are converted toLegacyStringInstructionTag
or, if they used to carry a functional meaning, the tag carrying that same fucntional meaning (e.g.UseStreamingStoreTag
).
Inherits from
pymbolic.primitives.Variable
andpytools.tag.Taggable
.
-
class
loopy.symbolic.
Reduction
(operation, inames, expr, allow_simultaneous=False)¶ Represents a reduction operation on
expr
acrossinames
.-
operation
¶ an instance of
loopy.library.reduction.ReductionOperation
-
expr
¶ An expression which may have tuple type. If the expression has tuple type, it must be one of the following: * a
tuple
ofpymbolic.primitives.Expression
, or * aloopy.symbolic.Reduction
, or * a function call or substitution rule invocation.
-
-
class
loopy.symbolic.
LinearSubscript
(aggregate, index)¶ Represents a linear index into a multi-dimensional array, completely ignoring any multi-dimensional layout.
-
class
loopy.symbolic.
RuleArgument
(index)¶ Represents a (numbered) argument of a
loopy.SubstitutionRule
. Only used internally in the rule-aware mappers to match subst rules independently of argument names.
-
class
loopy.symbolic.
ExpansionState
(*args, **kwargs)¶ -
kernel
¶
-
instruction
¶
-
stack
¶ a tuple representing the current expansion stack, as a tuple of (name, tag) pairs.
-
arg_context
¶ a dict representing current argument values
-
-
class
loopy.symbolic.
RuleAwareIdentityMapper
(rule_mapping_context)¶ Note: the third argument dragged around by this mapper is the current
ExpansionState
.Subclasses of this must be careful to not touch identifiers that are in
ExpansionState.arg_context
.
Types¶
DTypes of variables in a loopy.LoopKernel
must be picklable, so in
the codegen pipeline user-provided types are converted to
loopy.types.LoopyType
.
-
class
loopy.types.
LoopyType
¶ Abstract class for dtypes of variables encountered in a
loopy.LoopKernel
.
-
class
loopy.types.
NumpyType
(dtype, target=None)¶ This object works around several issues with pickling
numpy.dtype
objects. It does so by serving as a picklable wrapper around the original dtype.The issues are the following
numpy.dtype
objects for custom types inloopy
are usually registered in the target’s dtype registry. This registration may have been lost after unpickling. This container restores it implicitly, as part of unpickling.There is a`numpy bug <https://github.com/numpy/numpy/issues/4317>`_ that prevents unpickled dtypes from hashing properly. This is solved by retrieving the ‘canonical’ type from the dtype registry.
-
class
loopy.types.
AtomicType
¶ Abstract class for dtypes of variables encountered in a
loopy.LoopKernel
on which atomic operations are performed .
-
class
loopy.types.
AtomicNumpyType
(dtype, target=None)¶ A dtype wrapper that indicates that the described type should be capable of atomic operations.
Codegen¶
-
class
loopy.codegen.
ImplementedDataInfo
(target, name, dtype, arg_class, base_name=None, shape=None, strides=None, unvec_shape=None, unvec_strides=None, offset_for_name=None, stride_for_name_and_axis=None, allows_offset=None, is_written=None)¶ -
name
¶ The expanded name of the array. Note that, for example in the case of separate-array-tagged axes, multiple implemented arrays may correspond to one user-facing array.
-
dtype
¶
-
arg_class
¶
-
base_name
¶ The user-facing name of the underlying array. May be None for non-array arguments.
-
shape
¶
-
strides
¶ Strides in multiples of
dtype.itemsize
.
-
unvec_shape
¶
-
unvec_strides
¶ Strides in multiples of
dtype.itemsize
that accounts forloopy.kernel.array.VectorArrayDimTag
in a scalar manner
-
offset_for_name
¶
-
stride_for_name_and_axis
¶ A tuple (name, axis) indicating the (implementation-facing) name of the array and axis number for which this argument provides the strides.
-
allows_offset
¶
-
is_written
¶
-
-
class
loopy.codegen.
PreambleInfo
(*args, **kwargs)¶ -
kernel
¶
-
seen_dtypes
¶
-
seen_functions
¶
-
seen_atomic_dtypes
¶
-
codegen_state
¶
-
-
class
loopy.codegen.
SeenFunction
(name, c_name, arg_dtypes, result_dtypes)¶ -
name
¶
-
c_name
¶
-
arg_dtypes
¶ a tuple of arg dtypes
-
result_dtypes
¶ a tuple of result dtypes
-
-
class
loopy.codegen.
CodeGenerationState
(kernel, implemented_data_info, implemented_domain, implemented_predicates, seen_dtypes, seen_functions, seen_atomic_dtypes, var_subst_map, allow_complex, vectorization_info=None, var_name_generator=None, is_generating_device_code=None, gen_program_name=None, schedule_index_end=None, codegen_cachemanager=None)¶ -
kernel
¶
-
implemented_data_info
¶ a list of
ImplementedDataInfo
objects.
-
implemented_domain
¶ The entire implemented domain (as an
islpy.Set
) i.e. all constraints that have been enforced so far.
-
seen_dtypes
¶ set of dtypes that were encountered
-
seen_functions
¶ set of
SeenFunction
instances
-
seen_atomic_dtypes
¶
-
var_subst_map
¶
-
allow_complex
¶
-
vectorization_info
¶ None or an instance of
VectorizationInfo
-
is_generating_device_code
¶
-
gen_program_name
¶ None (indicating that host code is being generated) or the name of the device program currently being generated.
-
schedule_index_end
¶
-
codegen_cache_manager
¶ An instance of
loopy.codegen.tools.CodegenOperationCacheManager
.
-
-
class
loopy.codegen.result.
GeneratedProgram
(*args, **kwargs)¶ -
name
¶
-
is_device_program
¶
-
ast
¶ Once generated, this captures the AST of the overall function definition, including the body.
-
body_ast
¶ Once generated, this captures the AST of the operative function body (including declaration of necessary temporaries), but not the overall function definition.
-
-
class
loopy.codegen.result.
CodeGenerationResult
(*args, **kwargs)¶ -
host_program
¶
-
device_programs
¶ A list of
GeneratedProgram
instances intended to run on the compute device.
-
host_preambles
¶
-
device_preambles
¶
-
host_code
()¶
-
device_code
()¶
-
all_code
()¶
-
implemented_data_info
¶ a list of
loopy.codegen.ImplementedDataInfo
objects. Only added at the very end of code generation.
-
-
loopy.codegen.result.
merge_codegen_results
(codegen_state, elements, collapse=True)¶
-
loopy.codegen.result.
generate_host_or_device_program
(codegen_state, schedule_index)¶
-
class
loopy.codegen.tools.
KernelProxyForCodegenOperationCacheManager
(instructions, schedule, inames)¶ Proxy to
loopy.LoopKernel
to be used byCodegenOperationCacheManager
.
-
class
loopy.codegen.tools.
CodegenOperationCacheManager
(kernel_proxy)¶ Caches operations arising during the codegen pipeline.
-
kernel_proxy
¶ An instance of
KernelProxyForCodegenOperationCacheManager
.
-
with_kernel
(kernel)¶ Returns a new instance of
CodegenOperationCacheManager
corresponding to kernel if the cached variables in self would be invalid for kernel, else returns self.
-
get_parallel_inames_in_a_callkernel
(callkernel_index)¶ Returns a
frozenset
of parallel inames in a callkernel- Parameters
callkernel_index – Index of the
loopy.schedule.CallKernel
in theCodegenOperationCacheManager.kernel_proxy
’s schedule, whose parallel inames are to be found.
-
Reduction Operation¶
-
class
loopy.library.reduction.
ReductionOperation
¶ Subclasses of this type have to be hashable, picklable, and equality-comparable.
-
class
loopy.library.reduction.
ScalarReductionOperation
(forced_result_type=None)¶
-
class
loopy.library.reduction.
SumReductionOperation
(forced_result_type=None)¶
-
class
loopy.library.reduction.
ProductReductionOperation
(forced_result_type=None)¶
-
class
loopy.library.reduction.
MaxReductionOperation
(forced_result_type=None)¶
-
class
loopy.library.reduction.
MinReductionOperation
(forced_result_type=None)¶
Iname Tags¶
Return a subset of tags that matches type tag_type. Raises exception if the number of tags found were greater than max_num or less than min_num.
- Parameters
tags – An iterable of tags.
tag_type – a subclass of
loopy.kernel.data.IndexTag
.max_num – the maximum number of tags expected to be found.
min_num – the minimum number of tags expected to be found.
-
class
loopy.kernel.data.
IndexTag
(*args, **kwargs)¶
-
class
loopy.kernel.data.
ConcurrentTag
(*args, **kwargs)¶
-
class
loopy.kernel.data.
UniqueTag
(*args, **kwargs)¶
-
class
loopy.kernel.data.
AxisTag
(axis)¶
-
class
loopy.kernel.data.
LocalIndexTag
(axis)¶
-
class
loopy.kernel.data.
GroupIndexTag
(axis)¶
-
class
loopy.kernel.data.
VectorizeTag
(*args, **kwargs)¶
-
class
loopy.kernel.data.
UnrollTag
(*args, **kwargs)¶
-
class
loopy.kernel.data.
Iname
(name, tags=frozenset({}))¶ Records an iname in a
LoopKernel
. See Loop Domain Forest for semantics of inames inloopy
.This class records the metadata attached to an iname as instances of :class:pytools.tag.Tag`. A tag maybe a builtin tag like
loopy.kernel.data.IndexTag
or a user-defined custom tag. Custom tags may be attached to inames to be used in targeting later during transformations.-
tas
¶ An instance of
frozenset
ofpytools.tag.Tag
.
-
Array¶
-
class
loopy.kernel.array.
ArrayDimImplementationTag
(*args, **kwargs)¶
-
class
loopy.kernel.array.
_StrideArrayDimTagBase
(*args, **kwargs)¶ -
target_axis
¶ For objects (such as images) with more than one axis, target_axis sets which of these indices is being targeted by this dimension. Note that there may be multiple dim_tags with the same target_axis, their contributions are combined additively.
Note that “normal” arrays only have one target_axis.
-
layout_nesting_level
¶ For determining the stride of
ComputedStrideArrayDimTag
, this determines the layout nesting level of this axis. This must be a contiguous sequence of unique integers starting at 0 in a singleArrayBase.dim_tags
. The lowest nesting level varies fastest when viewed in linear memory.May be None on
FixedStrideArrayDimTag
, in which case noComputedStrideArrayDimTag
instances may occur.
-
-
class
loopy.kernel.array.
FixedStrideArrayDimTag
(stride, target_axis=0, layout_nesting_level=None)¶ An arg dimension implementation tag for a fixed (potentially symbolic) stride.
-
stride
¶ May be one of the following:
A
pymbolic.primitives.Expression
, including an integer, indicating the stride in units of the underlying array’sArrayBase.dtype
.loopy.auto
, indicating that a new kernel argument for this stride should automatically be created.
The stride is given in units of
ArrayBase.dtype
.-
-
class
loopy.kernel.array.
ComputedStrideArrayDimTag
(layout_nesting_level, pad_to=None, target_axis=0)¶ -
pad_to
¶ ArrayBase.dtype
granularity to which to pad this dimension
This type of stride arg dim gets converted to
FixedStrideArrayDimTag
on input toArrayBase
subclasses.-
-
class
loopy.kernel.array.
SeparateArrayArrayDimTag
(*args, **kwargs)¶
-
class
loopy.kernel.array.
VectorArrayDimTag
(*args, **kwargs)¶
Checks¶
-
loopy.check.
check_for_integer_subscript_indices
(kernel)¶ Checks is every array access is of type
int
.
-
loopy.check.
check_for_duplicate_insn_ids
(knl)¶ Check if multiple instructions of knl have the same
loopy.InstructionBase.id
.
-
loopy.check.
check_for_double_use_of_hw_axes
(kernel)¶ Check if any instruction of kernel is within multiple inames tagged with the same hw axis tag.
-
loopy.check.
check_insn_attributes
(kernel)¶ Check for legality of attributes of every instruction in kernel.
-
loopy.check.
check_loop_priority_inames_known
(kernel)¶ Checks if the inames in
loopy.LoopKernel.loop_priority
are part of the kernel’s domain.
Checks if a multiple tags of an iname are compatible.
-
loopy.check.
check_for_inactive_iname_access
(kernel)¶ Check if any instruction accesses an iname but is not within it.
-
loopy.check.
check_for_unused_inames
(kernel)¶ Check if there are any unused inames in the kernel.
-
loopy.check.
check_for_write_races
(kernel)¶ Check if any memory accesses lead to write races.
-
loopy.check.
check_for_data_dependent_parallel_bounds
(kernel)¶ Check that inames tagged as hw axes have bounds that are known at kernel launch.
-
loopy.check.
check_bounds
(kernel)¶ Performs out-of-bound check for every array access.
-
loopy.check.
check_variable_access_ordered
(kernel)¶ Checks that between each write to a variable and all other accesses to the variable there is either:
a direct/indirect depdendency edge, or
an explicit statement that no ordering is necessary (expressed through a bi-directional
loopy.InstructionBase.no_sync_with
)
Schedule¶
-
class
loopy.schedule.
ScheduleItem
(*args, **kwargs)¶
-
class
loopy.schedule.
BeginBlockItem
(*args, **kwargs)¶
-
class
loopy.schedule.
EndBlockItem
(*args, **kwargs)¶
-
class
loopy.schedule.
CallKernel
(*args, **kwargs)¶
-
class
loopy.schedule.
Barrier
(*args, **kwargs)¶ -
comment
¶ A plain-text comment explaining why the barrier was inserted.
-
synchronization_kind
¶ "local"
or"global"
-
mem_kind
¶ "local"
or"global"
-
originating_insn_id
¶
-
-
class
loopy.schedule.
RunInstruction
(*args, **kwargs)¶
-
class
loopy.schedule.
MinRecursionLimitForScheduling
(kernel)¶