Reference: Creating Kernels

From Loop Domains and Instructions

loopy.make_kernel(domains, instructions, kernel_data=['...'], **kwargs)

User-facing kernel creation entrypoint.

Parameters:
  • domains – A list of islpy.BasicSet (i.e. convex set) instances representing the Loop Domain Forest. May also be a list of strings which will be parsed into such instances according to ISL syntax.
  • instructions – A list of Assignment (or other InstructionBase subclasses), possibly intermixed with instances of SubstitutionRule. This same list may also contain strings which will be parsed into such objects using the Textual Assignment Syntax and the Textual Syntax for Substitution Rules. May also be a single multi-line string which will be split into lines and then parsed.
  • kernel_data

    A list of ValueArg, ArrayArg, … (etc.) instances. The order of these arguments determines the order of the arguments to the generated kernel.

    May also contain TemporaryVariable instances(which do not give rise to kernel-level arguments).

    The string "..." may be passed as one of the entries of the list, in which case loopy will infer names, shapes, and types of arguments from the kernel code. It is possible to just pass the list ["..."], in which case all arguments are inferred.

    In Python 3, the string "..." may be spelled somewhat more sensibly as just ... (the ellipsis), for the same meaning.

    As an additional option, each argument may be specified as just a name (a string). This is useful to specify argument ordering. All other characteristics of the named arguments are inferred.

The following keyword arguments are recognized:

Parameters:
  • preambles – a list of (tag, code) tuples that identify preamble snippets. Each tag’s snippet is only included once, at its first occurrence. The preambles will be inserted in order of their tags.
  • preamble_generators – a list of functions of signature (seen_dtypes, seen_functions) where seen_functions is a set of (name, c_name, arg_dtypes), generating extra entries for preambles.
  • default_order – “C” (default) or “F”
  • default_offset – 0 or loopy.auto. The default value of offset in ArrayArg for guessed arguments. Defaults to 0.
  • function_manglers – list of functions of signature (target, name, arg_dtypes) returning a loopy.CallMangleInfo.
  • symbol_manglers – list of functions of signature (name) returning a tuple (result_dtype, c_name), where c_name is the C-level symbol to be evaluated.
  • assumptions – the initial implemented_domain, captures assumptions on loop domain parameters. (an isl.Set or a string in ISL syntax. If given as a string, only the CONDITIONS part of the set notation should be given.)
  • local_sizes – A dictionary from integers to integers, mapping workgroup axes to their sizes, e.g. {0: 16} forces axis 0 to be length 16.
  • silenced_warnings – a list (or semicolon-separated string) or warnings to silence
  • options – an instance of loopy.Options or an equivalent string representation
  • target – an instance of loopy.TargetBase, or None, to use the default target.
  • seq_dependencies – If True, dependencies that sequentially connect the given instructions will be added. Defaults to False.
  • fixed_parameters – A dictionary of name/value pairs, where name will be fixed to value. name may refer to Domain parameters or Arguments. See also loopy.fix_parameters().
  • lang_version

    The language version against which the kernel was written, a tuple. To ensure future compatibility, copy the current value of loopy.MOST_RECENT_LANGUAGE_VERSION and pass that value.

    (If you just pass loopy.MOST_RECENT_LANGUAGE_VERSION directly, breaking language changes will apply to your kernel without asking, likely breaking your code.)

    If not given, this value defaults to version (2017, 2, 1) and a warning will be issued.

    To set the kernel version for all loopy kernels in a (Python) source file, you may simply say:

    from loopy.version import LOOPY_USE_LANGUAGE_VERSION_2018_2
    

    If lang_version is not explicitly given, that version value will be used.

    See also Loopy Language Versioning.

Changed in version 2017.2.1: lang_version added.

Changed in version 2017.2: fixed_parameters added.

Changed in version 2016.3: seq_dependencies added.

From Fortran

loopy.parse_fortran(source, filename='<floopy code>', free_form=True, strict=True, seq_dependencies=None, auto_dependencies=None, target=None)
Returns:a list of loopy.LoopKernel objects
loopy.parse_transformed_fortran(source, free_form=True, strict=True, pre_transform_code=None, transform_code_context=None, filename='<floopy code>')
Parameters:
  • source – a string of Fortran source code which must include a snippet of transform code as described below.
  • pre_transform_code – code that is run in the same context as the transform

source may contain snippets of loopy transform code between markers:

!$loopy begin
! ...
!$loopy end

Within the transform code, the following symbols are predefined:

  • lp: a reference to the loopy package
  • np: a reference to the numpy package
  • SOURCE: the source code surrounding the transform block. This may be processed using c_preprocess() and parse_fortran().
  • FILENAME: the file name of the code being processed

The transform code must define RESULT, conventionally a list of kernels, which is returned from this function unmodified.

An example of source may look as follows:

subroutine fill(out, a, n)
  implicit none

  real*8 a, out(n)
  integer n, i

  do i = 1, n
    out(i) = a
  end do
end

!$loopy begin
!
! fill, = lp.parse_fortran(SOURCE, FILENAME)
! fill = lp.split_iname(fill, "i", split_amount,
!     outer_tag="g.0", inner_tag="l.0")
! RESULT = [fill]
!
!$loopy end
loopy.c_preprocess(source, defines=None, filename=None, include_paths=None)
Parameters:
  • source – a string, possibly containing C preprocessor constructs
  • defines – a list of strings as they might occur after a C-style #define directive, for example deg2rad(x) (x/180d0 * 3.14d0).
Returns:

a string

From Other Kernels

loopy.fuse_kernels(kernels, suffixes=None, data_flow=None)

Return a kernel that performs all the operations in all entries of kernels.

Parameters:
  • kernels – A list of loopy.LoopKernel instances to be fused.
  • suffixes – If given, must be a list of strings of a length matching that of kernels. This will be used to disambiguate the names of temporaries, as described below.
  • data_flow – A list of data dependencies [(var_name, from_kernel, to_kernel), ...]. Based on this, the fuser will create dependencies between all writers of var_name in kernels[from_kernel] to readers of var_name in kernels[to_kernel]. from_kernel and to_kernel are indices into kernels.

The components of the kernels are fused as follows:

  • The resulting kernel will have a domain involving all the inames and parameters occurring across kernels. Inames with matching names across kernels are fused in such a way that they remain a single iname in the fused kernel. Use loopy.rename_iname() if this is not desired.
  • The projection of the domains of each pair of kernels onto their common subset of inames must match in order for fusion to succeed.
  • Assumptions are fused by taking their conjunction.
  • If kernel arguments with matching names are encountered across kernels, their declarations must match in order for fusion to succeed.
  • Temporaries are automatically renamed to remain uniquely associated with each instruction stream.
  • The resulting kernel will contain all instructions from each entry of kernels. Clashing instruction IDs will be renamed to ensure uniqueness.

Changed in version 2016.2: data_flow was added in version 2016.2

To Copy between Data Formats

loopy.make_copy_kernel(new_dim_tags, old_dim_tags=None)

Returns a LoopKernel that changes the data layout of a variable (called “input”) to the new layout specified by new_dim_tags from the one specified by old_dim_tags. old_dim_tags defaults to an all-C layout of the same rank as the one given by new_dim_tags.

loopy.VERSION

A tuple representing the current version number of loopy, for example (2017, 2, 1). Direct comparison of these tuples will always yield valid version comparisons.

Loopy Language Versioning

At version 2018.1, loopy introduced a language versioning scheme to make it easier to evolve the language while retaining backward compatibility. What prompted this is the addition of loopy.Options.enforce_variable_access_ordered, which (despite its name) serves to enable a new check that helps ensure that all variable access in a kernel is ordered as intended. Since that has the potential to break existing programs, kernels now have to declare support for a given language version to let them take advantage of this check.

As a result, loopy will now issue a warning when a call to loopy.make_kernel() does not declare a language version. Such kernels will (indefinitely) default to language version 2017.2.1. If passing a language version to make_kernel() is impractical, you may also import one of the LOOPY_USE_LANGUAGE_VERSION_... symbols given below using:

from loopy.version import LOOPY_USE_LANGUAGE_VERSION_2018_2

in the global namespace of the function calling make_kernel(). If lang_version in that call is not explicitly given, this value will be used.

Language versions will generally reflect the version number of loopy in which they were introduced, though it is likely that most versions of loopy do not introduce language incompatibilities. In such situations, the previous language version number remains. (In fact, we will work hard to avoid backward-incompatible language changes.)

loopy.MOST_RECENT_LANGUAGE_VERSION

A tuple representing the most recent language version number of loopy, for example (2018, 1). Direct comparison of these tuples will always yield valid version comparisons.

History of Language Versions

loopy.LOOPY_USE_LANGUAGE_VERSION_2018_2

loopy.Options.ignore_boostable_into is turned on by default.

loopy.LOOPY_USE_LANGUAGE_VERSION_2018_1

loopy.Options.enforce_variable_access_ordered is turned on by default.

loopy.LOOPY_USE_LANGUAGE_VERSION_2017_2_1

Initial legacy language version.