Distributed ComputationΒΆ
High-level Point FMM InterfaceΒΆ
To perform point-FMM, first construct a
boxtree.distributed.DistributedFMMRunner object. The constructor will
distribute the necessary information from the root rank to all worker ranks. Then,
the boxtree.distributed.DistributedFMMRunner.drive_dfmm() can be used for
launching FMM.
- class boxtree.distributed.DistributedFMMRunner(array_context: ~arraycontext.context.ArrayContext, global_tree, traversal_builder, wrangler_factory, calibration_params=None, comm=<mpi4py.MPI.Intracomm object>)[source]ΒΆ
Helper class for setting up and running distributed point FMM.
- __init__(array_context: ~arraycontext.context.ArrayContext, global_tree, traversal_builder, wrangler_factory, calibration_params=None, comm=<mpi4py.MPI.Intracomm object>)[source]ΒΆ
Construct a DistributedFMMRunner object.
- Parameters:
global_tree β a
boxtree.Treeobject. This tree could live in the host or the device memory, depending on the wrangler. This argument is only significant on the root rank.traversal_builder β an object which, when called, takes a
pyopencl.CommandQueueobject and aboxtree.Treeobject, and generates aboxtree.traversal.FMMTraversalInfoobject from the tree using the command queue.wrangler_factory β an object which, when called, takes the local traversal and the global traversal objects and returns an
boxtree.fmm.ExpansionWranglerInterfaceobject.calibration_params β Calibration parameters for the cost model, if supplied. The cost model is used for estimating the execution time of each box, which is used for improving load balancing.
comm β MPI communicator.
- drive_dfmm(actx: ArrayContext, source_weights)[source]ΒΆ
Calculate potentials at target points.
Distributed Algorithm OverviewΒΆ
Construct the global tree and traversal lists on the root rank and broadcast to all worker ranks.
Partition boxes into disjoint sets, where the number of sets is the number of MPI ranks. (See Partition Boxes)
Each rank constructs the local tree and traversal lists independently, according to the partition. (See Construct Local Tree and Traversal)
Distribute source weights from the root rank to all worker ranks. (See Distributed Wrangler)
Each rank independently forms multipole expansions from the leaf nodes of the local tree and propagates the partial multipole expansions upwards.
Communicate multipole expansions so that all ranks have the complete multipole expansions needed.
Each rank independently forms local expansions, propagates the local expansions downwards, and evaluate potentials of target points in its partition. The calculated potentials are then assembled on the root rank.
For step 5-7, see Distributed FMM Evaluation.
Note that step 4-7 may be repeated multiple times with the same tree and traversal object built from step 1-3. For example, when iteratively solving a PDE, step 4-7 is executed for each iteration of the linear solver.
The next sections will cover the interfaces of these steps.
Partition BoxesΒΆ
- boxtree.distributed.partition.partition_work(cost_per_box, traversal, comm)[source]ΒΆ
This function assigns responsible boxes for each rank.
If a rank is responsible for a box, it will calculate the multiple expansion of the box and evaluate target potentials in the box.
- Parameters:
cost_per_box β The expected running time of each box. This argument is only significant on the root rank.
traversal β The global traversal object containing all particles. This argument is significant on all ranks.
comm β MPI communicator.
- Returns:
A numpy array containing the responsible boxes of the current rank.
- class boxtree.distributed.partition.BoxMasks(responsible_boxes: Array, ancestor_boxes: Array, point_src_boxes: Array, multipole_src_boxes: Array)[source]ΒΆ
Box masks needed for the distributed calculation. Each of these masks is an array with length
tree.nboxes, whose i-th entry is 1 if box i is set.- responsible_boxesΒΆ
Current process will evaluate target potentials and multipole expansions in these boxes. Sources and targets in these boxes are needed.
- ancestor_boxesΒΆ
Ancestors of the responsible boxes.
- point_src_boxesΒΆ
Current process needs sources but not targets in these boxes.
- multipole_src_boxesΒΆ
Current process needs multipole expressions in these boxes.
- boxtree.distributed.partition.get_box_masks(actx, traversal, responsible_boxes_list)[source]ΒΆ
Given the responsible boxes for a rank, this helper function calculates the relevant masks.
- Parameters:
responsible_boxes_list β A numpy array of responsible box indices.
- Returns:
A
BoxMasksobject of the relevant masks.
Construct Local Tree and TraversalΒΆ
- class boxtree.distributed.local_tree.LocalTree(root_extent: Array, box_centers: Array, box_parent_ids: Array, box_child_ids: Array, box_levels: Array, box_flags: Array | None, level_start_box_nrs: Array | None, box_id_dtype: dtype, box_level_dtype: dtype, coord_dtype: dtype, sources_have_extent: bool, targets_have_extent: bool, extent_norm: ExtentNorm, stick_out_factor: float, _is_pruned: bool, sources_are_targets: bool, particle_id_dtype: dtype, sources: Array, source_radii: Array, targets: Array, target_radii: Array, bounding_box: tuple[Array, Array], user_source_ids: Array, sorted_target_ids: Array, box_source_starts: Array, box_source_counts_nonchild: Array, box_source_counts_cumul: Array, box_target_starts: Array, box_target_counts_nonchild: Array, box_target_counts_cumul: Array, box_source_bounding_box_min: Array, box_source_bounding_box_max: Array, box_target_bounding_box_min: Array, box_target_bounding_box_max: Array, box_to_user_rank_starts: Array, box_to_user_rank_lists: Array, responsible_boxes_list: Array, responsible_boxes_mask: Array, ancestor_mask: Array)[source]ΒΆ
Inherits from
boxtree.Tree.- box_to_user_rank_startsΒΆ
box_id_t [nboxes + 1]
- box_to_user_rank_listsΒΆ
int32 [*]A CSR-like interaction list storage array, together with
box_to_user_rank_starts. For each box, the list of ranks which own targets that use the multipole expansion at this box, via either List 3 or (possibly downward propagated from an ancestor) List 2.
- boxtree.distributed.local_tree.generate_local_tree(actx: ArrayContext, global_traversal, responsible_boxes_list, comm)[source]ΒΆ
Generate the local tree for the current rank.
This is an MPI-collective routine on comm.
- Parameters:
global_traversal β Global
boxtree.traversal.FMMTraversalInfoobject on host memory.responsible_boxes_list β a
numpy.ndarrayobject containing the responsible boxes of the current rank.
- Returns:
a tuple of
(local_tree, src_idx, tgt_idx), wherelocal_treeis an object with classboxtree.distributed.local_tree.LocalTreeof the generated local tree,src_idxis the indices of the local sources in the global tree, andtgt_idxis the indices of the local targets in the global tree.src_idxandtgt_idxare needed for distributing source weights from root rank and assembling calculated potentials on the root rank.
- boxtree.distributed.local_traversal.generate_local_travs(actx, local_tree, traversal_builder, merge_close_lists=False)[source]ΒΆ
Generate local traversal from local tree.
- Parameters:
local_tree β the local tree on which the local traversal object will be constructed.
traversal_builder β a function, taken a
arraycontext.ArrayContextand a tree, returns the traversal object based on the tree.
- Returns:
generated local traversal object in device memory
Distributed WranglerΒΆ
- class boxtree.distributed.calculation.DistributedExpansionWranglerMixin[source]ΒΆ
Distributed expansion wrangler base class.
This class is meant to aid in adding distributed capabilities to wranglers. All distributed wranglers should inherit from this class.
- commΒΆ
- global_traversalΒΆ
- communicate_mpoles_via_allreduceΒΆ
- distribute_source_weights(actx: ArrayContext, src_weight_vecs, src_idx_all_ranks)[source]ΒΆ
- gather_potential_results(actx: ArrayContext, potentials, tgt_idx_all_ranks)[source]ΒΆ
- communicate_mpoles(actx: ArrayContext, mpole_exps, return_stats=False)[source]ΒΆ
Based on Algorithm 3: Reduce and Scatter in Lashuk et al. [1].
The main idea is to mimic an allreduce as done on a hypercube network, but to decrease the bandwidth cost by sending only information that is relevant to the rank receiving the message.
Distributed FMM EvaluationΒΆ
The distributed version of the FMM evaluation shares the same interface as the
shared-memory version. To evaluate FMM in a distributed manner, use a subclass
of boxtree.distributed.calculation.DistributedExpansionWranglerMixin
in boxtree.fmm.drive_fmm().