Ubuntu Manpage: lamssi_collectives - overview of LAM's MPI collective SSI modules

Provided by: lam-runtime_7.1.4-6build2_amd64

NAME

       lamssi_collectives - overview of LAM's MPI collective SSI modules

DESCRIPTION

       The  "kind"  for  collectives  SSI  modules  is  "coll".   Specifically, the string "coll"
       (without the quotes) is the prefix that should be used with the mpirun command  line  with
       the -ssi switch.  For example:

       mpirun -ssi coll_base_crossover 4 C my_mpi_program

       LAM currently has three coll modules:

       lam_basic
           A  full  implementation  of MPI collectives on intracommunicators.  The algorithms are
           the same as were in  the  LAM  6.5  series.   Collectives  on  intercommunicators  are
           undefined, and will result in run-time errors.

       impi
           Collective  functions  for  IMPI communicators.  These are mostly un-implemented; only
           the basics exist: MPI_BARRIER and MPI_REDUCE.

       shmem
           Shared memory collectives.

       smp SMP-aware collectives (based on the  MagPIe  algorithms).   The  following  algorithms
           provide   SMP-aware   performance  on  multiprocessors:  MPI_ALLREDUCE,  MPI_ALLTOALL,
           MPI_ALLTOALLV,   MPI_BARRIER,   MPI_BCAST,   MPI_GATHER,   MPI_GATHERV,    MPI_REDUCE,
           MPI_SCATTER,   and   MPI_SCATTERV.    Note  that  the  reduction  algorithms  must  be
           specifically enabled by marking the operations as  associative  before  they  will  be
           used.  All other MPI collectives will fall back to their lam_basic equivalents.

       More collective modules are likely to be implemented in the future.

COLL MODULE PARAMETERS

In the discussion below, the parameters are discussed in terms of kind and value. Unlike
other SSI module kinds, since coll modules are selected on a per-communicator basis, the
kind and value may be specified as attributes to a parent communicator.

Need to write much more here.

Selecting a coll module
coll modules are selected on a per-communicator basis. They are selected when the
communicator is created, and remain the active coll module for the life of that
communicator. For example, different coll modules may be assigned to MPI_COMM_WORLD and
MPI_COMM_SELF. In most cases LAM/MPI will select the best coll module automatically. For
example, when a communicator spans multiple nodes and at least one node has multiple MPI
processes, the smp module will automatically be selected.

However, the LAM_MPI_SSI_COLL keyval can be used to set an attribute on a communicator
that is used to create a new communicator. The attribute should have the value of the
string name of the coll module to use. If that module cannot be used, an MPI exception
will occur. This attribute is only examined on the parent communicator when a new
communicator is created.

coll SSI Parameters
The coll modules accept several parameters:

coll_associative
Because of specific wording in the MPI standard, LAM/MPI can effectively not assume
that any reduction operator is associative (at least, not without additional
overhead). Hence, LAM/MPI relies on the user to indicate that certain operations are
associative. If the user sets the coll_associative SSI parameter to 1, LAM/MPI may
assume that the reduction operator is assocative, and may be able to optimize the
overall reduction operation. If it is 0 or undefined, LAM/MPI will assume that the
reduction operation is not associative, and will use strict linear ordering of
reduction operations (regardless of data locality). This attribute is checked every
time a reduction operator is invoked. The User's Guide contains more information on
this topic.

coll_crossover
This parameter determines the maximum number of processes in a communicator that will
use linear algorithms. This SSI parameter is only checked during MPI_INIT.

coll_reduce_crossover
During reduction operations, it makes sense to use the number of bytes to be
transferred rather than the number of processes as a metric whether to use linear or
logrithmic algorithms. This parameter indicates the maxmimum number of bytes to be
transferred by each process by a linear algorithm. This SSI parameter is only checked
during MPI_INIT.

Notes on the smp coll Module
The smp coll module is based on the algorithms from the MagPIe project. It is not yet
complete; there are still more algorithms that can be optmized for SMP-aware execution --
by the time that LAM/MPI was frozen in preparation for release, only some of the
algorithms had been completed. It is expected that future versions of LAM/MPI will have
more SMP-optimized algorithms.

The User's Guide contains much more detail about the smp module. In particular, the
coll_associative SSI parameter must be 1 for the SMP-aware reduction algorithms to be
used. If it is 0 or undefined, the corresponding lam_basic algorithms will be used. The
coll_associative attribute is checked at every invocation of the reduction algorithms.

NAME

DESCRIPTION

COLL MODULE PARAMETERS

SEE ALSO