Provided by: lam-runtime_7.1.1a-1_i386 bug

NAME

       LAM SSI collectives - overview of LAM’s MPI collective SSI modules

DESCRIPTION

       The  "kind"  for  collectives SSI modules is "coll".  Specifically, the
       string "coll" (without the quotes) is the prefix that  should  be  used
       with the mpirun command line with the -ssi switch.  For example:

       mpirun -ssi coll_base_crossover 4 C my_mpi_program

       LAM currently has three coll modules:

       lam_basic
           A  full  implementation  of  MPI collectives on intracommunicators.
           The algorithms are  the  same  as  were  in  the  LAM  6.5  series.
           Collectives on intercommunicators are undefined, and will result in
           run-time errors.

       impi
           Collective functions for IMPI communicators.  These are mostly  un-
           implemented; only the basics exist: MPI_BARRIER and MPI_REDUCE.

       shmem
           Shared memory collectives.

       smp SMP-aware  collectives  (based  on  the  MagPIe  algorithms).   The
           following   algorithms    provide    SMP-aware    performance    on
           multiprocessors:    MPI_ALLREDUCE,   MPI_ALLTOALL,   MPI_ALLTOALLV,
           MPI_BARRIER,  MPI_BCAST,   MPI_GATHER,   MPI_GATHERV,   MPI_REDUCE,
           MPI_SCATTER,  and MPI_SCATTERV.  Note that the reduction algorithms
           must  be  specifically  enabled  by  marking  the   operations   as
           associative  before  they  will be used.  All other MPI collectives
           will fall back to their lam_basic equivalents.

       More collective modules are likely to be implemented in the future.

COLL MODULE PARAMETERS

       In the discussion below, the parameters are discussed in terms of  kind
       and  value.   Unlike  other  SSI  module  kinds, since coll modules are
       selected on a  per-communicator  basis,  the  kind  and  value  may  be
       specified as attributes to a parent communicator.

       Need to write much more here.

   Selecting a coll module
       coll  modules  are  selected  on  a  per-communicator  basis.  They are
       selected when the communicator is created, and remain the  active  coll
       module  for the life of that communicator.  For example, different coll
       modules may be assigned to MPI_COMM_WORLD and MPI_COMM_SELF.   In  most
       cases  LAM/MPI  will  select  the  best coll module automatically.  For
       example, when a communicator spans multiple nodes and at least one node
       has  multiple  MPI  processes,  the  smp  module  will automatically be
       selected.

       However, the LAM_MPI_SSI_COLL keyval can be used to set an attribute on
       a  communicator  that  is  used  to  create  a  new  communicator.  The
       attribute should have the value of the string name of the  coll  module
       to  use.   If  that module cannot be used, an MPI exception will occur.
       This attribute is only examined on the parent communicator when  a  new
       communicator is created.

   coll SSI Parameters
       The coll modules accept several parameters:

       coll_associative
           Because  of  specific  wording  in  the  MPI  standard, LAM/MPI can
           effectively not assume that any reduction operator  is  associative
           (at least, not without additional overhead).  Hence, LAM/MPI relies
           on the user to indicate that certain  operations  are  associative.
           If  the  user sets the coll_associative SSI parameter to 1, LAM/MPI
           may assume that the reduction operator is assocative,  and  may  be
           able  to  optimize  the overall reduction operation.  If it is 0 or
           undefined, LAM/MPI will assume that the reduction operation is  not
           associative,  and  will  use  strict  linear  ordering of reduction
           operations  (regardless  of  data  locality).   This  attribute  is
           checked  every  time  a  reduction operator is invoked.  The User’s
           Guide contains more information on this topic.

       coll_crossover
           This parameter determines the maximum  number  of  processes  in  a
           communicator  that  will use linear algorithms.  This SSI parameter
           is only checked during MPI_INIT.

       coll_reduce_crossover
           During reduction operations, it makes sense to use  the  number  of
           bytes  to  be  transferred rather than the number of processes as a
           metric whether  to  use  linear  or  logrithmic  algorithms.   This
           parameter  indicates the maxmimum number of bytes to be transferred
           by each process by a linear algorithm.  This SSI parameter is  only
           checked during MPI_INIT.

   Notes on the smp coll Module
       The smp coll module is based on the algorithms from the MagPIe project.
       It is not yet complete; there are still more  algorithms  that  can  be
       optmized for SMP-aware execution -- by the time that LAM/MPI was frozen
       in preparation for release,  only  some  of  the  algorithms  had  been
       completed.   It  is  expected that future versions of LAM/MPI will have
       more SMP-optimized algorithms.

       The User’s Guide contains much more detail about the  smp  module.   In
       particular,  the  coll_associative SSI parameter must be 1 for the SMP-
       aware reduction algorithms to be used.  If it is 0  or  undefined,  the
       corresponding  lam_basic algorithms will be used.  The coll_associative
       attribute is checked at every invocation of the reduction algorithms.

SEE ALSO

       lamssi(7), mpirun(1), LAM User’s Guide