oracular (7) fi_psm.7.gz

Provided by: libfabric-dev_1.17.0-3ubuntu1_amd64 bug

NAME

       fi_psm - The PSM Fabric Provider

OVERVIEW

       The  psm provider runs over the PSM 1.x interface that is currently supported by the Intel
       TrueScale Fabric.  PSM provides tag-matching message queue functions  that  are  optimized
       for  MPI  implementations.   PSM  also  has  limited  Active Message support, which is not
       officially published but is quite stable and well documented in the source code  (part  of
       the  OFED  release).   The  psm  provider makes use of both the tag-matching message queue
       functions and the Active Message functions to support a variety of libfabric data transfer
       APIs, including tagged message queue, message queue, RMA, and atomic operations.

       The  psm provider can work with the psm2-compat library, which exposes a PSM 1.x interface
       over the Intel Omni-Path Fabric.

LIMITATIONS

       The psm provider doesn’t support all the features defined in the libfabric API.  Here  are
       some of the limitations:

       Endpoint types
              Only support non-connection based types FI_DGRAM and FI_RDM

       Endpoint capabilities
              Endpoints  can  support  any  combination  of data transfer capabilities FI_TAGGED,
              FI_MSG, FI_ATOMICS, and FI_RMA.  These  capabilities  can  be  further  refined  by
              FI_SEND,  FI_RECV,  FI_READ, FI_WRITE, FI_REMOTE_READ, and FI_REMOTE_WRITE to limit
              the direction of operations.  The limitation is that  no  two  endpoints  can  have
              overlapping receive or RMA target capabilities in any of the above categories.  For
              example it is fine to have two endpoints with FI_TAGGED  |  FI_SEND,  one  endpoint
              with  FI_TAGGED  |  FI_RECV,  one  endpoint with FI_MSG, one endpoint with FI_RMA |
              FI_ATOMICS.  But it is not allowed to have two endpoints  with  FI_TAGGED,  or  two
              endpoints with FI_RMA.

       FI_MULTI_RECV is supported for non-tagged message queue only.

       Other supported capabilities include FI_TRIGGER.

       Modes  FI_CONTEXT  is required for the FI_TAGGED and FI_MSG capabilities.  That means, any
              request belonging to these two categories that generates a completion must pass  as
              the  operation  context  a  valid  pointer to type struct fi_context, and the space
              referenced by the pointer must remain untouched until the  request  has  completed.
              If none of FI_TAGGED and FI_MSG is asked for, the FI_CONTEXT mode is not required.

       Progress
              The  psm  provider  requires  manual progress.  The application is expected to call
              fi_cq_read or fi_cntr_read function from time  to  time  when  no  other  libfabric
              function  is  called  to  ensure progress is made in a timely manner.  The provider
              does support auto progress mode.  However, the  performance  can  be  significantly
              impacted if the application purely depends on the provider to make auto progress.

       Unsupported features
              These  features  are unsupported: connection management, scalable endpoint, passive
              endpoint, shared receive context, send/inject with immediate data.

RUNTIME PARAMETERS

       The psm provider checks for the following environment variables:

       FI_PSM_UUID
              PSM requires that each job has a unique ID (UUID).  All the processes in  the  same
              job  need  to use the same UUID in order to be able to talk to each other.  The PSM
              reference manual advises to  keep  UUID  unique  to  each  job.   In  practice,  it
              generally  works  fine  to reuse UUID as long as (1) no two jobs with the same UUID
              are running at the same time; and (2) previous jobs with the same UUID have  exited
              normally.   If  running  into  “resource  busy” or “connection failure” issues with
              unknown reason, it is advisable to manually set the UUID to a value different  from
              the default.

       The default UUID is 0FFF0FFF-0000-0000-0000-0FFF0FFF0FFF.

       FI_PSM_NAME_SERVER
              The  psm  provider has a simple built-in name server that can be used to resolve an
              IP address or host name into a transport address needed by the  fi_av_insert  call.
              The  main  purpose  of  this  name  server  is  to  allow simple client-server type
              applications (such as those in fabtests)  to  be  written  purely  with  libfabric,
              without  using any out-of-band communication mechanism.  For such applications, the
              server would run first to allow endpoints be created and registered with  the  name
              server,  and  then  the client would call fi_getinfo with the node parameter set to
              the IP address or host name of the server.  The resulting fi_info  structure  would
              have  the  transport address of the endpoint created by the server in the dest_addr
              field.  Optionally the service parameter can be used in addition to  node.   Notice
              that  the  service  number  is interpreted by the provider and is not a TCP/IP port
              number.

       The name server is on by default.  It can be turned off by  setting  the  variable  to  0.
       This  may save a small amount of resource since a separate thread is created when the name
       server is on.

       The provider detects OpenMPI and MPICH runs and changes the default setting to off.

       FI_PSM_TAGGED_RMA
              The RMA functions are implemented on top of the PSM Active Message functions.   The
              Active  Message  functions  have  limit on the size of data can be transferred in a
              single message.  Large transfers can be divided into  small  chunks  and  be  pipe-
              lined.  However, the bandwidth is sub-optimal by doing this way.

       The  psm provider use PSM tag-matching message queue functions to achieve higher bandwidth
       for large size RMA.  For this purpose, a bit is reserved from the tag  space  to  separate
       the RMA traffic from the regular tagged message queue.

       The option is on by default.  To turn it off set the variable to 0.

       FI_PSM_AM_MSG
              The  psm provider implements the non-tagged message queue over the PSM tag-matching
              message queue.  One tag bit is reserved for this purpose.  Alternatively, the  non-
              tagged  message  queue  can  be implemented over Active Message.  This experimental
              feature has slightly larger latency.

       This option is off by default.  To turn it on set the variable to 1.

       FI_PSM_DELAY
              Time (seconds) to sleep before closing PSM endpoints.  This is a workaround  for  a
              bug in some versions of PSM library.

       The default setting is 1.

       FI_PSM_TIMEOUT
              Timeout  (seconds)  for gracefully closing PSM endpoints.  A forced closing will be
              issued if timeout expires.

       The default setting is 5.

       FI_PSM_PROG_INTERVAL
              When auto progress is enabled (asked via  the  hints  to  fi_getinfo),  a  progress
              thread  is  created  to make progress calls from time to time.  This option set the
              interval (microseconds) between progress calls.

       The default setting is 1 if affinity is set, or 1000 if not.  See FI_PSM_PROG_AFFINITY.

       FI_PSM_PROG_AFFINITY
              When set, specify the set of CPU cores to set the progress thread affinity to.  The
              format   is  <start>[:<end>[:<stride>]][,<start>[:<end>[:<stride>]]]*,  where  each
              triplet <start>:<end>:<stride> defines a block of core_ids.  Both <start> and <end>
              can be either the core_id (when >=0) or core_id - num_cores (when <0).

       By default affinity is not set.

SEE ALSO

       fabric(7), fi_provider(7), fi_psm2(7), fi_psm3(7),

AUTHORS

       OpenFabrics.