Provided by: libfabric-dev_1.17.0-3ubuntu1_amd64 bug

NAME

       fi_shm - The SHM Fabric Provider

OVERVIEW

       The  SHM  provider  is a complete provider that can be used on Linux systems supporting shared memory and
       process_vm_readv/process_vm_writev  calls.   The  provider  is  intended  to   provide   high-performance
       communication between processes on the same system.

SUPPORTED FEATURES

       This release contains an initial implementation of the SHM provider that offers the following support:

       Endpoint types
              The provider supports only endpoint type FI_EP_RDM.

       Endpoint capabilities
              Endpoints  cna  support  any  combinations  of  the  following data transfer capabilities: FI_MSG,
              FI_TAGGED, FI_RMA, amd FI_ATOMICS.  These capabilities can be further defined by FI_SEND, FI_RECV,
              FI_READ, FI_WRITE, FI_REMOTE_READ, and FI_REMOTE_WRITE to limit the direction of operations.

       Modes  The provider does not require the use of any mode bits.

       Progress
              The  SHM provider supports FI_PROGRESS_MANUAL.  Receive side data buffers are not modified outside
              of completion processing routines.  The provider processes messages using three different methods,
              based  on  the  size  of  the  message.   For messages smaller than 4096 bytes, tx completions are
              generated immediately after the send.  For larger messages, tx completions are not generated until
              the receiving side has processed the message.

       Address Format
              The  SHM  provider  uses  the address format FI_ADDR_STR, which follows the general format pattern
              “[prefix]://[addr]”.  The application can provide addresses through the node or  hints  parameter.
              As long as the address is in a valid FI_ADDR_STR format (contains “://”), the address will be used
              as is.  If the application input is incorrectly formatted  or  no  input  was  provided,  the  SHM
              provider will resolve it according to the following SHM provider standards:

       (flags  &  FI_SOURCE)  ?   src_addr  :  dest_addr  = - if (node && service) : “fi_ns://node:service” - if
       (service) : “fi_ns://service” - if (node && !service) :  “fi_shm://node”  -  if  (!node  &&  !service)  :
       “fi_shm://PID”

       !(flags & FI_SOURCE) - src_addr = “fi_shm://PID”

       In  other  words,  if  the  application  provides  a  source  and/or destination address in an acceptable
       FI_ADDR_STR format (contains “://”), the call to util_getinfo will  successfully  fill  in  src_addr  and
       dest_addr with the provided input.  If the input is not in an ADDR_STR format, the shared memory provider
       will then create a proper FI_ADDR_STR  address  with  either  the  “fi_ns://”  (node/service  format)  or
       “fi_shm://”  (shm  format) prefixes signaling whether the addr is a “unique” address and does or does not
       need an extra endpoint name identifier appended in order to  make  it  unique.   For  the  shared  memory
       provider,  we  assume  that  the service (with or without a node) is enough to make it unique, but a node
       alone is not sufficient.  If only a node is provided, the “fi_shm://” prefix is used to signify  that  it
       is  not a unique address.  If no node or service are provided (and in the case of setting the src address
       without FI_SOURCE and no hints), the process ID will be used as a default address.  On endpoint creation,
       if  the  src_addr  has  the  “fi_shm://”  prefix,  the provider will append “:[uid]:[ep_idx]” as a unique
       endpoint name (essentially, in place of a service).  In the case of the “fi_ns://” prefix (or  any  other
       prefix if one was provided by the application), no supplemental information is required to make it unique
       and it will remain with only the application-defined address.  Note that the actual  endpoint  name  will
       not  include the FI_ADDR_STR "*://" prefix since it cannot be included in any shared memory region names.
       The provider will strip off the prefix before setting the endpoint name.   As  a  result,  the  addresses
       “fi_prefix1://my_node:my_service”  and  “fi_prefix2://my_node:my_service”  would  result in endpoints and
       regions of the same name.  The application can also override the endpoint name after creating an endpoint
       using setname() without any address format restrictions.

       Msg flags The provider currently only supports the FI_REMOTE_CQ_DATA msg flag.

       MR registration mode The provider implements FI_MR_VIRT_ADDR memory mode.

       Atomic  operations  The  provider  supports  all  combinations  of datatype and operations as long as the
       message is less than 4096 bytes (or 2048 for compare operations).

DSA

       Intel Data Streaming Accelerator (DSA) is an integrated accelerator in  Intel  Xeon  processors  starting
       with  Sapphire  Rapids  generation.   One of the capabilities of DSA is to offload memory copy operations
       from the CPU.  A system may have one or more DSA devices.  Each DSA device may  have  one  or  more  work
       queues.  The DSA specification can be found here.

       The  SAR  protocol  of SHM provider is enabled to take advantage of DSA to offload memory copy operations
       into and out of SAR buffers in shared memory regions.   To  fully  take  advantage  of  the  DSA  offload
       capability,  memory  copy  operations are performed asynchronously.  Copy initiator thread constructs the
       DSA commands and submits to work queues.  A copy operation may consists of more than  one  DSA  commands.
       In  such case, commands are spread across all available work queues in round robin fashion.  The progress
       thread checks for DSA command completions.  If the copy command successfully completes, it then  notifies
       the  peer  to consume the data.  If DSA encountered a page fault during command execution, the page fault
       is reported via completion records.  In such case, the progress thread accesses the page to  resolve  the
       page  fault  and  resubmits  the command after adjusting for partial completions.  One of the benefits of
       making memory copy operations asynchronous is that now data transfers between different target  endpoints
       can be initiated in parallel.  Use of Intel DSA in SAR protocol is disabled by default and can be enabled
       using an environment variable.  Note that CMA must be disabled, e.g. FI_SHM_DISABLE_CMA=0, in  order  for
       DSA to be used.  See the RUNTIME PARAMETERS section.

       Compiling  with  DSA  capabilities  depends on the accel-config library which can be found here.  Running
       with DSA requires using Linux Kernel 5.19.0-rc3 or later.

       DSA   devices   need   to   be   setup   just   once   before   runtime.    This    configuration    file
       (https://github.com/intel/idxd-config/blob/stable/contrib/configs/os_profile.conf)   can  be  used  as  a
       template with accel-config utility to configure the DSA devices.

LIMITATIONS

       The SHM provider has hard-coded maximums for supported queue sizes and data transfers.  These values  are
       reflected in the related fabric attribute structures

       EPs must be bound to both RX and TX CQs.

       No support for counters.

RUNTIME PARAMETERS

       The shm provider checks for the following environment variables:

       FI_SHM_SAR_THRESHOLD
              Maximum message size to use segmentation protocol before switching to mmap (only valid when CMA is
              not available).  Default: SIZE_MAX (18446744073709551615)

       FI_SHM_TX_SIZE
              Maximum number of outstanding tx operations.  Default 1024

       FI_SHM_RX_SIZE
              Maximum number of outstanding rx operations.  Default 1024

       FI_SHM_DISABLE_CMA
              Manually disables CMA.  Default false

       FI_SHM_USE_DSA_SAR
              Enables memory copy offload to Intel DSA in SAR protocol.  Default false

       FI_SHM_ENABLE_DSA_PAGE_TOUCH
              Enables CPU touching of memory pages in a DSA command descriptor when the page fault is  reported,
              so  that  there  is  valid  address  translation for the remaining addresses in the command.  This
              minimizes DSA page faults.  Default false # SEE ALSO

       fabric(7), fi_provider(7), fi_getinfo(3)

AUTHORS

       OpenFabrics.