Provided by: libfabric-dev_1.6.2-3ubuntu0.1_amd64 bug

NAME

       fi_endpoint - Fabric endpoint operations

       fi_endpoint / fi_scalable_ep / fi_passive_ep / fi_close
              Allocate or close an endpoint.

       fi_ep_bind
              Associate  an endpoint with hardware resources, such as event queues, completion queues, counters,
              address vectors, or shared transmit/receive contexts.

       fi_scalable_ep_bind
              Associate a scalable endpoint with an address vector

       fi_pep_bind
              Associate a passive endpoint with an event queue

       fi_enable
              Transitions an active endpoint into an enabled state.

       fi_cancel
              Cancel a pending asynchronous data transfer

       fi_ep_alias
              Create an alias to the endpoint

       fi_control
              Control endpoint operation.

       fi_getopt / fi_setopt
              Get or set endpoint options.

       fi_rx_context / fi_tx_context / fi_srx_context / fi_stx_context
              Open a transmit or receive context.

       fi_rx_size_left / fi_tx_size_left (DEPRECATED)
              Query the lower bound on how many RX/TX operations may be posted without  an  operation  returning
              -FI_EAGAIN.   This  functions  have been deprecated and will be removed in a future version of the
              library.

SYNOPSIS

              #include <rdma/fabric.h>

              #include <rdma/fi_endpoint.h>

              int fi_endpoint(struct fid_domain *domain, struct fi_info *info,
                  struct fid_ep **ep, void *context);

              int fi_scalable_ep(struct fid_domain *domain, struct fi_info *info,
                  struct fid_ep **sep, void *context);

              int fi_passive_ep(struct fi_fabric *fabric, struct fi_info *info,
                  struct fid_pep **pep, void *context);

              int fi_tx_context(struct fid_ep *sep, int index,
                  struct fi_tx_attr *attr, struct fid_ep **tx_ep,
                  void *context);

              int fi_rx_context(struct fid_ep *sep, int index,
                  struct fi_rx_attr *attr, struct fid_ep **rx_ep,
                  void *context);

              int fi_stx_context(struct fid_domain *domain,
                  struct fi_tx_attr *attr, struct fid_stx **stx,
                  void *context);

              int fi_srx_context(struct fid_domain *domain,
                  struct fi_rx_attr *attr, struct fid_ep **rx_ep,
                  void *context);

              int fi_close(struct fid *ep);

              int fi_ep_bind(struct fid_ep *ep, struct fid *fid, uint64_t flags);

              int fi_scalable_ep_bind(struct fid_ep *sep, struct fid *fid, uint64_t flags);

              int fi_pep_bind(struct fid_pep *pep, struct fid *fid, uint64_t flags);

              int fi_enable(struct fid_ep *ep);

              int fi_cancel(struct fid_ep *ep, void *context);

              int fi_ep_alias(struct fid_ep *ep, struct fid_ep **alias_ep, uint64_t flags);

              int fi_control(struct fid *ep, int command, void *arg);

              int fi_getopt(struct fid *ep, int level, int optname,
                  void *optval, size_t *optlen);

              int fi_setopt(struct fid *ep, int level, int optname,
                  const void *optval, size_t optlen);

              DEPRECATED ssize_t fi_rx_size_left(struct fid_ep *ep);

              DEPRECATED ssize_t fi_tx_size_left(struct fid_ep *ep);

ARGUMENTS

       fid : On creation, specifies a fabric or access domain.  On bind, identifies the event queue,  completion
       queue,  counter,  or address vector to bind to the endpoint.  In other cases, it's a fabric identifier of
       an associated resource.

       info : Details about the fabric interface endpoint to be opened, obtained from fi_getinfo.

       ep : A fabric endpoint.

       sep : A scalable fabric endpoint.

       pep : A passive fabric endpoint.

       context : Context associated with the endpoint or asynchronous operation.

       index : Index to retrieve a specific transmit/receive context.

       attr : Transmit or receive context attributes.

       flags : Additional flags to apply to the operation.

       command : Command of control operation to perform on endpoint.

       arg : Optional control argument.

       level : Protocol level at which the desired option resides.

       optname : The protocol option to read or set.

       optval : The option value that was read or to set.

       optlen : The size of the optval buffer.

DESCRIPTION

       Endpoints are transport level communication portals.  There  are  two  types  of  endpoints:  active  and
       passive.   Passive  endpoints  belong  to  a fabric domain and are most often used to listen for incoming
       connection requests.  However, a passive endpoint may be used to reserve a fabric  address  that  can  be
       granted to an active endpoint.  Active endpoints belong to access domains and can perform data transfers.

       Active  endpoints  may  be  connection-oriented or connectionless, and may provide data reliability.  The
       data transfer interfaces -- messages (fi_msg), tagged messages (fi_tagged),  RMA  (fi_rma),  and  atomics
       (fi_atomic)  --  are  associated  with active endpoints.  In basic configurations, an active endpoint has
       transmit and receive queues.  In general, operations that generate traffic on the fabric  are  posted  to
       the  transmit  queue.   This  includes  all  RMA and atomic operations, along with sent messages and sent
       tagged messages.  Operations that post buffers for receiving incoming data are submitted to  the  receive
       queue.

       Active  endpoints  are  created in the disabled state.  They must transition into an enabled state before
       accepting data transfer operations, including posting of receive buffers.  The fi_enable call is used  to
       transition  an  active  endpoint  into  an  enabled  state.  The fi_connect and fi_accept calls will also
       transition an endpoint into the enabled state, if it is not already active.

       In order to transition an endpoint into an enabled state,  it  must  be  bound  to  one  or  more  fabric
       resources.   An  endpoint  that  will  generate  asynchronous  completions,  either through data transfer
       operations or communication establishment events, must be bound to the appropriate completion  queues  or
       event  queues, respectively, before being enabled.  Additionally, endpoints that use manual progress must
       be associated with relevant completion queues or event queues in order to drive progress.  For  endpoints
       that  are  only  used  as  the  target  of RMA or atomic operations, this means binding the endpoint to a
       completion queue associated with receive processing.  Unconnected endpoints must be bound to  an  address
       vector.

       Once an endpoint has been activated, it may be associated with an address vector.  Receive buffers may be
       posted  to  it  and calls may be made to connection establishment routines.  Connectionless endpoints may
       also perform data transfers.

       The behavior of an endpoint may be adjusted by setting its  control  data  and  protocol  options.   This
       allows  the  underlying  provider  to  redirect  function  calls to implementations optimized to meet the
       desired application behavior.

       If an endpoint experiences a critical error, it will transition back into  a  disabled  state.   Critical
       errors  are  reported  through  the  event  queue  associated  with the EP.  In certain cases, a disabled
       endpoint may be re-enabled.  The ability to transition back into an enabled state  is  provider  specific
       and depends on the type of error that the endpoint experienced.  When an endpoint is disabled as a result
       of a critical error, all pending operations are discarded.

   fi_endpoint / fi_passive_ep / fi_scalable_ep
       fi_endpoint   allocates  a  new  active  endpoint.   fi_passive_ep  allocates  a  new  passive  endpoint.
       fi_scalable_ep allocates a scalable endpoint.  The properties and behavior of the  endpoint  are  defined
       based  on  the provided struct fi_info.  See fi_getinfo for additional details on fi_info.  fi_info flags
       that control the operation of an endpoint are defined below.  See section SCALABLE ENDPOINTS.

       If an active endpoint is allocated in order to accept a connection request, the fi_info parameter must be
       the same as the fi_info structure provided with the connection request (FI_CONNREQ) event.

       An active endpoint may acquire the properties of a passive endpoint by setting the fi_info  handle  field
       to  the  passive  endpoint  fabric  descriptor.  This is useful for applications that need to reserve the
       fabric address of an endpoint prior to knowing if the endpoint will be used on the active or passive side
       of a connection.  For example, this feature is useful for simulating socket semantics.   Once  an  active
       endpoint  acquires  the  properties of a passive endpoint, the passive endpoint is no longer bound to any
       fabric resources and must no longer be used.  The user is expected to close the  passive  endpoint  after
       opening the active endpoint in order to free up any lingering resources that had been used.

   fi_close
       Closes an endpoint and release all resources associated with it.

       When  closing  a  scalable  endpoint,  there  must  be  no  opened transmit contexts, or receive contexts
       associated with the scalable endpoint.  If resources are still associated with the scalable endpoint when
       attempting to close, the call will return -FI_EBUSY.

       Outstanding operations posted to the endpoint when fi_close  is  called  will  be  discarded.   Discarded
       operations  will silently be dropped, with no completions reported.  Additionally, a provider may discard
       previously completed operations from  the  associated  completion  queue(s).   The  behavior  to  discard
       completed operations is provider specific.

   fi_ep_bind
       fi_ep_bind  is used to associate an endpoint with hardware resources.  The common use of fi_ep_bind is to
       direct asynchronous operations associated with an endpoint to a completion queue.  An  endpoint  must  be
       bound with CQs capable of reporting completions for any asynchronous operation initiated on the endpoint.
       This  is  true  even for endpoints which are configured to suppress successful completions, in order that
       operations that complete in error may be reported to the user.   For  passive  endpoints,  this  requires
       binding the endpoint with an EQ that supports the communication management (CM) domain.

       An  active endpoint may direct asynchronous completions to different CQs, based on the type of operation.
       This is specified using fi_ep_bind flags.  The following flags may be used separately or  OR'ed  together
       when binding an endpoint to a completion domain CQ.

       FI_TRANSMIT  :  Directs  the  completion  of  outbound data transfer requests to the specified completion
       queue.  This includes send message, RMA, and atomic operations.

       FI_RECV : Directs the notification of inbound data transfers to the  specified  completion  queue.   This
       includes  received  messages.   This binding automatically includes FI_REMOTE_WRITE, if applicable to the
       endpoint.

       FI_SELECTIVE_COMPLETION : By default,  data  transfer  operations  generate  completion  entries  into  a
       completion  queue  after  they  have  successfully  completed.   Applications  can  use this bind flag to
       selectively enable when  completions  are  generated.   If  FI_SELECTIVE_COMPLETION  is  specified,  data
       transfer  operations  will not generate entries for successful completions unless FI_COMPLETION is set as
       an operational flag for the given operation.  FI_SELECTIVE_COMPLETION  must  be  OR'ed  with  FI_TRANSMIT
       and/or FI_RECV flags.

       When  FI_SELECTIVE_COMPLETION  is  set,  the  user  must  determine  when  a  request  that does NOT have
       FI_COMPLETION set has completed indirectly, usually based on the completion of  a  subsequent  operation.
       Use of this flag may improve performance by allowing the provider to avoid writing a completion entry for
       every operation.

       Example:  An  application  can  selectively  generate  send  completions  by  using the following general
       approach:

                fi_tx_attr::op_flags = 0; // default - no completion
                fi_ep_bind(ep, cq, FI_TRANSMIT | FI_SELECTIVE_COMPLETION);
                fi_send(ep, ...);                   // no completion
                fi_sendv(ep, ...);                  // no completion
                fi_sendmsg(ep, ..., FI_COMPLETION); // completion!
                fi_inject(ep, ...);                 // no completion

       Example: An application can selectively disable send completions by modifying the operational flags:

                fi_tx_attr::op_flags = FI_COMPLETION; // default - completion
                fi_ep_bind(ep, cq, FI_TRANSMIT | FI_SELECTIVE_COMPLETION);
                fi_send(ep, ...);       // completion
                fi_sendv(ep, ...);      // completion
                fi_sendmsg(ep, ..., 0); // no completion!
                fi_inject(ep, ...);     // no completion!

       Example: Omitting FI_SELECTIVE_COMPLETION when binding will generate completions  for  all  non-fi_inject
       calls:

                fi_tx_attr::op_flags = 0;
                fi_ep_bind(ep, cq, FI_TRANSMIT);    // default - completion
                fi_send(ep, ...);                   // completion
                fi_sendv(ep, ...);                  // completion
                fi_sendmsg(ep, ..., 0);             // completion!
                fi_sendmsg(ep, ..., FI_COMPLETION); // completion
                fi_sendmsg(ep, ..., FI_INJECT|FI_COMPLETION); // completion!
                fi_inject(ep, ...);                 // no completion!

       An  endpoint may also be bound to a fabric counter.  When binding an endpoint to a counter, the following
       flags may be specified.

       FI_SEND : Increments the specified counter whenever a message transfer initiated over  the  endpoint  has
       completed successfully or in error.  Sent messages include both tagged and normal message operations.

       FI_RECV  :  Increments  the specified counter whenever a message is received over the endpoint.  Received
       messages include both tagged and normal message operations.

       FI_READ : Increments the specified counter whenever an RMA read or atomic fetch operation initiated  from
       the endpoint has completed successfully or in error.

       FI_WRITE  : Increments the specified counter whenever an RMA write or atomic operation initiated from the
       endpoint has completed successfully or in error.

       FI_REMOTE_READ : Increments the specified counter whenever an RMA  read  or  atomic  fetch  operation  is
       initiated  from  a  remote  endpoint that targets the given endpoint.  Use of this flag requires that the
       endpoint be created using FI_RMA_EVENT.

       FI_REMOTE_WRITE : Increments the specified counter whenever an RMA write or atomic operation is initiated
       from a remote endpoint that targets the given endpoint.  Use of this flag requires that the  endpoint  be
       created using FI_RMA_EVENT.

       An endpoint may only be bound to a single CQ or counter for a given type of operation.  For example, a EP
       may  not  bind  to  two  counters  both  using FI_WRITE.  Furthermore, providers may limit CQ and counter
       bindings to endpoints of the same endpoint type (DGRAM, MSG, RDM, etc.).

       Connectionless endpoints must be bound to a single address vector.

       If an endpoint is using a shared transmit and/or receive context, the shared contexts must  be  bound  to
       the endpoint.  CQs, counters, AV, and shared contexts must be bound to endpoints before they are enabled.

   fi_scalable_ep_bind
       fi_scalable_ep_bind  is  used  to  associate  a scalable endpoint with an address vector.  See section on
       SCALABLE ENDPOINTS.  A scalable endpoint has a single transport level address and  can  support  multiple
       transmit  and  receive  contexts.   The  transmit and receive contexts share the transport-level address.
       Address vectors that are bound to scalable endpoints are implicitly bound  to  any  transmit  or  receive
       contexts created using the scalable endpoint.

   fi_enable
       This  call  transitions the endpoint into an enabled state.  An endpoint must be enabled before it may be
       used to perform data transfers.  Enabling an endpoint  typically  results  in  hardware  resources  being
       assigned  to  it.   Endpoints  making  use  of  completion queues, counters, event queues, and/or address
       vectors must be bound to them before being enabled.

       Calling connect or accept on an endpoint will implicitly enable an endpoint if it has  not  already  been
       enabled.

       fi_enable  may also be used to re-enable an endpoint that has been disabled as a result of experiencing a
       critical error.  Applications should check the return value from fi_enable to see if a disabled  endpoint
       has successfully be re-enabled.

   fi_cancel
       fi_cancel  attempts  to  cancel an outstanding asynchronous operation.  Canceling an operation causes the
       fabric provider to search for the operation and, if it is still  pending,  complete  it  as  having  been
       canceled.   An  error  queue  entry  will  be  available  in  the  associated error queue with error code
       FI_ECANCELED.  On the other hand, if the operation completed before  the  call  to  fi_cancel,  then  the
       completion  status  of  that operation will be available in the associated completion queue.  No specific
       entry related to fi_cancel itself will be posted.

       Cancel uses the context parameter associated with  an  operation  to  identify  the  request  to  cancel.
       Operations  posted  without  a valid context parameter -- either no context parameter is specified or the
       context value was ignored by the provider -- cannot be  canceled.   If  multiple  outstanding  operations
       match the context parameter, only one will be canceled.  In this case, the operation which is canceled is
       provider  specific.   The  cancel operation is asynchronous, but will complete within a bounded period of
       time.

   fi_ep_alias
       This call creates an alias to the specified  endpoint.   Conceptually,  an  endpoint  alias  provides  an
       alternate  software  path  from the application to the underlying provider hardware.  An alias EP differs
       from its parent endpoint only by its default data transfer flags.   For  example,  an  alias  EP  may  be
       configured  to  use a different completion mode.  By default, an alias EP inherits the same data transfer
       flags as the parent endpoint.  An application can use fi_control  to  modify  the  alias  EP  operational
       flags.

       When  allocating an alias, an application may configure either the transmit or receive operational flags.
       This avoids needing a separate call to fi_control to set those flags.  The flags  passed  to  fi_ep_alias
       must include FI_TRANSMIT or FI_RECV (not both) with other operational flags OR'ed in.  This will override
       the  transmit  or  receive  flags,  respectively,  for operations posted through the alias endpoint.  All
       allocated aliases must be closed for the underlying endpoint to be released.

   fi_control
       The control operation is used to adjust the default behavior of an endpoint.  It  allows  the  underlying
       provider  to  redirect  function  calls  to  implementations  optimized  to  meet the desired application
       behavior.  As a result, calls to fi_ep_control must be serialized against all other calls to an endpoint.

       The base operation of an endpoint is selected  during  creation  using  struct  fi_info.   The  following
       control commands and arguments may be assigned to an endpoint.

       **FI_GETOPSFLAG  --  uint64_t  *flags** : Used to retrieve the current value of flags associated with the
       data transfer operations initiated on the endpoint.  The control argument  must  include  FI_TRANSMIT  or
       FI_RECV  (not  both)  flags  to indicate the type of data transfer flags to be returned.  See below for a
       list of control flags.

       **FI_SETOPSFLAG -- uint64_t *flags** : Used to change the data transfer operation flags  associated  with
       an endpoint.  The control argument must include FI_TRANSMIT or FI_RECV (not both) to indicate the type of
       data  transfer  that the flags should apply to, with other flags OR'ed in.  The given flags will override
       the previous transmit and receive attributes that were set when the endpoint was created.  Valid  control
       flags are defined below.

       **FI_BACKLOG  -  int  *value**  :  This  option only applies to passive endpoints.  It is used to set the
       connection request backlog for listening endpoints.

       FI_GETWAIT (void **) : This command allows the user to retrieve the file  descriptor  associated  with  a
       socket  endpoint.  The fi_control arg parameter should be an address where a pointer to the returned file
       descriptor will be written.  See fi_eq.3 for addition details using fi_control with FI_GETWAIT.  The file
       descriptor may be used for notification that the endpoint is ready to send or receive data.

   fi_getopt / fi_setopt
       Endpoint protocol operations may be retrieved using  fi_getopt  or  set  using  fi_setopt.   Applications
       specify  the level that a desired option exists, identify the option, and provide input/output buffers to
       get or set the option.  fi_setopt provides  an  application  a  way  to  adjust  low-level  protocol  and
       implementation specific details of an endpoint.

       The following option levels and option names and parameters are defined.

       FI_OPT_ENDPOINTFI_OPT_MIN_MULTI_RECV  -  size_t  : Defines the minimum receive buffer space available when the receive
         buffer is released by the provider (see FI_MULTI_RECV).  Modifying this value is only guaranteed to set
         the minimum buffer space needed on receives posted after the value has been changed.  It is recommended
         that applications that want to override  the  default  MIN_MULTI_RECV  value  set  this  option  before
         enabling the corresponding endpoint.

       • FI_OPT_CM_DATA_SIZE  -  size_t  :  Defines  the size of available space in CM messages for user-defined
         data.  This value limits the amount of data that applications can exchange between peer endpoints using
         the fi_connect, fi_accept,  and  fi_reject  operations.   The  size  returned  is  dependent  upon  the
         properties  of  the  endpoint,  except in the case of passive endpoints, in which the size reflects the
         maximum size of the data that may be present as part of a connection request  event.   This  option  is
         read only.

   fi_rx_size_left (DEPRECATED)
       This  function has been deprecated and will be removed in a future version of the library.  It may not be
       supported by all providers.

       The fi_rx_size_left call returns a lower bound on the number of receive operations that may be posted  to
       the given endpoint without that operation returning -FI_EAGAIN.  Depending on the specific details of the
       subsequently  posted  receive  operations (e.g., number of iov entries, which receive function is called,
       etc.), it may be possible to post more receive operations than originally indicated by fi_rx_size_left.

   fi_tx_size_left (DEPRECATED)
       This function has been deprecated and will be removed in a future version of the library.  It may not  be
       supported by all providers.

       The fi_tx_size_left call returns a lower bound on the number of transmit operations that may be posted to
       the given endpoint without that operation returning -FI_EAGAIN.  Depending on the specific details of the
       subsequently  posted transmit operations (e.g., number of iov entries, which transmit function is called,
       etc.), it may be possible to post more transmit operations than originally indicated by fi_tx_size_left.

ENDPOINT ATTRIBUTES

       The fi_ep_attr structure defines the set of attributes associated with an endpoint.  Endpoint  attributes
       may be further refined using the transmit and receive context attributes as shown below.

              struct fi_ep_attr {
                  enum fi_ep_type type;
                  uint32_t        protocol;
                  uint32_t        protocol_version;
                  size_t          max_msg_size;
                  size_t          msg_prefix_size;
                  size_t          max_order_raw_size;
                  size_t          max_order_war_size;
                  size_t          max_order_waw_size;
                  uint64_t        mem_tag_format;
                  size_t          tx_ctx_cnt;
                  size_t          rx_ctx_cnt;
                  size_t          auth_key_size;
                  uint8_t         *auth_key;
              };

   type - Endpoint Type
       If specified, indicates the type of fabric interface communication desired.  Supported types are:

       FI_EP_UNSPEC  :  The  type  of  endpoint is not specified.  This is usually provided as input, with other
       attributes of the endpoint or the provider selecting the type.

       FI_EP_MSG : Provides a reliable,  connection-oriented  data  transfer  service  with  flow  control  that
       maintains message boundaries.

       FI_EP_DGRAM  :  Supports  a  connectionless,  unreliable  datagram communication.  Message boundaries are
       maintained, but the maximum message size may  be  limited  to  the  fabric  MTU.   Flow  control  is  not
       guaranteed.

       FI_EP_RDM  : Reliable datagram message.  Provides a reliable, unconnected data transfer service with flow
       control that maintains message boundaries.

       FI_EP_SOCK_STREAM : Data streaming  endpoint  with  TCP  socket-like  semantics.   Provides  a  reliable,
       connection-oriented  data  transfer service that does not maintain message boundaries.  FI_EP_SOCK_STREAM
       is most useful for applications designed around using TCP sockets.  See the SOCKET ENDPOINT  section  for
       additional details and restrictions that apply to stream endpoints.

       FI_EP_SOCK_DGRAM  :  A  connectionless,  unreliable  datagram  endpoint  with  UDP socket-like semantics.
       FI_EP_SOCK_DGRAM is most useful for applications designed around  using  UDP  sockets.   See  the  SOCKET
       ENDPOINT section for additional details and restrictions that apply to datagram socket endpoints.

   Protocol
       Specifies  the  low-level end to end protocol employed by the provider.  A matching protocol must be used
       by communicating endpoints to ensure  interoperability.   The  following  protocol  values  are  defined.
       Provider  specific  protocols  are also allowed.  Provider specific protocols will be indicated by having
       the upper bit of the protocol value set to one.

       FI_PROTO_UNSPEC : The protocol is  not  specified.   This  is  usually  provided  as  input,  with  other
       attributes of the socket or the provider selecting the actual protocol.

       FI_PROTO_RDMA_CM_IB_RC : The protocol runs over Infiniband reliable-connected queue pairs, using the RDMA
       CM protocol for connection establishment.

       FI_PROTO_IWARP : The protocol runs over the Internet wide area RDMA protocol transport.

       FI_PROTO_IB_UD : The protocol runs over Infiniband unreliable datagram queue pairs.

       FI_PROTO_PSMX  :  The protocol is based on an Intel proprietary protocol known as PSM, performance scaled
       messaging.  PSMX is an extended version of the PSM protocol to support the libfabric interfaces.

       FI_PROTO_UDP :  The  protocol  sends  and  receives  UDP  datagrams.   For  example,  an  endpoint  using
       FI_PROTO_UDP  will  be  able  to communicate with a remote peer that is using Berkeley SOCK_DGRAM sockets
       using IPPROTO_UDP.

       FI_PROTO_SOCK_TCP : The protocol is layered over TCP packets.

       FI_PROTO_IWARP_RDM : Reliable-datagram protocol implemented over iWarp reliable-connected queue pairs.

       FI_PROTO_IB_RDM : Reliable-datagram protocol implemented over InfiniBand reliable-connected queue pairs.

       FI_PROTO_GNI : Protocol runs over Cray GNI low-level interface.

       FI_PROTO_RXM : Reliable-datagram protocol implemented over message endpoints.  RXM is a libfabric utility
       component that adds RDM endpoint semantics over MSG endpoint semantics.

       FI_PROTO_RXD : Reliable-datagram protocol implemented  over  datagram  endpoints.   RXD  is  a  libfabric
       utility component that adds RDM endpoint semantics over DGRAM endpoint semantics.

       FI_PROTO_NETWORKDIRECT  :  Protocol  runs  over Microsoft NetworkDirect service provider interface.  This
       adds reliable-datagram semantics over the NetworkDirect connection- oriented endpoint semantics.

       FI_PROTO_PSMX2 : The protocol is based on an Intel proprietary protocol known as PSM2, performance scaled
       messaging version 2.  PSMX2 is an extended  version  of  the  PSM2  protocol  to  support  the  libfabric
       interfaces.

   protocol_version - Protocol Version
       Identifies  which  version  of  the  protocol  is  employed by the provider.  The protocol version allows
       providers to extend an existing protocol, by adding support for additional features or functionality  for
       example, in a backward compatible manner.  Providers that support different versions of the same protocol
       should inter-operate, but only when using the capabilities defined for the lesser version.

   max_msg_size - Max Message Size
       Defines the maximum size for an application data transfer as a single operation.

   msg_prefix_size - Message Prefix Size
       Specifies  the  size  of  any  required  message  prefix  buffer  space.  This field will be 0 unless the
       FI_MSG_PREFIX mode is enabled.  If msg_prefix_size is > 0 the specified  value  will  be  a  multiple  of
       8-bytes.

   Max RMA Ordered Size
       The  maximum  ordered  size specifies the delivery order of transport data into target memory for RMA and
       atomic operations.  Data ordering is separate, but dependent on message ordering (defined  below).   Data
       ordering is unspecified where message order is not defined.

       Data ordering refers to the access of target memory by subsequent operations.  When back to back RMA read
       or  write  operations  access  the  same  registered memory location, data ordering indicates whether the
       second operation reads or writes the target memory after the first operation has completed.  Because  RMA
       ordering  applies  between two operations, and not within a single data transfer, ordering is defined per
       byte-addressable memory location.  I.e.  ordering specifies whether location X is accessed by the  second
       operation  after  the  first  operation.   Nothing is implied about the completion of the first operation
       before the second operation is initiated.

       In order to support large data transfers being broken into multiple packets and sent using multiple paths
       through the fabric, data ordering may be limited to transfers of a  specific  size  or  less.   Providers
       specify  when  data ordering is maintained through the following values.  Note that even if data ordering
       is not maintained, message ordering may be.

       max_order_raw_size : Read after write size.  If set, an RMA or atomic read operation issued after an  RMA
       or  atomic  write  operation, both of which are smaller than the size, will be ordered.  Where the target
       memory locations overlap, the RMA or atomic read operation will see the results of the  previous  RMA  or
       atomic write.

       max_order_war_size : Write after read size.  If set, an RMA or atomic write operation issued after an RMA
       or  atomic  read  operation, both of which are smaller than the size, will be ordered.  The RMA or atomic
       read operation will see the initial value of the target memory location before a subsequent RMA or atomic
       write updates the value.

       max_order_waw_size : Write after write size.  If set, an RMA or atomic write operation  issued  after  an
       RMA  or  atomic  write  operation,  both of which are smaller than the size, will be ordered.  The target
       memory location will reflect the results of the second RMA or atomic write.

       An order size value of 0 indicates that ordering is not guaranteed.  A value of  -1  guarantees  ordering
       for any data size.

   mem_tag_format - Memory Tag Format
       The  memory  tag  format is a bit array used to convey the number of tagged bits supported by a provider.
       Additionally, it may be used to divide the bit array into separate fields.  The mem_tag_format optionally
       begins with a series of bits set to 0, to signify bits which are ignored by the provider.  Following  the
       initial  prefix  of  ignored bits, the array will consist of alternating groups of bits set to all 1's or
       all 0's.  Each group of bits corresponds to a tagged field.  The implication of defining a  tagged  field
       is  that when a mask is applied to the tagged bit array, all bits belonging to a single field will either
       be set to 1 or 0, collectively.

       For example, a mem_tag_format of 0x30FF indicates support for 14 tagged bits, separated  into  3  fields.
       The first field consists of 2-bits, the second field 4-bits, and the final field 8-bits.  Valid masks for
       such  a  tagged  field would be a bitwise OR'ing of zero or more of the following values: 0x3000, 0x0F00,
       and 0x00FF.

       By identifying fields within a tag, a provider may  be  able  to  optimize  their  search  routines.   An
       application  which requests tag fields must provide tag masks that either set all mask bits corresponding
       to a field to all 0 or all 1.  When negotiating tag fields, an application can request a specific  number
       of  fields  of  a  given size.  A provider must return a tag format that supports the requested number of
       fields, with each field being at least the size requested, or fail the request.  A provider may  increase
       the  size of the fields.  When reporting completions (see FI_CQ_FORMAT_TAGGED), the provider must provide
       the exact value of the received tag, clearing out any unsupported tag bits.

       It is recommended that field sizes be ordered from smallest to largest.  A generic, unstructured tag  and
       mask can be achieved by requesting a bit array consisting of alternating 1's and 0's.

   tx_ctx_cnt - Transmit Context Count
       Number  of  transmit  contexts  to  associate with the endpoint.  If not specified (0), 1 context will be
       assigned if the endpoint supports outbound transfers.  Transmit contexts are independent transmit  queues
       that  may be separately configured.  Each transmit context may be bound to a separate CQ, and no ordering
       is defined between contexts.  Additionally, no synchronization  is  needed  when  accessing  contexts  in
       parallel.

       If  the  count  is  set  to  the value FI_SHARED_CONTEXT, the endpoint will be configured to use a shared
       transmit context, if supported by the provider.  Providers that do not support shared  transmit  contexts
       will fail the request.

       See the scalable endpoint and shared contexts sections for additional details.

   rx_ctx_cnt - Receive Context Count
       Number  of receive contexts to associate with the endpoint.  If not specified, 1 context will be assigned
       if the endpoint supports inbound transfers.  Receive contexts are independent processing queues that  may
       be separately configured.  Each receive context may be bound to a separate CQ, and no ordering is defined
       between contexts.  Additionally, no synchronization is needed when accessing contexts in parallel.

       If  the  count  is  set  to  the value FI_SHARED_CONTEXT, the endpoint will be configured to use a shared
       receive context, if supported by the provider.  Providers that do not  support  shared  receive  contexts
       will fail the request.

       See the scalable endpoint and shared contexts sections for additional details.

   auth_key_size - Authorization Key Length
       The  length  of  the  authorization  key  in  bytes.   This field will be 0 if authorization keys are not
       available or used.  This field is ignored unless the fabric is opened with API version 1.5 or greater.

   auth_key - Authorization Key
       If supported by the fabric, an authorization key (a.k.a.  job key) to associate with  the  endpoint.   An
       authorization  key  is  used  to  limit  communication  between  endpoints.  Only peer endpoints that are
       programmed to use the same authorization key may communicate.   Authorization  keys  are  often  used  to
       implement job keys, to ensure that processes running in different jobs do not accidentally cross traffic.
       The domain authorization key will be used if auth_key_size is set to 0.  This field is ignored unless the
       fabric is opened with API version 1.5 or greater.

TRANSMIT CONTEXT ATTRIBUTES

       Attributes specific to the transmit capabilities of an endpoint are specified using struct fi_tx_attr.

              struct fi_tx_attr {
                  uint64_t  caps;
                  uint64_t  mode;
                  uint64_t  op_flags;
                  uint64_t  msg_order;
                  uint64_t  comp_order;
                  size_t    inject_size;
                  size_t    size;
                  size_t    iov_limit;
                  size_t    rma_iov_limit;
              };

   caps - Capabilities
       The  requested  capabilities of the context.  The capabilities must be a subset of those requested of the
       associated endpoint.  See the CAPABILITIES section of fi_getinfo(3) for capability details.  If the  caps
       field is 0 on input to fi_getinfo(3), the caps value from the fi_info structure will be used.

   mode
       The  operational  mode  bits of the context.  The mode bits will be a subset of those associated with the
       endpoint.  See the MODE section of fi_getinfo(3) for details.  A mode value of 0 will be ignored on input
       to fi_getinfo(3),  with  the  mode  value  of  the  fi_info  structure  used  instead.   On  return  from
       fi_getinfo(3), the mode will be set only to those constraints specific to transmit operations.

   op_flags - Default transmit operation flags
       Flags  that  control  the  operation  of  operations submitted against the context.  Applicable flags are
       listed in the Operation Flags section.

   msg_order - Message Ordering
       Message ordering refers to the order in which transport layer headers (as viewed by the application)  are
       processed.   Relaxed message order enables data transfers to be sent and received out of order, which may
       improve performance by utilizing multiple paths through the fabric from  the  initiating  endpoint  to  a
       target  endpoint.   Message  order  applies  only  between a single source and destination endpoint pair.
       Ordering between different target endpoints is not defined.

       Message order is determined using a set of ordering bits.   Each  set  bit  indicates  that  ordering  is
       maintained  between  data  transfers of the specified type.  Message order is defined for [read | write |
       send] operations submitted by an application after [read | write | send] operations.

       Message ordering only applies to the end to end transmission of transport headers.  Message  ordering  is
       necessary,  but  does not guarantee, the order in which message data is sent or received by the transport
       layer.  Message ordering requires matching ordering semantics on the receiving side of  a  data  transfer
       operation in order to guarantee that ordering is met.

       FI_ORDER_NONE : No ordering is specified.  This value may be used as input in order to obtain the default
       message order supported by the provider.  FI_ORDER_NONE is an alias for the value 0.

       FI_ORDER_RAR  :  Read  after  read.   If set, RMA and atomic read operations are transmitted in the order
       submitted relative to other RMA and atomic read operations.  If not set, RMA  and  atomic  reads  may  be
       transmitted out of order from their submission.

       FI_ORDER_RAW  :  Read  after  write.  If set, RMA and atomic read operations are transmitted in the order
       submitted relative to RMA and atomic write  operations.   If  not  set,  RMA  and  atomic  reads  may  be
       transmitted ahead of RMA and atomic writes.

       FI_ORDER_RAS  :  Read  after  send.   If set, RMA and atomic read operations are transmitted in the order
       submitted relative to message send operations, including tagged sends.  If not set, RMA and atomic  reads
       may be transmitted ahead of sends.

       FI_ORDER_WAR  :  Write  after read.  If set, RMA and atomic write operations are transmitted in the order
       submitted relative to RMA and atomic read  operations.   If  not  set,  RMA  and  atomic  writes  may  be
       transmitted ahead of RMA and atomic reads.

       FI_ORDER_WAW  :  Write after write.  If set, RMA and atomic write operations are transmitted in the order
       submitted relative to other RMA and atomic write operations.  If not set, RMA and atomic  writes  may  be
       transmitted out of order from their submission.

       FI_ORDER_WAS  :  Write  after send.  If set, RMA and atomic write operations are transmitted in the order
       submitted relative to message send operations, including tagged sends.  If not set, RMA and atomic writes
       may be transmitted ahead of sends.

       FI_ORDER_SAR : Send after read.  If set, message send operations, including tagged sends, are transmitted
       in order submitted relative to RMA and atomic  read  operations.   If  not  set,  message  sends  may  be
       transmitted ahead of RMA and atomic reads.

       FI_ORDER_SAW  :  Send  after  write.   If  set,  message  send  operations,  including  tagged sends, are
       transmitted in order submitted relative to RMA and atomic write operations.  If not  set,  message  sends
       may be transmitted ahead of RMA and atomic writes.

       FI_ORDER_SAS : Send after send.  If set, message send operations, including tagged sends, are transmitted
       in  the order submitted relative to other message send.  If not set, message sends may be transmitted out
       of order from their submission.

   comp_order - Completion Ordering
       Completion ordering refers to the order in which completed  requests  are  written  into  the  completion
       queue.   Completion  ordering  is  similar  to message order.  Relaxed completion order may enable faster
       reporting of completed transfers, allow acknowledgments to be  sent  over  different  fabric  paths,  and
       support  more sophisticated retry mechanisms.  This can result in lower-latency completions, particularly
       when using unconnected endpoints.  Strict completion ordering may require that providers queue  completed
       operations or limit available optimizations.

       For  transmit  requests,  completion ordering depends on the endpoint communication type.  For unreliable
       communication, completion ordering applies to all data transfer requests submitted to an  endpoint.   For
       reliable  communication,  completion  ordering  only applies to requests that target a single destination
       endpoint.  Completion ordering of requests that target different endpoints over a reliable  transport  is
       not defined.

       Applications  should  specify  the  completion  ordering  that they support or require.  Providers should
       return the completion order that they actually provide, with the constraint that the returned ordering is
       stricter than that specified by the application.  Supported completion order values are:

       FI_ORDER_NONE : No ordering is defined for completed operations.   Requests  submitted  to  the  transmit
       context may complete in any order.

       FI_ORDER_STRICT : Requests complete in the order in which they are submitted to the transmit context.

   inject_size
       The  requested inject operation size (see the FI_INJECT flag) that the context will support.  This is the
       maximum size data transfer that can be associated with an inject operation (such as fi_inject) or may  be
       used with the FI_INJECT data transfer flag.

   size
       The  size of the context.  The size is specified as the minimum number of transmit operations that may be
       posted to the endpoint without the operation returning -FI_EAGAIN.

   iov_limit
       This is the maximum number of IO vectors (scatter-gather elements) that a  single  posted  operation  may
       reference.

   rma_iov_limit
       This  is  the  maximum number of RMA IO vectors (scatter-gather elements) that an RMA or atomic operation
       may reference.  The rma_iov_limit corresponds to the rma_iov_count values in RMA and  atomic  operations.
       See struct fi_msg_rma and struct fi_msg_atomic in fi_rma.3 and fi_atomic.3, for additional details.  This
       limit  applies  to  both  the number of RMA IO vectors that may be specified when initiating an operation
       from the local endpoint, as well as the maximum number of IO vectors that may  be  carried  in  a  single
       request from a remote endpoint.

RECEIVE CONTEXT ATTRIBUTES

       Attributes specific to the receive capabilities of an endpoint are specified using struct fi_rx_attr.

              struct fi_rx_attr {
                  uint64_t  caps;
                  uint64_t  mode;
                  uint64_t  op_flags;
                  uint64_t  msg_order;
                  uint64_t  comp_order;
                  size_t    total_buffered_recv;
                  size_t    size;
                  size_t    iov_limit;
              };

   caps - Capabilities
       The  requested  capabilities of the context.  The capabilities must be a subset of those requested of the
       associated endpoint.  See the CAPABILITIES section if fi_getinfo(3) for capability details.  If the  caps
       field is 0 on input to fi_getinfo(3), the caps value from the fi_info structure will be used.

   mode
       The  operational  mode  bits of the context.  The mode bits will be a subset of those associated with the
       endpoint.  See the MODE section of fi_getinfo(3) for details.  A mode value of 0 will be ignored on input
       to fi_getinfo(3),  with  the  mode  value  of  the  fi_info  structure  used  instead.   On  return  from
       fi_getinfo(3), the mode will be set only to those constraints specific to receive operations.

   op_flags - Default receive operation flags
       Flags  that  control  the  operation  of  operations submitted against the context.  Applicable flags are
       listed in the Operation Flags section.

   msg_order - Message Ordering
       For a description of message ordering, see the msg_order field in the Transmit Context Attribute section.
       Receive context message ordering defines the order  in  which  received  transport  message  headers  are
       processed when received by an endpoint.

       The  following ordering flags, as defined for transmit ordering, also apply to the processing of received
       operations:  FI_ORDER_NONE,  FI_ORDER_RAR,  FI_ORDER_RAW,   FI_ORDER_RAS,   FI_ORDER_WAR,   FI_ORDER_WAW,
       FI_ORDER_WAS, FI_ORDER_SAR, FI_ORDER_SAW, and FI_ORDER_SAS.

   comp_order - Completion Ordering
       For  a  description  of  completion  ordering, see the comp_order field in the Transmit Context Attribute
       section.

       FI_ORDER_NONE : No ordering is defined for completed operations.  Receive operations may complete in  any
       order, regardless of their submission order.

       FI_ORDER_STRICT  :  Receive  operations  complete in the order in which they are processed by the receive
       context, based on the receive side msg_order attribute.

       FI_ORDER_DATA : When set, this bit indicates that received data is written into memory  in  order.   Data
       ordering  applies  to  memory  accessed  as  part of a single operation and between operations if message
       ordering is guaranteed.

   total_buffered_recv
       This field is supported for backwards compatibility purposes.  It is a hint to the provider of the  total
       available  space  that  may be needed to buffer messages that are received for which there is no matching
       receive operation.  The provider may adjust or ignore this value.  The  allocation  of  internal  network
       buffering  among  received  message is provider specific.  For instance, a provider may limit the size of
       messages which can be buffered or the amount of buffering allocated to a single message.

       If receive side buffering is disabled (total_buffered_recv = 0) and a message is received by an endpoint,
       then the behavior is dependent on whether resource management has been enabled (FI_RM_ENABLED has be  set
       or  not).   See  the  Resource  Management  section  of  fi_domain.3  for  further  clarification.  It is
       recommended that  applications  enable  resource  management  if  they  anticipate  receiving  unexpected
       messages, rather than modifying this value.

   size
       The  size  of the context.  The size is specified as the minimum number of receive operations that may be
       posted to the endpoint without the operation returning -FI_EAGAIN.

   iov_limit
       This is the maximum number of IO vectors (scatter-gather elements) that a  single  posted  operating  may
       reference.

SCALABLE ENDPOINTS

       A  scalable  endpoint  is  a  communication  portal that supports multiple transmit and receive contexts.
       Scalable endpoints are loosely modeled after the networking concept  of  transmit/receive  side  scaling,
       also  known  as  multi-queue.  Support for scalable endpoints is domain specific.  Scalable endpoints may
       improve the performance of multi-threaded and  parallel  applications,  by  allowing  threads  to  access
       independent transmit and receive queues.  A scalable endpoint has a single transport level address, which
       can  reduce  the  memory  requirements  needed  to  store  remote  addressing data, versus using standard
       endpoints.  Scalable endpoints cannot be used directly for  communication  operations,  and  require  the
       application to explicitly create transmit and receive contexts as described below.

   fi_tx_context
       Transmit contexts are independent transmit queues.  Ordering and synchronization between contexts are not
       defined.   Conceptually  a  transmit context behaves similar to a send-only endpoint.  A transmit context
       may be configured with fewer capabilities than the base endpoint and with different attributes  (such  as
       ordering  requirements  and  inject size) than other contexts associated with the same scalable endpoint.
       Each transmit context has its own completion queue.  The number of transmit contexts associated  with  an
       endpoint is specified during endpoint creation.

       The  fi_tx_context  call  is  used  to retrieve a specific context, identified by an index (see above for
       details on transmit context attributes).  Providers may dynamically allocate contexts when  fi_tx_context
       is  called,  or  may  statically create all contexts when fi_endpoint is invoked.  By default, a transmit
       context inherits the properties of its associated endpoint.  However, applications  may  request  context
       specific  attributes through the attr parameter.  Support for per transmit context attributes is provider
       specific and not guaranteed.  Providers will return the actual attributes assigned to the context through
       the attr parameter, if provided.

   fi_rx_context
       Receive  contexts  are  independent  receive  queues  for  receiving   incoming   data.    Ordering   and
       synchronization between contexts are not guaranteed.  Conceptually a receive context behaves similar to a
       receive-only  endpoint.   A  receive  context  may  be  configured  with fewer capabilities than the base
       endpoint and with different attributes (such  as  ordering  requirements  and  inject  size)  than  other
       contexts  associated with the same scalable endpoint.  Each receive context has its own completion queue.
       The number of receive contexts associated with an endpoint is specified during endpoint creation.

       Receive contexts are often associated with steering flows, that specify which incoming packets  targeting
       a  scalable endpoint to process.  However, receive contexts may be targeted directly by the initiator, if
       supported by the underlying protocol.  Such contexts are referred  to  as  'named'.   Support  for  named
       contexts must be indicated by setting the caps FI_NAMED_RX_CTX capability when the corresponding endpoint
       is  created.   Support  for named receive contexts is coordinated with address vectors.  See fi_av(3) and
       fi_rx_addr(3).

       The fi_rx_context call is used to retrieve a specific context, identified by  an  index  (see  above  for
       details  on  receive context attributes).  Providers may dynamically allocate contexts when fi_rx_context
       is called, or may statically create all contexts when fi_endpoint is  invoked.   By  default,  a  receive
       context  inherits  the  properties of its associated endpoint.  However, applications may request context
       specific attributes through the attr parameter.  Support for per receive context attributes  is  provider
       specific and not guaranteed.  Providers will return the actual attributes assigned to the context through
       the attr parameter, if provided.

SHARED CONTEXTS

       Shared  contexts  are  transmit  and  receive  contexts explicitly shared among one or more endpoints.  A
       shareable context allows an application to use  a  single  dedicated  provider  resource  among  multiple
       transport  addressable  endpoints.   This can greatly reduce the resources needed to manage communication
       over multiple endpoints by multiplexing transmit and/or receive processing, with the  potential  cost  of
       serializing access across multiple endpoints.  Support for shareable contexts is domain specific.

       Conceptually,  shareable  transmit  contexts  are transmit queues that may be accessed by many endpoints.
       The use of a shared transmit context is mostly opaque to an application.  Applications must allocate  and
       bind  shared  transmit contexts to endpoints, but operations are posted directly to the endpoint.  Shared
       transmit contexts are not associated with completion queues or counters.  Completed operations are posted
       to the CQs bound to the endpoint.  An endpoint may only be  associated  with  a  single  shared  transmit
       context.

       Unlike shared transmit contexts, applications interact directly with shared receive contexts.  Users post
       receive  buffers  directly  to a shared receive context, with the buffers usable by any endpoint bound to
       the shared receive context.  Shared receive  contexts  are  not  associated  with  completion  queues  or
       counters.   Completed  receive  operations  are posted to the CQs bound to the endpoint.  An endpoint may
       only be associated with a single receive context, and all  connectionless  endpoints  associated  with  a
       shared receive context must also share the same address vector.

       Endpoints  associated  with a shared transmit context may use dedicated receive contexts, and vice-versa.
       Or an endpoint may use shared transmit and receive contexts.  And there is no requirement that  the  same
       group  of  endpoints  sharing  a  context  of  one  type  also  share  the  context of an alternate type.
       Furthermore, an endpoint may use a shared context of one type, but a scalable  set  of  contexts  of  the
       alternate type.

   fi_stx_context
       This  call  is  used  to open a shareable transmit context (see above for details on the transmit context
       attributes).  Endpoints associated with a shared transmit context must  use  a  subset  of  the  transmit
       context's  attributes.   Note  that  this  is  the  reverse  of the requirement for transmit contexts for
       scalable endpoints.

   fi_srx_context
       This allocates a shareable receive context (see above for details on  the  receive  context  attributes).
       Endpoints associated with a shared receive context must use a subset of the receive context's attributes.
       Note that this is the reverse of the requirement for receive contexts for scalable endpoints.

SOCKET ENDPOINTS

       The  following  feature and description should be considered experimental.  Until the experimental tag is
       removed, the interfaces, semantics, and data structures  associated  with  socket  endpoints  may  change
       between library versions.

       This section applies to endpoints of type FI_EP_SOCK_STREAM and FI_EP_SOCK_DGRAM, commonly referred to as
       socket endpoints.

       Socket  endpoints  are  defined  with  semantics  that allow them to more easily be adopted by developers
       familiar with the UNIX socket API, or by middleware that exposes  the  socket  API,  while  still  taking
       advantage of high-performance hardware features.

       The  key  difference  between  socket  endpoints  and  other  active  endpoints  are socket endpoints use
       synchronous data transfers.  Buffers passed into send and receive operations revert to the control of the
       application upon returning from the function call.   As  a  result,  no  data  transfer  completions  are
       reported to the application, and socket endpoints are not associated with completion queues or counters.

       Socket  endpoints  support  a  subset  of  message  operations:  fi_send,  fi_sendv, fi_sendmsg, fi_recv,
       fi_recvv, fi_recvmsg, and fi_inject.  Because data transfers are synchronous, the return value from  send
       and receive operations indicate the number of bytes transferred on success, or a negative value on error,
       including  -FI_EAGAIN  if  the  endpoint cannot send or receive any data because of full or empty queues,
       respectively.

       Socket endpoints are associated with event queues and address vectors, and process connection  management
       events  asynchronously,  similar  to other endpoints.  Unlike UNIX sockets, socket endpoint must still be
       declared as either active or passive.

       Socket endpoints behave like non-blocking sockets.  In order to support select and poll semantics, active
       socket endpoints are associated with a file descriptor that is signaled whenever the endpoint is ready to
       send and/or receive data.  The file descriptor may be retrieved using fi_control.

OPERATION FLAGS

       Operation flags are obtained by OR-ing the following flags together.  Operation flags define the  default
       flags  applied to an endpoint's data transfer operations, where a flags parameter is not available.  Data
       transfer operations that take flags as input override the op_flags value of transmit or  receive  context
       attributes of an endpoint.

       FI_INJECT : Indicates that all outbound data buffers should be returned to the user's control immediately
       after  a  data  transfer call returns, even if the operation is handled asynchronously.  This may require
       that the provider copy the data into a local buffer and transfer out of  that  buffer.   A  provider  can
       limit  the  total  amount of send data that may be buffered and/or the size of a single send that can use
       this flag.  This limit is indicated using inject_size (see inject_size above).

       FI_MULTI_RECV : Applies to posted receive operations.  This flag allows the user to post a single  buffer
       that  will  receive multiple incoming messages.  Received messages will be packed into the receive buffer
       until the buffer has been consumed.  Use of this flag may cause a  single  posted  receive  operation  to
       generate  multiple  completions  as  messages are placed into the buffer.  The placement of received data
       into the buffer may be subjected to  provider  specific  alignment  restrictions.   The  buffer  will  be
       released  by  the  provider  when  the  available  buffer  space  falls  below the specified minimum (see
       FI_OPT_MIN_MULTI_RECV).

       FI_COMPLETION : Indicates that a completion entry should be generated for data transfer operations.  This
       flag only applies to operations issued on endpoints  that  were  bound  to  a  CQ  or  counter  with  the
       FI_SELECTIVE_COMPLETION flag.  See the fi_ep_bind section above for more detail.

       FI_INJECT_COMPLETE  :  Indicates  that  a completion should be generated when the source buffer(s) may be
       reused.  A completion guarantees that the buffers will not be read from again  and  the  application  may
       reclaim them.  No other guarantees are made with respect to the state of the operation.

       Note:  This flag is used to control when a completion entry is inserted into a completion queue.  It does
       not apply to operations that do not generate a completion queue entry, such as the  fi_inject  operation,
       and is not subject to the inject_size message limit restriction.

       FI_TRANSMIT_COMPLETE  :  Indicates  that a completion should be generated when the transmit operation has
       completed relative to the local provider.  The exact behavior is dependent on the endpoint type.

       For reliable endpoints:

       Indicates that a completion should be generated when  the  operation  has  been  delivered  to  the  peer
       endpoint.   A  completion  guarantees  that  the  operation is no longer dependent on the fabric or local
       resources.  The state of the operation at the peer endpoint is not defined.

       For unreliable endpoints:

       Indicates that a completion should be generated when the operation has been delivered to the  fabric.   A
       completion  guarantees  that  the  operation is no longer dependent on local resources.  The state of the
       operation within the fabric is not defined.

       FI_DELIVERY_COMPLETE : Indicates that a completion should not be generated until an  operation  has  been
       processed  by  the  destination endpoint(s).  A completion guarantees that the result of the operation is
       available.

       This completion mode applies only to  reliable  endpoints.   For  operations  that  return  data  to  the
       initiator,  such  as  RMA  read  or  atomic-fetch,  the  source endpoint is also considered a destination
       endpoint.  This is the default completion mode for such operations.

       FI_COMMIT_COMPLETE : Indicates that a completion should not be generated (locally or at the  peer)  until
       the  result  of  an operation have been made persistent.  A completion guarantees that the result is both
       available and durable, in the case of power failure.

       This completion mode applies only to operations that  target  persistent  memory  regions  over  reliable
       endpoints.  This completion mode is experimental.

       FI_MULTICAST  :  Indicates that data transfers will target multicast addresses by default.  Any fi_addr_t
       passed into a data transfer operation will be treated as a multicast address.

NOTES

       Users should call fi_close to release all resources allocated to the fabric endpoint.

       Endpoints allocated with the FI_CONTEXT mode set must typically provide struct fi_context  as  their  per
       operation  context  parameter.   (See fi_getinfo.3 for details.) However, when FI_SELECTIVE_COMPLETION is
       enabled to suppress completion entries, and an operation is initiated  without  FI_COMPLETION  flag  set,
       then the context parameter is ignored.  An application does not need to pass in a valid struct fi_context
       into such data transfers.

       Operations  that  complete  in  error that are not associated with valid operational context will use the
       endpoint context in any error reporting structures.

       Although applications typically  associate  individual  completions  with  either  completion  queues  or
       counters,  an  endpoint can be attached to both a counter and completion queue.  When combined with using
       selective completions, this allows an application to use counters to track successful completions, with a
       CQ used to report errors.  Operations that complete  with  an  error  increment  the  error  counter  and
       generate  a  completion  event.   The  generation of entries going to the CQ can then be controlled using
       FI_SELECTIVE_COMPLETION.

       As mentioned in fi_getinfo(3), the ep_attr structure can be used to query providers that support  various
       endpoint  attributes.  fi_getinfo can return provider info structures that can support the minimal set of
       requirements (such that the application maintains correctness).  However, it  can  also  return  provider
       info  structures that exceed application requirements.  As an example, consider an application requesting
       msg_order as FI_ORDER_NONE.  The resulting output from fi_getinfo may have all  the  ordering  bits  set.
       The  application  can  reset  the  ordering  bits  it does not require before creating the endpoint.  The
       provider is free to implement a stricter ordering than is required by the application.

RETURN VALUES

       Returns 0 on success.  On error, a negative  value  corresponding  to  fabric  errno  is  returned.   For
       fi_cancel, a return value of 0 indicates that the cancel request was submitted for processing.

       Fabric errno values are defined in rdma/fi_errno.h.

ERRORS

       -FI_EDOMAIN  :  A  resource  domain was not bound to the endpoint or an attempt was made to bind multiple
       domains.

       -FI_ENOCQ : The endpoint has not been configured with necessary event queue.

       -FI_EOPBADSTATE : The endpoint's state does not permit the requested operation.

SEE ALSO

       fi_getinfo(3), fi_domain(3), fi_msg(3), fi_tagged(3), fi_rma(3)

AUTHORS

       OpenFabrics.

Libfabric Programmer's Manual                      2018-02-13                                     fi_endpoint(3)