Provided by: libfabric-dev_1.6.2-3ubuntu0.1_amd64 bug

NAME

       fi_eq - Event queue operations

       fi_eq_open / fi_close : Open/close an event queue

       fi_control : Control operation of EQ

       fi_eq_read / fi_eq_readerr : Read an event from an event queue

       fi_eq_write : Writes an event to an event queue

       fi_eq_sread : A synchronous (blocking) read of an event queue

       fi_eq_strerror : Converts provider specific error information into a printable string

SYNOPSIS

              #include <rdma/fi_domain.h>

              int fi_eq_open(struct fid_fabric *fabric, struct fi_eq_attr *attr,
                  struct fid_eq **eq, void *context);

              int fi_close(struct fid *eq);

              int fi_control(struct fid *eq, int command, void *arg);

              ssize_t fi_eq_read(struct fid_eq *eq, uint32_t *event,
                  void *buf, size_t len, uint64_t flags);

              ssize_t fi_eq_readerr(struct fid_eq *eq, struct fi_eq_err_entry *buf,
                  uint64_t flags);

              ssize_t fi_eq_write(struct fid_eq *eq, uint32_t event,
                  const void *buf, size_t len, uint64_t flags);

              ssize_t fi_eq_sread(struct fid_eq *eq, uint32_t *event,
                  void *buf, size_t len, int timeout, uint64_t flags);

              const char * fi_eq_strerror(struct fid_eq *eq, int prov_errno,
                    const void *err_data, char *buf, size_t len);

ARGUMENTS

       fabric : Opened fabric descriptor

       eq : Event queue

       attr : Event queue attributes

       context : User specified context associated with the event queue.

       event : Reported event

       buf : For read calls, the data buffer to write events into.  For write calls, an event to insert into the
       event queue.  For fi_eq_strerror, an optional buffer that receives printable error information.

       len : Length of data buffer

       flags : Additional flags to apply to the operation

       command : Command of control operation to perform on EQ.

       arg : Optional control argument

       prov_errno : Provider specific error value

       err_data : Provider specific error data related to a completion

       timeout : Timeout specified in milliseconds

DESCRIPTION

       Event  queues  are  used  to  report events associated with control operations.  They are associated with
       memory registration, address  vectors,  connection  management,  and  fabric  and  domain  level  events.
       Reported events are either associated with a requested operation or affiliated with a call that registers
       for specific types of events, such as listening for connection requests.

   fi_eq_open
       fi_eq_open allocates a new event queue.

       The properties and behavior of an event queue are defined by struct fi_eq_attr.

              struct fi_eq_attr {
                  size_t               size;      /* # entries for EQ */
                  uint64_t             flags;     /* operation flags */
                  enum fi_wait_obj     wait_obj;  /* requested wait object */
                  int                  signaling_vector; /* interrupt affinity */
                  struct fid_wait     *wait_set;  /* optional wait set */
              };

       size : Specifies the minimum size of an event queue.

       flags : Flags that control the configuration of the EQ.

       • FI_WRITE  :  Indicates that the application requires support for inserting user events into the EQ.  If
         this flag is set, then the fi_eq_write operation must be supported by the provider.   If  the  FI_WRITE
         flag is not set, then the application may not invoke fi_eq_write.

       • FI_AFFINITY : Indicates that the signaling_vector field (see below) is valid.

       wait_obj  : EQ's may be associated with a specific wait object.  Wait objects allow applications to block
       until the wait object is signaled, indicating that an event is available  to  be  read.   Users  may  use
       fi_control  to  retrieve  the  underlying  wait object associated with an EQ, in order to use it in other
       system calls.  The following values may be used to specify the type of wait object associated with an EQ:

       • FI_WAIT_NONE : Used to indicate that the user will not  block  (wait)  for  events  on  the  EQ.   When
         FI_WAIT_NONE  is  specified,  the application may not call fi_eq_sread.  This is the default is no wait
         object is specified.

       • FI_WAIT_UNSPEC : Specifies that the user will only wait on the EQ using fabric interface calls, such as
         fi_eq_sread.  In this case, the  underlying  provider  may  select  the  most  appropriate  or  highest
         performing  wait  object  available,  including  custom  wait  mechanisms.   Applications  that  select
         FI_WAIT_UNSPEC are not guaranteed to retrieve the underlying wait object.

       • FI_WAIT_SET : Indicates that the event queue should use a wait set  object  to  wait  for  events.   If
         specified, the wait_set field must reference an existing wait set object.

       • FI_WAIT_FD  :  Indicates  that  the  EQ  should  use  a  file descriptor as its wait mechanism.  A file
         descriptor wait object must be usable in select, poll, and epoll routines.   However,  a  provider  may
         signal an FD wait object by marking it as readable or with an error.

       • FI_WAIT_MUTEX_COND  :  Specifies  that  the  EQ  should use a pthread mutex and cond variable as a wait
         object.

       • FI_WAIT_CRITSEC_COND : Windows specific.  Specifies that the EQ  should  use  a  critical  section  and
         condition variable as a wait object.

       signaling_vector : If the FI_AFFINITY flag is set, this indicates the logical cpu number (0..max cpu - 1)
       that  interrupts  associated  with  the  EQ should target.  This field should be treated as a hint to the
       provider and may be ignored if the provider does not support interrupt affinity.

       wait_set : If wait_obj is FI_WAIT_SET, this field references a wait  object  to  which  the  event  queue
       should  attach.   When  an  event  is  inserted  into the event queue, the corresponding wait set will be
       signaled if all necessary conditions are met.  The use of a  wait_set  enables  an  optimized  method  of
       waiting for events across multiple event queues.  This field is ignored if wait_obj is not FI_WAIT_SET.

   fi_close
       The  fi_close call releases all resources associated with an event queue.  Any events which remain on the
       EQ when it is closed are lost.

       The EQ must not be bound to any other objects prior to being  closed,  otherwise  the  call  will  return
       -FI_EBUSY.

   fi_control
       The  fi_control  call  is  used to access provider or implementation specific details of the event queue.
       Access to the EQ should be serialized across all calls when fi_control is invoked, as it may redirect the
       implementation of EQ operations.  The following control commands are usable with an EQ.

       FI_GETWAIT (void **) : This command allows the user to retrieve the low-level wait object associated with
       the EQ.  The format of the wait-object is specified during EQ creation, through the EQ  attributes.   The
       fi_control  arg  parameter  should  be  an  address  where  a pointer to the returned wait object will be
       written.  This should be an 'int *' for FI_WAIT_FD, or 'struct fi_mutex_cond' for FI_WAIT_MUTEX_COND.

              struct fi_mutex_cond {
                  pthread_mutex_t     *mutex;
                  pthread_cond_t      *cond;
              };

   fi_eq_read
       The fi_eq_read operations performs a non-blocking read of event data from the  EQ.   The  format  of  the
       event  data  is  based on the type of event retrieved from the EQ, with all events starting with a struct
       fi_eq_entry header.  At most one event will be returned per EQ  read  operation.   The  number  of  bytes
       successfully  read  from the EQ is returned from the read.  The FI_PEEK flag may be used to indicate that
       event data should be read from the EQ without being consumed.  A subsequent read without the FI_PEEK flag
       would then remove the event from the EQ.

       The following types of events may be reported to an EQ,  along  with  information  regarding  the  format
       associated with each event.

       Asynchronous  Control Operations : Asynchronous control operations are basic requests that simply need to
       generate an event to indicate that they have completed.  These include the  following  types  of  events:
       memory registration, address vector resolution, and multicast joins.

       Control  requests report their completion by inserting a struct   fi_eq_entry into the EQ.  The format of
       this structure is:

              struct fi_eq_entry {
                  fid_t            fid;        /* fid associated with request */
                  void            *context;    /* operation context */
                  uint64_t         data;       /* completion-specific data */
              };

       For the completion of basic asynchronous  control  operations,  the  returned  event  will  indicate  the
       operation that has completed, and the fid will reference the fabric descriptor associated with the event.
       For  memory  registration,  this will be an FI_MR_COMPLETE event and the fid_mr.  Address resolution will
       reference an FI_AV_COMPLETE event and fid_av.   Multicast  joins  will  report  an  FI_JOIN_COMPLETE  and
       fid_mc.   The  context field will be set to the context specified as part of the operation, if available,
       otherwise the context will be associated with the fabric descriptor.  The  data  field  will  be  set  as
       described  in the man page for the corresponding object type (e.g., see fi_av(3) for a description of how
       asynchronous address vector insertions are completed).

       Connection Notification : Connection notifications are connection management notifications used to  setup
       or  tear down connections between endpoints.  There are three connection notification events: FI_CONNREQ,
       FI_CONNECTED, and FI_SHUTDOWN.  Connection notifications are reported using struct   fi_eq_cm_entry:

              struct fi_eq_cm_entry {
                  fid_t            fid;        /* fid associated with request */
                  struct fi_info  *info;       /* endpoint information */
                  uint8_t         data[];     /* app connection data */
              };

       A connection request (FI_CONNREQ) event indicates that a  remote  endpoint  wishes  to  establish  a  new
       connection to a listening, or passive, endpoint.  The fid is the passive endpoint.  Information regarding
       the  requested,  active  endpoint's  capabilities  and attributes are available from the info field.  The
       application is responsible for freeing this structure by calling fi_freeinfo when it is no longer needed.
       The fi_info connreq field will reference the connection request associated with this event.  To accept  a
       connection,  an  endpoint  must first be created by passing an fi_info structure referencing this connreq
       field to fi_endpoint().  This endpoint is then passed to fi_accept() to complete the  acceptance  of  the
       connection attempt.  Creating the endpoint is most easily accomplished by passing the fi_info returned as
       part  of  the CM event into fi_endpoint().  If the connection is to be rejected, the connreq is passed to
       fi_reject().

       Any application data exchanged as part of the connection request  is  placed  beyond  the  fi_eq_cm_entry
       structure.   The  amount  of  data  available  is  application  dependent and limited to the buffer space
       provided by the application when fi_eq_read is called.  The amount of returned  data  may  be  calculated
       using the return value to fi_eq_read.  Note that the amount of returned data is limited by the underlying
       connection  protocol, and the length of any data returned may include protocol padding.  As a result, the
       returned length may be larger than that specified by the connecting peer.

       If a connection request has been accepted, an FI_CONNECTED event will be generated on both sides  of  the
       connection.   The  active  side  --  one that called fi_connect() -- may receive user data as part of the
       FI_CONNECTED event.  The user data is passed to the connection manager on the passive  side  through  the
       fi_accept  call.   User  data  is  not  provided  with an FI_CONNECTED event on the listening side of the
       connection.

       Notification that a remote peer has disconnected from an active endpoint is done through the  FI_SHUTDOWN
       event.  Shutdown notification uses struct fi_eq_cm_entry as declared above.  The fid field for a shutdown
       notification refers to the active endpoint's fid_ep.

       Asynchronous  Error Notification : Asynchronous errors are used to report problems with fabric resources.
       Reported errors may be fatal or transient, based on the  error,  and  result  in  the  resource  becoming
       disabled.   Disabled  resources  will  fail  operations  submitted against them until they are explicitly
       re-enabled by the application.

       Asynchronous errors may be reported for completion queues and endpoints of  all  types.   CQ  errors  can
       result  when  resource  management  has  been  disabled,  and  the provider has detected a queue overrun.
       Endpoint errors may be result of numerous actions, but are often  associated  with  a  failed  operation.
       Operations  may  fail  because  of  buffer  overruns,  invalid permissions, incorrect memory access keys,
       network routing failures, network reach-ability issues, etc.

       Asynchronous errors are reported using struct fi_eq_err_entry, as defined below.  The  fabric  descriptor
       (fid)  associated  with the error is provided as part of the error data.  An error code is also available
       to determine the cause of the error.

   fi_eq_sread
       The fi_eq_sread call is the blocking (or synchronous) equivalent to fi_eq_read.  It behaves is similar to
       the non-blocking call, with the exception that the calls will not return until either an event  has  been
       read from the EQ or an error or timeout occurs.  Specifying a negative timeout means an infinite timeout.

       It  is invalid for applications to call this function if the EQ has been configured with a wait object of
       FI_WAIT_NONE or FI_WAIT_SET.

   fi_eq_readerr
       The read error function, fi_eq_readerr, retrieves information regarding any asynchronous operation  which
       has  completed  with  an  unexpected  error.  fi_eq_readerr is a non-blocking call, returning immediately
       whether an error completion was found or not.

       EQs are optimized to report operations which have completed  successfully.   Operations  which  fail  are
       reported  'out  of  band'.   Such  operations  are  retrieved  using the fi_eq_readerr function.  When an
       operation that completes with an unexpected error is inserted into an EQ, it is placed into  a  temporary
       error  queue.   Attempting to read from an EQ while an item is in the error queue results in an FI_EAVAIL
       failure.  Applications may use this return code to determine when to call fi_eq_readerr.

       Error information is reported to the user through struct fi_eq_err_entry.  The format of  this  structure
       is defined below.

              struct fi_eq_err_entry {
                  fid_t            fid;        /* fid associated with error */
                  void            *context;    /* operation context */
                  uint64_t         data;       /* completion-specific data */
                  int              err;        /* positive error code */
                  int              prov_errno; /* provider error code */
                  void            *err_data;   /* additional error data */
                  size_t           err_data_size; /* size of err_data */
              };

       The  fid  will  reference the fabric descriptor associated with the event.  For memory registration, this
       will be the fid_mr, address resolution will reference a fid_av, and CM events will  refer  to  a  fid_ep.
       The context field will be set to the context specified as part of the operation.

       The  data  field  will  be  set as described in the man page for the corresponding object type (e.g., see
       fi_av(3) for a description of how asynchronous address vector insertions are completed).

       The general reason for the error is provided through the err field.   Provider  or  operational  specific
       error  information  may  also  be  available  through the prov_errno and err_data fields.  Users may call
       fi_eq_strerror to convert provider specific error information  into  a  printable  string  for  debugging
       purposes.

       On  input,  err_data_size  indicates  the size of the err_data buffer in bytes.  On output, err_data_size
       will be set to the number of bytes copied to the err_data buffer.  The err_data information is  typically
       used with fi_eq_strerror to provide details about the type of error that occurred.

       For  compatibility purposes, if err_data_size is 0 on input, or the fabric was opened with release < 1.5,
       err_data will be set to a data buffer owned by the provider.  The contents  of  the  buffer  will  remain
       valid  until  a  subsequent  read call against the EQ.  Applications must serialize access to the EQ when
       processing errors to ensure that the buffer referenced by err_data does not change.

EVENT FIELDS

       The EQ entry data structures share many of the same fields.  The meanings are the same or similar for all
       EQ structure formats, with specific details described below.

       fid : This corresponds to the fabric descriptor associated with the event.  The type of  fid  depends  on
       the event being reported.  For FI_CONNREQ this will be the fid of the passive endpoint.  FI_CONNECTED and
       FI_SHUTDOWN  will  reference the active endpoint.  FI_MR_COMPLETE and FI_AV_COMPLETE will refer to the MR
       or AV fabric descriptor, respectively.  FI_JOIN_COMPLETE will point to the multicast descriptor  returned
       as  part  of  the  join  operation.   Applications  can  use  fid->context  value to retrieve the context
       associated with the fabric descriptor.

       context : The context value is set to the context parameter specified with the operation  that  generated
       the event.  If no context parameter is associated with the operation, this field will be NULL.

       data  :  Data is an operation specific value or set of bytes.  For connection events, data is application
       data exchanged as part of the connection protocol.

       err : This err code is a positive fabric errno associated with an event.  The  err  value  indicates  the
       general reason for an error, if one occurred.  See fi_errno.3 for a list of possible error codes.

       prov_errno  :  On an error, prov_errno may contain a provider specific error code.  The use of this field
       and its meaning is provider specific.  It is intended to be used as a debugging aid.  See  fi_eq_strerror
       for additional details on converting this error value into a human readable string.

       err_data  :  On  an  error,  err_data may reference a provider specific amount of data associated with an
       error.  The use of this field and its meaning is provider specific.  It is  intended  to  be  used  as  a
       debugging  aid.   See  fi_eq_strerror  for  additional details on converting this error data into a human
       readable string.

       err_data_size : On input, err_data_size indicates the size of the err_data buffer in bytes.   On  output,
       err_data_size will be set to the number of bytes copied to the err_data buffer.  The err_data information
       is typically used with fi_eq_strerror to provide details about the type of error that occurred.

       For  compatibility purposes, if err_data_size is 0 on input, or the fabric was opened with release < 1.5,
       err_data will be set to a data buffer owned by the provider.  The contents  of  the  buffer  will  remain
       valid  until  a  subsequent  read call against the EQ.  Applications must serialize access to the EQ when
       processing errors to ensure that the buffer referenced by err_data does no change.

NOTES

       If an event queue has been overrun, it will be placed into an 'overrun' state.  Write operations  against
       an  overrun  EQ  will  fail  with  -FI_EOVERRUN.   Read  operations  will  continue  to return any valid,
       non-corrupted events, if available.  After all valid events have been retrieved, any attempt to read  the
       EQ will result in it returning an FI_EOVERRUN error event.  Overrun event queues are considered fatal and
       may not be used to report additional events once the overrun occurs.

RETURN VALUES

       fi_eq_open : Returns 0 on success.  On error, a negative value corresponding to fabric errno is returned.

       fi_eq_read  /  fi_eq_readerr  / fi_eq_sread : On success, returns the number of bytes read from the event
       queue.  On error, a negative value corresponding to fabric errno is returned.  If no data is available to
       be read from the event queue, -FI_EAGAIN is returned.

       fi_eq_write : On success, returns the number of bytes written to the event queue.  On error,  a  negative
       value corresponding to fabric errno is returned.

       fi_eq_strerror : Returns a character string interpretation of the provider specific error returned with a
       completion.

       Fabric errno values are defined in rdma/fi_errno.h.

SEE ALSO

       fi_getinfo(3), fi_endpoint(3), fi_domain(3), fi_cntr(3), fi_poll(3)

AUTHORS

       OpenFabrics.

Libfabric Programmer's Manual                      2017-12-01                                           fi_eq(3)