Provided by: libfabric-dev_1.5.3-1_amd64 bug

NAME

       fi_cq - Completion queue operations

       fi_cq_open / fi_close : Open/close a completion queue

       fi_control : Control CQ operation or attributes.

       fi_cq_read / fi_cq_readfrom / fi_cq_readerr : Read a completion from a completion queue

       fi_cq_sread  / fi_cq_sreadfrom : A synchronous (blocking) read that waits until a specified condition has
       been met before reading a completion from a completion queue.

       fi_cq_signal : Unblock any thread waiting in fi_cq_sread or fi_cq_sreadfrom.

       fi_cq_strerror : Converts provider specific error information into a printable string

SYNOPSIS

              #include <rdma/fi_domain.h>

              int fi_cq_open(struct fid_domain *domain, struct fi_cq_attr *attr,
                  struct fid_cq **cq, void *context);

              int fi_close(struct fid *cq);

              int fi_control(struct fid *cq, int command, void *arg);

              ssize_t fi_cq_read(struct fid_cq *cq, void *buf, size_t count);

              ssize_t fi_cq_readfrom(struct fid_cq *cq, void *buf, size_t count,
                  fi_addr_t *src_addr);

              ssize_t fi_cq_readerr(struct fid_cq *cq, struct fi_cq_err_entry *buf,
                  uint64_t flags);

              ssize_t fi_cq_sread(struct fid_cq *cq, void *buf, size_t count,
                  const void *cond, int timeout);

              ssize_t fi_cq_sreadfrom(struct fid_cq *cq, void *buf, size_t count,
                  fi_addr_t *src_addr, const void *cond, int timeout);

              int fi_cq_signal(struct fid_cq *cq);

              const char * fi_cq_strerror(struct fid_cq *cq, int prov_errno,
                    const void *err_data, char *buf, size_t len);

ARGUMENTS

       domain : Open resource domain

       cq : Completion queue

       attr : Completion queue attributes

       context : User specified context associated with the completion queue.

       buf : For read calls, the data buffer to write completions into.  For write calls, a completion to insert
       into  the  completion  queue.   For  fi_cq_strerror,  an  optional  buffer  that receives printable error
       information.

       count : Number of CQ entries.

       len : Length of data buffer

       src_addr : Source address of a completed receive operation

       flags : Additional flags to apply to the operation

       command : Command of control operation to perform on CQ.

       arg : Optional control argument

       cond : Condition that must be met before a completion is generated

       timeout : Time in milliseconds to wait.  A negative value indicates infinite timeout.

       prov_errno : Provider specific error value

       err_data : Provider specific error data related to a completion

DESCRIPTION

       Completion queues are used to report events associated with data transfers.   They  are  associated  with
       message  sends  and  receives,  RMA,  atomic, tagged messages, and triggered events.  Reported events are
       usually associated with a fabric endpoint, but may also refer to memory regions used as the target of  an
       RMA or atomic operation.

   fi_cq_open
       fi_cq_open  allocates a new completion queue.  Unlike event queues, completion queues are associated with
       a resource domain and may be offloaded entirely in provider hardware.

       The properties and behavior of a completion queue are defined by struct fi_cq_attr.

              struct fi_cq_attr {
                  size_t               size;      /* # entries for CQ */
                  uint64_t             flags;     /* operation flags */
                  enum fi_cq_format    format;    /* completion format */
                  enum fi_wait_obj     wait_obj;  /* requested wait object */
                  int                  signaling_vector; /* interrupt affinity */
                  enum fi_cq_wait_cond wait_cond; /* wait condition format */
                  struct fid_wait     *wait_set;  /* optional wait set */
              };

       size : Specifies the minimum size of a completion queue.  A value of 0 indicates that  the  provider  may
       choose a default value.

       flags : Flags that control the configuration of the CQ.

       • FI_AFFINITY : Indicates that the signaling_vector field (see below) is valid.

       format  :  Completion  queues allow the application to select the amount of detail that it must store and
       report.  The format attribute allows the  application  to  select  one  of  several  completion  formats,
       indicating  the  structure  of  the  data  that  the completion queue should return when read.  Supported
       formats and the structures that correspond to each are listed below.  The meaning of the CQ entry  fields
       are defined in the Completion Fields section.

       • FI_CQ_FORMAT_UNSPEC  :  If an unspecified format is requested, then the CQ will use a provider selected
         default format.

       • FI_CQ_FORMAT_CONTEXT : Provides only user specified context that was associated with the completion.

         struct fi_cq_entry {
             void     *op_context; /* operation context */
         };

       • FI_CQ_FORMAT_MSG : Provides  minimal  data  for  processing  completions,  with  expanded  support  for
         reporting information about received messages.

         struct fi_cq_msg_entry {
             void     *op_context; /* operation context */
             uint64_t flags;       /* completion flags */
             size_t   len;         /* size of received data */
         };

       • FI_CQ_FORMAT_DATA  : Provides data associated with a completion.  Includes support for received message
         length, remote CQ data, and multi-receive buffers.

         struct fi_cq_data_entry {
             void     *op_context; /* operation context */
             uint64_t flags;       /* completion flags */
             size_t   len;         /* size of received data */
             void     *buf;        /* receive data buffer */
             uint64_t data;        /* completion data */
         };

       • FI_CQ_FORMAT_TAGGED : Expands completion data to include support for the tagged message interfaces.

         struct fi_cq_tagged_entry {
             void     *op_context; /* operation context */
             uint64_t flags;       /* completion flags */
             size_t   len;         /* size of received data */
             void     *buf;        /* receive data buffer */
             uint64_t data;        /* completion data */
             uint64_t tag;         /* received tag */
         };

       wait_obj : CQ's may be associated with a specific wait object.  Wait objects allow applications to  block
       until  the  wait object is signaled, indicating that a completion is available to be read.  Users may use
       fi_control to retrieve the underlying wait object associated with a CQ, in  order  to  use  it  in  other
       system  calls.  The following values may be used to specify the type of wait object associated with a CQ:
       FI_WAIT_NONE,  FI_WAIT_UNSPEC,  FI_WAIT_SET,  FI_WAIT_FD,  and  FI_WAIT_MUTEX_COND.    The   default   is
       FI_WAIT_NONE.

       • FI_WAIT_NONE  :  Used  to indicate that the user will not block (wait) for completions on the CQ.  When
         FI_WAIT_NONE is specified, the application may not call fi_cq_sread or fi_cq_sreadfrom.

       • FI_WAIT_UNSPEC : Specifies that the user will only wait on the CQ using fabric interface calls, such as
         fi_cq_sread  or fi_cq_sreadfrom.  In this case, the underlying provider may select the most appropriate
         or highest performing wait object available,  including  custom  wait  mechanisms.   Applications  that
         select FI_WAIT_UNSPEC are not guaranteed to retrieve the underlying wait object.

       • FI_WAIT_SET : Indicates that the completion queue should use a wait set object to wait for completions.
         If specified, the wait_set field must reference an existing wait set object.

       • FI_WAIT_FD : Indicates that the CQ should use  a  file  descriptor  as  its  wait  mechanism.   A  file
         descriptor  wait  object  must  be usable in select, poll, and epoll routines.  However, a provider may
         signal an FD wait object by marking it as readable, writable, or with an error.

       • FI_WAIT_MUTEX_COND : Specifies that the CQ should use a pthread mutex  and  cond  variable  as  a  wait
         object.

       • FI_WAIT_CRITSEC_COND  :  Windows  specific.   Specifies  that  the CQ should use a critical section and
         condition variable as a wait object.

       signaling_vector : If the FI_AFFINITY flag is set, this indicates the logical cpu number (0..max cpu - 1)
       that  interrupts  associated  with  the  CQ should target.  This field should be treated as a hint to the
       provider and may be ignored if the provider does not support interrupt affinity.

       wait_cond : By  default,  when  a  completion  is  inserted  into  a  CQ  that  supports  blocking  reads
       (fi_cq_sread/fi_cq_sreadfrom),  the corresponding wait object is signaled.  Users may specify a condition
       that must first be met before the wait is satisfied.   This  field  indicates  how  the  provider  should
       interpret the cond field, which describes the condition needed to signal the wait object.

       A  wait  condition  should  be  treated  as  an  optimization.   Providers  are  not required to meet the
       requirements of the condition before signaling the wait object.  Applications  should  not  rely  on  the
       condition necessarily being true when a blocking read call returns.

       If wait_cond is set to FI_CQ_COND_NONE, then no additional conditions are applied to the signaling of the
       CQ wait object, and the insertion of any new entry will trigger the wait condition.  If wait_cond is  set
       to  FI_CQ_COND_THRESHOLD,  then the cond field is interpreted as a size_t threshold value.  The threshold
       indicates the number of entries that are to be queued before at the CQ before the wait is satisfied.

       This field is ignored if wait_obj is set to FI_WAIT_NONE.

       wait_set : If wait_obj is FI_WAIT_SET, this field references a wait object to which the completion  queue
       should  attach.   When an event is inserted into the completion queue, the corresponding wait set will be
       signaled if all necessary conditions are met.  The use of a  wait_set  enables  an  optimized  method  of
       waiting for events across multiple event and completion queues.  This field is ignored if wait_obj is not
       FI_WAIT_SET.

   fi_close
       The fi_close call releases all resources associated with  a  completion  queue.   Any  completions  which
       remain on the CQ when it is closed are lost.

       When closing the CQ, there must be no opened endpoints, transmit contexts, or receive contexts associated
       with the CQ.  If resources are still associated with the CQ when  attempting  to  close,  the  call  will
       return -FI_EBUSY.

   fi_control
       The  fi_control  call  is  used  to  access provider or implementation specific details of the completion
       queue.  Access to the CQ should be serialized across all calls when fi_control  is  invoked,  as  it  may
       redirect the implementation of CQ operations.  The following control commands are usable with a CQ.

       FI_GETWAIT (void **) : This command allows the user to retrieve the low-level wait object associated with
       the CQ.  The format of the wait-object is specified during CQ creation, through the CQ  attributes.   The
       fi_control  arg  parameter  should  be  an  address  where  a pointer to the returned wait object will be
       written.  See fi_eq.3 for addition details using fi_control with FI_GETWAIT.

   fi_cq_read
       The fi_cq_read operation performs a non-blocking read of completion data from the CQ.  The format of  the
       completion  event  is determined using the fi_cq_format option that was specified when the CQ was opened.
       Multiple completions may be retrieved from a CQ in a single call.   The  maximum  number  of  entries  to
       return is limited to the specified count parameter, with the number of entries successfully read from the
       CQ returned by the call.  (See return values section below.)

       CQs are optimized to report operations which have completed  successfully.   Operations  which  fail  are
       reported  'out  of  band'.   Such  operations  are  retrieved  using the fi_cq_readerr function.  When an
       operation that has completed with an unexpected error is encountered, it is placed into a temporary error
       queue.   Attempting  to  read from a CQ while an item is in the error queue results in fi_cq_read failing
       with a return code of -FI_EAVAIL.  Applications may use this  return  code  to  determine  when  to  call
       fi_cq_readerr.

   fi_cq_readfrom
       The  fi_cq_readfrom  call  behaves  identical  to fi_cq_read, with the exception that it allows the CQ to
       return source address information to the user for  any  received  data.   Source  address  data  is  only
       available  for  those  endpoints configured with FI_SOURCE capability.  If fi_cq_readfrom is called on an
       endpoint for which source  addressing  data  is  not  available,  the  source  address  will  be  set  to
       FI_ADDR_NOTAVAIL.  The number of input src_addr entries must the the same as the count parameter.

       Returned  source  addressing data is converted from the native address used by the underlying fabric into
       an fi_addr_t, which may be used in transmit operations.  Typically, returning fi_addr_t requires that the
       source address be inserted into the address vector associated with the receiving endpoint.  For endpoints
       allocated using the FI_SOURCE_ERR capability, if the source  address  has  not  been  inserted  into  the
       address  vector,  fi_cq_readfrom  will  return  -FI_EAVAIL.  The completion will then be reported through
       fi_cq_readerr with error code -FI_EADDRNOTAVAIL.  See fi_cq_readerr for details.

       If FI_SOURCE is specified without FI_SOURCE_ERR, source addresses which  cannot  be  mapped  to  a  local
       fi_addr_t  will be reported as FI_ADDR_NOTAVAIL.  The behavior is dependent on the type of address vector
       in use.  For AVs of type FI_AV_MAP, source addresses may be mapped directly to an fi_addr_t  value,  even
       if  the source address were not inserted into the AV.  This allows the provider to optimize the reporting
       of the source fi_addr_t without the overhead of verifying whether the address is  in  the  AV.   If  full
       address validation is necessary, FI_SOURCE_ERR must be used.

   fi_cq_sread / fi_cq_sreadfrom
       The  fi_cq_sread  and  fi_cq_sreadfrom  calls  are  the  blocking equivalent operations to fi_cq_read and
       fi_cq_readfrom.  Their behavior is similar to the non-blocking calls, with the exception that  the  calls
       will not return until either a completion has been read from the CQ or an error or timeout occurs.

       It  is  invalid for applications to call these functions if the CQ has been configured with a wait object
       of FI_WAIT_NONE or FI_WAIT_SET.

   fi_cq_readerr
       The read error function, fi_cq_readerr, retrieves information regarding any asynchronous operation  which
       has  completed  with  an  unexpected  error.  fi_cq_readerr is a non-blocking call, returning immediately
       whether an error completion was found or not.

       Error information is reported to the user through struct fi_cq_err_entry.  The format of  this  structure
       is defined below.

              struct fi_cq_err_entry {
                  void     *op_context; /* operation context */
                  uint64_t flags;       /* completion flags */
                  size_t   len;         /* size of received data */
                  void     *buf;        /* receive data buffer */
                  uint64_t data;        /* completion data */
                  uint64_t tag;         /* message tag */
                  size_t   olen;        /* overflow length */
                  int      err;         /* positive error code */
                  int      prov_errno;  /* provider error code */
                  void    *err_data;    /*  error data */
                  size_t   err_data_size; /* size of err_data */
              };

       The  general reason for the error is provided through the err field.  Provider specific error information
       may also be available through the prov_errno and err_data fields.   The  err_data  field,  if  set,  will
       reference an internal buffer owned by the provider.  The contents of the buffer will remain valid until a
       subsequent read call against the CQ.  Users may call fi_cq_strerror to convert  provider  specific  error
       information into a printable string for debugging purposes.

       Notable completion error codes are given below.

       FI_EADDRNOTAVAIL : This error code is used by CQs configured with FI_SOURCE_ERR to report completions for
       which a matching fi_addr_t source address  could  not  be  found.   An  error  code  of  FI_EADDRNOTAVAIL
       indicates that the data transfer was successfully received and processed, with the fi_cq_err_entry fields
       containing information about the completion.  The err_data field will be set to the source address  data.
       The  source address will be in the same format as specified through the fi_info addr_format field for the
       opened domain.  This may be pass directly into an fi_av_insert call to add  the  source  address  to  the
       address vector.

   fi_cq_signal
       The  fi_cq_signal  call  will  unblock any thread waiting in fi_cq_sread or fi_cq_sreadfrom.  This may be
       used to wake-up a thread that is blocked waiting  to  read  a  completion  operation.   The  fi_cq_signal
       operation is only available if the CQ was configured with a wait object.

COMPLETION FIELDS

       The  CQ  entry  data structures share many of the same fields.  The meanings of these fields are the same
       for all CQ entry structure formats.

       op_context : The operation context is the application specified context value that was provided  with  an
       asynchronous operation.  The op_context field is valid for all completions.

       flags : This specifies flags associated with the completed operation.  The Completion Flags section below
       lists valid flag values.  Flags are set for all relevant completions.

       len : This len field only applies to completed receive operations (e.g.  fi_recv,  fi_trecv,  etc.).   It
       indicates  the size of received message data -- i.e.  how many data bytes were placed into the associated
       receive buffer by a corresponding fi_send/fi_tsend/et al call.  If an endpoint has been  configured  with
       the FI_MSG_PREFIX mode, the len also reflects the size of the prefix buffer.

       buf  :  The  buf  field is only valid for completed receive operations, and only applies when the receive
       buffer was posted with the FI_MULTI_RECV flag.  In this case, buf points to the starting  location  where
       the receive data was placed.

       data  : The data field is only valid if the FI_REMOTE_CQ_DATA completion flag is set, and only applies to
       receive completions.  If FI_REMOTE_CQ_DATA is set, this field will contain the completion  data  provided
       by the peer as part of their transmit request.  The completion data will be given in host byte order.

       tag  :  A  tag  applies  only  to  received  messages that occur using the tagged interfaces.  This field
       contains the tag that was included with the received message.  The tag will be in host byte order.

       olen : The olen field applies to received messages.  It is used to indicate that a received  message  has
       overrun  the  available  buffer space and has been truncated.  The olen specifies the amount of data that
       did not fit into the available receive buffer and was discarded.

       err : This err code is a positive fabric errno associated with a completion.  The err value indicates the
       general reason for an error, if one occurred.  See fi_errno.3 for a list of possible error codes.

       prov_errno  :  On an error, prov_errno may contain a provider specific error code.  The use of this field
       and its meaning is provider specific.  It is intended to be used as a debugging aid.  See  fi_cq_strerror
       for additional details on converting this error value into a human readable string.

       err_data  :  On  an  error,  err_data may reference a provider specific amount of data associated with an
       error.  The use of this field and its meaning is provider specific.  It is  intended  to  be  used  as  a
       debugging  aid.   See  fi_cq_strerror  for  additional details on converting this error data into a human
       readable string.

       err_data_size : On input, err_data_size indicates the size of the err_data buffer in bytes.   On  output,
       err_data_size will be set to the number of bytes copied to the err_data buffer.  The err_data information
       is typically used with fi_cq_strerror to provide details about the type of error that occurred.

       For compatibility purposes, if err_data_size is 0 on input, or the fabric was opened with release <  1.5,
       err_data  will  be  set  to  a data buffer owned by the provider.  The contents of the buffer will remain
       valid until a subsequent read call against the CQ.  Applications must serialize access  to  the  CQ  when
       processing errors to ensure that the buffer referenced by err_data does no change.

COMPLETION FLAGS

       Completion  flags provide additional details regarding the completed operation.  The following completion
       flags are defined.

       FI_SEND : Indicates that the completion was for a send operation.  This flag  may  be  combined  with  an
       FI_MSG or FI_TAGGED flag.

       FI_RECV  :  Indicates that the completion was for a receive operation.  This flag may be combined with an
       FI_MSG or FI_TAGGED flag.

       FI_RMA : Indicates that an RMA operation completed.  This flag may be combined with an FI_READ, FI_WRITE,
       FI_REMOTE_READ, or FI_REMOTE_WRITE flag.

       FI_ATOMIC  :  Indicates  that  an atomic operation completed.  This flag may be combined with an FI_READ,
       FI_WRITE, FI_REMOTE_READ, or FI_REMOTE_WRITE flag.

       FI_MSG : Indicates that a message-based operation completed.  This flag may be combined with  an  FI_SEND
       or FI_RECV flag.

       FI_TAGGED  :  Indicates  that  a  tagged  message operation completed.  This flag may be combined with an
       FI_SEND or FI_RECV flag.

       FI_MULTICAST : Indicates that a multicast operation completed.  This flag may be combined with FI_MSG and
       relevant  flags.  This flag is only guaranteed to be valid for received messages if the endpoint has been
       configured with FI_SOURCE.

       FI_READ : Indicates that a locally initiated RMA or atomic read operation has completed.  This  flag  may
       be combined with an FI_RMA or FI_ATOMIC flag.

       FI_WRITE : Indicates that a locally initiated RMA or atomic write operation has completed.  This flag may
       be combined with an FI_RMA or FI_ATOMIC flag.

       FI_REMOTE_READ : Indicates that a remotely initiated RMA or atomic read operation  has  completed.   This
       flag may be combined with an FI_RMA or FI_ATOMIC flag.

       FI_REMOTE_WRITE  : Indicates that a remotely initiated RMA or atomic write operation has completed.  This
       flag may be combined with an FI_RMA or FI_ATOMIC flag.

       FI_REMOTE_CQ_DATA : This indicates that remote CQ data is available as part of the completion.

       FI_MULTI_RECV : This flag applies to receive buffers that were posted with the  FI_MULTI_RECV  flag  set.
       This  completion  flag  indicates  that the original receive buffer referenced by the completion has been
       consumed and was released by the provider.  Providers may set this flag  on  the  last  message  that  is
       received  into  the  multi-  recv  buffer,  or may generate a separate completion that indicates that the
       buffer has been released.

       Applications can distinguish between these two cases by examining the completion entry flags  field.   If
       additional  flags,  such  as  FI_RECV, are set, the completion is associated with a received message.  In
       this case, the buf field will reference the location where the  received  message  was  placed  into  the
       multi-recv  buffer.   Other  fields  in  the  completion  entry  will be determined based on the received
       message.  If other flag bits are zero, the provider is reporting that  the  multi-recv  buffer  has  been
       released, and the completion entry is not associated with a received message.

NOTES

       A  completion  queue  must  be  bound  to  at  least  one  enabled  endpoint before any operation such as
       fi_cq_read, fi_cq_readfrom, fi_cq_sread, fi_cq_sreadfrom etc.  can be called on it.

       Completion flags may be suppressed if the FI_NOTIFY_FLAGS_ONLY mode bit has been set.  When enabled, only
       the  following  flags are guaranteed to be set in completion data when they are valid: FI_REMOTE_READ and
       FI_REMOTE_WRITE (when FI_RMA_EVENT capability bit has been set), FI_REMOTE_CQ_DATA, and FI_MULTI_RECV.

       If a completion queue has been overrun, it will be placed into an 'overrun' state.  Read operations  will
       continue  to return any valid, non-corrupted completions, if available.  After all valid completions have
       been retrieved, any attempt to read the CQ will result  in  it  returning  an  FI_EOVERRUN  error  event.
       Overrun  completion queues are considered fatal and may not be used to report additional completions once
       the overrun occurs.

RETURN VALUES

       fi_cq_open / fi_cq_signal : Returns 0 on success.  On error, a negative  value  corresponding  to  fabric
       errno is returned.

       fi_cq_read  /  fi_cq_readfrom  /  fi_cq_readerr  fi_cq_sread  / fi_cq_sreadfrom : On success, returns the
       number  of  completion  events  retrieved  from  the  completion  queue.   On  error,  a  negative  value
       corresponding  to  fabric  errno  is  returned.   If  no completions are available to return from the CQ,
       -FI_EAGAIN will be returned.

       fi_cq_strerror : Returns a character string interpretation of the provider specific error returned with a
       completion.

       Fabric errno values are defined in rdma/fi_errno.h.

SEE ALSO

       fi_getinfo(3), fi_endpoint(3), fi_domain(3), fi_eq(3), fi_cntr(3), fi_poll(3)

AUTHORS

       OpenFabrics.