plucky (3) fi_wait.3.gz

Provided by: libfabric-dev_1.17.0-3.1_amd64 bug

NAME

       fi_poll - Polling and wait set operations

       fi_poll_open / fi_close
              Open/close a polling set

       fi_poll_add / fi_poll_del
              Add/remove a completion queue or counter to/from a poll set.

       fi_poll
              Poll for progress and events across multiple completion queues and counters.

       fi_wait_open / fi_close
              Open/close a wait set

       fi_wait
              Waits for one or more wait objects in a set to be signaled.

       fi_trywait
              Indicate when it is safe to block on wait objects using native OS calls.

       fi_control
              Control wait set operation or attributes.

SYNOPSIS

              #include <rdma/fi_domain.h>

              int fi_poll_open(struct fid_domain *domain, struct fi_poll_attr *attr,
                  struct fid_poll **pollset);

              int fi_close(struct fid *pollset);

              int fi_poll_add(struct fid_poll *pollset, struct fid *event_fid,
                  uint64_t flags);

              int fi_poll_del(struct fid_poll *pollset, struct fid *event_fid,
                  uint64_t flags);

              int fi_poll(struct fid_poll *pollset, void **context, int count);

              int fi_wait_open(struct fid_fabric *fabric, struct fi_wait_attr *attr,
                  struct fid_wait **waitset);

              int fi_close(struct fid *waitset);

              int fi_wait(struct fid_wait *waitset, int timeout);

              int fi_trywait(struct fid_fabric *fabric, struct fid **fids, size_t count);

              int fi_control(struct fid *waitset, int command, void *arg);

ARGUMENTS

       fabric Fabric provider

       domain Resource domain

       pollset
              Event poll set

       waitset
              Wait object set

       attr   Poll or wait set attributes

       context
              On success, an array of user context values associated with completion queues or counters.

       fids   An array of fabric descriptors, each one associated with a native wait object.

       count  Number of entries in context or fids array.

       timeout
              Time to wait for a signal, in milliseconds.

       command
              Command of control operation to perform on the wait set.

       arg    Optional control argument.

DESCRIPTION

   fi_poll_open
       fi_poll_open  creates  a  new  polling  set.   A  poll  set  enables  an optimized method for progressing
       asynchronous  operations  across  multiple  completion  queues  and  counters  and  checking  for   their
       completions.

       A poll set is defined with the following attributes.

              struct fi_poll_attr {
                  uint64_t             flags;     /* operation flags */
              };

       flags  Flags  that set the default operation of the poll set.  The use of this field is reserved and must
              be set to 0 by the caller.

   fi_close
       The fi_close call releases all resources associated with a poll set.  The poll set must not be associated
       with any other resources prior to being closed, otherwise the call will return -FI_EBUSY.

   fi_poll_add
       Associates a completion queue or counter with a poll set.

   fi_poll_del
       Removes a completion queue or counter from a poll set.

   fi_poll
       Progresses  all  completion  queues  and  counters  associated with a poll set and checks for events.  If
       events might have occurred, contexts associated with the completion queues and/or counters are  returned.
       Completion queues will return their context if they are not empty.  The context associated with a counter
       will be returned if the counter’s success value or error value have changed since the last time  fi_poll,
       fi_cntr_set,  or  fi_cntr_add  were called.  The number of contexts is limited to the size of the context
       array, indicated by the count parameter.

       Note that fi_poll only indicates that events might be available.  In some cases,  providers  may  consume
       such  events  internally,  to  drive  progress,  for example.  This can result in fi_poll returning false
       positives.  Applications should drive their progress based on  the  results  of  reading  events  from  a
       completion  queue  or  reading  counter  values.   The fi_poll function will always return all completion
       queues and counters that do have new events.

   fi_wait_open
       fi_wait_open allocates a new wait set.  A wait set enables an optimized  method  of  waiting  for  events
       across multiple completion queues and counters.  Where possible, a wait set uses a single underlying wait
       object that is signaled when a specified condition occurs on an associated completion queue or counter.

       The properties and behavior of a wait set are defined by struct fi_wait_attr.

              struct fi_wait_attr {
                  enum fi_wait_obj     wait_obj;  /* requested wait object */
                  uint64_t             flags;     /* operation flags */
              };

       wait_obj
              Wait sets are associated with specific wait object(s).  Wait objects allow applications  to  block
              until  the  wait  object  is  signaled,  indicating  that  an  event is available to be read.  The
              following values may be used to specify the type of  wait  object  associated  with  a  wait  set:
              FI_WAIT_UNSPEC, FI_WAIT_FD, FI_WAIT_MUTEX_COND, and FI_WAIT_YIELD.

       - FI_WAIT_UNSPEC
              Specifies  that  the  user  will  only  wait on the wait set using fabric interface calls, such as
              fi_wait.  In this case, the underlying  provider  may  select  the  most  appropriate  or  highest
              performing  wait  object  available,  including  custom wait mechanisms.  Applications that select
              FI_WAIT_UNSPEC are not guaranteed to retrieve the underlying wait object.

       - FI_WAIT_FD
              Indicates that the wait set should use a single file descriptor as its wait mechanism, as  exposed
              to  the application.  Internally, this may require the use of epoll in order to support waiting on
              a single file descriptor.  File descriptor wait objects must be usable in the POSIX select(2)  and
              poll(2), and Linux epoll(7) routines (if available).  Provider signal an FD wait object by marking
              it as readable or with an error.

       - FI_WAIT_MUTEX_COND
              Specifies that the wait set should use a pthread mutex and cond variable as a wait object.

       - FI_WAIT_POLLFD
              This option is similar to  FI_WAIT_FD,  but  allows  the  wait  mechanism  to  use  multiple  file
              descriptors  as  its  wait mechanism, as viewed by the application.  The use of FI_WAIT_POLLFD can
              eliminate the need to use epoll to abstract away needing to check multiple file  descriptors  when
              waiting  for  events.   The  file  descriptors  must  be usable in the POSIX select(2) and poll(2)
              routines, and match directly to being used with poll.  See the NOTES section below for details  on
              using pollfd.

       - FI_WAIT_YIELD
              Indicates that the wait set will wait without a wait object but instead yield on every wait.

       flags  Flags  that set the default operation of the wait set.  The use of this field is reserved and must
              be set to 0 by the caller.

   fi_close
       The fi_close call releases all resources associated with a wait set.  The wait set must not be  bound  to
       any other opened resources prior to being closed, otherwise the call will return -FI_EBUSY.

   fi_wait
       Waits on a wait set until one or more of its underlying wait objects is signaled.

   fi_trywait
       The  fi_trywait  call was introduced in libfabric version 1.3.  The behavior of using native wait objects
       without the use of fi_trywait is provider specific and should be considered non-deterministic.

       The fi_trywait() call is used in conjunction with native operating system calls to block on wait objects,
       such  as  file descriptors.  The application must call fi_trywait and obtain a return value of FI_SUCCESS
       prior to blocking on a native wait object.  Failure to do so may result in  the  wait  object  not  being
       signaled,  and  the application not observing the desired events.  The following pseudo-code demonstrates
       the use of fi_trywait in conjunction with the OS select(2) call.

              fi_control(&cq->fid, FI_GETWAIT, (void *) &fd);
              FD_ZERO(&fds);
              FD_SET(fd, &fds);

              while (1) {
                  if (fi_trywait(&cq, 1) == FI_SUCCESS)
                      select(fd + 1, &fds, NULL, &fds, &timeout);

                  do {
                      ret = fi_cq_read(cq, &comp, 1);
                  } while (ret > 0);
              }

       fi_trywait() will return FI_SUCCESS if it is safe to block on the wait  object(s)  corresponding  to  the
       fabric  descriptor(s),  or  -FI_EAGAIN if there are events queued on the fabric descriptor or if blocking
       could hang the application.

       The call takes an array of fabric descriptors.  For each wait object that will be passed  to  the  native
       wait  routine,  the  corresponding  fabric  descriptor  should first be passed to fi_trywait.  All fabric
       descriptors passed into a single fi_trywait call must make use of the same underlying wait object type.

       The following types of fabric descriptors may be passed into fi_trywait: event queues, completion queues,
       counters,  and  wait  sets.   Applications that wish to use native wait calls should select specific wait
       objects when allocating such resources.  For example, by setting the item’s creation  attribute  wait_obj
       value to FI_WAIT_FD.

       In  the  case the wait object to check belongs to a wait set, only the wait set itself needs to be passed
       into fi_trywait.  The fabric resources associated with the wait set do not.

       On receiving a return value of  -FI_EAGAIN  from  fi_trywait,  an  application  should  read  all  queued
       completions  and events, and call fi_trywait again before attempting to block.  Applications can make use
       of a fabric poll set to identify completion queues and counters that may require processing.

   fi_control
       The fi_control call is used to access provider or implementation specific details of a fids that  support
       blocking calls, such as wait sets, completion queues, counters, and event queues.  Access to the wait set
       or fid should be serialized across all  calls  when  fi_control  is  invoked,  as  it  may  redirect  the
       implementation of wait set operations.  The following control commands are usable with a wait set or fid.

       FI_GETWAIT (void **)
              This  command  allows the user to retrieve the low-level wait object associated with a wait set or
              fid.  The format of the wait set is specified during wait  set  creation,  through  the  wait  set
              attributes.   The  fi_control  arg  parameter should be an address where a pointer to the returned
              wait object will be written.  This should be an ’int *’ for FI_WAIT_FD, `struct fi_mutex_cond' for
              FI_WAIT_MUTEX_COND,  or  `struct  fi_wait_pollfd'  for  FI_WAIT_POLLFD.  Support for FI_GETWAIT is
              provider specific.

       FI_GETWAITOBJ (enum fi_wait_obj *)
              This command returns the type of wait object associated with a wait set or fid.

RETURN VALUES

       Returns FI_SUCCESS on success.  On error, a negative value corresponding to fabric errno is returned.

       Fabric errno values are defined in rdma/fi_errno.h.

       fi_poll
              On success, if events are available, returns the number of entries written to the context array.

NOTES

       In many situations, blocking calls may need to wait on signals sent to a number of file descriptors.  For
       example,  this  is the case for socket based providers, such as tcp and udp, as well as utility providers
       such as multi-rail.  For simplicity, when epoll is available, it can be used to limit the number of  file
       descriptors  that an application must monitor.  The use of epoll may also be required in order to support
       FI_WAIT_FD.

       However, in order to support waiting on multiple file descriptors on systems where epoll support  is  not
       available,  or  where  epoll  performance may negatively impact performance, FI_WAIT_POLLFD provides this
       mechanism.   A  significant  different  between  using  POLLFD  versus  FD  wait  objects  is  that  with
       FI_WAIT_POLLFD,  the  file  descriptors  may  change  dynamically.   As  an example, the file descriptors
       associated with a completion queues’ wait set may change as endpoint associations with the CQ  are  added
       and removed.

       Struct  fi_wait_pollfd  is used to retrieve all file descriptors for fids using FI_WAIT_POLLFD to support
       blocking calls.

              struct fi_wait_pollfd {
                  uint64_t      change_index;
                  size_t        nfds;
                  struct pollfd *fd;
              };

       change_index
              The change_index may be used to determine if there have been any changes to  the  file  descriptor
              list.   Anytime  a  file  descriptor  is  added, removed, or its events are updated, this field is
              incremented by the provider.  Applications wishing to wait on  file  descriptors  directly  should
              cache  the  change_index  value.   Before  blocking  on file descriptor events, the app should use
              fi_control() to retrieve the current change_index and compare that against its cached  value.   If
              the values differ, then the app should update its file descriptor list prior to blocking.

       nfds   On  input  to fi_control(), this indicates the number of entries in the struct pollfd * array.  On
              output, this will be set to the number of entries needed to  store  the  current  number  of  file
              descriptors.   If  the  input value is smaller than the output value, fi_control() will return the
              error -FI_ETOOSMALL.  Note that setting  nfds  =  0  allows  an  efficient  way  of  checking  the
              change_index.

       fd     This  points to an array of struct pollfd entries.  The number of entries is specified through the
              nfds field.  If the number of needed entries is less than  or  equal  to  the  number  of  entries
              available,  the  struct  pollfd  array  will  be  filled  out  with a list of file descriptors and
              corresponding events that can be used in the select(2) and poll(2) calls.

       The change_index is updated only when the file descriptors  associated  with  the  pollfd  file  set  has
       changed.   Checking  the  change_index is an additional step needed when working with FI_WAIT_POLLFD wait
       objects directly.  The use of the fi_trywait() function is  still  required  if  accessing  wait  objects
       directly.

SEE ALSO

       fi_getinfo(3), fi_domain(3), fi_cntr(3), fi_eq(3)

AUTHORS

       OpenFabrics.