Provided by: liburing-dev_2.6-1_amd64 bug

NAME

       io_uring_register - register files or user buffers for asynchronous I/O

SYNOPSIS

       #include <liburing.h>

       int io_uring_register(unsigned int fd, unsigned int opcode,
                             void *arg, unsigned int nr_args);

DESCRIPTION

       The io_uring_register(2) system call registers resources (e.g. user buffers, files, eventfd, personality,
       restrictions) for use in an io_uring(7) instance referenced by fd.  Registering  files  or  user  buffers
       allows  the  kernel to take long term references to internal data structures or create long term mappings
       of application memory, greatly reducing per-I/O overhead.

       fd  is  the  file  descriptor  returned  by  a  call  to  io_uring_setup(2).   If  opcode  has  the  flag
       IORING_REGISTER_USE_REGISTERED_RING ored into it, fd is instead the index of a registered ring fd.

       opcode can be one of:

       IORING_REGISTER_BUFFERS
              arg  points  to  a  struct iovec array of nr_args entries.  The buffers associated with the iovecs
              will be locked in memory and charged  against  the  user's  RLIMIT_MEMLOCK  resource  limit.   See
              getrlimit(2)  for  more  information.   Additionally,  there  is  a size limit of 1GiB per buffer.
              Currently, the buffers must be  anonymous,  non-file-backed  memory,  such  as  that  returned  by
              malloc(3) or mmap(2) with the MAP_ANONYMOUS flag set.  It is expected that this limitation will be
              lifted in the future. Huge pages are supported as well. Note that the entire  huge  page  will  be
              pinned in the kernel, even if only a portion of it is used.

              After a successful call, the supplied buffers are mapped into the kernel and eligible for I/O.  To
              make use of them, the application must specify the IORING_OP_READ_FIXED  or  IORING_OP_WRITE_FIXED
              opcodes   in   the   submission   queue   entry   (see   the  struct  io_uring_sqe  definition  in
              io_uring_enter(2)), and set the buf_index field to the desired buffer  index.   The  memory  range
              described by the submission queue entry's addr and len fields must fall within the indexed buffer.

              It  is perfectly valid to setup a large buffer and then only use part of it for an I/O, as long as
              the range is within the originally mapped region.

              An application can increase or decrease  the  size  or  number  of  registered  buffers  by  first
              unregistering  the  existing buffers, and then issuing a new call to io_uring_register(2) with the
              new buffers.

              Note that before 5.13 registering buffers would wait for the ring to  idle.   If  the  application
              currently  has  requests  in-flight,  the  registration  will  wait  for  those  to  finish before
              proceeding.

              An application need not unregister buffers explicitly before shutting down the io_uring  instance.
              Note,  however, that shutdown processing may run asynchronously within the kernel. As a result, it
              is not guaranteed that pages are immediately unpinned in this case. Available since 5.1.

       IORING_REGISTER_BUFFERS2
              Register buffers for I/O. Similar to IORING_REGISTER_BUFFERS but aims to have  a  more  extensible
              ABI.

              arg points to a struct io_uring_rsrc_register, and nr_args should be set to the number of bytes in
              the structure.

               struct io_uring_rsrc_register {
                   __u32 nr;
                   __u32 resv;
                   __u64 resv2;
                   __aligned_u64 data;
                   __aligned_u64 tags;
               };

               The data field contains a pointer to a struct iovec array of nr entries.  The tags  field  should
               either  be  0,  then  tagging  is  disabled,  or  point to an array of nr "tags" (unsigned 64 bit
               integers). If a tag is zero, then tagging for this particular resource (a buffer in this case) is
               disabled.  Otherwise,  after  the resource had been unregistered and it's not used anymore, a CQE
               will be posted with user_data set to the specified tag and all other fields zeroed.

               Note that resource updates, e.g.  IORING_REGISTER_BUFFERS_UPDATE,  don't  necessarily  deallocate
               resources  by  the  time  it  returns,  but  they might be held alive until all requests using it
               complete.

               Available since 5.13.

       IORING_REGISTER_BUFFERS_UPDATE
              Updates registered buffers with new ones, either turning a  sparse  entry  into  a  real  one,  or
              replacing an existing entry.

              arg must contain a pointer to a struct io_uring_rsrc_update2, which contains an offset on which to
              start the update, and an array of struct iovec.  tags points to an array of tags.  nr must contain
              the  number of descriptors in the passed in arrays.  See IORING_REGISTER_BUFFERS2 for the resource
              tagging description.

               struct io_uring_rsrc_update2 {
                   __u32 offset;
                   __u32 resv;
                   __aligned_u64 data;
                   __aligned_u64 tags;
                   __u32 nr;
                   __u32 resv2;
               };

               Available since 5.13.

       IORING_UNREGISTER_BUFFERS
              This operation takes no argument, and arg must be  passed  as  NULL.   All  previously  registered
              buffers associated with the io_uring instance will be released synchronously. Available since 5.1.

       IORING_REGISTER_FILES
              Register files for I/O.  arg contains a pointer to an array of nr_args file descriptors (signed 32
              bit integers).

              To make use of the registered files, the IOSQE_FIXED_FILE flag must be set in the flags member  of
              the  struct io_uring_sqe, and the fd member is set to the index of the file in the file descriptor
              array.

              The file set may be sparse, meaning that the fd field  in  the  array  may  be  set  to  -1.   See
              IORING_REGISTER_FILES_UPDATE for how to update files in place.

              Note  that  before  5.13  registering  files  would wait for the ring to idle.  If the application
              currently has  requests  in-flight,  the  registration  will  wait  for  those  to  finish  before
              proceeding.  See  IORING_REGISTER_FILES_UPDATE  for  how  to  update  an existing set without that
              limitation.

              Files are automatically unregistered when the io_uring instance is torn down. An application needs
              only unregister if it wishes to register a new set of fds. Available since 5.1.

       IORING_REGISTER_FILES2
              Register files for I/O. Similar to IORING_REGISTER_FILES.

              arg points to a struct io_uring_rsrc_register, and nr_args should be set to the number of bytes in
              the structure.

              The data field contains a pointer to an array of nr file descriptors  (signed  32  bit  integers).
              tags field should either be 0 or or point to an array of nr "tags" (unsigned 64 bit integers). See
              IORING_REGISTER_BUFFERS2 for more info on resource tagging.

              Note that resource  updates,  e.g.   IORING_REGISTER_FILES_UPDATE,  don't  necessarily  deallocate
              resources, they might be held until all requests using that resource complete.

              Available since 5.13.

       IORING_REGISTER_FILES_UPDATE
              This  operation replaces existing files in the registered file set with new ones, either turning a
              sparse entry (one where fd is equal to -1 ) into a real one, removing an existing entry  (new  one
              is set to -1 ), or replacing an existing entry with a new existing entry.

              arg must contain a pointer to a struct io_uring_files_update, which contains an offset on which to
              start the update, and an array of file descriptors to use for the update.   nr_args  must  contain
              the number of descriptors in the passed in array. Available since 5.5.

              File  descriptors  can  be  skipped if they are set to IORING_REGISTER_FILES_SKIP.  Skipping an fd
              will not touch the file associated with the previous fd at that index. Available since 5.12.

       IORING_REGISTER_FILES_UPDATE2
              Similar to IORING_REGISTER_FILES_UPDATE, replaces existing files in the registered file  set  with
              new  ones,  either turning a sparse entry (one where fd is equal to -1 ) into a real one, removing
              an existing entry (new one is set to -1 ), or replacing an existing  entry  with  a  new  existing
              entry.

              arg must contain a pointer to a struct io_uring_rsrc_update2, which contains an offset on which to
              start the update, and an array of file descriptors to use for the update  stored  in  data.   tags
              points  to  an  array of tags.  nr must contain the number of descriptors in the passed in arrays.
              See IORING_REGISTER_BUFFERS2 for the resource tagging description.

              Available since 5.13.

       IORING_UNREGISTER_FILES
              This operation requires no argument, and arg must be passed as NULL.   All  previously  registered
              files associated with the io_uring instance will be unregistered. Available since 5.1.

       IORING_REGISTER_EVENTFD
              It's  possible  to use eventfd(2) to get notified of completion events on an io_uring instance. If
              this is desired, an eventfd file descriptor can be registered through this  operation.   arg  must
              contain  a pointer to the eventfd file descriptor, and nr_args must be 1. Note that while io_uring
              generally takes care to avoid spurious events, they can occur. Similarly, batched  completions  of
              CQEs  may  only  trigger  a  single  eventfd  notification  even  if multiple CQEs are posted. The
              application should make no assumptions on  number  of  events  being  available  having  a  direct
              correlation  to eventfd notifications posted. An eventfd notification must thus only be treated as
              a hint to check the CQ ring for completions. Available since 5.2.

              An application can temporarily disable notifications, coming through the  registered  eventfd,  by
              setting  the  IORING_CQ_EVENTFD_DISABLED  bit  in the flags field of the CQ ring.  Available since
              5.8.

       IORING_REGISTER_EVENTFD_ASYNC
              This works just like IORING_REGISTER_EVENTFD , except notifications are  only  posted  for  events
              that  complete  in  an  async  manner.  This  means  that  events that complete inline while being
              submitted do not trigger a notification  event.  The  arguments  supplied  are  the  same  as  for
              IORING_REGISTER_EVENTFD.  Available since 5.6.

       IORING_UNREGISTER_EVENTFD
              Unregister  an eventfd file descriptor to stop notifications. Since only one eventfd descriptor is
              currently supported, this operation takes no argument, and arg must be passed as NULL and  nr_args
              must be zero. Available since 5.2.

       IORING_REGISTER_PROBE
              This  operation  returns a structure, io_uring_probe, which contains information about the opcodes
              supported  by  io_uring  on  the  running  kernel.   arg  must  contain  a  pointer  to  a  struct
              io_uring_probe,  and  nr_args must contain the size of the ops array in that probe struct. The ops
              array is of the type io_uring_probe_op, which holds the value of the opcode and a flags field.  If
              the  flags  field  has  IO_URING_OP_SUPPORTED  set,  then  this opcode is supported on the running
              kernel. Available since 5.6.

       IORING_REGISTER_PERSONALITY
              This operation registers credentials of the running application with io_uring, and returns  an  id
              associated  with  these  credentials.  Applications  wishing  to  share  a  ring  between separate
              users/processes can pass in this credential  id  in  the  sqe  personality  field.  If  set,  that
              particular  sqe  will  be  issued with these credentials. Must be invoked with arg set to NULL and
              nr_args set to zero. Available since 5.6.

       IORING_UNREGISTER_PERSONALITY
              This operation unregisters a previously registered personality with io_uring.  nr_args must be set
              to the id in question, and arg must be set to NULL. Available since 5.6.

       IORING_REGISTER_ENABLE_RINGS
              This  operation  enables an io_uring ring started in a disabled state (IORING_SETUP_R_DISABLED was
              specified in the call to io_uring_setup(2)).  While the io_uring ring is disabled, submissions are
              not allowed and registrations are not restricted.

              After  the execution of this operation, the io_uring ring is enabled: submissions and registration
              are allowed, but they will be validated following the  registered  restrictions  (if  any).   This
              operation  takes  no  argument,  must  be  invoked  with  arg set to NULL and nr_args set to zero.
              Available since 5.10.

       IORING_REGISTER_RESTRICTIONS
              arg points to a struct io_uring_restriction array of nr_args entries.

              With an entry it is possible to allow an io_uring_register(2) opcode, or specify which opcode  and
              flags  of  the submission queue entry are allowed, or require certain flags to be specified (these
              flags must be set on each submission queue entry).

              All the restrictions must be submitted with  a  single  io_uring_register(2)  call  and  they  are
              handled as an allowlist (opcodes and flags not registered, are not allowed).

              Restrictions   can  be  registered  only  if  the  io_uring  ring  started  in  a  disabled  state
              (IORING_SETUP_R_DISABLED must be specified in the call to io_uring_setup(2)).

              Available since 5.10.

       IORING_REGISTER_IOWQ_AFF
              By default, async workers created by io_uring will inherit the CPU mask of  its  parent.  This  is
              usually  all  the  CPUs  in the system, unless the parent is being run with a limited set. If this
              isn't the desired outcome, the application may  explicitly  tell  io_uring  what  CPUs  the  async
              workers may run on.  arg must point to a cpu_set_t mask, and nr_args the byte size of that mask.

              Available since 5.14.

       IORING_UNREGISTER_IOWQ_AFF
              Undoes a CPU mask previously set with IORING_REGISTER_IOWQ_AFF.  Must not have arg or nr_args set.

              Available since 5.14.

       IORING_REGISTER_IOWQ_MAX_WORKERS
              By  default,  io_uring  limits the unbounded workers created to the maximum processor count set by
              RLIMIT_NPROC and the bounded workers is a function of the SQ ring size and the number of  CPUs  in
              the  system.  Sometimes  this  can  be  excessive  (or  too little, for bounded), and this command
              provides a way to change the count per ring (per NUMA node) instead.

              arg must be set to an unsigned int pointer to an array of two values, with the values in the array
              being  set  to the maximum count of workers per NUMA node. Index 0 holds the bounded worker count,
              and index 1 holds the unbounded worker count. On successful  return,  the  passed  in  array  will
              contain  the  previous  maximum values for each type. If the count being passed in is 0, then this
              command returns the current maximum values and doesn't modify the current setting.   nr_args  must
              be set to 2, as the command takes two values.

              Available since 5.15.

       IORING_REGISTER_RING_FDS
              Whenever  io_uring_enter(2)  is  called to submit request or wait for completions, the kernel must
              grab a reference to the file descriptor. If the application using io_uring is threaded,  the  file
              table  is  marked  as  shared, and the reference grab and put of the file descriptor count is more
              expensive than it is for a non-threaded application.

              Similarly to how io_uring allows registration of files, this allow registration of the  ring  file
              descriptor itself. This reduces the overhead of the io_uring_enter(2) system call.

              arg  must be set to a pointer to an array of type struct io_uring_rsrc_update of nr_args number of
              entries. The data field of this struct must point to an io_uring file descriptor, and  the  offset
              field  can be either -1 or an explicit offset desired for the registered file descriptor value. If
              -1 is used, then upon successful return of this system call, the field will contain the  value  of
              the registered file descriptor to be used for future io_uring_enter(2) system calls.

              On successful completion of this request, the returned descriptors may be used instead of the real
              file descriptor for io_uring_enter(2), provided that IORING_ENTER_REGISTERED_RING is  set  in  the
              flags  for the system call. This flag tells the kernel that a registered descriptor is used rather
              than a real file descriptor.

              Each thread or process using a ring must register the file descriptor  directly  by  issuing  this
              request.

              The maximum number of supported registered ring descriptors is currently limited to 16.

              Available since 5.18.

       IORING_UNREGISTER_RING_FDS
              Unregister descriptors previously registered with IORING_REGISTER_RING_FDS.

              arg  must be set to a pointer to an array of type struct io_uring_rsrc_update of nr_args number of
              entries. Only the offset field should be set in the  structure,  containing  the  registered  file
              descriptor offset previously returned from IORING_REGISTER_RING_FDS that the application wishes to
              unregister.

              Note that this isn't done automatically on ring exit,  if  the  thread  or  task  that  previously
              registered  a  ring  file  descriptor  isn't exiting. It is recommended to manually unregister any
              previously registered ring descriptors if the ring is closed and the task persists. This will free
              up a registration slot, making it available for future use.

              Available since 5.18.

       IORING_REGISTER_PBUF_RING
              Registers  a  shared  buffer ring to be used with provided buffers. This is a newer alternative to
              using IORING_OP_PROVIDE_BUFFERS which is more efficient,  to  be  used  with  request  types  that
              support the IOSQE_BUFFER_SELECT flag.

              The arg argument must be filled in with the appropriate information. It looks as follows:

                   struct io_uring_buf_reg {
                       __u64 ring_addr;
                       __u32 ring_entries;
                       __u16 bgid;
                       __u16 pad;
                       __u64 resv[3];
                   };

               The  ring_addr  field  must  contain  the  address to the memory allocated to fit this ring.  The
               memory must be page aligned and hence  allocated  appropriately  using  eg  posix_memalign(3)  or
               similar. The size of the ring is the product of ring_entries and the size of struct io_uring_buf.
               ring_entries is the desired size of the ring, and must be a power-of-2 in size. The maximum  size
               allowed is 2^15 (32768).  bgid is the buffer group ID associated with this ring. SQEs that select
               a buffer have a buffer group associated with them in their buf_group field,  and  the  associated
               CQEs  will  have  IORING_CQE_F_BUFFER  set  in  their  flags  member, which will also contain the
               specific ID of the buffer selected. The rest of the fields are reserved and must  be  cleared  to
               zero.

               nr_args must be set to 1.

               Also see io_uring_register_buf_ring(3) for more details. Available since 5.19.

       IORING_UNREGISTER_PBUF_RING
              Unregister  a  previously  registered  provided  buffer ring.  arg must be set to the address of a
              struct io_uring_buf_reg, with just the bgid field set to the buffer group  ID  of  the  previously
              registered provided buffer group.  nr_args must be set to 1. Also see IORING_REGISTER_PBUF_RING .

              Available since 5.19.

       IORING_REGISTER_SYNC_CANCEL
              Performs   a   synchronous   cancelation   request,   which   works   in   a  similar  fashion  to
              IORING_OP_ASYNC_CANCEL except it  completes  inline.  This  can  be  useful  for  scenarios  where
              cancelations  should  happen  synchronously,  rather  than  needing  to  issue an SQE and wait for
              completion of that specific CQE.

              arg must be set to a pointer to a struct  io_uring_sync_cancel_reg  structure,  with  the  details
              filled  in for what request(s) to target for cancelation. See io_uring_register_sync_cancel(3) for
              details on that. The return values are the same, except they are passed back synchronously  rather
              than through the CQE res field.  nr_args must be set to 1.

              Available since 6.0.

       IORING_REGISTER_FILE_ALLOC_RANGE
              sets  the  allowable  range for fixed file index allocations within the kernel. When requests that
              can instantiate a new fixed file are used with IORING_FILE_INDEX_ALLOC , the application is asking
              the  kernel  to allocate a new fixed file descriptor rather than pass in a specific value for one.
              By default, the kernel will pick any available fixed file descriptor within the  range  available.
              This  effectively  allows  the application to set aside a range just for dynamic allocations, with
              the remainder being used for specific values.

              nr_args must be set to 1 and arg must be set to a pointer to a struct io_uring_file_index_range:

                   struct io_uring_file_index_range {
                       __u32 off;
                       __u32 len;
                       __u64 resv;
                   };

               with off being set to the starting value for the range, and  len  being  set  to  the  number  of
               descriptors. The reserved resv field must be cleared to zero.

               The application must have registered a file table first.

               Available since 6.0.

RETURN VALUE

       On  success, io_uring_register(2) returns either 0 or a positive value, depending on the opcode used.  On
       error, a negative error value is returned. The caller should not rely on the errno variable.

ERRORS

       EACCES The opcode field is not allowed due to registered restrictions.

       EBADF  One or more fds in the fd array are invalid.

       EBADFD IORING_REGISTER_ENABLE_RINGS or IORING_REGISTER_RESTRICTIONS was specified, but the io_uring  ring
              is not disabled.

       EBUSY  IORING_REGISTER_BUFFERS  or  IORING_REGISTER_FILES  or IORING_REGISTER_RESTRICTIONS was specified,
              but there were already buffers, files, or restrictions registered.

       EEXIST The thread performing the registration is invalid.

       EFAULT buffer is outside of the process' accessible address space, or iov_len is greater than 1GiB.

       EINVAL IORING_REGISTER_BUFFERS or IORING_REGISTER_FILES was specified, but nr_args is 0.

       EINVAL IORING_REGISTER_BUFFERS was specified, but nr_args exceeds UIO_MAXIOV

       EINVAL IORING_UNREGISTER_BUFFERS or IORING_UNREGISTER_FILES was specified, and nr_args is non-zero or arg
              is non-NULL.

       EINVAL IORING_REGISTER_RESTRICTIONS  was  specified,  but  nr_args  exceeds the maximum allowed number of
              restrictions or restriction opcode is invalid.

       EMFILE IORING_REGISTER_FILES was specified and nr_args exceeds the maximum allowed number of files  in  a
              fixed file set.

       EMFILE IORING_REGISTER_FILES  was  specified  and adding nr_args file references would exceed the maximum
              allowed number of files the user is allowed to have according to the RLIMIT_NOFILE resource  limit
              and  the caller does not have CAP_SYS_RESOURCE capability. Note that this is a per user limit, not
              per process.

       ENOMEM Insufficient kernel resources are available, or the caller  had  a  non-zero  RLIMIT_MEMLOCK  soft
              resource  limit,  but  tried  to  lock  more  memory  than the limit permitted.  This limit is not
              enforced if the process is privileged (CAP_IPC_LOCK).

       ENXIO  IORING_UNREGISTER_BUFFERS or IORING_UNREGISTER_FILES was specified, but there were no  buffers  or
              files registered.

       ENXIO  Attempt  to  register  files or buffers on an io_uring instance that is already undergoing file or
              buffer registration, or is being torn down.

       EOPNOTSUPP
              User buffers point to file-backed memory.

       EFAULT User buffers point to file-backed memory (newer kernels).