Provided by: manpages_2.77-1_all bug


       capabilities - overview of Linux capabilities


       For  the  purpose  of  performing  permission  checks, traditional Unix
       implementations distinguish two  categories  of  processes:  privileged
       processes  (whose  effective  user ID is 0, referred to as superuser or
       root), and unprivileged processes (whose  effective  UID  is  nonzero).
       Privileged   processes  bypass  all  kernel  permission  checks,  while
       unprivileged processes are subject to full permission checking based on
       the  process’s  credentials (usually: effective UID, effective GID, and
       supplementary group list).

       Starting with kernel 2.2, Linux divides  the  privileges  traditionally
       associated  with  superuser into distinct units, known as capabilities,
       which can be independently enabled and disabled.   Capabilities  are  a
       per-thread attribute.

   Capabilities List
       As at Linux 2.6.14, the following capabilities are implemented:

       CAP_AUDIT_CONTROL (since Linux 2.6.11)
              Enable  and  disable  kernel  auditing;  change  auditing filter
              rules; retrieve auditing status and filtering rules.

       CAP_AUDIT_WRITE (since Linux 2.6.11)
              Allow records to be written to kernel auditing log.

              Allow arbitrary changes to file UIDs and GIDs (see chown(2)).

              Bypass file read, write, and execute permission checks.  (DAC  =
              "discretionary access control".)

              Bypass  file  read  permission  checks  and  directory  read and
              execute permission checks.

              Bypass permission checks on operations that normally require the
              file  system  UID  of  the  process to match the UID of the file
              (e.g., chmod(2), utime(2)), excluding those  operations  covered
              by  the  CAP_DAC_OVERRIDE  and CAP_DAC_READ_SEARCH; set extended
              file attributes (see chattr(1)) on arbitrary files;  set  Access
              Control Lists (ACLs) on arbitrary files; ignore directory sticky
              bit on file deletion; specify O_NOATIME for arbitrary  files  in
              open(2) and fcntl(2).

              Don’t  clear  set-user-ID  and  set-group-ID bits when a file is
              modified; permit setting of the  set-group-ID  bit  for  a  file
              whose  GID  does  not  match  the  file  system  or  any  of the
              supplementary GIDs of the calling process.

              Permit   memory   locking   (mlock(2),   mlockall(2),   mmap(2),

              Bypass permission checks for operations on System V IPC objects.

              Bypass permission checks  for  sending  signals  (see  kill(2)).
              This includes use of the KDSIGACCEPT ioctl.

              (Linux  2.4  onwards)   Allow  file  leases to be established on
              arbitrary files (see fcntl(2)).

              Allow  setting  of  the  EXT2_APPEND_FL  and   EXT2_IMMUTABLE_FL
              extended file attributes (see chattr(1)).

              (Linux  2.4  onwards)  Allow  creation  of  special  files using

              Allow  various   network-related   operations   (e.g.,   setting
              privileged  socket  options,  enabling  multicasting,  interface
              configuration, modifying routing tables).

              Allow binding to Internet domain  reserved  socket  ports  (port
              numbers less than 1024).

              (Unused)  Allow socket broadcasting, and listening multicasts.

              Permit use of RAW and PACKET sockets.

              Allow  arbitrary manipulations of process GIDs and supplementary
              GID list; allow forged GID when passing socket  credentials  via
              Unix domain sockets.

              Grant  or  remove  any  capability  in  the  caller’s  permitted
              capability set to or from any other process.

              Allow  arbitrary  manipulations  of  process  UIDs   (setuid(2),
              setreuid(2),  setresuid(2),  setfsuid(2)); allow forged UID when
              passing socket credentials via Unix domain sockets.

              Permit a range of system  administration  operations  including:
              quotactl(2),   mount(2),   umount(2),   swapon(2),   swapoff(2),
              sethostname(2),   setdomainname(2),   IPC_SET    and    IPC_RMID
              operations on arbitrary System V IPC objects; perform operations
              on trusted and security Extended Attributes (see attr(5));  call
              lookup_dcookie(2);  use  ioprio_set(2) to assign IOPRIO_CLASS_RT
              and IOPRIO_CLASS_IDLE I/O scheduling classes; perform  keyctl(2)
              KEYCTL_CHOWN  and  KEYCTL_SETPERM  operations.  allow forged UID
              when passing socket credentials;  exceed  /proc/sys/fs/file-max,
              the  system-wide  limit  on  the number of open files, in system
              calls that open  files  (e.g.,  accept(2),  execve(2),  open(2),
              pipe(2);  without  this  capability these system calls will fail
              with the error ENFILE if  this  limit  is  encountered);  employ
              CLONE_NEWNS   flag   with   clone(2)   and  unshare(2);  perform
              KEYCTL_CHOWN and KEYCTL_SETPERM keyctl(2) operations.

              Permit calls to reboot(2) and kexec_load(2).

              Permit calls to chroot(2).

              Allow  loading  and   unloading   of   kernel   modules;   allow
              modifications to capability bounding set (see init_module(2) and

              Allow raising process nice value (nice(2),  setpriority(2))  and
              changing  of  the  nice  value  for  arbitrary  processes; allow
              setting of real-time scheduling policies  for  calling  process,
              and  setting  scheduling  policies  and priorities for arbitrary
              processes (sched_setscheduler(2),  sched_setparam(2));  set  CPU
              affinity for arbitrary processes (sched_setaffinity(2)); set I/O
              scheduling  class   and   priority   for   arbitrary   processes
              (ioprio_set(2));   allow   migrate_pages(2)  to  be  applied  to
              arbitrary processes  and  allow  processes  to  be  migrated  to
              arbitrary  nodes; allow move_pages(2) to be applied to arbitrary
              processes; use  the  MPOL_MF_MOVE_ALL  flag  with  mbind(2)  and

              Permit calls to acct(2).

              Allow arbitrary processes to be traced using ptrace(2)

              Permit  I/O  port  operations  (iopl(2)  and  ioperm(2)); access

              Permit: use of reserved space on  ext2  file  systems;  ioctl(2)
              calls  controlling  ext3  journaling;  disk  quota  limits to be
              overridden; resource limits to be increased (see  setrlimit(2));
              RLIMIT_NPROC  resource  limit to be overridden; msg_qbytes limit
              for  a  message  queue  to  be  raised  above   the   limit   in
              /proc/sys/kernel/msgmnb (see msgop(2) and msgctl(2).

              Allow  modification  of system clock (settimeofday(2), stime(2),
              adjtimex(2)); allow modification of real-time (hardware) clock

              Permit calls to vhangup(2).

   Capability Sets
       Each thread has three capability sets containing zero or  more  of  the
       above capabilities:

              the capabilities used by the kernel to perform permission checks
              for the thread.

              the capabilities that the thread may assume  (i.e.,  a  limiting
              superset  for  the effective and inheritable sets).  If a thread
              drops a capability from its permitted  set,  it  can  never  re-
              acquire that capability (unless it execve(2)s a set-user-ID-root

              the capabilities preserved across an execve(2).

       A child created via fork(2) inherits copies of its parent’s  capability
       sets.   See  below  for  a  discussion of the treatment of capabilities
       during execve(2).

       Using capset(2), a thread may manipulate its own capability  sets,  or,
       if  it  has  the  CAP_SETPCAP  capability, those of a thread in another

   Capability bounding set
       When a program is execed, the permitted and effective capabilities  are
       ANDed  with the current value of the so-called capability bounding set,
       defined in the file /proc/sys/kernel/cap-bound.  This parameter can  be
       used  to  place  a system-wide limit on the capabilities granted to all
       subsequently executed programs.  (Confusingly, this bit mask  parameter
       is expressed as a signed decimal number in /proc/sys/kernel/cap-bound.)

       Only the init process may set bits  in  the  capability  bounding  set;
       other than that, the superuser may only clear bits in this set.

       On  a  standard system the capability bounding set always masks out the
       CAP_SETPCAP  capability.   To  remove  this  restriction  (dangerous!),
       modify the definition of CAP_INIT_EFF_SET in include/linux/capability.h
       and rebuild the kernel.

       The capability bounding set feature was added to  Linux  starting  with
       kernel version 2.2.11.

   Current and Future Implementation
       A full implementation of capabilities requires:

       1.  that  for  all  privileged operations, the kernel check whether the
           thread has the required capability in its effective set.

       2.  that the kernel provide system calls allowing a thread’s capability
           sets to be changed and retrieved.

       3.  file  system  support  for  attaching capabilities to an executable
           file, so that a process gains those capabilities when the  file  is

       As at Linux 2.6.14, only the first two of these requirements are met.

       Eventually,  it  should  be possible to associate three capability sets
       with an executable file, which, in conjunction with the capability sets
       of  the  thread,  will  determine the capabilities of a thread after an

       Inheritable (formerly known as allowed):
              this set is ANDed with the thread’s inheritable set to determine
              which inheritable capabilities are permitted to the thread after
              the execve(2).

       Permitted (formerly known as forced):
              the  capabilities  automatically  permitted   to   the   thread,
              regardless of the thread’s inheritable capabilities.

              those capabilities in the thread’s new permitted set are also to
              be set in the new effective set.  (F(effective)  would  normally
              be either all zeroes or all ones.)

       In the meantime, since the current implementation does not support file
       capability sets, during an execve(2):

       1.  All three file capability sets are initially assumed to be cleared.

       2.  If  a set-user-ID-root program is being execed, or the real user ID
           of the process is 0 (root) then the file inheritable and  permitted
           sets are defined to be all ones (i.e., all capabilities enabled).

       3.  If  a  set-user-ID-root  program  is  being executed, then the file
           effective set is defined to be all ones.

   Transformation of Capabilities During exec()
       During an execve(2), the kernel calculates the new capabilities of  the
       process using the following algorithm:

           P’(permitted) = (P(inheritable) & F(inheritable)) |
                           (F(permitted) & cap_bset)

           P’(effective) = P’(permitted) & F(effective)

           P’(inheritable) = P(inheritable)    [i.e., unchanged]


       P         denotes  the  value  of  a  thread  capability set before the

       P’        denotes the value of a capability set after the execve(2)

       F         denotes a file capability set

       cap_bset  is the value of the capability bounding set.

       In the current implementation, the upshot of  this  algorithm  is  that
       when a process execve(2)s a set-user-ID-root program, or when a process
       with an  effective  UID  of  0  execve(2)s  a  program,  it  gains  all
       capabilities  in  its  permitted  and effective capability sets, except
       those masked out by the capability bounding  set  (i.e.,  CAP_SETPCAP).
       This  provides  semantics  that  are  the  same  as  those  provided by
       traditional Unix systems.

   Effect of User ID Changes on Capabilities
       To preserve the traditional semantics for  transitions  between  0  and
       nonzero  user IDs, the kernel makes the following changes to a thread’s
       capability sets on changes to the thread’s real, effective, saved  set,
       and file system user IDs (using setuid(2), setresuid(2), or similar):

       1.  If  one  or  more  of the real, effective or saved set user IDs was
           previously 0, and as a result of the UID changes all of  these  IDs
           have  a  nonzero  value, then all capabilities are cleared from the
           permitted and effective capability sets.

       2.  If the effective user ID is changed from 0  to  nonzero,  then  all
           capabilities are cleared from the effective set.

       3.  If  the  effective  user  ID is changed from nonzero to 0, then the
           permitted set is copied to the effective set.

       4.  If the file system user ID  is  changed  from  0  to  nonzero  (see
           setfsuid(2))  then  the following capabilities are cleared from the
           effective set:  CAP_CHOWN,  CAP_DAC_OVERRIDE,  CAP_DAC_READ_SEARCH,
           CAP_FOWNER, and CAP_FSETID.  If the file system UID is changed from
           nonzero to 0, then any of these capabilities that  are  enabled  in
           the permitted set are enabled in the effective set.

       If a thread that has a 0 value for one or more of its user IDs wants to
       prevent its permitted capability set being cleared when it  resets  all
       of  its  user  IDs  to  nonzero values, it can do so using the prctl(2)
       PR_SET_KEEPCAPS operation.


       No  standards   govern   capabilities,   but   the   Linux   capability
       implementation is based on the withdrawn POSIX.1e draft standard.


       The libcap package provides a suite of routines for setting and getting
       capabilities that is more comfortable and less likely  to  change  than
       the interface provided by capset(2) and capget(2).


       There  is  as  yet  no  file system support allowing capabilities to be
       associated with executable files.


       capget(2), prctl(2), setfsuid(2), credentials(7), pthreads(7)


       This page is part of release 2.77 of the Linux  man-pages  project.   A
       description  of  the project, and information about reporting bugs, can
       be found at