Provided by: manpages_3.01-1_all
capabilities - overview of Linux capabilities
For the purpose of performing permission checks, traditional Unix
implementations distinguish two categories of processes: privileged
processes (whose effective user ID is 0, referred to as superuser or
root), and unprivileged processes (whose effective UID is non-zero).
Privileged processes bypass all kernel permission checks, while
unprivileged processes are subject to full permission checking based on
the process’s credentials (usually: effective UID, effective GID, and
supplementary group list).
Starting with kernel 2.2, Linux divides the privileges traditionally
associated with superuser into distinct units, known as capabilities,
which can be independently enabled and disabled. Capabilities are a
As at Linux 2.6.14, the following capabilities are implemented:
CAP_AUDIT_CONTROL (since Linux 2.6.11)
Enable and disable kernel auditing; change auditing filter
rules; retrieve auditing status and filtering rules.
CAP_AUDIT_WRITE (since Linux 2.6.11)
Allow records to be written to kernel auditing log.
Allow arbitrary changes to file UIDs and GIDs (see chown(2)).
Bypass file read, write, and execute permission checks. (DAC =
"discretionary access control".)
Bypass file read permission checks and directory read and
execute permission checks.
Bypass permission checks on operations that normally require the
file system UID of the process to match the UID of the file
(e.g., chmod(2), utime(2)), excluding those operations covered
by the CAP_DAC_OVERRIDE and CAP_DAC_READ_SEARCH; set extended
file attributes (see chattr(1)) on arbitrary files; set Access
Control Lists (ACLs) on arbitrary files; ignore directory sticky
bit on file deletion; specify O_NOATIME for arbitrary files in
open(2) and fcntl(2).
Don’t clear set-user-ID and set-group-ID bits when a file is
modified; permit setting of the set-group-ID bit for a file
whose GID does not match the file system or any of the
supplementary GIDs of the calling process.
Permit memory locking (mlock(2), mlockall(2), mmap(2),
Bypass permission checks for operations on System V IPC objects.
Bypass permission checks for sending signals (see kill(2)).
This includes use of the KDSIGACCEPT ioctl.
(Linux 2.4 onwards) Allow file leases to be established on
arbitrary files (see fcntl(2)).
Allow setting of the EXT2_APPEND_FL and EXT2_IMMUTABLE_FL
extended file attributes (see chattr(1)).
(Linux 2.4 onwards) Allow creation of special files using
Allow various network-related operations (e.g., setting
privileged socket options, enabling multicasting, interface
configuration, modifying routing tables).
Allow binding to Internet domain reserved socket ports (port
numbers less than 1024).
(Unused) Allow socket broadcasting, and listening multicasts.
Permit use of RAW and PACKET sockets.
Allow arbitrary manipulations of process GIDs and supplementary
GID list; allow forged GID when passing socket credentials via
Unix domain sockets.
Grant or remove any capability in the caller’s permitted
capability set to or from any other process.
Allow arbitrary manipulations of process UIDs (setuid(2),
setreuid(2), setresuid(2), setfsuid(2)); allow forged UID when
passing socket credentials via Unix domain sockets.
Permit a range of system administration operations including:
quotactl(2), mount(2), umount(2), swapon(2), swapoff(2),
sethostname(2), setdomainname(2), IPC_SET and IPC_RMID
operations on arbitrary System V IPC objects; perform operations
on trusted and security Extended Attributes (see attr(5)); call
lookup_dcookie(2); use ioprio_set(2) to assign IOPRIO_CLASS_RT
and IOPRIO_CLASS_IDLE I/O scheduling classes; perform keyctl(2)
KEYCTL_CHOWN and KEYCTL_SETPERM operations. allow forged UID
when passing socket credentials; exceed /proc/sys/fs/file-max,
the system-wide limit on the number of open files, in system
calls that open files (e.g., accept(2), execve(2), open(2),
pipe(2); without this capability these system calls will fail
with the error ENFILE if this limit is encountered); employ
CLONE_NEWNS flag with clone(2) and unshare(2); perform
KEYCTL_CHOWN and KEYCTL_SETPERM keyctl(2) operations.
Permit calls to reboot(2) and kexec_load(2).
Permit calls to chroot(2).
Allow loading and unloading of kernel modules; allow
modifications to capability bounding set (see init_module(2) and
Allow raising process nice value (nice(2), setpriority(2)) and
changing of the nice value for arbitrary processes; allow
setting of real-time scheduling policies for calling process,
and setting scheduling policies and priorities for arbitrary
processes (sched_setscheduler(2), sched_setparam(2)); set CPU
affinity for arbitrary processes (sched_setaffinity(2)); set I/O
scheduling class and priority for arbitrary processes
(ioprio_set(2)); allow migrate_pages(2) to be applied to
arbitrary processes and allow processes to be migrated to
arbitrary nodes; allow move_pages(2) to be applied to arbitrary
processes; use the MPOL_MF_MOVE_ALL flag with mbind(2) and
Permit calls to acct(2).
Allow arbitrary processes to be traced using ptrace(2)
Permit I/O port operations (iopl(2) and ioperm(2)); access
Permit: use of reserved space on ext2 file systems; ioctl(2)
calls controlling ext3 journaling; disk quota limits to be
overridden; resource limits to be increased (see setrlimit(2));
RLIMIT_NPROC resource limit to be overridden; msg_qbytes limit
for a message queue to be raised above the limit in
/proc/sys/kernel/msgmnb (see msgop(2) and msgctl(2).
Allow modification of system clock (settimeofday(2), stime(2),
adjtimex(2)); allow modification of real-time (hardware) clock
Permit calls to vhangup(2).
Each thread has three capability sets containing zero or more of the
the capabilities used by the kernel to perform permission checks
for the thread.
the capabilities that the thread may assume (i.e., a limiting
superset for the effective and inheritable sets). If a thread
drops a capability from its permitted set, it can never re-
acquire that capability (unless it execve(2)s a set-user-ID-root
the capabilities preserved across an execve(2).
A child created via fork(2) inherits copies of its parent’s capability
sets. See below for a discussion of the treatment of capabilities
Using capset(2), a thread may manipulate its own capability sets, or,
if it has the CAP_SETPCAP capability, those of a thread in another
Capability bounding set
When a program is execed, the permitted and effective capabilities are
ANDed with the current value of the so-called capability bounding set,
defined in the file /proc/sys/kernel/cap-bound. This parameter can be
used to place a system-wide limit on the capabilities granted to all
subsequently executed programs. (Confusingly, this bit mask parameter
is expressed as a signed decimal number in /proc/sys/kernel/cap-bound.)
Only the init process may set bits in the capability bounding set;
other than that, the superuser may only clear bits in this set.
On a standard system the capability bounding set always masks out the
CAP_SETPCAP capability. To remove this restriction (dangerous!),
modify the definition of CAP_INIT_EFF_SET in include/linux/capability.h
and rebuild the kernel.
The capability bounding set feature was added to Linux starting with
kernel version 2.2.11.
Current and Future Implementation
A full implementation of capabilities requires:
1. that for all privileged operations, the kernel check whether the
thread has the required capability in its effective set.
2. that the kernel provide system calls allowing a thread’s capability
sets to be changed and retrieved.
3. file system support for attaching capabilities to an executable
file, so that a process gains those capabilities when the file is
As at Linux 2.6.14, only the first two of these requirements are met.
Eventually, it should be possible to associate three capability sets
with an executable file, which, in conjunction with the capability sets
of the thread, will determine the capabilities of a thread after an
Inheritable (formerly known as allowed):
this set is ANDed with the thread’s inheritable set to determine
which inheritable capabilities are permitted to the thread after
Permitted (formerly known as forced):
the capabilities automatically permitted to the thread,
regardless of the thread’s inheritable capabilities.
those capabilities in the thread’s new permitted set are also to
be set in the new effective set. (F(effective) would normally
be either all zeros or all ones.)
In the meantime, since the current implementation does not support file
capability sets, during an execve(2):
1. All three file capability sets are initially assumed to be cleared.
2. If a set-user-ID-root program is being execed, or the real user ID
of the process is 0 (root) then the file inheritable and permitted
sets are defined to be all ones (i.e., all capabilities enabled).
3. If a set-user-ID-root program is being executed, then the file
effective set is defined to be all ones.
Transformation of Capabilities During exec()
During an execve(2), the kernel calculates the new capabilities of the
process using the following algorithm:
P’(permitted) = (P(inheritable) & F(inheritable)) |
(F(permitted) & cap_bset)
P’(effective) = P’(permitted) & F(effective)
P’(inheritable) = P(inheritable) [i.e., unchanged]
P denotes the value of a thread capability set before the
P’ denotes the value of a capability set after the execve(2)
F denotes a file capability set
cap_bset is the value of the capability bounding set.
In the current implementation, the upshot of this algorithm is that
when a process execve(2)s a set-user-ID-root program, or when a process
with an effective UID of 0 execve(2)s a program, it gains all
capabilities in its permitted and effective capability sets, except
those masked out by the capability bounding set (i.e., CAP_SETPCAP).
This provides semantics that are the same as those provided by
traditional Unix systems.
Effect of User ID Changes on Capabilities
To preserve the traditional semantics for transitions between 0 and
non-zero user IDs, the kernel makes the following changes to a thread’s
capability sets on changes to the thread’s real, effective, saved set,
and file system user IDs (using setuid(2), setresuid(2), or similar):
1. If one or more of the real, effective or saved set user IDs was
previously 0, and as a result of the UID changes all of these IDs
have a non-zero value, then all capabilities are cleared from the
permitted and effective capability sets.
2. If the effective user ID is changed from 0 to non-zero, then all
capabilities are cleared from the effective set.
3. If the effective user ID is changed from non-zero to 0, then the
permitted set is copied to the effective set.
4. If the file system user ID is changed from 0 to non-zero (see
setfsuid(2)) then the following capabilities are cleared from the
effective set: CAP_CHOWN, CAP_DAC_OVERRIDE, CAP_DAC_READ_SEARCH,
CAP_FOWNER, and CAP_FSETID. If the file system UID is changed from
non-zero to 0, then any of these capabilities that are enabled in
the permitted set are enabled in the effective set.
If a thread that has a 0 value for one or more of its user IDs wants to
prevent its permitted capability set being cleared when it resets all
of its user IDs to non-zero values, it can do so using the prctl(2)
No standards govern capabilities, but the Linux capability
implementation is based on the withdrawn POSIX.1e draft standard.
The libcap package provides a suite of routines for setting and getting
capabilities that is more comfortable and less likely to change than
the interface provided by capset(2) and capget(2).
There is as yet no file system support allowing capabilities to be
associated with executable files.
capget(2), prctl(2), setfsuid(2), credentials(7), pthreads(7)
This page is part of release 3.01 of the Linux man-pages project. A
description of the project, and information about reporting bugs, can
be found at http://www.kernel.org/doc/man-pages/.