Ubuntu Manpage: namespaces - overview of Linux namespaces

NAME

       namespaces - overview of Linux namespaces

DESCRIPTION

       A namespace wraps a global system resource in an abstraction that makes it appear to the processes within
       the  namespace  that they have their own isolated instance of the global resource.  Changes to the global
       resource are visible to other processes that are members of the namespace, but  are  invisible  to  other
       processes.  One use of namespaces is to implement containers.

       Linux provides the following namespaces:
       Namespace   Constant          Isolates
       Cgroup      CLONE_NEWCGROUP   Cgroup root directory
       IPC         CLONE_NEWIPC      System V IPC, POSIX message queues
       Network     CLONE_NEWNET      Network devices, stacks, ports, etc.
       Mount       CLONE_NEWNS       Mount points
       PID         CLONE_NEWPID      Process IDs
       User        CLONE_NEWUSER     User and group IDs
       UTS         CLONE_NEWUTS      Hostname and NIS domain name

       This  page  describes  the various namespaces and the associated /proc files, and summarizes the APIs for
       working with namespaces.

   The namespaces API
       As well as various /proc files described below, the namespaces API includes the following system calls:

       clone(2)
              The clone(2) system call creates a new process.  If the flags argument of the call  specifies  one
              or  more  of the CLONE_NEW* flags listed below, then new namespaces are created for each flag, and
              the child process is made a member of those namespaces.   (This  system  call  also  implements  a
              number of features unrelated to namespaces.)

       setns(2)
              The  setns(2) system call allows the calling process to join an existing namespace.  The namespace
              to join is specified via a file  descriptor  that  refers  to  one  of  the  /proc/[pid]/ns  files
              described below.

       unshare(2)
              The unshare(2) system call moves the calling process to a new namespace.  If the flags argument of
              the  call  specifies  one  or  more  of the CLONE_NEW* flags listed below, then new namespaces are
              created for each flag, and the calling process is made a member of those namespaces.  (This system
              call also implements a number of features unrelated to namespaces.)

       Creation of new namespaces using clone(2)  and  unshare(2)  in  most  cases  requires  the  CAP_SYS_ADMIN
       capability.   User  namespaces  are  the exception: since Linux 3.8, no privilege is required to create a
       user namespace.

   The /proc/[pid]/ns/ directory
       Each process has a /proc/[pid]/ns/ subdirectory containing one entry for  each  namespace  that  supports
       being manipulated by setns(2):

           $ ls -l /proc/$$/ns
           total 0
           lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 cgroup -> cgroup:[4026531835]
           lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 ipc -> ipc:[4026531839]
           lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 mnt -> mnt:[4026531840]
           lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 net -> net:[4026531969]
           lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 pid -> pid:[4026531836]
           lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 pid_for_children -> pid:[4026531834]
           lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 user -> user:[4026531837]
           lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 uts -> uts:[4026531838]

       Bind mounting (see mount(2)) one of the files in this directory to somewhere else in the filesystem keeps
       the  corresponding namespace of the process specified by pid alive even if all processes currently in the
       namespace terminate.

       Opening one of the files in this directory (or a file that is bind mounted to one of these files) returns
       a file handle for the corresponding namespace of the process specified by pid.   As  long  as  this  file
       descriptor  remains  open,  the  namespace  will  remain  alive,  even  if all processes in the namespace
       terminate.  The file descriptor can be passed to setns(2).

       In Linux 3.7 and earlier, these files were visible as hard  links.   Since  Linux  3.8,  they  appear  as
       symbolic  links.   If  two  processes  are  in  the  same  namespace,  then  the  inode  numbers of their
       /proc/[pid]/ns/xxx symbolic links will be the same; an application can check this using  the  stat.st_ino
       field  returned  by stat(2).  The content of this symbolic link is a string containing the namespace type
       and inode number as in the following example:

           $ readlink /proc/$$/ns/uts
           uts:[4026531838]

       The symbolic links in this subdirectory are as follows:

       /proc/[pid]/ns/cgroup (since Linux 4.6)
              This file is a handle for the cgroup namespace of the process.

       /proc/[pid]/ns/ipc (since Linux 3.0)
              This file is a handle for the IPC namespace of the process.

       /proc/[pid]/ns/mnt (since Linux 3.8)
              This file is a handle for the mount namespace of the process.

       /proc/[pid]/ns/net (since Linux 3.0)
              This file is a handle for the network namespace of the process.

       /proc/[pid]/ns/pid (since Linux 3.8)
              This file is a handle for the PID namespace of the process.  This  handle  is  permanent  for  the
              lifetime of the process (i.e., a process's PID namespace membership never changes).

       /proc/[pid]/ns/pid_for_children (since Linux 4.12)
              This  file is a handle for the PID namespace of child processes created by this process.  This can
              change as a consequence of calls to unshare(2) and setns(2) (see pid_namespaces(7)), so  the  file
              may differ from /proc/[pid]/ns/pid.

       /proc/[pid]/ns/user (since Linux 3.8)
              This file is a handle for the user namespace of the process.

       /proc/[pid]/ns/uts (since Linux 3.0)
              This file is a handle for the UTS namespace of the process.

       Permission  to dereference or read (readlink(2)) these symbolic links is governed by a ptrace access mode
       PTRACE_MODE_READ_FSCREDS check; see ptrace(2).

   The /proc/sys/user directory
       The files in the /proc/sys/user directory (which is present since Linux 4.9) expose limits on the  number
       of namespaces of various types that can be created.  The files are as follows:

       max_cgroup_namespaces
              The  value  in  this  file defines a per-user limit on the number of cgroup namespaces that may be
              created in the user namespace.

       max_ipc_namespaces
              The value in this file defines a per-user limit on the  number  of  ipc  namespaces  that  may  be
              created in the user namespace.

       max_mnt_namespaces
              The  value  in  this  file  defines a per-user limit on the number of mount namespaces that may be
              created in the user namespace.

       max_net_namespaces
              The value in this file defines a per-user limit on the number of network namespaces  that  may  be
              created in the user namespace.

       max_pid_namespaces
              The  value  in  this  file  defines  a  per-user limit on the number of pid namespaces that may be
              created in the user namespace.

       max_user_namespaces
              The value in this file defines a per-user limit on the number  of  user  namespaces  that  may  be
              created in the user namespace.

       max_uts_namespaces
              The  value  in  this  file  defines  a per-user limit on the number of user namespaces that may be
              created in the user namespace.

       Note the following details about these files:

       *  The values in these files are modifiable by privileged processes.

       *  The values exposed by these files are the limits for the user namespace in which the  opening  process
          resides.

       *  The limits are per-user.  Each user in the same user namespace can create namespaces up to the defined
          limit.

       *  The limits apply to all users, including UID 0.

       *  These  limits  apply  in  addition  to  any other per-namespace limits (such as those for PID and user
          namespaces) that may be enforced.

       *  Upon encountering these limits, clone(2) and unshare(2) fail with the error ENOSPC.

       *  For the initial user namespace, the default value in each of these files is  half  the  limit  on  the
          number  of  threads  that  may  be  created  (/proc/sys/kernel/threads-max).   In  all descendant user
          namespaces, the default value in each file is MAXINT.

       *  When a namespace is  created,  the  object  is  also  accounted  against  ancestor  namespaces.   More
          precisely:

          +  Each user namespace has a creator UID.

          +  When  a namespace is created, it is accounted against the creator UIDs in each of the ancestor user
             namespaces, and the kernel ensures that the corresponding namespace limit for the  creator  UID  in
             the ancestor namespace is not exceeded.

          +  The  aforementioned  point  ensures that creating a new user namespace cannot be used as a means to
             escape the limits in force in the current user namespace.

   Cgroup namespaces (CLONE_NEWCGROUP)
       See cgroup_namespaces(7).

   IPC namespaces (CLONE_NEWIPC)
       IPC namespaces isolate certain IPC resources, namely, System V IPC  objects  (see  svipc(7))  and  (since
       Linux  2.6.30)  POSIX  message  queues  (see  mq_overview(7)).   The  common  characteristic of these IPC
       mechanisms is that IPC objects are identified by mechanisms other than filesystem pathnames.

       Each IPC namespace has its own set  of  System  V  IPC  identifiers  and  its  own  POSIX  message  queue
       filesystem.   Objects  created in an IPC namespace are visible to all other processes that are members of
       that namespace, but are not visible to processes in other IPC namespaces.

       The following /proc interfaces are distinct in each IPC namespace:

       *  The POSIX message queue interfaces in /proc/sys/fs/mqueue.

       *  The System V IPC interfaces in /proc/sys/kernel, namely: msgmax, msgmnb, msgmni, sem, shmall,  shmmax,
          shmmni, and shm_rmid_forced.

       *  The System V IPC interfaces in /proc/sysvipc.

       When  an  IPC  namespace  is  destroyed  (i.e.,  when  the last process that is a member of the namespace
       terminates), all IPC objects in the namespace are automatically destroyed.

       Use of IPC namespaces requires a kernel that is configured with the CONFIG_IPC_NS option.

   Network namespaces (CLONE_NEWNET)
       See network_namespaces(7).

   Mount namespaces (CLONE_NEWNS)
       See mount_namespaces(7).

   PID namespaces (CLONE_NEWPID)
       See pid_namespaces(7).

   User namespaces (CLONE_NEWUSER)
       See user_namespaces(7).

   UTS namespaces (CLONE_NEWUTS)
       UTS namespaces provide isolation of two system identifiers: the hostname and the NIS domain name.   These
       identifiers  are  set  using  sethostname(2)  and  setdomainname(2), and can be retrieved using uname(2),
       gethostname(2), and getdomainname(2).

       Use of UTS namespaces requires a kernel that is configured with the CONFIG_UTS_NS option.

EXAMPLE

       See clone(2) and user_namespaces(7).

COLOPHON

       This page is part of release 4.15 of  the  Linux  man-pages  project.   A  description  of  the  project,
       information   about   reporting   bugs,   and   the  latest  version  of  this  page,  can  be  found  at
       https://www.kernel.org/doc/man-pages/.

Linux                                              2018-02-02                                      NAMESPACES(7)

NAME

DESCRIPTION

EXAMPLE

SEE ALSO

COLOPHON