Provided by: manpages_4.04-2_all bug


       pid_namespaces - overview of Linux PID namespaces


       For an overview of namespaces, see namespaces(7).

       PID  namespaces  isolate  the  process  ID  number  space, meaning that
       processes in different PID namespaces  can  have  the  same  PID.   PID
       namespaces   allow   containers   to   provide  functionality  such  as
       suspending/resuming the set of processes in the container and migrating
       the  container  to  a new host while the processes inside the container
       maintain the same PIDs.

       PIDs in a new PID namespace start at  1,  somewhat  like  a  standalone
       system,  and  calls  to  fork(2),  vfork(2),  or  clone(2) will produce
       processes with PIDs that are unique within the namespace.

       Use of PID namespaces requires a kernel that  is  configured  with  the
       CONFIG_PID_NS option.

   The namespace init process
       The first process created in a new namespace (i.e., the process created
       using clone(2) with the CLONE_NEWPID flag, or the first  child  created
       by  a  process  after a call to unshare(2) using the CLONE_NEWPID flag)
       has the PID 1, and  is  the  "init"  process  for  the  namespace  (see
       init(1)).   A  child process that is orphaned within the namespace will
       be reparented to this process rather than init(1) (unless  one  of  the
       ancestors  of the child in the same PID namespace employed the prctl(2)
       PR_SET_CHILD_SUBREAPER command to mark itself as the reaper of orphaned
       descendant processes).

       If  the  "init"  process  of  a  PID  namespace  terminates, the kernel
       terminates all of the processes in the namespace via a SIGKILL  signal.
       This  behavior  reflects  the fact that the "init" process is essential
       for the correct  operation  of  a  PID  namespace.   In  this  case,  a
       subsequent  fork(2)  into  this  PID namespace will fail with the error
       ENOMEM; it is not possible to create a new processes in a PID namespace
       whose  "init"  process  has terminated.  Such scenarios can occur when,
       for  example,  a  process  uses  an  open   file   descriptor   for   a
       /proc/[pid]/ns/pid  file  corresponding  to  a  process  that  was in a
       namespace to setns(2) into that namespace after the "init" process  has
       terminated.   Another  possible  scenario  can  occur  after  a call to
       unshare(2): if the  first  child  subsequently  created  by  a  fork(2)
       terminates, then subsequent calls to fork(2) will fail with ENOMEM.

       Only  signals  for  which  the  "init" process has established a signal
       handler can be sent to the "init" process by other members of  the  PID
       namespace.   This restriction applies even to privileged processes, and
       prevents other members of the PID namespace from  accidentally  killing
       the "init" process.

       Likewise,  a  process in an ancestor namespace can—subject to the usual
       permission checks described  in  kill(2)—send  signals  to  the  "init"
       process  of  a  child  PID  namespace  only  if  the "init" process has
       established a handler  for  that  signal.   (Within  the  handler,  the
       siginfo_t  si_pid  field  described  in  sigaction(2)  will  be  zero.)
       SIGKILL  or  SIGSTOP  are  treated  exceptionally:  these  signals  are
       forcibly  delivered  when sent from an ancestor PID namespace.  Neither
       of these signals can be caught by  the  "init"  process,  and  so  will
       result   in   the   usual   actions   associated   with  those  signals
       (respectively, terminating and stopping the process).

       Starting with Linux 3.4, the reboot(2) system call causes a  signal  to
       be  sent  to  the  namespace  "init"  process.   See reboot(2) for more

   Nesting PID namespaces
       PID namespaces can be nested: each PID namespace has a  parent,  except
       for  the initial ("root") PID namespace.  The parent of a PID namespace
       is the PID namespace of the process that created  the  namespace  using
       clone(2)  or  unshare(2).   PID  namespaces  thus form a tree, with all
       namespaces ultimately tracing their ancestry to the root namespace.

       A process is visible to other processes in its PID  namespace,  and  to
       the  processes  in each direct ancestor PID namespace going back to the
       root PID namespace.  In this context, "visible" means that one  process
       can  be  the target of operations by another process using system calls
       that specify a process ID.  Conversely, the processes in  a  child  PID
       namespace  can't  see  processes  in  the  parent  and  further removed
       ancestor namespaces.  More succinctly: a process can  see  (e.g.,  send
       signals  with  kill(2), set nice values with setpriority(2), etc.) only
       processes contained in its own PID namespace and in descendants of that

       A process has one process ID in each of the layers of the PID namespace
       hierarchy in which is visible, and  walking  back  though  each  direct
       ancestor  namespace  through  to  the root PID namespace.  System calls
       that operate on process IDs always operate using the process ID that is
       visible in the PID namespace of the caller.  A call to getpid(2) always
       returns the PID associated with the namespace in which the process  was

       Some  processes in a PID namespace may have parents that are outside of
       the namespace.  For example, the parent of the initial process  in  the
       namespace  (i.e.,  the  init(1)  process  with PID 1) is necessarily in
       another namespace.  Likewise, the direct children  of  a  process  that
       uses  setns(2)  to  cause its children to join a PID namespace are in a
       different  PID  namespace  from  the  caller  of  setns(2).   Calls  to
       getppid(2) for such processes return 0.

       While  processes  may  freely  descend into child PID namespaces (e.g.,
       using setns(2) with CLONE_NEWPID), they  may  not  move  in  the  other
       direction.   That  is  to  say,  processes  may  not enter any ancestor
       namespaces (parent, grandparent, etc.).  Changing PID namespaces  is  a
       one way operation.

   setns(2) and unshare(2) semantics
       Calls  to  setns(2)  that  specify  a PID namespace file descriptor and
       calls  to  unshare(2)  with  the  CLONE_NEWPID  flag   cause   children
       subsequently  created  by  the  caller  to be placed in a different PID
       namespace from the caller.  These calls do not, however, change the PID
       namespace  of  the  calling  process, because doing so would change the
       caller's idea of its own PID (as reported  by  getpid()),  which  would
       break many applications and libraries.

       To  put  things  another  way:  a process's PID namespace membership is
       determined  when  the  process  is  created  and  cannot   be   changed
       thereafter.    Among   other  things,  this  means  that  the  parental
       relationship  between  processes  mirrors  the  parental   relationship
       between  PID  namespaces: the parent of a process is either in the same
       namespace or resides in the immediate parent PID namespace.

   Compatibility of CLONE_NEWPID with other CLONE_* flags
       CLONE_NEWPID can't be combined with some other CLONE_* flags:

       *  CLONE_THREAD requires being in the same PID namespace in order  that
          the threads in a process can send signals to each other.  Similarly,
          it must be possible to see all of the threads of a processes in  the
          proc(5) filesystem.

       *  CLONE_SIGHAND  requires  being  in the same PID namespace; otherwise
          the process ID  of  the  process  sending  a  signal  could  not  be
          meaningfully  encoded  when a signal is sent (see the description of
          the siginfo_t type in  sigaction(2)).   A  signal  queue  shared  by
          processes in multiple PID namespaces will defeat that.

       *  CLONE_VM  requires  all  of  the  threads  to  be  in  the  same PID
          namespace, because, from the point of view of a core  dump,  if  two
          processes  share  the  same  address space then they are threads and
          will be core dumped together.  When a core dump is written, the  PID
          of  each  thread is written into the core dump.  Writing the process
          IDs could not meaningfully succeed if some of the process  IDs  were
          in a parent PID namespace.

       To   summarize:   there   is   a  technical  requirement  for  each  of
       CLONE_THREAD, CLONE_SIGHAND, and CLONE_VM to  share  a  PID  namespace.
       (Note furthermore that in clone(2) requires CLONE_VM to be specified if
       CLONE_THREAD or CLONE_SIGHAND is specified.)  Thus, call sequences such
       as the following will fail (with the error EINVAL):

           clone(..., CLONE_VM, ...);    /* Fails */

           setns(fd, CLONE_NEWPID);
           clone(..., CLONE_VM, ...);    /* Fails */

           clone(..., CLONE_VM, ...);
           setns(fd, CLONE_NEWPID);      /* Fails */

           clone(..., CLONE_VM, ...);
           unshare(CLONE_NEWPID);        /* Fails */

   /proc and PID namespaces
       A  /proc filesystem shows (in the /proc/PID directories) only processes
       visible in the PID namespace of the process that performed  the  mount,
       even  if  the  /proc  filesystem  is  viewed  from  processes  in other

       After creating a new PID namespace, it  is  useful  for  the  child  to
       change  its  root directory and mount a new procfs instance at /proc so
       that tools such as ps(1) work correctly.  If a new mount  namespace  is
       simultaneously  created  by including CLONE_NEWNS in the flags argument
       of clone(2) or unshare(2), then it isn't necessary to change  the  root
       directory: a new procfs instance can be mounted directly over /proc.

       From a shell, the command to mount /proc is:

           $ mount -t proc proc /proc

       Calling readlink(2) on the path /proc/self yields the process ID of the
       caller in the  PID  namespace  of  the  procfs  mount  (i.e.,  the  PID
       namespace  of the process that mounted the procfs).  This can be useful
       for introspection purposes, when a process wants to discover its PID in
       other namespaces.

       When a process ID is passed over a UNIX domain socket to a process in a
       different PID namespace (see  the  description  of  SCM_CREDENTIALS  in
       unix(7)),  it  is  translated  into  the corresponding PID value in the
       receiving process's PID namespace.


       Namespaces are a Linux-specific feature.


       See user_namespaces(7).


       clone(2),    setns(2),     unshare(2),     proc(5),     credentials(7),
       capabilities(7), user_namespaces(7), switch_root(8)


       This  page  is  part of release 4.04 of the Linux man-pages project.  A
       description of the project, information about reporting bugs,  and  the
       latest     version     of     this    page,    can    be    found    at