Ubuntu Manpage: ch-image - Build and manage images; completely unprivileged

Provided by: charliecloud-builders_0.27-1_amd64

NAME

       ch-image - Build and manage images; completely unprivileged

SYNOPSIS

          $ ch-image [...] build [-t TAG] [-f DOCKERFILE] [...] CONTEXT
          $ ch-image [...] delete IMAGE_REF
          $ ch-image [...] import PATH IMAGE_REF
          $ ch-image [...] list [IMAGE_REF]
          $ ch-image [...] pull [...] IMAGE_REF [IMAGE_DIR]
          $ ch-image [...] push [--image DIR] IMAGE_REF [DEST_REF]
          $ ch-image [...] reset
          $ ch-image [...] storage-path
          $ ch-image { --help | --version | --dependencies }

DESCRIPTION

ch-image is a tool for building and manipulating container images, but not running them
(for that you want ch-run). It is completely unprivileged, with no setuid/setgid/setcap
helpers. The action to take is specified by a sub-command.

Options that print brief information and then exit:

-h, --help
Print help and exit successfully. If specified before the sub-command, print
general help and list of sub-commands; if after the sub-command, print help
specific to that sub-command.

--dependencies
Report dependency problems on standard output, if any, and exit. If all is well,
there is no output and the exit is successful; in case of problems, the exit is
unsuccessful.

--version
Print version number and exit successfully.

Common options placed before the sub-command:

-a, --arch ARCH
Use ARCH for architecture-aware registry operations, currently pull and pulls
done within build. ARCH can be: (1) yolo, to bypass architecture-aware code and
use the registry’s default architecture; (2) host, to use the host’s
architecture, obtained with the equivalent of uname -m (default if --arch not
specified); or (3) an architecture name. If the specified architecture is not
available, the error message will list which ones are.

Notes:

1. ch-image is limited to one image per image reference in builder storage at a
time, regardless of architecture. For example, if you say ch-image pull
--arch=foo baz and then ch-image pull --arch=bar baz, builder storage will
contain one image called “baz”, with architecture “bar”.

2. Images’ default architecture is usually amd64, so this is usually what you
get with --arch=yolo. Similarly, if a registry image is architecture-unaware,
it will still be pulled with --arch=amd64 and --arch=host on x86-64 hosts
(other host architectures must specify --arch=yolo to pull
architecture-unaware images).

3. uname -m and image registries often use different names for the same
architecture. For example, what uname -m reports as “x86_64” is known to
registries as “amd64”. --arch=host should translate if needed, but it’s
useful to know this is happening. Directly specified architecture names are
passed to the registry without translation.

4. Registries treat architecture as a pair of items, architecture and sometimes
variant (e.g., “arm” and “v7”). Charliecloud treats architecture as a simple
string and converts to/from the registry view transparently.

--no-cache
Download everything needed, ignoring the cache.

--password-many
Re-prompt the user every time a registry password is needed.

-s, --storage DIR
Set the storage directory (see below for important details).

--tls-no-verify
Don’t verify TLS certificates of the repository. (Do not use this option unless
you understand the risks.)

-v, --verbose
Print extra chatter; can be repeated.

AUTHENTICATION

       If the remote repository needs authentication, Charliecloud will prompt you for a username
       and password. Note that some repositories call the secret something other than “password”;
       e.g., GitLab calls it a “personal access token (PAT)”.

       These values are remembered for the life of the process and  silently  re-offered  to  the
       registry  if  needed.  One  case  when this happens is on push to a private registry: many
       registries will first offer a read-only token when ch-image checks  if  something  exists,
       then  re-authenticate when upgrading the token to read-write for upload. If your site uses
       one-time passwords such as provided by a security device, you can specify  --password-many
       to provide a new secret each time.

       These values are not saved persistently, e.g. in a file. Note that we do use normal Python
       variables for this information, without pinning them into physical RAM  with  mlock(2)  or
       any  other  special  treatment,  so we cannot guarantee they will never reach non-volatile
       storage.

       There is no separate login subcommand like Docker. For non-interactive authentication, you
       can use environment variables CH_IMAGE_USERNAME and CH_IMAGE_PASSWORD. Only do this if you
       fully understand the implications for your specific use case, because it is  difficult  to
       securely store secrets in environment variables.

STORAGE DIRECTORY

ch-image maintains state using normal files and directories located in its storage
directory; contents include temporary images used for building and various caches.

In descending order of priority, this directory is located at:

-s, --storage DIR
Command line option.

$CH_IMAGE_STORAGE
Environment variable.

/var/tmp/$USER.ch
Default. (Previously, the default was /var/tmp/$USER/ch-image. If a valid
storage directory is found at the old default path, ch-image tries to move it to
the new default path.)

Unlike many container implementations, there is no notion of storage drivers, graph
drivers, etc., to select and/or configure.

The storage directory can reside on any filesystem. However, it contains lots of small
files and metadata traffic can be intense. For example, the Charliecloud test suite uses
approximately 400,000 files and directories in the storage directory as of this writing.
Place it on a filesystem appropriate for this; tmpfs’es such as /var/tmp are a good choice
if you have enough RAM (/tmp is not recommended because ch-run bind-mounts it into
containers by default).

While you can currently poke around in the storage directory and find unpacked images
runnable with ch-run, this is not a supported use case. The supported workflow uses
ch-convert to obtain a packed image; see the tutorial for details.

The storage directory format changes on no particular schedule. Often ch-image is able to
upgrade the directory; however, downgrading is not supported and sometimes upgrade is not
possible. In these cases, ch-image will refuse to run until you delete and re-initialize
the directory with ch-image reset.

WARNING:
Network filesystems, especially Lustre, are typically bad choices for the storage
directory. This is a site-specific question and your local support will likely have
strong opinions.

BUILD

Build an image from a Dockerfile and put it in the storage directory.

Synopsis
$ ch-image [...] build [-t TAG] [-f DOCKERFILE] [...] CONTEXT

Description
Uses ch-run -w -u0 -g0 --no-home --no-passwd to execute RUN instructions. Note that FROM
implicitly pulls the base image if needed, so you may want to read about the pull
subcommand below as well.

Required argument:

CONTEXT
Path to context directory. This is the root of COPY instructions in the
Dockerfile. If a single hyphen (-) is specified: (a) read the Dockerfile from
standard input, (b) specifying --file is an error, and (c) there is no context,
so COPY will fail. (See --file for how to provide the Dockerfile on standard
input while also having a context.)

Options:

-b, --bind SRC[:DST]
For RUN instructions only, bind-mount SRC at guest DST. The default destination
if not specified is to use the same path as the host; i.e., the default is
equivalent to --bind=SRC:SRC. If DST does not exist, try to create it as an
empty directory, though images do have ten directories /mnt/[0-9] already
available as mount points. Can be repeated.

Note: See documentation for ch-run --bind for important caveats and gotchas.

Note: Other instructions that modify the image filesystem, e.g. COPY, can only
access host files from the context directory, regardless of this option.

--build-arg KEY[=VALUE]
Set build-time variable KEY defined by ARG instruction to VALUE. If VALUE not
specified, use the value of environment variable KEY.

-f, --file DOCKERFILE
Use DOCKERFILE instead of CONTEXT/Dockerfile. If a single hyphen (-) is
specified, read the Dockerfile from standard input; like docker build, the
context directory is still available in this case.

--force
Inject the unprivileged build workarounds; see discussion later in this section
for details on what this does and when you might need it. If a build fails and
ch-image thinks --force would help, it will suggest it.

-n, --dry-run
Don’t actually execute any Dockerfile instructions.

--no-force-detect
Don’t try to detect if the workarounds in --force would help.

--parse-only
Stop after parsing the Dockerfile.

-t, --tag TAG
Name of image to create. If not specified, infer the name:

1. If Dockerfile named Dockerfile with an extension: use the extension with
invalid characters stripped, e.g. Dockerfile.@FOO.bar → foo.bar.

2. If Dockerfile has extension dockerfile: use the basename with the same
transformation, e.g. baz.@QUX.dockerfile -> baz.qux.

3. If context directory is not /: use its name, i.e. the last component of the
absolute path to the context directory, with the same transformation,

4. Otherwise (context directory is /): use root.

If no colon present in the name, append :latest.

Privilege model
ch-image is a fully unprivileged image builder. It does not use any setuid or setcap
helper programs, and it does not use configuration files /etc/subuid or /etc/subgid. This
contrasts with the “rootless” or “fakeroot” modes of some competing builders, which do
require privileged supporting code or utilities.

This approach does yield some quirks. We provide built-in workarounds that should mostly
work (i.e., --force), but it can be helpful to understand what is going on.

ch-image executes all instructions as the normal user who invokes it. For RUN, this is
accomplished with ch-run -w --uid=0 --gid=0 (and some other arguments), i.e., your host
EUID and EGID both mapped to zero inside the container, and only one UID (zero) and GID
(zero) are available inside the container. Under this arrangement, processes running in
the container for each RUN appear to be running as root, but many privileged system calls
will fail without the workarounds described below. This affects any fully unprivileged
container build, not just Charliecloud.

The most common time to see this is installing packages. For example, here is RPM failing
to chown(2) a file, which makes the package update fail:

Updating : 1:dbus-1.10.24-13.el7_6.x86_64 2/4
Error unpacking rpm package 1:dbus-1.10.24-13.el7_6.x86_64
error: unpacking of archive failed on file /usr/libexec/dbus-1/dbus-daemon-launch-helper;5cffd726: cpio: chown
Cleanup : 1:dbus-libs-1.10.24-12.el7.x86_64 3/4
error: dbus-1:1.10.24-13.el7_6.x86_64: install failed

This one is (ironically) apt-get failing to drop privileges:

E: setgroups 65534 failed - setgroups (1: Operation not permitted)
E: setegid 65534 failed - setegid (22: Invalid argument)
E: seteuid 100 failed - seteuid (22: Invalid argument)
E: setgroups 0 failed - setgroups (1: Operation not permitted)

By default, nothing is done to avoid these problems, though ch-image does try to detect if
the workarounds could help. --force activates the workarounds: ch-image injects extra
commands to intercept these system calls and fake a successful result, using fakeroot(1).
There are three basic steps:

1. After FROM, analyze the image to see what distribution it contains, which determines
the specific workarounds.

2. Before the user command in the first RUN instruction where the injection seems
needed, install fakeroot(1) in the image, if one is not already installed, as well
as any other necessary initialization commands. For example, we turn off the apt
sandbox (for Debian Buster) and configure EPEL but leave it disabled (for
CentOS/RHEL).

3. Prepend fakeroot to RUN instructions that seem to need it, e.g. ones that contain
apt, apt-get, dpkg for Debian derivatives and dnf, rpm, or yum for RPM-based
distributions.

The details are specific to each distribution. ch-image analyzes image content (e.g.,
grepping /etc/debian_version) to select a configuration; see lib/fakeroot.py for details.
ch-image prints exactly what it is doing.

Compatibility with other Dockerfile interpreters
ch-image is an independent implementation and shares no code with other Dockerfile
interpreters. It uses a formal Dockerfile parsing grammar developed from the Dockerfile
reference documentation and miscellaneous other sources, which you can examine in the
source code.

We believe this independence is valuable for several reasons. First, it helps the
community examine Dockerfile syntax and semantics critically, think rigorously about what
is really needed, and build a more robust standard. Second, it yields disjoint sets of
bugs (note that Podman, Buildah, and Docker all share the same Dockerfile parser). Third,
because it is a much smaller code base, it illustrates how Dockerfiles work more clearly.
Finally, it allows straightforward extensions if needed to support scientific computing.

ch-image tries hard to be compatible with Docker and other interpreters, though as an
independent implementation, it is not bug-compatible.

The following subsections describe differences from the Dockerfile reference that we
expect to be approximately permanent. For not-yet-implemented features and bugs in this
area, see related issues on GitHub.

None of these are set in stone. We are very interested in feedback on our assessments and
open questions. This helps us prioritize new features and revise our thinking about what
is needed for HPC containers.

Context directory
The context directory is bind-mounted into the build, rather than copied like Docker.
Thus, the size of the context is immaterial, and the build reads directly from storage
like any other local process would. However, you still can’t access anything outside the
context directory.

Variable substitution
Variable substitution happens for all instructions, not just the ones listed in the
Dockerfile reference.

ARG and ENV cause cache misses upon definition, in contrast with Docker where these
variables miss upon use, except for certain cache-excluded variables that never cause
misses, listed below.

ch-image passes the following proxy environment variables in to the build. Changes to
these variables do not cause a cache miss. They do not require an ARG instruction, as
documented in the Dockerfile reference. Unlike Docker, they are available if the
same-named environment variable is defined; --build-arg is not required.

HTTP_PROXY
http_proxy
HTTPS_PROXY
https_proxy
FTP_PROXY
ftp_proxy
NO_PROXY
no_proxy

In addition to those listed in the Dockerfile reference, these environment variables are
passed through in the same way:

SSH_AUTH_SOCK
USER

Finally, these variables are also pre-defined but are unrelated to the host environment:

PATH=/ch/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
TAR_OPTIONS=--no-same-owner

Note that ARG and ENV have different syntax despite very similar semantics.

COPY
Especially for people used to UNIX cp(1), the semantics of the Dockerfile COPY instruction
can be confusing.

Most notably, when a source of the copy is a directory, the contents of that directory,
not the directory itself, are copied. This is documented, but it’s a real gotcha because
that’s not what cp(1) does, and it means that many things you can do in one cp(1) command
require multiple COPY instructions.

Also, the reference documentation is incomplete. In our experience, Docker also behaves as
follows; ch-image does the same in an attempt to be bug-compatible.

1. You can use absolute paths in the source; the root is the context directory.

2. Destination directories are created if they don’t exist in the following situations:

1. If the destination path ends in slash. (Documented.)

2. If the number of sources is greater than 1, either by wildcard or explicitly,
regardless of whether the destination ends in slash. (Not documented.)

3. If there is a single source and it is a directory. (Not documented.)

3. Symbolic links behave differently depending on how deep in the copied tree they are.
(Not documented.)

1. Symlinks at the top level — i.e., named as the destination or the source, either
explicitly or by wildcards — are dereferenced. They are followed, and whatever they
point to is used as the destination or source, respectively.

2. Symlinks at deeper levels are not dereferenced, i.e., the symlink itself is copied.

4. If a directory appears at the same path in source and destination, and is at the 2nd
level or deeper, the source directory’s metadata (e.g., permissions) are copied to the
destination directory. (Not documented.)

5. If an object appears in both the source and destination, and is at the 2nd level or
deeper, and is of different types in the source and destination, then the source object
will overwrite the destination object. (Not documented.) For example, if /tmp/foo/bar
is a regular file, and /tmp is the context directory, then the following Dockerfile
snippet will result in a file in the container at /foo/bar (copied from /tmp/foo/bar);
the directory and all its contents will be lost.

RUN mkdir -p /foo/bar && touch /foo/bar/baz
COPY foo /foo

We expect the following differences to be permanent:

• Wildcards use Python glob semantics, not the Go semantics.

• COPY --chown is ignored, because it doesn’t make sense in an unprivileged build.

Features we do not plan to support
• Parser directives are not supported. We have not identified a need for any of them.

• EXPOSE: Charliecloud does not use the network namespace, so containerized processes can
simply listen on a host port like other unprivileged processes.

• HEALTHCHECK: This instruction’s main use case is monitoring server processes rather than
applications. Also, implementing it requires a container supervisor daemon, which we
have no plans to add.

• MAINTAINER is deprecated.

• STOPSIGNAL requires a container supervisor daemon process, which we have no plans to
add.

• USER does not make sense for unprivileged builds.

• VOLUME: This instruction is not currently supported. Charliecloud has good support for
bind mounts; we anticipate that it will continue to focus on that and will not introduce
the volume management features that Docker has.

Examples
Build image bar using ./foo/bar/Dockerfile and context directory ./foo/bar:

$ ch-image build -t bar -f ./foo/bar/Dockerfile ./foo/bar
[...]
grown in 4 instructions: bar

Same, but infer the image name and Dockerfile from the context directory path:

$ ch-image build ./foo/bar
[...]
grown in 4 instructions: bar

Build using humongous vendor compilers you want to bind-mount instead of installing into
the image:

$ ch-image build --bind /opt/bigvendor:/opt .
$ cat Dockerfile
FROM centos:7

RUN /opt/bin/cc hello.c
#COPY /opt/lib/*.so /usr/local/lib # fail: COPY doesn't bind mount
RUN cp /opt/lib/*.so /usr/local/lib # possible workaround
RUN ldconfig

DELETE

          $ ch-image [...] delete IMAGE_REF

       Delete the image described by the image reference IMAGE_REF from the storage directory.

LIST

       Print information about images. If no argument given, list the images in builder storage.

   Synopsis
          $ ch-image [...] list [IMAGE_REF]

   Description
       Optional argument:

          IMAGE_REF
                 Print  details  of  what’s known about IMAGE_REF, both locally and in the remote
                 registry, if any.

   Examples
       List images in builder storage:

          $ ch-image list
          alpine:3.9 (amd64)
          alpine:latest (amd64)
          debian:buster (amd64)

       Print details about Debian Buster image:

          $ ch-image list debian:buster
          details of image:    debian:buster
          in local storage:    no
          full remote ref:     registry-1.docker.io:443/library/debian:buster
          available remotely:  yes
          remote arch-aware:   yes
          host architecture:   amd64
          archs available:     386 amd64 arm/v5 arm/v7 arm64/v8 mips64le ppc64le s390x

IMPORT

          $ ch-image [...] import PATH IMAGE_REF

       Copy the image at PATH into builder storage with name IMAGE_REF. PATH can be:

       • an image directory

       • a tarball with no top-level directory (a.k.a. a “tarbomb”)

       • a standard tarball with one top-level directory

       If the imported image contains Charliecloud metadata, that  will  be  imported  unchanged,
       i.e.,  images  exported  from ch-image builder storage will be functionally identical when
       re-imported.

PULL

       Pull the image described by the image reference IMAGE_REF from a repository to  the  local
       filesystem.

   Synopsis
          $ ch-image [...] pull [...] IMAGE_REF [IMAGE_DIR]

       See the FAQ for the gory details on specifying image references.

   Description
       Destination:

          IMAGE_DIR
                 If specified, place the unpacked image at this path; it is then ready for use by
                 ch-run or other tools. The storage directory will not  contain  a  copy  of  the
                 image, i.e., it is only unpacked once.

       Options:

          --last-layer N
                 Unpack  only  N layers, leaving an incomplete image. This option is intended for
                 debugging.

          --parse-only
                 Parse IMAGE_REF, print a parse report, and exit successfully without talking  to
                 the internet or touching the storage directory.

       This  script  does  a  fair  amount  of validation and fixing of the layer tarballs before
       flattening in order to support unprivileged use despite image problems we  frequently  see
       in the wild. For example, device files are ignored, and file and directory permissions are
       increased to a minimum of  rwx------  and  rw-------  respectively.  Note,  however,  that
       symlinks  pointing  outside  the  image are permitted, because they are not resolved until
       runtime within a container.

       The following metadata in the pulled image is retained; all other  metadata  is  currently
       ignored. (If you have a need for additional metadata, please let us know!)

          • Current working directory set with WORKDIR is effective in downstream Dockerfiles.

          • Environment  variables  set with ENV are effective in downstream Dockerfiles and also
            written to /ch/environment for use in ch-run --set-env.

          • Mount point directories specified with VOLUME are created in the image if they  don’t
            exist, but no other action is taken.

       Note that some images (e.g., those with a “version 1 manifest”) do not contain metadata. A
       warning is printed in this case.

   Examples
       Download the Debian Buster image matching the host’s architecture  and  place  it  in  the
       storage directory:

          $ uname -m
          aarch32
          pulling image:    debian:buster
          requesting arch:  arm64/v8
          manifest list: downloading
          manifest: downloading
          config: downloading
          layer 1/1: c54d940: downloading
          flattening image
          layer 1/1: c54d940: listing
          validating tarball members
          resolving whiteouts
          layer 1/1: c54d940: extracting
          image arch: arm64
          done

       Same, specifying the architecture explicitly:

          $ ch-image --arch=arm/v7 pull debian:buster
          pulling image:    debian:buster
          requesting arch:  arm/v7
          manifest list: downloading
          manifest: downloading
          config: downloading
          layer 1/1: 8947560: downloading
          flattening image
          layer 1/1: 8947560: listing
          validating tarball members
          resolving whiteouts
          layer 1/1: 8947560: extracting
          image arch: arm (may not match host arm64/v8)

       Download the same image and place it in /tmp/buster:

          $ ch-image pull debian:buster /tmp/buster
          [...]
          $ ls /tmp/buster
          bin   dev  home  lib64  mnt  proc  run   srv  tmp  var
          boot  etc  lib   media  opt  root  sbin  sys  usr

PUSH

       Push  the  image described by the image reference IMAGE_REF from the local filesystem to a
       repository.

   Synopsis
          $ ch-image [...] push [--image DIR] IMAGE_REF [DEST_REF]

       See the FAQ for the gory details on specifying image references.

   Description
       Destination:

          DEST_REF
                 If  specified,  use  this  as  the  destination  image  reference,  rather  than
                 IMAGE_REF.  This  lets you push to a repository without permanently adding a tag
                 to the image.

       Options:

          --image DIR
                 Use the unpacked image located at DIR  rather  than  an  image  in  the  storage
                 directory named IMAGE_REF.

       Because Charliecloud is fully unprivileged, the owner and group of files in its images are
       not meaningful in the broader ecosystem. Thus, when pushed, everything  in  the  image  is
       flattened  to  user:group  root:root.  Also,  setuid/setgid  bits  are  removed,  to avoid
       surprises if the image is pulled by a privileged container implementation.

   Examples
       Push a local image to the registry example.com:5000 at path /foo/bar with tag latest. Note
       that in this form, the local image must be named to match that remote reference.

          $ ch-image push example.com:5000/foo/bar:latest
          pushing image:   example.com:5000/foo/bar:latest
          layer 1/1: gathering
          layer 1/1: preparing
          preparing metadata
          starting upload
          layer 1/1: a1664c4: checking if already in repository
          layer 1/1: a1664c4: not present, uploading
          config: 89315a2: checking if already in repository
          config: 89315a2: not present, uploading
          manifest: uploading
          cleaning up
          done

       Same,  except use local image alpine:3.9. In this form, the local image name does not have
       to match the destination reference.

          $ ch-image push alpine:3.9 example.com:5000/foo/bar:latest
          pushing image:   alpine:3.9
          destination:     example.com:5000/foo/bar:latest
          layer 1/1: gathering
          layer 1/1: preparing
          preparing metadata
          starting upload
          layer 1/1: a1664c4: checking if already in repository
          layer 1/1: a1664c4: not present, uploading
          config: 89315a2: checking if already in repository
          config: 89315a2: not present, uploading
          manifest: uploading
          cleaning up
          done

       Same, except use unpacked image located at /var/tmp/image rather than an image in ch-image
       storage.  (Also,  the  sole  layer  is already present in the remote registry, so we don’t
       upload it again.)

          $ ch-image push --image /var/tmp/image example.com:5000/foo/bar:latest
          pushing image:   example.com:5000/foo/bar:latest
          image path:      /var/tmp/image
          layer 1/1: gathering
          layer 1/1: preparing
          preparing metadata
          starting upload
          layer 1/1: 892e38d: checking if already in repository
          layer 1/1: 892e38d: already present
          config: 546f447: checking if already in repository
          config: 546f447: not present, uploading
          manifest: uploading
          cleaning up
          done

RESET

          $ ch-image [...] reset

       Delete all images and cache from ch-image builder storage.

STORAGE-PATH

          $ ch-image [...] storage-path

       Print the storage directory path and exit.

ENVIRONMENT VARIABLES

       CH_IMAGE_USERNAME, CH_IMAGE_PASSWORD
              Username and password for registry authentication. See important caveats in section
              “Authentication” above.

       CH_LOG_FILE
              If set, append log chatter to this file, rather than standard error. This is useful
              for debugging situations where standard error is consumed or lost.

              Also sets verbose mode if not already set (equivalent to --verbose).

       CH_LOG_FESTOON
              If set, prepend PID and timestamp to logged chatter.

REPORTING BUGS

       If Charliecloud was obtained from your Linux distribution,  use  your  distribution’s  bug
       reporting procedures.

       Otherwise, report bugs to: https://github.com/hpc/charliecloud/issues

COPYRIGHT

       2014–2022, Triad National Security, LLC and others