Provided by: charliecloud-builders_0.37-1build1_amd64 

NAME
ch-fromhost - Inject files from the host into an image directory, with various magic
SYNOPSIS
$ ch-fromhost [OPTION ...] [FILE_OPTION ...] IMGDIR
DESCRIPTION
NOTE:
This command is experimental. Features may be incomplete and/or buggy. Please report any issues you
find, so we can fix them!
Inject files from the host into the Charliecloud image directory IMGDIR.
The purpose of this command is to inject arbitrary host files into a container necessary to access host
specific resources; usually GPU or proprietary interconnects. It is not a general copy-to-image tool; see
further discussion on use cases below.
It should be run after:code:ch-convert and before ch-run. After invocation, the image is no longer
portable to other hosts.
Injection is not atomic; if an error occurs partway through injection, the image is left in an undefined
state and should be re-unpacked from storage. Injection is currently implemented using a simple file
copy, but that may change in the future.
Arbitrary file and libfabric injection are handled differently.
Arbitrary files
Arbitrary file paths that contain the strings /bin or /sbin are assumed to be executables and placed in
/usr/bin within the container. Paths that are not loadable libfabric providers and contain the strings
/lib or .so are assumed to be shared libraries and are placed in the first-priority directory reported by
ldconfig (see --lib-path below). Other files are placed in the directory specified by --dest.
If any shared libraries are injected, run ldconfig inside the container (using ch-run -w) after
injection.
Libfabric
MPI implementations have numerous ways of communicating messages over interconnects. We use libfabric
(OFI), an OpenFabric framework that exports fabric communication services to applications, to manage
these communications with built-in, or loadable, fabric providers.
• https://ofiwg.github.io/libfabric
• https://ofiwg.github.io/libfabric/v1.14.0/man/fi_provider.3.html
Using OFI, we can (a) uniformly manage fabric communication services for both OpenMPI and MPICH, and (b)
use simplified methods of accessing proprietary host hardware, e.g., Cray’s Gemini/Aries and Slingshot
(CXI).
OFI providers implement the application facing software interfaces needed to access network specific
protocols, drivers, and hardware. Loadable providers, i.e., compiled OFI libraries that end in -fi.so,
for example, Cray’s libgnix-fi.so, can be copied into, and used, by an image with a MPI configured
against OFI. Alternatively, the image’s libfabric.so can be overwritten with the host’s. See details and
quirks below.
OPTIONS
To specify which files to inject
-c, --cmd CMD
Inject files listed in the standard output of command CMD.
-f, --file FILE
Inject files listed in the file FILE.
-p, --path PATH
Inject the file at PATH.
--cray-cxi
Inject cray-libfabric for slingshot. This is equivalent to --path $CH_FROMHOST_OFI_CXI, where
$CH_FROMHOST_OFI_CXI is the path the Cray host libfabric libfabric.so.
--cray-gni
Inject cray gemini/aries GNI libfabric provider libgnix-fi.so. This is equivalent to
--fi-provider $CH_FROMHOST_OFI_GNI, where CH_FROMHOST_OFI_GNI is the path to the Cray host ugni
provider libgnix-fi.so.
--nvidia
Use nvidia-container-cli list (from libnvidia-container) to find executables and libraries to
inject.
These can be repeated, and at least one must be specified.
To specify the destination within the image
-d, --dest DST
Place files specified later in directory IMGDIR/DST, overriding the inferred destination, if
any. If a file’s destination cannot be inferred and --dest has not been specified, exit with an
error. This can be repeated to place files in varying destinations.
Additional arguments
--print-cray-fi
Print inferred destination for libfabric replacement.
--print-fi
Print the guest destination path for libfabric provider(s).
--print-lib
Print the guest destination path for shared libraries inferred as described above.
--no-ldconfig
Don’t run ldconfig even if we appear to have injected shared libraries.
-h, --help
Print help and exit.
-v, --verbose
Be more verbose about what is going on. Can be repeated.
--version
Print version and exit.
WARNING:
ldconfig often prints scary-looking warnings on stderr even everything is going well. By default, we
suppress these, but you can see them with sufficient verbosity. For example:
$ ch-fromhost --print-lib /var/tmp/bullseye
/usr/local/lib
$ ch-fromhost -v --print-lib /var/tmp/bullseye
asking ldconfig for inferred shared library destination
inferred shared library destination: /var/tmp/bullseye//usr/local/lib
/usr/local/lib
$ ch-fromhost -v -v --print-lib /var/tmp/bullseye
asking ldconfig for inferred shared library destination
/sbin/ldconfig: Can't stat /usr/local/lib/x86_64-linux-gnu: No such file or directory
/sbin/ldconfig: Path `/lib/x86_64-linux-gnu' given more than once
/sbin/ldconfig: Path `/usr/lib/x86_64-linux-gnu' given more than once
/sbin/ldconfig: /lib/x86_64-linux-gnu/ld-2.31.so is the dynamic linker, ignoring
inferred shared library destination: /var/tmp/bullseye//usr/local/lib
/usr/local/lib
See issue #732 for an example of how this was confusing for users.
WHEN TO USE CH-FROMHOST
This command does a lot of heuristic magic; while it can copy arbitrary files into an image, this usage
is discouraged and prone to error. Here are some use cases and the recommended approach:
1. I have some files on my build host that I want to include in the image. Use the COPY instruction
within your Dockerfile. Note that it’s OK to build an image that meets your specific needs but isn’t
generally portable, e.g., only runs on specific micro-architectures you’re using.
2. I have an already built image and want to install a program I compiled separately into the image.
Consider whether a building a new derived image with a Dockerfile is appropriate. Another good option
is to bind-mount the directory containing your program at run time. A less good option is to cp(1) the
program into your image, because this permanently alters the image in a non-reproducible way.
3. I have some shared libraries that I need in the image for functionality or performance, and they
aren’t available in a place where I can use COPY. This is the intended use case of ch-fromhost. You
can use --cmd, --file, --ofi, and/or --path to put together a custom solution. But, please consider
filing an issue so we can package your functionality with a tidy option like --nvidia.
LIBFABRIC USAGE AND QUIRKS
The implementation of libfabric provider injection and replacement is experimental and has a couple
quirks.
1. Containers must have the following software installed:
a. libfabric (https://ofiwg.github.io/libfabric/). See charliecloud/examples/Dockerfile.libfabric.
b. Corresponding open source MPI implementation configured and built against the container libfabric,
e.g., - MPICH, or - OpenMPI. See charliecloud/examples/Dockerfile.mpich and
charliecloud/examples/Dockerfile.openmpi.
2. At run time, a libfabric provider can be specified with the variable FI_PROVIDER. The path to search
for shared providers can be specified with FI_PROVIDER_PATH. These variables can be inherited from the
host or explicitly set with the container’s environment file /ch/environent via --set-env.
To avoid issues and reduce complexity, the inferred injection destination for libfabric providers and
replacement will always at the path in the image where libfabric.so is found.
3. The Cray GNI loadable provider, libgnix-fi.so, will link to compiler(s) in the programming environment
by default. For example, if it is built under the PrgEnv-intel programming environment, it will have
links to files at paths /opt/gcc and /opt/intel that ch-run will not bind automatically.
Managing all possible bind mount paths is untenable. Thus, this experimental implementation injects
libraries linked to a libgnix-fi.so built with the minimal modules necessary to compile, i.e.:
• modules
• craype-network-aries
• eproxy
• slurm
• cray-mpich
• craype-haswell
• craype-hugepages2M
A Cray GNI provider linked against more complicated PE’s will still work, assuming 1) the user
explicitly bind-mounts missing libraries listed from its ldd output, and 2) all such libraries do not
conflict with container functionality, e.g., glibc.so, etc.
4. At the time of this writing, a Cray Slingshot optimized provider is not available; however, recent
libfabric source acitivity indicates there may be at some point, see:
https://github.com/ofiwg/libfabric/pull/7839We.
For now, on Cray systems with Slingshot, CXI, we need overwrite the container’s libfabric.so with the
hosts using --path. See examples for details.
5. Tested only for C programs compiled with GCC. Additional bind mount or kludging may be needed for
untested use cases. If you’d like to use another compiler or programming environment, please get in
touch so we can implement the necessary support.
Please file a bug if we missed anything above or if you know how to make the code better.
NOTES
Symbolic links are dereferenced, i.e., the files pointed to are injected, not the links themselves.
As a corollary, do not include symlinks to shared libraries. These will be re-created by ldconfig.
There are two alternate approaches for nVidia GPU libraries:
1. Link libnvidia-containers into ch-run and call the library functions directly. However, this would
mean that Charliecloud would either (a) need to be compiled differently on machines with and
without nVidia GPUs or (b) have libnvidia-containers available even on machines without nVidia
GPUs. Neither of these is consistent with Charliecloud’s philosophies of simplicity and minimal
dependencies.
2. Use nvidia-container-cli configure to do the injecting. This would require that containers have a
half-started state, where the namespaces are active and everything is mounted but pivot_root(2) has
not been performed. This is not feasible because Charliecloud has no notion of a half-started
container.
Further, while these alternate approaches would simplify or eliminate this script for nVidia GPUs, they
would not solve the problem for other situations.
BUGS
File paths may not contain colons or newlines.
ldconfig tends to print stat errors; these are typically non-fatal and occur when trying to probe common
library paths. See issue #732.
EXAMPLES
libfabric
Cray Slingshot CXI injection.
Replace image libabfric, i.e., libfabric.so, with Cray host’s libfabric at host path
/opt/cray-libfabric/lib64/libfabric.so.
$ ch-fromhost -v --path /opt/cray-libfabric/lib64/libfabric.so /tmp/ompi
[ debug ] queueing files
[ debug ] cray libfabric: /opt/cray-libfabric/lib64/libfabric.so
[ debug ] searching image for inferred libfabric destiation
[ debug ] found /tmp/ompi/usr/local/lib/libfabric.so
[ debug ] adding cray libfabric libraries
[ debug ] skipping /lib64/libcom_err.so.2
[...]
[ debug ] queueing files
[ debug ] shared library: /usr/lib64/libcxi.so.1
[ debug ] queueing files
[ debug ] shared library: /usr/lib64/libcxi.so.1.2.1
[ debug ] queueing files
[ debug ] shared library: /usr/lib64/libjson-c.so.3
[ debug ] queueing files
[ debug ] shared library: /usr/lib64/libjson-c.so.3.0.1
[...]
[ debug ] queueing files
[ debug ] shared library: /usr/lib64/libssh.so.4
[ debug ] queueing files
[ debug ] shared library: /usr/lib64/libssh.so.4.7.4
[...]
[ debug ] inferred shared library destination: /tmp/ompi//usr/local/lib
[ debug ] injecting into image: /tmp/ompi/
[ debug ] mkdir -p /tmp/ompi//var/lib/hugetlbfs
[ debug ] mkdir -p /tmp/ompi//var/spool/slurmd
[ debug ] echo '/usr/lib64' >> /tmp/ompi//etc/ld.so.conf.d/ch-ofi.conf
[ debug ] /opt/cray-libfabric/lib64/libfabric.so -> /usr/local/lib (inferred)
[ debug ] /usr/lib64/libcxi.so.1 -> /usr/local/lib (inferred)
[ debug ] /usr/lib64/libcxi.so.1.2.1 -> /usr/local/lib (inferred)
[ debug ] /usr/lib64/libjson-c.so.3 -> /usr/local/lib (inferred)
[ debug ] /usr/lib64/libjson-c.so.3.0.1 -> /usr/local/lib (inferred)
[ debug ] /usr/lib64/libssh.so.4 -> /usr/local/lib (inferred)
[ debug ] /usr/lib64/libssh.so.4.7.4 -> /usr/local/lib (inferred)
[ debug ] running ldconfig
[ debug ] ch-run -w /tmp/ompi/ -- /sbin/ldconfig
[ debug ] validating ldconfig cache
done
Same as above, except also inject Cray’s fi_info to verify Slingshot provider access.
$ ch-fromhost -v --path /opt/cray/libfabric/1.15.0.0/lib64/libfabric.so \
-d /usr/local/bin \
--path /opt/cray/libfabric/1.15.0.0/lib64/libfabric.so \
/tmp/ompi
[...]
$ ch-run /tmp/ompi/ -- fi_info -p cxi
provider: cxi
fabric: cxi
[...]
type: FI_EP_RDM
protocol: FI_PROTO_CXI
Cray GNI shared provider injection.
Add Cray host built GNI provider libgnix-fi.so to the image and verify with fi_info.
$ ch-fromhost -v --path /home/ofi/libgnix-fi.so /tmp/ompi
[ debug ] queueing files
[ debug ] libfabric shared provider: /home/ofi/libgnix-fi.so
[ debug ] searching /tmp/ompi for libfabric shared provider destination
[ debug ] found: /tmp/ompi/usr/local/lib/libfabric.so
[ debug ] inferred provider destination: //usr/local/lib/libfabric
[ debug ] injecting into image: /tmp/ompi
[ debug ] mkdir -p /tmp/ompi//usr/local/lib/libfabric
[ debug ] mkdir -p /tmp/ompi/var/lib/hugetlbfs
[ debug ] mkdir -p /tmp/ompi/var/opt/cray/alps/spool
[ debug ] mkdir -p /tmp/ompi/opt/cray/wlm_detect
[ debug ] mkdir -p /tmp/ompi/etc/opt/cray/wlm_detect
[ debug ] mkdir -p /tmp/ompi/opt/cray/udreg
[ debug ] mkdir -p /tmp/ompi/opt/cray/xpmem
[ debug ] mkdir -p /tmp/ompi/opt/cray/ugni
[ debug ] mkdir -p /tmp/ompi/opt/cray/alps
[ debug ] echo '/lib64' >> /tmp/ompi/etc/ld.so.conf.d/ch-ofi.conf
[ debug ] echo '/opt/cray/alps/lib64' >> /tmp/ompi/etc/ld.so.conf.d/ch-ofi.conf
[ debug ] echo '/opt/cray/udreg/lib64' >> /tmp/ompi/etc/ld.so.conf.d/ch-ofi.conf
[ debug ] echo '/opt/cray/ugni/lib64' >> /tmp/ompi/etc/ld.so.conf.d/ch-ofi.conf
[ debug ] echo '/opt/cray/wlm_detect/lib64' >> /tmp/ompi/etc/ld.so.conf.d/ch-ofi.conf
[ debug ] echo '/opt/cray/xpmem/lib64' >> /tmp/ompi/etc/ld.so.conf.d/ch-ofi.conf
[ debug ] echo '/usr/lib64' >> /tmp/ompi/etc/ld.so.conf.d/ch-ofi.conf
[ debug ] /home/ofi/libgnix-fi.so -> //usr/local/lib/libfabric (inferred)
[ debug ] running ldconfig
[ debug ] ch-run -w /tmp/ompi -- /sbin/ldconfig
[ debug ] validating ldconfig cache
done
$ ch-run /tmp/ompi -- fi_info -p gni
provider: gni
fabric: gni
[...]
type: FI_EP_RDM
protocol: FI_PROTO_GNI
Arbitrary
Place shared library /usr/lib64/libfoo.so at path /usr/lib/libfoo.so (assuming /usr/lib is the first
directory searched by the dynamic loader in the image), within the image /var/tmp/baz and executable
/bin/bar at path /usr/bin/bar. Then, create appropriate symlinks to libfoo and update the ld.so cache.
$ cat qux.txt
/bin/bar
/usr/lib64/libfoo.so
$ ch-fromhost --file qux.txt /var/tmp/baz
Same as above:
$ ch-fromhost --cmd 'cat qux.txt' /var/tmp/baz
Same as above:
$ ch-fromhost --path /bin/bar --path /usr/lib64/libfoo.so /var/tmp/baz
Same as above, but place the files into /corge instead (and the shared library will not be found by
ldconfig):
$ ch-fromhost --dest /corge --file qux.txt /var/tmp/baz
Same as above, and also place file /etc/quux at /etc/quux within the container:
$ ch-fromhost --file qux.txt --dest /etc --path /etc/quux /var/tmp/baz
Inject the executables and libraries recommended by nVidia into the image, and then run ldconfig:
$ ch-fromhost --nvidia /var/tmp/baz
asking ldconfig for shared library destination
/sbin/ldconfig: Can’t stat /libx32: No such file or directory
/sbin/ldconfig: Can’t stat /usr/libx32: No such file or directory
shared library destination: /usr/lib64//bind9-export
injecting into image: /var/tmp/baz
/usr/bin/nvidia-smi -> /usr/bin (inferred)
/usr/bin/nvidia-debugdump -> /usr/bin (inferred)
/usr/bin/nvidia-persistenced -> /usr/bin (inferred)
/usr/bin/nvidia-cuda-mps-control -> /usr/bin (inferred)
/usr/bin/nvidia-cuda-mps-server -> /usr/bin (inferred)
/usr/lib64/libnvidia-ml.so.460.32.03 -> /usr/lib64//bind9-export (inferred)
/usr/lib64/libnvidia-cfg.so.460.32.03 -> /usr/lib64//bind9-export (inferred)
[...]
/usr/lib64/libGLESv2_nvidia.so.460.32.03 -> /usr/lib64//bind9-export (inferred)
/usr/lib64/libGLESv1_CM_nvidia.so.460.32.03 -> /usr/lib64//bind9-export (inferred)
running ldconfig
ACKNOWLEDGEMENTS
This command was inspired by the similar Shifter feature that allows Shifter containers to use the Cray
Aries network. We particularly appreciate the help provided by Shane Canon and Doug Jacobsen during our
implementation of --cray-mpi.
We appreciate the advice of Ryan Olson at nVidia on implementing --nvidia.
REPORTING BUGS
If Charliecloud was obtained from your Linux distribution, use your distribution’s bug reporting
procedures.
Otherwise, report bugs to: https://github.com/hpc/charliecloud/issues
SEE ALSO
charliecloud(7)
Full documentation at: <https://hpc.github.io/charliecloud>
COPYRIGHT
2014–2023, Triad National Security, LLC and others
0.37 2024-04-01 05:37 UTC CH-FROMHOST(1)