4. Charliecloud command reference

This section is a comprehensive description of the usage and arguments of the Charliecloud commands. Its content is identical to the commands’ man pages.

4.1. ch-build

Wrapper for docker build that works around some of its annoying behaviors.

4.1.1. Synopsis

$ ch-build -t TAG [ARGS ...] CONTEXT

4.1.2. Description

Build a Docker image named TAG described by Dockerfile ./Dockerfile or as specified. This is a wrapper for docker build with various enhancements.

Sudo privileges are required to run the docker command.

Arguments:

--file
Dockerfile to use (default: ./Dockerfile)
-t
name (tag) of Docker image to build
--help
print help and exit
--version
print version and exit

Additional arguments are accepted and passed unchanged to docker build.

4.1.3. Improvements over plain docker build

ch-build adds the following features to docker build:

  • If there is a file Dockerfile in the current working directory and -f is not already specified, add -f $PWD/Dockerfile.
  • Pass the HTTP proxy environment variables through with --build-arg.

Note

The suffix :latest is somewhat misleading, as neither ch-build nor bare docker build will notice if the base FROM image has been updated. Use --no-cache to make sure you have the latest base image, at the cost of rebuilding every layer.

4.1.4. Examples

Create a Docker image tagged foo and specified by the file Dockerfile located in the current working directory. Use /bar as the Docker context directory:

$ ch-build -t foo /bar

Equivalent to above:

$ ch-build -t foo --file=./Dockerfile /bar

Instead, use the Dockerfile /baz/qux.docker:

$ ch-build -t foo --file=/baz/qux.docker /bar

Note that calling your Dockerfile anything other than Dockerfile will confuse people.

4.2. ch-build2dir

Build a Charliecloud image from Dockerfile and unpack it.

4.2.1. Synopsis

$ ch-build2dir CONTEXT DEST [ARGS ...]

4.2.2. Description

Build a Docker image as specified by the file Dockerfile in the current working directory and context directory CONTEXT. Unpack it in DEST.

Sudo privileges are required to run the docker command.

This runs the following command sequence: ch-build, ch-docker2tar, and ch-tar2dir but provides less flexibility than the individual commands.

Arguments:

CONTEXT
Docker context directory
DEST
directory in which to place image tarball and directory
ARGS
additional arguments passed to ch-build
--help
print help and exit
--version
print version and exit

4.3. ch-docker2tar

Flatten a Docker image into a Charliecloud image tarball.

4.3.1. Synopsis

$ ch-docker2tar IMAGE OUTDIR

4.3.2. Description

Flattens the Docker image tagged IMAGE into a Charliecloud tarball in directory OUTDIR.

Sudo privileges are required to run docker export.

Additional arguments:

--help
print help and exit
--version
print version and exit

4.3.3. Example

$ ch-docker2tar hello /var/tmp
57M /var/tmp/hello.tar.gz
$ ls -lh /var/tmp
-rw-r-----  1 reidpr reidpr  57M Feb 13 16:14 hello.tar.gz

4.4. ch-docker-run

Run a command in a Docker container.

4.4.1. Synopsis

$ ch-docker-run [-i] [-b HOSTDIR:GUESTDIR ...] TAG CMD [ARGS ...]

4.4.2. Description

Runs the command CMD in a Docker container using the image named TAG.

Sudo privileges are required for docker run.

CMD is run under your user ID. The users and groups inside the container match those on the host.

Note

This command is intended as a convenience for debugging images and Charliecloud. Routine use for running applications is not recommended. Instead, use ch-run.

Arguments:

-i
run interactively with a pseudo-TTY
-b
bind-mount HOSTDIR at GUESTDIR inside the container (can be repeated)
--help
print help and exit
--version
print version and exit

4.5. ch-fromhost

Inject files from the host into an image directory.

4.5.1. Synopsis

$ ch-fromhost [OPTION ...] (-c CMD | -f FILE | --nvidia ...) IMGDIR

4.5.2. Description

Inject files from the host into the Charliecloud image directory IMGDIR.

The purpose of this command is to provide host-specific files, such as GPU libraries, to a container. It should be run after ch-tar2dir and before ch-run. After invocation, the image is no longer portable to other hosts.

Injection is not atomic; if an error occurs partway through injection, the image is left in an undefined state. Injection is currently implemented using a simple file copy, but that may change in the future.

By default, file paths that contain the string /bin are assumed to be executables and are placed in /usr/bin within the container. File paths that contain the strings /lib or .so are assumed to be shared libraries and are placed in the first-priority directory reported by ldconfig. Other files are placed in the directory specified by --dest.

You can see where shared libraries will go with:

$ ch-run $IMG -- ldconfig -v 2>/dev/null | egrep '^/' | cut -d: -f1 | head -1
/usr/local/lib

If any shared libraries are injected, run ldconfig inside the container (using ch-run -w) after injection.

4.5.3. Options

To specify which files to inject:

-c, --cmd CMD
Inject files listed in the standard output of command CMD.
-f, --file FILE
Inject files listed in the file FILE.
--nvidia
Use nvidia-container-cli list (from libnvidia-container) to find executables and libraries to inject.

These can be repeated, and at least one must be specified.

Additional arguments:

-d, --dest DST

Place files whose destination cannot be inferred in directory IMGDIR/DST. If such a file is found and this option is not specified, exit with an error.
-h, --help
Print help and exit.
--no-infer
Do not infer the type of any files.
-v, --verbose
Pist the injected files.
--version
Print version and exit.

4.5.4. Notes

Symbolic links are dereferenced, i.e., the files pointed to are injected, not the links themselves.

As a corollary, do not include symlinks to shared libraries. These will be re-created by ldconfig.

There are two alternate approaches for nVidia GPU libraries:

  1. Link libnvidia-containers into ch-run and call the library functions directly. However, this would mean that Charliecloud would either (a) need to be compiled differently on machines with and without nVidia GPUs or (b) have libnvidia-containers available even on machines without nVidia GPUs. Neither of these is consistent with Charliecloud’s philosophies of simplicity and minimal dependencies.
  2. Use nvidia-container-cli configure to do the injecting. This would require that containers have a half-started state, where the namespaces are active and everything is mounted but pivot_root(2) has not been performed. This is not feasible because Charliecloud has no notion of a half-started container.

Further, while these alternate approaches would simplify or eliminate this script for nVidia GPUs, they would not solve the problem for other situations.

4.5.5. Bugs

File paths may not contain newlines.

4.5.6. Examples

Place shared library /usr/lib64/libfoo.so at path /usr/lib/libfoo.so within the image /var/tmp/baz and executable /bin/bar at path /usr/bin/bar. Then, create appropriate symlinks to libfoo and update the ld.so cache.

$ cat qux.txt
/bin/bar
/usr/lib64/libfoo.so
$ ch-fromhost --file qux.txt /var/tmp/baz

Same as above:

$ ch-fromhost --cmd 'cat qux.txt' /var/tmp/baz

Same as above, and also place file /etc/quux at /etc/quux within the container:

$ cat corge.txt
/bin/bar
/etc/quux
/usr/lib64/libfoo.so
$ ch-fromhost --file corge.txt --dest /etc /var/tmp/baz

Inject the executables and libraries recommended by nVidia into the image, and then run ldconfig:

$ ch-fromhost --nvidia /var/tmp/baz

4.6. ch-run

Run a command in a Charliecloud container.

4.6.1. Synopsis

$ ch-run [OPTION...] NEWROOT CMD [ARG...]

4.6.2. Description

Run command CMD in a Charliecloud container using the flattened and unpacked image directory located at NEWROOT.

-b, --bind=SRC[:DST]
mount SRC at guest DST (default /mnt/0, /mnt/1, etc.)
-c, --cd=DIR
initial working directory in container
-g, --gid=GID
run as group GID within container
-j, --join
use the same container (namespaces) as peer ch-run invocations
--join-ct=N
number of ch-run peers (implies --join; default: see below)
--join-tag=TAG
label for ch-run peer group (implies --join; default: see below)
--no-home
do not bind-mount your home directory (by default, your home directory is mounted at /home/$USER in the container)
-t, --private-tmp
use container-private /tmp (by default, /tmp is shared with the host)
-u, --uid=UID
run as user UID within container
-v, --verbose
be more verbose (debug if repeated)
-w, --write
mount image read-write (by default, the image is mounted read-only)
-?, --help
print help and exit
--usage
print a short usage message and exit
-V, --version
print version and exit

4.6.3. Host files and directories available in container via bind mounts

In addition to any directories specified by the user with --bind, ch-run has standard host files and directories that are bind-mounted in as well.

The following host files and directories are bind-mounted at the same location in the container. These cannot be disabled.

  • /dev
  • /etc/passwd
  • /etc/group
  • /etc/hosts
  • /etc/resolv.conf
  • /proc
  • /sys

Three additional bind mounts can be disabled by the user:

  • Your home directory (i.e., $HOME) is mounted at guest /home/$USER by default. This is accomplished by mounting a new tmpfs at /home, which hides any image content under that path. If --no-home is specified, neither of these things happens and the image’s /home is exposed unaltered.
  • /tmp is shared with the host by default. If --private-tmp is specified, a new tmpfs is mounted on the guest’s /tmp instead.
  • If file /usr/bin/ch-ssh is present in the image, it is over-mounted with the ch-ssh binary in the same directory as ch-run.

4.6.4. Multiple processes in the same container with --join

By default, different ch-run invocations use different user and mount namespaces (i.e., different containers). While this has no impact on sharing most resources between invocations, there are a few important exceptions. These include:

  1. ptrace(2), used by debuggers and related tools. One can attach a debugger to processes in descendant namespaces, but not sibling namespaces. The practical effect of this is that (without --join), you can’t run a command with ch-run and then attach to it with a debugger also run with ch-run.
  2. Cross-memory attach (CMA) is used by cooperating processes to communicate by simply reading and writing one another’s memory. This is also not permitted between sibling namespaces. This affects various MPI implementations that use CMA to pass messages between ranks on the same node, because it’s faster than traditional shared memory.

--join is designed to address this by placing related ch-run commands (the “peer group”) in the same container. This is done by one of the peers creating the namespaces with unshare(2) and the others joining with setns(2).

To do so, we need to know the number of peers and a name for the group. These are specified by additional arguments that can (hopefully) be left at default values in most cases:

  • --join-ct sets the number of peers. The default is the value of the first of the following environment variables that is defined: OMPI_COMM_WORLD_LOCAL_SIZE, SLURM_STEP_TASKS_PER_NODE, SLURM_CPUS_ON_NODE.
  • --join-tag sets the tag that names the peer group. The default is environment variable SLURM_STEP_ID, if defined; otherwise, the PID of ch-run’s parent. Tags can be re-used for peer groups that start at different times, i.e., once all peer ch-run have replaced themselves with the user command, the tag can be re-used.

Caveats:

  • One cannot currently add peers after the fact, for example, if one decides to start a debugger after the fact. (This is only required for code with bugs and is thus an unusual use case.)
  • ch-run instances race. The winner of this race sets up the namespaces, and the other peers use the winner to find the namespaces to join. Therefore, if the user command of the winner exits, any remaining peers will not be able to join the namespaces, even if they are still active. There is currently no general way to specify which ch-run should be the winner.
  • If --join-ct is too high, the winning ch-run’s user command exits before all peers join, or ch-run itself crashes, IPC resources such as semaphores and shared memory segments will be leaked. These appear as files in /dev/shm/ and can be removed with rm(1).
  • Many of the arguments given to the race losers, such as the image path and --bind, will be ignored in favor of what was given to the winner.

4.6.5. Environment variables

ch-run generally tries to leave environment variables unchanged, but in some cases, guests can be significantly broken unless environment variables are tweaked. This section lists those changes.

  • $HOME: If the path to your home directory is not /home/$USER on the host, then an inherited $HOME will be incorrect inside the guest. This confuses some software, such as Spack.

    Thus, we change $HOME to /home/$USER, unless --no-home is specified, in which case it is left unchanged.

  • $PATH: Newer Linux distributions replace some root-level directories, such as /bin, with symlinks to their counterparts in /usr.

    Some of these distributions (e.g., Fedora 24) have also dropped /bin from the default $PATH. This is a problem when the guest OS does not have a merged /usr (e.g., Debian 8 “Jessie”). Thus, we add /bin to $PATH if it’s not already present.

    Further reading:

4.6.6. Examples

Run the command echo hello inside a Charliecloud container using the unpacked image at /data/foo:

$ ch-run /data/foo -- echo hello
hello

Run an MPI job that can use CMA to communicate:

$ srun ch-run --join /data/foo -- bar

4.7. ch-ssh

Run a remote command in a Charliecloud container.

4.7.1. Synopsis

$ CH_RUN_ARGS="NEWROOT [ARG...]"
$ ch-ssh [OPTION...] HOST CMD [ARG...]

4.7.2. Description

Runs command CMD in a Charliecloud container on remote host HOST. Use the content of environment variable CH_RUN_ARGS as the arguments to ch-run on the remote host.

Note

Words in CH_RUN_ARGS are delimited by spaces only; it is not shell syntax.

4.7.3. Example

On host bar.example.com, run the command echo hello inside a Charliecloud container using the unpacked image at /data/foo with starting directory /baz:

$ hostname
foo
$ export CH_RUN_ARGS='--cd /baz /data/foo'
$ ch-ssh bar.example.com -- hostname
bar

4.8. ch-tar2dir

Unpack an image tarball into a directory.

4.8.1. Synopsis

$ ch-tar2dir TARBALL DIR

4.8.2. Description

Extract the tarball TARBALL into a subdirectory of DIR. TARBALL must contain a Linux filesystem image, e.g. as created by ch-docker2tar.

Inside DIR, a subdirectory will be created whose name corresponds to the name of the tarball with the .tar.gz suffix removed. If such a directory exists already and appears to be a Charliecloud container image, it is removed and replaced. If the existing directory doesn’t appear to be a container image, the script aborts with an error.

Additional arguments:

--help
print help and exit
--verbose
be more verbose
--version
print version and exit

Warning

Placing DIR on a shared file system can cause significant metadata load on the file system servers. This can result in poor performance for you and all your colleagues who use the same file system. Please consult your site admin for a suitable location.

4.8.3. Example

$ ls -lh /var/tmp
total 57M
-rw-r-----  1 reidpr reidpr  57M Feb 13 16:14 hello.tar.gz
$ ch-tar2dir /var/tmp/hello.tar.gz /var/tmp
creating new image /var/tmp/hello
/var/tmp/hello unpacked ok
$ ls -lh /var/tmp
total 57M
drwxr-x--- 22 reidpr reidpr 4.0K Feb 13 16:29 hello
-rw-r-----  1 reidpr reidpr  57M Feb 13 16:14 hello.tar.gz