Docker

Introduction

Docker is a company producing an open-source "software container platform", but the word is also used is also used to refer to the platform itself. Docker allows a user to collect all the code needed to run an application, in our case a grid simulation job, into an image. This image will contain everything from the OS-level utilities and libraries, to the lab UPS products, to the collaboration code, to our grid scripts and environmental variables. This image starts with a OS image, usually provided by the OS experts, and uploaded to the Docker repository. The user then adds layers of software to this base image. The computing division has added the art software layer to an OS image and uploaded this new, larger image to the repository. Mu2e can download the art image and add our own layers. This final mu2e image can be passed around and run on almost any platform with x86_64 architecture.

The Docker image can only be run on a machine with Docker software installed and the Docker daemon running. When the user issues a docker command, the daemon starts the image. The user's command can run in the image's environment, exit, and return control back to the user. The user may also run the image in an interactive mode, gain the image's prompt and issue commands like any linux command line. In effect, the user can fire up their own pseudo-virtual machine. The user's process will have a virtual working directory which can be destroyed, or saved, on exit from the container. File systems mounted on the host machine (your home directory, for example) can be mounted in the pseudo-VM and you can read and write permanently to the file system. This pseudo-VM is called a container because it is isolated in software from other copies of image that may be running.

The main point of the image/container model is that the user brings their entire software stack to the CPU, therefore the execution of the application is very reliable and reproducible. This model differs from the model of running several full virtual machines on the host in an important way. If we start a virtual machine, it requires a separate memory space and will load the shared libraries once in each virtual machine, quickly using up the memory. The Docker daemon knows if the users are running multiple containers for one image and shares as much memory as possible, such as the shared libraries, while keeping each container's data space strictly isolated. If you are running on a machine with 64 GB of memory and 64 cores, you may be able to run 64 containers of one Docker image, but you can't run 64 full virtual machines. Docker is also more CPU efficient than a full VM.

Installing Docker

The Docker software can be installed on Mac, Windows and Linux. SL6 has an older kernel which cannot implement all the features of Docker and limits the size of a Docker image to 10GB. Windows (>=10), MacOS (>=10.11), and SL7 are all suitable to run Docker. The code needs to be installed by admin/root because it will need a higher level of access to the kernel than a typical user application, and will run a system daemon. A few collaborators with access to the computing division machine called woof can access a docker server there. From 4/2019, Mu2e has its own docker build machine called mu2eimagegpvm01. This machine is for a few experts to build images - the intention is that most users will only be downloading those images to a laptop or other machine. New users must be added by hand, ask offline management for access. In 11/2020 the docker cache size was increased to 200 GB.

Installation on Mac OSX

Docker provides installation instructions. There is also a link to further documentation. Once installed, you should see the docker icon on your screen menubar. To test that docker is working, start up the Terminal app, and execute the following from inside a local window:

> docker run hello-world

You should get a welcome message with more ideas for testing.

To mount a local disk (ie make it available to the container), do the following

add the directory you want to mount to the docker import list.
- Select the 'File Sharing' tab in the docker Preferences menu
- Select add ("+"), and add the directory (eg /data)
- Save and relaunch docker
Run docker with your mount.
- > docker -v 'native_directory:local_directory ...

As an example, to mount the directory '/data' as '/data' in an interactive shell inside docker:

> docker run -it -v /data:/data bash
bash-5.0# ls /data
bash-5.0# exit
>

You should see the local contents of the directory from the docker shell. (to stop the interactive shell, type 'exit')

To get root to present native graphics when run inside a docker container you must allow network connections to XQuartz from docker. I found this to be a useful guide.

make sure you have a recent version of XQuartz installed
Enable network connections within XQuartz:
- start XQuartz
- select Preferences from the X11 menu
- select Security tab
- select both 'Authenticate connections' and 'Allow connections from network clients'
- restart XQuartz
To whitelist your local IP address, execute the following from inside a local Terminal screen:

> export ip=`ifconfig en0 | grep inet | awk '$1=="inet" {print $2}'` (you may need to tweak this line..)
> xhost + $ip

export the display when starting the docker image

> docker run -e DISPLAY=$ip:0 ...

Notes from Dave Brown: After that I was able to start root and get it to process and display a largish (50MB) .root file living in /data using a standard macro that's in Offline. I timed how long it took to run the macro native on my mac vs through the docker container (the results on the screen were indistinguishable):

docker: 3.42 seconds
native: 1.79 seconds

Finder says the docker image is 64GBytes on my hard drive. Docker claims it only uses 19GBytes By default docker will only use 2 CPUs on your machine. With hyperthreading there are often more available. You may get better performance by increasing the number of CPUs used by docker. This is done under the Advanced tab of the docker Preferences menu.

A note from Adam Lyon: I do development on my Mac with a Vagrant (Virtual Box) VM and not Docker. Docker i/o performance when accessing the Mac disk volume is atrocious. It's unusable - Docker knows about this problem but can't seem to fix it. With Vagrant, you can mount your Mac disk into the VM with nfs (it sets it all up for you) and it works great. Very fast i/o. You can keep your code and stuff on the Mac side and access it (eg compile) with the VM and it just works. If you try that with docker, it is so slow that you will hate it. I have a VM configuration at https://github.com/lyon-fnal/centos-gm2-dev/blob/master/README.md that I've been using for a long time. With VNC (described in the Readme) it is a very performant and nice to use.

Another note (Pengfei): The slowness of disk access in Mac is caused by osx fuse or osxfs. Turning on “cached” or “delegated" option for the bind-mount might help. The old work around for the slow disk access is to use NFS (e.g. docker-machine + VirtualBox + NFS). But you may not need to do that any more. Since March 2018, Docker for mac starts to support NFS. I personally haven’t tried that though. link

Intallation on Linux

Ubuntu

Docker has installation instructions here, which will take you through to running the hello world example.

There are also some post-installation tips here, that has some advice for e.g. not having to "sudo" every docker command.

From Andy: I had to do the following to get a TBrowser showing and to have files persist after I exit the container:

mkdir /path/to/new/data/
xhost +local:root
XSOCK=/tmp/.X11-unix ; sudo docker run --rm -it -v /path/to/your/machine/working_area:/home/working_area -v $XSOCK:$XSOCK -e DISPLAY=unix$DISPLAY mu2e/user:tutorial_1-00

Once finished I then do

xhost -local:root

to keep things secure

I found some useful info here

Other Linux

From Pasha: Linux command which includes mounting cvmfs and setting up xwindows

XSOCK=/tmp/.X11-unix/X0 ; docker run -it -v /projects:/projects -v /cvmfs:/cvmfs:shared -v $XSOCK:$XSOCK -u `id -u`:`id -u` -e TZ=America/Chicago mu2e

the setup allows to simply setup the executable mu2e and use it
`id -u`:`id -u` assumes that your id in the container is the same as on the host machine, which is very handy
confirmed working on a host running X11 (up to Fedora 35) , but not on Fedora Wayland. Wayland needs a different forwarding mechanism

Installation on Windows

The basic instructions are provided by Docker windows. There are requirements on Windows versions (Windows Home version won't work) and memory. You will need admin access. You will need an account on docker.com in order to download images.

After the installation, you can run run powershell or command as admin, then docker should be in your path. Do a "docker login" to access images on docker.com. (I found it would not take email/password, only username/password). At this point you can download and run images. There seems to be no easy way to avoid running as admin, but there were some not-easy ways available as web recipes. If you reboot, you will need to restart the Docker App, running it as admin. It may take a few minutes to start.

To run a GUI application, you have to set DISPLAY. Run command, and type "ipconfig". Record your I.P. address, (it looks like the from the "DockerNAT" stanza works) and use it to set DISPLAY:

set-variable -name DISPLAY -value 10.11.128.118:0.0
docker run -ti --rm -e DISPLAY=$DISPLAY <image>

After rebooting, I found that I got an error:

No protocol specified
Error: can't open display <ipaddress>

I am using the Xming x server and checked the Xming log by right clicking on the Xming icon in the tray and selecting "show log". I saw errors like :

rejecting request from IP <ipaddress>

I added the ip address to

C: \Program File (x86)\Xming\X0.hosts

In the Xming launch script you can also allow all connections, but that is considered a security risk.

To mount a local disk, you need to turn on sharing. Then, for example, to mount C:\Users\rlc\docker\data, add to the run command

-v c:/Users/rlc/docker/data:/data

I was not able to mount a local disk when starting a container, I suspect it failed for me because DOE has disabled disk sharing on my machine.

References

A cautionary tail

From Yongyi Wu, 6/25/19

A little background: I am running a 64-bit Windows 10 Pro build 14393.2214 (yes, I realized later that the newest Docker for Windows requires build 15063 or later, but this should only prevent an upgrade to the newest version after installation). I have a VMWare Workstaion installed that I use for work and Vt-d enabled.

After I first installed Docker, a prompt showed up and said hyper-V needs to be enabled for Docker to work, and a reboot is required. When I rebooted the computer, a blue screen error page showed up during the staring process. The computer automatically rebooted twice after that and each time accompanied by a blue screen error. I managed to boot the computer within the safe mode. Suspecting that there was something scrambled during the installation process I uninstalled Docker and rebooted. This time it rebooted normally.

Thinking it was just a momentary bad luck, I tried to install Docker again. The program was successfully installed again. No prompt for reboot appeared. However when I opened cmd/powershell and tried to verify docker version, the command was not recognized. I tried to reboot again, suspecting some virtualization functions need to be initialized at boot up. One more blue screen showed up but miraculously the next reboot initiated by Windows automatic recover succeeded. I opened powershell and was able to verify the success installation of Docker.

Little did I know that was not yet the end of the problem. When I reached a later part of the tutorial, I realized no standalone X Server was set up on my machine (I have a MobaXterm for daily quick SSH tasks but it does not provide Xterm for powershell). Thinking back on it now, if other people do not see the other defects I see later, trying to fire up Docker in the MobaXterm shell might be worth trying). I installed Xming, created its configuration file and configured the ip address for display. But the attempts to run ROOT TBrowser failed. I tried to fix it, trying things ranging from running powershell in admin mode to switching to VcXsrv for X Server. Nothing worked. Even worse was that I noticed the "blue screen of death (BSOD)"(Windows Log reports scores of event 7001/10005 within few seconds) persisted upon each boot, and my virtual machines in VMWare Workstation could not be opened any more due to an "Incompatibility with Device/Credential Guard". I got the sense that the BSOD was related to the Docker. At that time I did not found much information on it and I decided to give up on Docker, being afraid that it may damage other data in the computer.

BSOD disappeared with the uninstallation, but I freaked out when I found out the virtual machines were still unable to boot, even using the backup images. Microsoft has instructions on "Disable Windows Defender Credential Guard", which turns the feature off in computer configuration. I followed the instructions and even manually deleted the related registry keys, yet the problem persisted. In my search for solution, I came across an explanation that the issue was actually caused by VM's incompatibility with hyper-V. The solution was to use " bcdedit /set hypervisorlaunchtype off" in a command prompt in admin mode, and reboot afterwards. A new combination of search words "hyper-V" and "BSOD" also revealed the cause of the repeated BSOD's. Multiple incidences of BSOD related to hyper-V were reported. Some suggested the issue is tied to the fast boot feature of the system, but the actual cause remains vague. I spent the rest of the session working together on he tutorial with a colleague.

This is the end of my story, and here are the takeaways: 1) If your work involves using VMWare Workstation or any other software that is incompatible with hyper-V, using Docker is not a good idea, unless you are willing to switch on/off hyper-V each time you switch your working environment. 2) Even if you do not have the problem in 1), issues caused by hyper-V on Windows are quite common. And this issue is hard to avoid, since hyper-V is the key to run Docker. Although there is the work-around of using the Docker Toolbox (which is essentially a Docker VM in an Oracle VM) or installing Docker in a virtual Linux/MacOS, such solution is ugly and may cause severe delay/unstability. BSOD's related to these two methods also exist. I would suggest Windows users to ssh to a Linux machine that runs Docker whenever possible.

Installation From toolkit

These are old scratch notes..

On a windows 10 laptop. The basic download only works for Windows pro, which I didn't have. Switched to downloading Docker toolkit. You need to run as an admin in the command window. Running the first example

docker run hello-world

resulted in errors

docker: error during connect:  ...

I performed a cleanup found on blogs

docker-machine rm default
docker-machine create --driver virtualbox default
docker-machine env --shell cmd default
 (cut and paste printed commands)

and then the hello-world example worked.

docker run -it ubuntu bash

Running a Docker image

There are two basic ways to run a docker image. The first is create a container, run a single command and exit. An example is

docker run --rm <image> <command>
docker run --rm rlcfnal/test:v0 ls; pwd
docker run --rm rlcfnal/test:v0 bash -l -c "ups list -aK+"

The "--rm" says do not save the container after running the command - you want this for any simple single commands or the working disk space will fill up with old non-running containers. If the command prints anything, it will appear on your screen. A switch "-d" will run the container in the background. There seem to be many ways to use docker. One way is to save a container, then "commit" the container, and any new contents, as a new image. You can connect to a container running in the background, or before other actions on it.

You can also run an image interactively.

docker run -it <image>
docker run --rm -it -v /scratch/rlc/docker/v6_2_1-0:/builddir rlcfnal/test:v6_2_1-0

The "--rm" switch says do not save the container after exit. The "-it" says start the interactive container and connects it the terminal. The "-v" option is of the form /localdir:/imagedir and causes the /localdir to be mounted in the container as /imagedir. There doesn't seem to be an x connection since root -l fails to display.

Useful Docker Commands

See space left available for docker server

 docker system df

How to see and delete old containers:

docker ps -a
docker rm $(docker ps -a -q)
docker rm $(docker ps --filter=status=exited --filter=status=created -q)

Remove untagged images

 docker rmi $(docker images | awk '{if($2=="<none>") print $3}' )

Connect to a container that was exited, but not deleted. (Files created in the container will still be there.)

docker ps -a
docker start -ai <container hash or name>

set-variable -name DISPLAY -value 10.11.128.118:0.0
docker run -ti --rm -e DISPLAY=$DISPLAY <image>

Creating a Docker image

A Docker image is a file, like a sophisticated tarball, which contains all the code needed to run our executables. This image will contain everything from the OS-level utilities and libraries, to the lab UPS products, to the collaboration code, to our grid scripts. Anything you would use to run the job interactively needs to be provided in the image. The image might be up to 30 GB in size or it might be as small as a few GB if it is carefully pruned.

One could start an image from scratch, but it we will always start with an operating system image, such as an SL6 or CentOS image, provided by experts, and add to that base image. In fact, the computing division has provided an OS image with an added layer of the mu art software manifest. We then have to add a mu2e release and some odds and ends such as the mu2egrid product.

To download the base image, you will need an account on hub.docker.com. Searching for Mu2e, FNAL, or fnalart will show the existing images. Mu2e has an account with username "mu2e" and linked to mu2e-production@fnal.gov. All hub images are public by default, so can be uploaded and shared from any account. The web page will show how to download the image, like:

docker pull fnalart/mu:v2_06_02-s47-e10-prof

Theoretically, the cvmfs client could be added to the image and run in the container to mount and read a standard cvmfs repository. We have heard this has been done, but we don't know of local expertise and haven't tried it yet. Until this is available, we need to include all the software directly in the image, or install cmvfs on the host machine, then mount the cvmfs directory when you start the container, as if it were a simple local disk. (This is how FermiGrid works.)

To build a useful image, you will need to pick a base image (usually a version of linux) then collect all the files you want to add to the image. The files to be added need to all be under a single working directory. There is a control file called Dockerfile, in the workig directory, which defines how to add your files to the base image to create a new image. Here is an example

FROM mu2e/base:grid-00
ADD  setupmu2e-art.sh /setupmu2e-art.sh
ADD  products     /products
ADD  artexternals /artexternals
ADD  DataFiles    /DataFiles
ADD  Offline      /Offline
COPY graphicslibs /graphicslibs
ADD  etc_profile /etc/profile.d/custom.sh
ADD  etc_bashrc   /etc_bashrc
RUN  cat /etc_bashrc >> /etc/bashrc ; rm /etc_bashrc
ENV  UPS_OVERRIDE "-H Linux64bit+3.10-2.17"
ENV  MU2E_DOCKER_IMAGE v6_2_1-00
CMD ["/bin/bash","-l"]

The FROM command defines the base image which you previously "pulled". Some useful existing images are

scientificlinux/sl7  (base SL7)
fermilab/fnal-wn-sl7  (FermiGrid worker node Dockerfile )
centos  (specific versions available as tags)
ubuntu  (specific versions available as tags)

The ADD command move files into the image. It has two arguments - the first is the local, source directory relative to the location of the Dockerfile and the second is the target directory in the image. The ENV command causes an environmental to be defined every time the image is run. CMD defines what is the default command for the image. This is what is run when you run it interactively. The CMD in the highest layer over-rides any CMD from previous layers.

The UPS_OVERRIDE line tell UPS to always use this as the UPS flavor (a.k.a. the kernel version). This is necessary because UPS will fail to extract a useful flavor from the CentOS kernel.

The pattern docker expects is to put your files to be added in a directory along with the Dockerfile, then tell it to build.

docker build -t <user/repo:tag> <directory with Dockerfile>
docker build -t mu2e/prd:v6_2_1-00 /scratch/rlc/docker/v6_2_1-00

The image itself goes into the docker cache directory which was created when docker product and its server were installed. You generally don't need to know where it is. You will push the image to hub.docker.com when it is ready to use.

If you build many images off of one base image, docker is smart enough to re-use base image and not waste any disk space.

To upload the image to docker hub, which is recommended whenever they need to be shared, you will need an account on [docker.com docker.com]. Then login with a docker.com account username and password, and push the image.

docker login  
docker push mu2e/prd:v6_2_1-00

I couldn't get the the username and password switches so I let it prompt me. Also it seems to require the username, not the user email.

A procedure on Jenkins

wiki describing a framework for building Docker containers on Jenkins for Continuous Integration. Not sure who is doing this..

Singularity

Singularity is an alternative virtualization product, orginally developed for HPC centers. Singularity can take a Docker container and modify it in important ways:

convert it to run in user space, not inside the root server like Docker does. This solves many permissions and security problems.
exposes more of the OS, such as other running process, and mounts some disks by default

docs old docs FIFE docs Our Docker build machine can also build Singularity. SL7 machines can install Singularity and run images. At the lab, they only install Singularity in non-privilged mode, which means you can only run image made with the "sandbox" setting, which look like a simple directory containing what you would see in the root directory ("/") in a normal linux machine.

On mu2eimagegpvm01 (the docker build node), you can create a singularity image from a docker image:

singularity build --sandbox --tmpdir /scratch/mu2e/rlc/tmp \
   /containers/userbase:sl7-00  docker://mu2e/userbase:sl7-00

and on any of our SL7 nodes, run it:

singularity [run|exec] /containers/userbase:sl7-00

add -B/cvmfs to mount the cvmfs disk.

As of 12/2022, here is how to start the latest singularity container we use on the grid, as it is started on the grid. Username will not be correct

singularity exec --pid --ipc --contain --bind /cvmfs --bind /etc/hosts --home ./:/srv --pwd /srv /cvmfs/singularity.opensciencegrid.org/fermilab/fnal-wn-sl7:latest /bin/bash

by default FermiGrid runs container ecf-gco/gpgrid-sl7

Singularity on the grid

FermiGrid and some OSG sites (SU-ITS and FNAL as of 10/2019) can run custom singularity containers. A custom container can be built on our Mu2e image node (see above). It is recommended that you create you image as a Docker image and upload it to dockerhub. Since the grid software has to run in your image, if has to have capabilities for handling certificates and data transfers. The easiest way to get this functionality is to start with the Femrigrid standard worker node image:

fermilab/fnal-wn-sl7

After that, request that it be converted to singularity and loaded on

/cvmfs/singularity.opensciencegrid.org

You can make the request by sending mail to support@osgconnect.net or reading more about how it works on OSG. When you submit your job, add these options:

--singularity-image '/full/path/to/image'

Your image will replace the standard worker node images. It will be run with switches

--pid --ipc --contain

and mount the following disks by default:

 /cvmfs
 /etc/hosts (for reverse look ups)

On Femrigrid, debugging output can be added with

  --lines='+FermiGrid_SINGULARITY_DEBUG = True'

It is very highly recommended that users run a small number of jobs first to make sure there aren't any unexpected issues with their container images. If there are any issues, jobs are going to fail on startup and submitting (and exiting) a lot of jobs without appropriate testing will increase the duty cycle on the schedd side.

You can run the lab singularity images, including mounting cvmfs with a command like

singularity run -B/cvmfs -B/mu2e/data /cvmfs/singularity.opensciencegrid.org/fermilab/fnal-wn-sl7:latest df -h

This command will just run the "df" command and exit. To go into an interactive shell, replace "run" with "shell".

Start-up scripts

When a docker image is run on a HPC center system, it is often converted to singularity container. This adds security features.

In these tests, based on the CentOS base image, we have two start-up files for all users:

/etc/bashrc
/etc/profile.d/custom.sh

We added setting environmentals to these files so I can tell if they run. We created two test files:

/home/test1 which contains "printenv"
/home/test2 which contains "#!/bin/bash \n printenv"

shifter does not ever run bashrc, even when running interactively. docker does run bashrc, but only when running interactively
both shifter and docker run profile if their command is "/bin/bash" "-l"
if the command is "/home/test1", docker fails to run it, shifter assumes a bash shell and runs it, but does not run profile
if the command is "/bin/bash" "/home/test1" or "/bin/bash" "/home/test2" or "/home/test2", shifter and docker can run it, but do not run profile
is the command is "bash" "-l" "/home/test1" or "/bin/bash" "-l" "/home/test2", docker and shifter run it after running profile

Build a singularity container

Create al9_image.def with the missing libraries

bootstrap: docker
From: fermilab/fnal-wn-el9

%post
    yum install -y perl-autodie
    yum install -y perl-English

Build a container on mu2eimagegpvm02:

apptainer build al9_image.simg al9_image.def