ALCF: Theta: Difference between revisions
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
=Running Jobs on Theta at ALCF= | |||
==Login== | |||
Once you have an allocation on Theta, or if you are using an existing allocation, you can reference the [https://www.alcf.anl.gov/user-guides/onboarding-guide Onboarding Guide] for answers to most of your questions about how to get started. | Once you have an allocation on Theta, or if you are using an existing allocation, you can reference the [https://www.alcf.anl.gov/user-guides/onboarding-guide Onboarding Guide] for answers to most of your questions about how to get started. | ||
Line 9: | Line 10: | ||
At the prompt you must enter your password; you must enter your 4-digit PIN (given to you by ALCF when you get an account) followed immediately by the one-time 8-digit cryptocard password with no spaces in between the two. You are then in your home directory /home/<username> running on a login node running the bash shell. Login nodes run SUSE Enterprise Linux-based full CLE OS. You can change the login shell in your account web page. | At the prompt you must enter your password; you must enter your 4-digit PIN (given to you by ALCF when you get an account) followed immediately by the one-time 8-digit cryptocard password with no spaces in between the two. You are then in your home directory /home/<username> running on a login node running the bash shell. Login nodes run SUSE Enterprise Linux-based full CLE OS. You can change the login shell in your account web page. | ||
There are two filesystems on Theta: the GPFS system which houses the /home/<username> directories in /gpfs/mira-home), and the Lustre filesystem which houses the /project/<proejctname> directories in /lus/theta-fs0/projects. The /home directories are backed up and by default are 50GiB. The /project directories are '''NOT''' backed up and are by default 1TiB. | ==Filesystems== | ||
There are two filesystems on Theta: the GPFS system which houses the /home/<username> directories in /gpfs/mira-home), and the Lustre filesystem which houses the /project/<proejctname> directories in /lus/theta-fs0/projects. The /home directories are backed up and by default are 50GiB. The /project directories are '''NOT''' backed up and are by default 1TiB. The /project directory is viewable by all members of the project, so common code and files should be placed here. | |||
Your environment is controlled via 'modules'. There is a default set of modules set up for all users. Run <pre> module list </pre> to see what is loaded at any given time. For the work done on Theta to date (as of May 2019), users have not needed to modify their environment. | ==Environment== | ||
Your environment is controlled via 'modules'. There is a default set of modules set up for all users. Run <pre> module list </pre> to see what is loaded at any given time. For the work done on Theta to date (as of May 2019), users have not needed to modify their environment. As of May 2019, the output of the 'module list' command for a default environment is | |||
<pre> | <pre> | ||
Currently Loaded Modulefiles: | Currently Loaded Modulefiles: | ||
Line 40: | Line 44: | ||
</pre> | </pre> | ||
==Containers== | |||
The easiest way to run the Mu2e Offline code on Theta is to run it in a container. [https://www.docker.com Docker] is a common container platform, but because of security issues, ALCF does not allow users to run Docker containers on their systems. [https://www.sylabs.io/docs/ Singularity] is another container platform that does not have the same security issues as Docker, and can be run on Theta. Singularity is capable of building containers from Docker images, so the Mu2e Offline code can be containerized as a Docker '''or''' Singularity container for use on Theta. | The easiest way to run the Mu2e Offline code on Theta is to run it in a container. [https://www.docker.com Docker] is a common container platform, but because of security issues, ALCF does not allow users to run Docker containers on their systems. [https://www.sylabs.io/docs/ Singularity] is another container platform that does not have the same security issues as Docker, and can be run on Theta. Singularity is capable of building containers from Docker images, so the Mu2e Offline code can be containerized as a Docker '''or''' Singularity container for use on Theta. | ||
We built a Docker container of the Offline code and put it on [https://hub.docker.com Docker Hub]. To pull | We built a Docker container of the Offline code and put it on [https://hub.docker.com Docker Hub]. To pull a container to Theta and turn it into a Singularity container run the command | ||
<pre> | <pre> | ||
singularity pull docker://username/image_name:image_version | singularity pull docker://username/image_name:image_version | ||
</pre> | </pre> | ||
You will then have a container named 'image_name-image_version.simg' in the current directory. | |||
For example, for the March 2019 jobs on Theta we used 'singularity pull docker://goodenou/mu2emt:v7_2_0-7.7.6' to create a container called mu2emt-v7_2_0-7.7.6.simg for use. We placed the container in the /projects/mu2e_CRY for all project members to access. For more information on using Singularity containers on Theta, see the ALCF [https://www.alcf.anl.gov/user-guides/singularity tutorial]. | |||
==Running Jobs== | |||
The ALCF has a detailed [https://www.alcf.anl.gov/user-guides/running-jobs-xc40 webpage] on running jobs on Theta. Theta uses the batch scheduler Cobalt. Jobs are run using the 'apron' command. | |||
As a first test of any code, it is good practice to run an interactive job. To get one node for interactive use, run the following | |||
<pre> | |||
qsub -A <projectname> -t 15 -q debug-cache-quad -n 1 -I | |||
</pre> | |||
This will put you on a service node from which you can launch your job. |
Revision as of 21:49, 7 May 2019
Running Jobs on Theta at ALCF
Login
Once you have an allocation on Theta, or if you are using an existing allocation, you can reference the Onboarding Guide for answers to most of your questions about how to get started.
To login to Theta from a terminal:
ssh <username>@theta.alcf.anl.gov
At the prompt you must enter your password; you must enter your 4-digit PIN (given to you by ALCF when you get an account) followed immediately by the one-time 8-digit cryptocard password with no spaces in between the two. You are then in your home directory /home/<username> running on a login node running the bash shell. Login nodes run SUSE Enterprise Linux-based full CLE OS. You can change the login shell in your account web page.
Filesystems
There are two filesystems on Theta: the GPFS system which houses the /home/<username> directories in /gpfs/mira-home), and the Lustre filesystem which houses the /project/<proejctname> directories in /lus/theta-fs0/projects. The /home directories are backed up and by default are 50GiB. The /project directories are NOT backed up and are by default 1TiB. The /project directory is viewable by all members of the project, so common code and files should be placed here.
Environment
Your environment is controlled via 'modules'. There is a default set of modules set up for all users. Run
module list
to see what is loaded at any given time. For the work done on Theta to date (as of May 2019), users have not needed to modify their environment. As of May 2019, the output of the 'module list' command for a default environment is
Currently Loaded Modulefiles: 1) modules/3.2.11.1 2) intel/18.0.0.128 3) craype-network-aries 4) craype/2.5.15 5) cray-libsci/18.07.1 6) udreg/2.3.2-6.0.7.1_5.13__g5196236.ari 7) ugni/6.0.14.0-6.0.7.1_3.13__gea11d3d.ari 8) pmi/5.0.14 9) dmapp/7.1.1-6.0.7.1_5.45__g5a674e0.ari 10) gni-headers/5.0.12.0-6.0.7.1_3.11__g3b1768f.ari 11) xpmem/2.2.15-6.0.7.1_5.11__g7549d06.ari 12) job/2.2.3-6.0.7.1_5.43__g6c4e934.ari 13) dvs/2.7_2.2.118-6.0.7.1_10.1__g58b37a2 14) alps/6.6.43-6.0.7.1_5.45__ga796da32.ari 15) rca/2.2.18-6.0.7.1_5.47__g2aa4f39.ari 16) atp/2.1.3 17) perftools-base/7.0.4 18) PrgEnv-intel/6.0.4 19) craype-mic-knl 20) cray-mpich/7.7.3 21) nompirun/nompirun 22) darshan/3.1.5 23) trackdeps 24) xalt
Containers
The easiest way to run the Mu2e Offline code on Theta is to run it in a container. Docker is a common container platform, but because of security issues, ALCF does not allow users to run Docker containers on their systems. Singularity is another container platform that does not have the same security issues as Docker, and can be run on Theta. Singularity is capable of building containers from Docker images, so the Mu2e Offline code can be containerized as a Docker or Singularity container for use on Theta.
We built a Docker container of the Offline code and put it on Docker Hub. To pull a container to Theta and turn it into a Singularity container run the command
singularity pull docker://username/image_name:image_version
You will then have a container named 'image_name-image_version.simg' in the current directory.
For example, for the March 2019 jobs on Theta we used 'singularity pull docker://goodenou/mu2emt:v7_2_0-7.7.6' to create a container called mu2emt-v7_2_0-7.7.6.simg for use. We placed the container in the /projects/mu2e_CRY for all project members to access. For more information on using Singularity containers on Theta, see the ALCF tutorial.
Running Jobs
The ALCF has a detailed webpage on running jobs on Theta. Theta uses the batch scheduler Cobalt. Jobs are run using the 'apron' command.
As a first test of any code, it is good practice to run an interactive job. To get one node for interactive use, run the following
qsub -A <projectname> -t 15 -q debug-cache-quad -n 1 -I
This will put you on a service node from which you can launch your job.