Gridexport: Difference between revisions
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
== Introduction == | |||
The ups package gridexport will export a build of Mu2e Offline or a build of a satellite release for use on a grid worker node. | The ups package gridexport will export a build of Mu2e Offline or a build of a satellite release for use on a grid worker node. | ||
Line 5: | Line 8: | ||
When you have exported code in this way, you can submit jobs to site in OSG; previously, if you were running a build on /mu2e/app you were restricted to running on Fermigrid, the only site that mounted that disk on its worker nodes. | When you have exported code in this way, you can submit jobs to site in OSG; previously, if you were running a build on /mu2e/app you were restricted to running on Fermigrid, the only site that mounted that disk on its worker nodes. | ||
== Cheat Sheet == | |||
# setup mu2e | |||
# setup gridexport | |||
# cd to a valid build of Mu2e Offline or a valid Mu2e satallite release and setup that code | |||
# gridexport - and capture the absolute path to the Code.tar.bz file that is printed at the end of the command | |||
# setup mu2eprodsys | |||
# mu2eprodsys --code=<absolute path to the tar.bz file> [other arguments ...] | |||
== Details == | |||
In step 2 you do not need to provide a version number for gridexport; you will automatically get the version that has been declared current. If you need to choose a different version, you can specify it using the usual ups syntax. | |||
Step 3 is important; gridexport will only work if your current working directory is the root directory of a build of Mu2e Offline or the root directory of a satellite release. In addition gridexport requires that the build be setup prior to running gridexport. This is required because gridexport uses some environment variables that are created by setting up the build; the alternative is for gridexport to parse setup.sh, which seems fragile. | |||
Step 4 will create a bzipped tar file, named /pnfs/mu2e/scratch/user/<your username>/gridexport/tmp.xxxxxxx/Code.tar.bz, where xxxxxxxx is a random string. The random string ensures that a run of gridexport will not step on the output of a previous run. The last line of output from the command is the absolute path to the Code.tar.bz file - save this because you need it in step 6. | |||
In step 5 be sure that the version of mu2eprodsys is v5_00_00 or greater. If not setup the highest version explicitly. | |||
In step 6 the --code option replaces the --setup option; the two are mutually exclusive. The --code option first appeared in mu2eprodsys v5_00_00. | |||
The default set of exclude patterns are found in | |||
$GRIDEXPORT_DIR/etc/OfflineExcludePatterns.txt | |||
If you wish to give additional exlcude patterns to gridexport, you can give the option: | |||
gridexport --append-exclude-from=FILE | |||
where FILE is a text file that contains the additional exclude patterns. You can abbreviate this to: | |||
gridexport -A FILE | |||
If you wish to override the default set of exclude patterns, you can use the syntax: | |||
gridexport --exclude-from=FILE | |||
gridexport -E FILE | |||
where FILE is a text file that contains the exclude patterns. | |||
If you specify both -E and -A, the two files will be concatenated. | |||
You can see other options of gridexport by: | |||
gridexport --help | |||
gridexport -h | |||
== Implementation Details == | |||
gridexport will make a temporary directory with a name like: | |||
/mu2e/app/users/<your username>/gridexport/tmp.xxxxxxxx | |||
where xxxxxxx is a randomly chosen string. It is the same randomly chosen string as is used for the path to the Code.tar.bz in /pnfs space. | |||
Inside this directory gridexport will make: | |||
# Code - a subdirectory | |||
# exclude.txt - a file containing exclude patterns for tar | |||
Under the Code subdirectory there will be a file named setup.sh and one or two symbolic links. In all cases there will be a symbolic link to the directory from which you ran gridexport; the name of that symbolic link will be the name of the directory. If you ran gridexport from within a satellite release and if the base release upon whichthe satellite release is based is NOT located on cvmfs, there will also be a symbolic link to the root directory of the base release. | |||
When you source setup.sh, it forwards to the setup.sh file in the appropriate subdirectory; if there is only one symbolic link then setup.sh forwards to setup.sh in | |||
that one subdirectory. If there are two symbolic links, it forwards to the setup.sh in the satellite release; in this case it also instructs the setup.sh in the satellite release to | |||
look for it's base release in the correct place on the worker node. | |||
gridexport will use tar to archive the subdirectory: | |||
/mu2e/app/users/kutschke/gridexport/tmp.xxxxxxxx/Code/ | |||
and put it in: | |||
/pnfs/mu2e/scratch/users/kutschke/gridexport/tmp.xxxxxxxx/Code.tar.bz | |||
It tells tar to follow symbolic links. In this way it captures the content of the directory tree from which gridexport was run; if there is corresponding base release on a non-cvmfs disk, it too is captured in the bzipped tar file. | |||
The tar command is told to exclude files that are not needed on the grid worker node, such as source code, object files, SConscript files, Makefiles, the .git subdirectory, and so on. It also excludes any files ending in .art or matching the pattern *.log* . The mechanism is described below. | |||
As of January 2018 it typically takes 3 to 5 minutes to produce the bzipped tar file of a full build of Mu2e Offline. It typically takes seconds to produce the bzipped tar file of a small satellite release that uses a base release in cvmfs; if dCache is heavily loaded it may take up to 30 seconds. For satellite releases that use a base release that is NOT on cvmfs, the base release is also copied into the bzipped tar file; this typically takes 3 to 5 minutes. | |||
For full build of Mu2e Offline, as of January 2018, the bzipped tar file has a size of about 360 MB. Tests showed that bzip slightly outperformed gzip for both size and CPU time. | |||
Why is the temporary space on /mu2e/app and not on /pnfs; there are sometimes delays of up to 30 seconds creating small files on /pnfs. | |||
Why not put the temporary space under current working directory; the reason is that it seemed fragile. The temporary directory contains a symbolic link back to the current working directory. The tar command has to be told to follow symbolic links; this will result in a recursive tar, which will eventually stop when the symbolic link count has been exceeded. One can solve this by excluding the temporary directory from tar command. However we provide users with the ability to specify their own exclude patterns; if they forget to exclude the temporary directory the recursion problem will bite. That's what is fragile. | |||
I experimented with writing the bzipped tar file to a bluearc disk or /tmp and then copying it to pnfs. Both were slower than writing directly to pnfs. | |||
== Interaction with mu2eprodsys == | |||
Starting with v5_00_00, mu2eprodsys supports the --code argument. This option and the --source option are exclusive. | |||
mu2eprodsys will use ifdh to transfer the specified tar.bz file to its current working directory. It will extract the content, which results in subdirectory named Code in the current working directory. Then it will | |||
source Code/setup.sh | |||
As of January 2018, an exported build of Offline occupies about 1.5 GB of disk space; the bzipped tar file occupies about another 350 to 400 MB. So plan on this code occupying about 2GB of the available disk space on the worker node. Include this in your accounting of the disk space required for your workflow. | |||
== To Do List == | |||
gridexport does NOT clean up after itself. Should gridexport clean up it's temporary space automatically after the tar command completes? If so, it should be retained if the verbose option is set; we can also add a --keep-tmp option. | |||
An alternative is to periodically clean up the temporary area on /mu2e/app. Should we automate this? Perhaps we should move the temporary directories to /mu2e/data/gridexport/users/ and have cron jobs that expire files older than a month? Also, files on /pnfs will expire from cache but directories do not; so should we think about a cron job to delete empty directories? | |||
Is there a good algorithm to make more meaningful temporary directory names. We still need to guarantee uniqueness. Maybe | |||
project-name-yyyy-mm-dd-hh-mm-ss-random | |||
where project-name can be supplied as an argument or defaults to the path to the build area with / turned into _ ? Is the random string really necessary to guarantee uniqueness in this model? | |||
Do we like the capitalized options or should they be lower case. I made the short form uppercase because if you have -a and --append-exclude-from, and then forget the double dash on the long form, it results in a difficult to understand error message. | |||
gridexport should check that the files specified by -A and -E exist and are files, not directories. Should it check that they not empty? It should give understandable error messages if the tests fail. |
Revision as of 17:46, 29 January 2018
Introduction
The ups package gridexport will export a build of Mu2e Offline or a build of a satellite release for use on a grid worker node.
Until January 2018, the bluearc disk /mu2e/app was mounted on worker nodes on Fermigrid; it was possible to build code on that disk and run it on Fermigrid. That is no longer possible. If you have a build of Mu2e code on a computer that can see /pnfs/mu2e/scratch, then you can use gridexport to produce a bzipped tar file of your build area. You can then submit a grid job that copies the bzipped tar file, unpacks it and runs the code that it finds inside. In particular you can use gridexport on any of the mu2egpvm* machines to export a build that is located on /mu2e/app or in your home area.
When you have exported code in this way, you can submit jobs to site in OSG; previously, if you were running a build on /mu2e/app you were restricted to running on Fermigrid, the only site that mounted that disk on its worker nodes.
Cheat Sheet
- setup mu2e
- setup gridexport
- cd to a valid build of Mu2e Offline or a valid Mu2e satallite release and setup that code
- gridexport - and capture the absolute path to the Code.tar.bz file that is printed at the end of the command
- setup mu2eprodsys
- mu2eprodsys --code=<absolute path to the tar.bz file> [other arguments ...]
Details
In step 2 you do not need to provide a version number for gridexport; you will automatically get the version that has been declared current. If you need to choose a different version, you can specify it using the usual ups syntax.
Step 3 is important; gridexport will only work if your current working directory is the root directory of a build of Mu2e Offline or the root directory of a satellite release. In addition gridexport requires that the build be setup prior to running gridexport. This is required because gridexport uses some environment variables that are created by setting up the build; the alternative is for gridexport to parse setup.sh, which seems fragile.
Step 4 will create a bzipped tar file, named /pnfs/mu2e/scratch/user/<your username>/gridexport/tmp.xxxxxxx/Code.tar.bz, where xxxxxxxx is a random string. The random string ensures that a run of gridexport will not step on the output of a previous run. The last line of output from the command is the absolute path to the Code.tar.bz file - save this because you need it in step 6.
In step 5 be sure that the version of mu2eprodsys is v5_00_00 or greater. If not setup the highest version explicitly.
In step 6 the --code option replaces the --setup option; the two are mutually exclusive. The --code option first appeared in mu2eprodsys v5_00_00.
The default set of exclude patterns are found in $GRIDEXPORT_DIR/etc/OfflineExcludePatterns.txt
If you wish to give additional exlcude patterns to gridexport, you can give the option:
gridexport --append-exclude-from=FILE
where FILE is a text file that contains the additional exclude patterns. You can abbreviate this to:
gridexport -A FILE
If you wish to override the default set of exclude patterns, you can use the syntax:
gridexport --exclude-from=FILE gridexport -E FILE
where FILE is a text file that contains the exclude patterns.
If you specify both -E and -A, the two files will be concatenated.
You can see other options of gridexport by:
gridexport --help gridexport -h
Implementation Details
gridexport will make a temporary directory with a name like:
/mu2e/app/users/<your username>/gridexport/tmp.xxxxxxxx
where xxxxxxx is a randomly chosen string. It is the same randomly chosen string as is used for the path to the Code.tar.bz in /pnfs space.
Inside this directory gridexport will make:
- Code - a subdirectory
- exclude.txt - a file containing exclude patterns for tar
Under the Code subdirectory there will be a file named setup.sh and one or two symbolic links. In all cases there will be a symbolic link to the directory from which you ran gridexport; the name of that symbolic link will be the name of the directory. If you ran gridexport from within a satellite release and if the base release upon whichthe satellite release is based is NOT located on cvmfs, there will also be a symbolic link to the root directory of the base release.
When you source setup.sh, it forwards to the setup.sh file in the appropriate subdirectory; if there is only one symbolic link then setup.sh forwards to setup.sh in that one subdirectory. If there are two symbolic links, it forwards to the setup.sh in the satellite release; in this case it also instructs the setup.sh in the satellite release to look for it's base release in the correct place on the worker node.
gridexport will use tar to archive the subdirectory:
/mu2e/app/users/kutschke/gridexport/tmp.xxxxxxxx/Code/
and put it in:
/pnfs/mu2e/scratch/users/kutschke/gridexport/tmp.xxxxxxxx/Code.tar.bz
It tells tar to follow symbolic links. In this way it captures the content of the directory tree from which gridexport was run; if there is corresponding base release on a non-cvmfs disk, it too is captured in the bzipped tar file.
The tar command is told to exclude files that are not needed on the grid worker node, such as source code, object files, SConscript files, Makefiles, the .git subdirectory, and so on. It also excludes any files ending in .art or matching the pattern *.log* . The mechanism is described below.
As of January 2018 it typically takes 3 to 5 minutes to produce the bzipped tar file of a full build of Mu2e Offline. It typically takes seconds to produce the bzipped tar file of a small satellite release that uses a base release in cvmfs; if dCache is heavily loaded it may take up to 30 seconds. For satellite releases that use a base release that is NOT on cvmfs, the base release is also copied into the bzipped tar file; this typically takes 3 to 5 minutes.
For full build of Mu2e Offline, as of January 2018, the bzipped tar file has a size of about 360 MB. Tests showed that bzip slightly outperformed gzip for both size and CPU time.
Why is the temporary space on /mu2e/app and not on /pnfs; there are sometimes delays of up to 30 seconds creating small files on /pnfs.
Why not put the temporary space under current working directory; the reason is that it seemed fragile. The temporary directory contains a symbolic link back to the current working directory. The tar command has to be told to follow symbolic links; this will result in a recursive tar, which will eventually stop when the symbolic link count has been exceeded. One can solve this by excluding the temporary directory from tar command. However we provide users with the ability to specify their own exclude patterns; if they forget to exclude the temporary directory the recursion problem will bite. That's what is fragile.
I experimented with writing the bzipped tar file to a bluearc disk or /tmp and then copying it to pnfs. Both were slower than writing directly to pnfs.
Interaction with mu2eprodsys
Starting with v5_00_00, mu2eprodsys supports the --code argument. This option and the --source option are exclusive.
mu2eprodsys will use ifdh to transfer the specified tar.bz file to its current working directory. It will extract the content, which results in subdirectory named Code in the current working directory. Then it will source Code/setup.sh
As of January 2018, an exported build of Offline occupies about 1.5 GB of disk space; the bzipped tar file occupies about another 350 to 400 MB. So plan on this code occupying about 2GB of the available disk space on the worker node. Include this in your accounting of the disk space required for your workflow.
To Do List
gridexport does NOT clean up after itself. Should gridexport clean up it's temporary space automatically after the tar command completes? If so, it should be retained if the verbose option is set; we can also add a --keep-tmp option.
An alternative is to periodically clean up the temporary area on /mu2e/app. Should we automate this? Perhaps we should move the temporary directories to /mu2e/data/gridexport/users/ and have cron jobs that expire files older than a month? Also, files on /pnfs will expire from cache but directories do not; so should we think about a cron job to delete empty directories?
Is there a good algorithm to make more meaningful temporary directory names. We still need to guarantee uniqueness. Maybe
project-name-yyyy-mm-dd-hh-mm-ss-random
where project-name can be supplied as an argument or defaults to the path to the build area with / turned into _ ? Is the random string really necessary to guarantee uniqueness in this model?
Do we like the capitalized options or should they be lower case. I made the short form uppercase because if you have -a and --append-exclude-from, and then forget the double dash on the long form, it results in a difficult to understand error message.
gridexport should check that the files specified by -A and -E exist and are files, not directories. Should it check that they not empty? It should give understandable error messages if the tests fail.