Muse: Difference between revisions
Line 412: | Line 412: | ||
Typically, once the new envset is validated, | Typically, once the new envset is validated, | ||
# u001 would be renamed "p001" as the successor to "p000" | # u001 would be renamed "p001" as the successor to "p000" | ||
# p001 is committed to the Muse repo, config directory, | # p001 is committed to the Muse repo, config directory, to archive it. This is done with the [[GitHubWorkflow#Developer_Workflow|standard git PR procedures]] in the Muse repo. | ||
# p001 is uploaded to /cvmfs/cvmfs/mu2e.opensciencegrid.org/DataFiles/Muse | # p001 is uploaded to /cvmfs/cvmfs/mu2e.opensciencegrid.org/DataFiles/Muse directory. This is done with the [[Cvmfs#Adding_CVMFS_content|standard CVMFS upload procedure]] | ||
# when the code to work with the new products is merged, the ENVSET line in <code>Offline/.muse</code> should updated to p001 in the same PR | # when the code to work with the new products is merged, the ENVSET line in <code>Offline/.muse</code> should updated to p001 in the same PR | ||
Revision as of 19:05, 30 June 2021
Introduction
As Mu2e approaches data-taking there will be many more ntuples, calibration procedures, and analysis code developed. Including this user code in the main Offline repo will be unwieldy and hard to maintain. The main Offline repo should remain minimal, including only what is needed for simulation and reconstruction jobs, for simple maintenance and faster builds. Code for calibration, ntuple, and analysis code, maintained and used by individuals and small groups should be stored in a smaller repos controlled by the appropriate group. These smaller repos could be standalone contain modules and utilities that depend on code and functionality in other small repos, or in Offline. To make this work, we need a system to build the smaller repos against an existing Offline build, or together with the Offline, sharing include files, fcl, and cross-linking. The system should support common use cases and be flexible. The Muse system is a set of scripts developed inside Mu2e that provides this functionality. It is a UPS product containing a set of scripts and support for scons code build at the core.
In this page we will use Ana1 and Ana2 as examples of smaller, non-Offline code repos. WorkDir will represent the path to a user project area on /mu2e/app, or possibly another disk where you can build code. Muse is designed around this workDir which will contain the repos to be built. If building Offline, workDir contains the Offline directory. AnaDir will represent an area where analysis files are kept : scripts, root scripts, ntuples, histograms, and notes. If the user's work is mostly concerned with code development and commits, then it may be convenient to use WorkDir as AnaDir, or, if the work is mostly using a static build, then they may be different. Or there may be several versions of AnaDir, or AnaDir's with different goals, all using the same build in workDir.
Muse maintenance operations for experts are on MuseMaintenance.
Quick Start
Always required:
setup mu2e setup muse
Built-in help:
muse -h muse <command> -h muse status
Muse can only be setup once in a process and can't be unsetup. To change setup choices, start in a new process. When building on mu2egpvm01-06, use "-j 4", when building on mu2ebuild01, use "-j 20", see Machines. See GitHubWorkflow for more details on using git. See mgit for more information on using partial checkout (building only parts of Offline).
Muse scripts are focused on the workDir which provides the code to be built, a place to build the code, and persistency of what the user is building and using. There can only be one workDir in a process. After muse is UPS setup ("setup muse"), the workDir is usually configured by cd'ing to the workDir, checking out repos
cd workDir git clone ...
or linking in pre-build backing releases.
cd workDir muse link ...
once the code is organized, then muse is setup
cd workDir muse setup
or from any directory:
muse setup workDir
Once "muse setup" has been run, then the list of repos which will be built, and the build options, are locked in for the process. If you add or delete repos, or change a link to a backing build, then you must run "muse setup" again in a new process. The status report:
muse status
can be run before or after "muse setup". If run before, it assumes the default directory is the workDir. If run after, it can be run from anywhere.
Builds can be removed:
muse build -c
or simply remove the build directory content:
rm -rf workDir/build
nothing is built in the repo git area, and everything built is under the build directory.
Build Offline
Build the Offline repo locally:
cd WorkDir git clone https://github.com/Mu2e/Offline or git clone git@github.com:<your GitHub username>/Offline optionally, add other repos: git clone git@github.com:<your GitHub username>/Ana1 muse setup edit files, as needed.. muse build -j 20 --mu2eCompactPrint run mu2e -c ..
Returning later
To re-setup, later, in another process
cd WorkDir muse setup
or
cd AnaDir muse setup WorkDir
Setup options
In one process/window:
cd WorkDir muse setup muse build -j 20 muse -c ...
In another process:
cd WorkDir muse setup -q debug muse build -j 20 gdb --args mu2e -c ...
Setup published releases
When Offline is tagged, it is build and published on cvmfs. You can setup and use these releases.
See what is there
setup muse muse link
setup a version:
muse setup Offline v10_00_00
setup what is current
muse setup Offline
or link it locally:
muse link Offline v10_00_00
Backing build
Before "muse setup", link in a pre-built Offline:
cd WorkDir see options for backing builds: muse link link in a release: muse link master/be905d45 optionally, add smaller repos: git clone git@github.com:<your GitHub username>/Ana1 muse setup muse build -j 20
Partial checkout with backing build
Before "muse setup", link in a pre-built Offline:
cd WorkDir see options for backing builds: muse link link in a release: muse link master/be905d45 optionally, add smaller repos: git clone git@github.com:<your GitHub username>/Ana1 add Offline partial build mgit init cd Offline mgit add HelloWorld edits.. mgit status cd .. muse setup muse build -j 20
See more about mgit below.
The build command can be run from any directory after "muse setup". The developer might find it useful to have two windows open, both setup to the same workDir: one sitting in workDir where code editing is done, and the build command is issued, and a second window sitting an AnaDir where code is run and output is kept.
Tarball
After preparing and testing a build area,
cd WorkDir muse setup (if not already done) muse tarball
Tarball location with be printed. This can be sent to the grid with --code=/path/tarball in mu2eprodsys.
Redirecting build output
It may be convenient for a user to keep code in one place, on one disk system, and write the build output (libraries) to another area. A typical use case might be keeping the source code on /mu2e/app
and writing the output to /mu2e/data
. All the output from a build is under the "build" subdirectory in workDir. To redirect the output:
ln -s /mue2/data/users/$USER/myproject build
which makes the "build" subdirectory into a link. This link should be made while organizing the code in workDir ("git clone", "muse link") and must be before "muse setup", which might create "build" as a subdirectory.
The use of a subdirectory, here called "myproject", is very important. This directory name should reflect the location of the source code, and be unique to the source code. If you link the output of multiple source code workDir's to one output directory, they will overwrite each other. Every source code area needs its own build area.
If you have local builds and want to move them off the local area, you can just move them, then link that area:
mv build /mue2/data/users/$USER/myproject ln -s /mue2/data/users/$USER/myproject build
Muse product
The Muse UPS product is created out of the Mu2e/Muse repo. When the product is setup,
setup mu2e setup muse
it will do the following.
- add $MUSE_DIR/bin to the path
- add MUSE_DIR/python to PYTHONPATH
- add
alias=source $MUSE_DIR/bin/muse
- export MUSE_ENVSET_DIR=/cvmfs/mu2e.opensciencegrid.org/DataFiles/Muse
The product can be packaged from a tag by checking out the tag in the repo, then
Muse/bin/museInstall.sh <version> <path to output product area>
muse setup
The command
muse setup
sets link, fhicl, and other paths, and sets up scons and art. Typically, the user would first organize their repos in the wokrDir, or link backing release builds, before running "muse setup". This command is also typically run when returning to work on the code in workDir in a later process.
The command must be run at the bash command line so that the alias is available, causing muse to be sourced so it can alter your environment. All other commands are run in subshells and will not effect the environment.
The setup command has two forms:
cd workDir muse setup
or from anywhere, if workDir is provided:
muse setup workDir
Options can be provided:
muse setup <workDir> -q <options>
there is a small set of allowed options, such as prof/debug, the geant visualization build options. See
muse setup -h
for the list of options. Examples (can only run one in a process)
muse setup muse setup -q debug muse setup workDir -q e21 muse setup workDir -q e21:trigger:debug
The effects of running setup are the following
- sets environmentals used by Muse, such as the workDir (MUSE_WORK_DIR)
- uses UPS to find the machine flavor
- parses the options
- determines which set of UPS art products to setup, and does the setup
- analyzes available information and determines a repo link order
- adds the workDir area to the bin, link, include and other paths
- for linked backing builds, links in their build to the workDir build area
and you will see the following (or something like it):
Build: prof Core: sl7 e20 p008 Options:
prof is prof or debug; sl7 will be replaced with centos; p008 is the envset, like specifying a version of setup.sh. p008 is (p)ermanent version 8 and you can expect that to change. You will also see that p008 in Offline/.muse.
The resulting environment can be examined with
muse status muse -v status
The UPS version of art, and all the other associated products, including a few that are not always needed, but are convenient, like valgrind, are setup as one "environmental set" or "envset". In implementation, an envset is a script to be sourced as part of "muse setup". This script chooses some defaults (if the user doesn't force a choice), then does the UPS art setups. envsets are files, where N represents a digit, with name pNNN
if published permanently, or uNNN
if customized by the user.
The "muse setup" command uses the following priority list to find which envset to use, and therefore which version of art to setup.
- the user has forced an explicit choice by a qualifier like "-q p000"
- an Offline repo is a local checkout, a mgit partial checkout, or a link, and has a .muse, then use the recommendation in there, if any
- any other local package with a .muse file with a recommendation
$MUSE_WORK_DIR/muse/uNNN
, where N represents a digit, exists, then take highest number there- use highest number from
$MUSE_ENVSET_DIR
There is a way to make your own custom local envset, see that section.
muse link
The "muse link" command is used to link (in the bash soft link sense) a repo, that was built earlier, elsewhere, to the local workDir. This is sometimes called a "backing build".
The linked repo must have been built by Muse, in its own workDir. The link will allow Muse to include from and link to (in the gcc sense) the other build. Muse will not attempt to build, or alter in any way, the linked repo.
The linked repo can be on a local disk, or on cvmfs. Running "muse link" with no argument gives a list of available cvmfs builds. There are two groups: first, a list of recent tagged and built releases, and second, a list of continuous integration (CI) builds. CI builds are made automatically every time code is merged in the main branches in the Offline repo.
The "muse link" commands, to see available builds and make links, are typically done while the default dir is workDir, while organizing code in that area, before "muse setup" is run.
muse link Recent published releases: v09_14_00 v09_13_00 Recent CI builds 2021-03-29 11:05 master/be905d45 2021-03-25 00:05 master/d3cc52b5 2021-03-24 22:05 master/9fe21bab
The choice is then linked:
muse link v09_14_00 or muse link master/be905d45
Some examples follow.
A user has an analysis repo they want to build against a recent Offline build. The user would clone the analysis repo in their workDir then link a backing build.
cd WorkDir muse link muse link master/be905d45 git clone git@github.com:<your GitHub username>/Ana1 muse setup muse build -j 20
A user wants to link a backing build, then do a partial checkout of Offline for a small local change, or a quick-fix development.
cd WorkDir muse link muse link master/be905d45 git clone git@github.com:<your GitHub username>/Ana1 mgit init cd Offline mgit add HelloWorld edits.. cd .. muse setup muse build -j 20
Multiple users are sharing a build of an Offline and an ntuple. In the Offline builder's area, call it workDir0,
cd WorkDir0 git clone https://github.com/Mu2e/Offline git clone git@github.com:<your GitHub username>/MyNtuple muse setup muse build -j 20
then in the user's separate workDir
cd workDir muse link workDir0/Offline muse link workDir0/MyNtuple git clone git@github.com:<your GitHub username>/Ana1 muse setup muse build -j 20
The command makes links in the "link" directory, which is in workDir. When "muse setup" is run, it creates links from the local build area to the linked code build area.
cd workDir muse link master/d3cc52b5 ls -l link Offline -> /cvmfs/mu2e-development.opensciencegrid.org/museCIBuild/master/d3cc52b5/Offline muse setup ls -l build/sl7-prof-e20-p000/link Offline -> /cvmfs/mu2e-development.opensciencegrid.org/museCIBuild/master/d3cc52b5/build/sl7-prof-e20-p000/Offline
Note the link command does not select prof/debug or other possible qualifier choice - those are set in the "muse setup" command. If the requested linked build area does not exist, you will get a warning during "setup".
If you link a repo name which has already been linked, the link will be updated, and a warning issued. After changing a link, the user must wipe out the current build
muse build -c -j 20 or rm -rf build
and re-run "muse setup" in a new process to finish the change.
muse build
This command is how the code in workDir is compiled and linked. This command can only be run after "muse setup". All build products go under the build
subdirectory in workDir. Repos linked in the link subdirectory by the "muse link" command, will have links to their build products under the build directory, created by "muse setup".
All arguments to "muse build" are passed to the scons command.
muse setup workDir muse build -j 20
scons features should all work. To remove a build:
muse build -c
to build a target
muse build -j 20 --mu2eCompactPrint build/sl7-prof-e20-p000/Offline/lib/libmu2e_HelloWorld_HelloWorld_module.so scons: Reading SConscript files ... Linking build/sl7-prof-e20-p000/Offline/lib/libmu2e_HelloWorld_HelloWorld_module.so
or
muse build GDML scons: Reading SConscript files ... ... [Tue Mar 30 16:34:15 CDT 2021] procs.sh starting GDML dir build/$MUSE_STUB/Offline/gen/gdml -rw-r--r-- 1 rlc mu2e 5.2M Mar 30 16:34 mu2e.gdml
The python driving the scons build is in $MUSE_DIR/python
.
The scons command itself is run in workDir.
muse status
The "muse status" command attempts to tell you useful things about how muse is setup in this process, or what setups are available. If run before "muse setup", then the command will assume the default directory is a workDir. It will analysis what's there and what builds have been created.
cd workDir muse status existing builds: sl7-prof-e20-p000 Build times: 03/28/21 17:24:29 to 17:27:35 sl7-debug-e20-p000 Build times: 03/26/21 10:20:19 to scons error 23 muse setup muse status existing builds: sl7-prof-e20-p000 ** this is your current setup ** Build times: 03/28/21 17:24:29 to 17:27:35 sl7-debug-e20-p000 Build times: 03/26/21 10:20:19 to scons error 23
MUSE_WORK_DIR = /mu2e/app/users/rlc/muse/phase2/test4 MUSE_REPOS = Offline link/Offline ...
The current setup is flagged. The build command will record its latest start and stop times. If the stop time is not present, then the build is still running or failed.
muse tarball
The "muse tarball" command makes a tarball of the build are, ready to be submitted to the grid.
muse tarball Tarball: /mu2e/data/users/rlc/museTarball/tmp.Sh0737kx1M/Code.tar.bz2
which can be submitted:
setup mu2egrid mu2eprodsys ... --code=/mu2e/data/users/rlc/museTarball/tmp.Sh0737kx1M/Code.tar.bz2
additional files can be included:
muse tarball *.txt
If workDir contains repos linked from cvmfs, the tarball will contain a link to the cvmfs build. If the linked repo is on a local disk, it will be included in the tarball.
If there are files named setup_pre.sh or setup_post.sh in workDir, they will be including in the tarball, and sourced before (pre) and after (post) the "muse setup" command.
mgit partial build
On occasion, a project only requires slight modification to Offline to be build and run. Some examples of this need are simple fixes, or adding a debug print, or modifying a few lines of fcl. In this case, it seems a waste of time to build all of the Offline repo locally. mgit allows you to build only one, or a small number of subdirectories, from the Offline repo. mgit is only available for the Offline repo.
mgit is Muse's successor to pgit, which is deprecated.
Here is a typical scenario where the user wants their analysis build together with a small piece of Offline, locally modified, and the rest of Offline coming from a backing build:
cd WorkDir muse link muse link master/be905d45 git clone git@github.com:<your GitHub username>/Ana1 # checkout Offline, and create mu2e and origin remotes mgit init cd Offline # add and build only this subdirectory locally, everything else from the linked build mgit add HelloWorld edits.. cd .. muse setup muse build -j 20
Linking order
In order to preserve the one-pass linking Mu2e has established by policy, the repos being built in Muse need to have a specified link order. There is a default link order which will be centrally maintained and will include all repos known to the collaboration generally. When you ask Muse to build your personal analysis repos, or anything not in the central link order, Muse will insert the unknown repos in the beginning of the link order by default. If you have multiple local repos which are not being linked in the right order, you can tell Muse the link order. You do this by creating a new "linkOrder" file locally.
cd workDir mkdir -p muse echo "Ana1 Ana2 " > muse/linkOrder cat $MUSE_ENVSET_DIR/linkOrder >> muse/linkOrder cat muse/linkOrder Ana1 Ana2 Tutorial Offline muse setup muse status ... MUSE_LINK_ORDER = Ana1 Ana2 Tutorial Offline
and the repo library directories will be searched in that order.
Linked (in the bash soft link sense) repos come after their local counterparts, i.e. an Offline in workDir will be linked before the backing Offline that link/Offline points to. This is the mechanism that allows mgit to work with a backing build.
Customizing setups
This procedure allows the user to provide their own envset. This might be necessary to test code with a new compiler, or to test a new set of art libraries, or test a new verison of geant. This procedure will generally only be needed by collaborators doing code management.
To create a custom envset, start with the most recent published envset. This is probably most reliably seen in the Offline/.muse
file from the head of Offline.
cd workDir git clone https://github.com/Mu2e/Offline grep ENVSET Offline/.muse ENVSET P000 mkdir -p muse cp /cvmfs/cvmfs/mu2e.opensciencegrid.org/DataFiles/Muse/P000 muse/u001 edit muse/u001 as needed muse setup -q u001 muse build -j 20
The user can build the same code in the same area both with the default envset p000 and the custom envset u001. The two builds will appear in different directories under the build directory. The two setup/builds will need to be done in separate processes. The two processes will produce the appropriate two different tarballs.
All local envsets must be named uNNN, to make sure they are not confused with copies of the published envsets pNNN. The "u" is for "user", the "p" is for "published".
Typically, once the new envset is validated,
- u001 would be renamed "p001" as the successor to "p000"
- p001 is committed to the Muse repo, config directory, to archive it. This is done with the standard git PR procedures in the Muse repo.
- p001 is uploaded to /cvmfs/cvmfs/mu2e.opensciencegrid.org/DataFiles/Muse directory. This is done with the standard CVMFS upload procedure
- when the code to work with the new products is merged, the ENVSET line in
Offline/.muse
should updated to p001 in the same PR
One of the techniques to devine the envset without user or repo guidance is to accept the pNNN set with the highest NNN. So while it is possible to publish arbitrary envsets for arbitrary reasons, the highest number may be picked up by some inexperienced users, though not for general use.
If you have modified the PRODUCTS path, to pick up a locally build UPS product that is in the envset (KinKal or BTrk are the prime examples), then "muse tarball" will sense that and include the local UPS area in the tarball.
Converting an existing build area
The traditional build of Offline (setup.sh and scons commands) can co-exist with the Muse build since the two sets of build products are in different places. If you want, you can build both, and select which build you run by which setup commands you run. This could be used to compare results, or to confirm that Muse is working before switching to Muse.
If you have an existing Offline checked out in a directory, that directory can become your Muse working directory. Simply, cd to that directory and "muse setup". All functionality is the same as if you created the area specifically for Muse. Secondary repos that are Muse-ready can be added to this working directory. Secondary repos that are not Muse-ready will probably have to be built by whatever methods are currently in use. We recommend making all secondary repos Muse-ready.
Making a repo Muse-ready
It is anticipated that ntuples, calibration and user analysis repos will be build together in the required combinations using Muse. This implies that the user analysis repos will be organized in a way that Muse can operate on them.
The first step is to get the repo recognized by Muse - this is accomplished by adding a file .muse
to the top level of the repo. When Muse goes to build a workDir, it will ignore any subdirectory without a .muse
.
The second step is to drive the building of libraries, modules and bins in the scons build. This can be achieved by following the patterns in the Offline or Tutorial subdirectories.
The .muse
file can also help help configure the Muse build. Please see the algorithm that Muse uses to find the envset (essentially the preferred art products version). One of the prioritized steps is to look in repos' .muse
file for a suggested envset, which will look like:
ENVSET p000
If your repo has envset preference, you can include it here. It will most likely be used only if there is no Offline repo involved in the workDir, since the Offline will overrule your repo's suggestion. You can also add directories to the PATH
and PYTHONPATH
as these example from Offline:
PYTHONPATH Trigger/python PATH bin
The paths are relative to the top level of the repo, but will become full paths in the setup.