Muse: Difference between revisions

From Mu2eWiki
Jump to navigation Jump to search
Line 412: Line 412:
Typically, once the new envset is validated,  
Typically, once the new envset is validated,  
# u001 would be renamed "p001" as the successor to "p000"
# u001 would be renamed "p001" as the successor to "p000"
# p001 is committed to the Muse repo, config directory.  This is done with the [[GitHubWorkflow#Developer_Workflow|standard git PR procedures]] in the Muse repo. Since it si only for archive purposes, exactly when this is done doesn't matter.
# p001 is uploaded to /cvmfs/cvmfs/mu2e.opensciencegrid.org/DataFiles/Muse directory, using the [[Cvmfs#Adding_CVMFS_content|standard CVMFS upload procedure]]
# p001 is uploaded to /cvmfs/cvmfs/mu2e.opensciencegrid.org/DataFiles/Muse directory.  This is done with the [[Cvmfs#Adding_CVMFS_content|standard CVMFS upload procedure]]
# when the Offline code to work with the new products is merged, the ENVSET line in <code>Offline/.muse</code> should updated to p001 in the same PR
# when the code to work with the new products is merged, the ENVSET line in <code>Offline/.muse</code> should updated to p001 in the same PR
# p001 is committed to the Muse repo, config directory.  This is done with the [[GitHubWorkflow#Developer_Workflow|standard git PR procedures]] in the Muse repo. Since it is only for archive purposes, exactly when this is done doesn't matter.


One of the techniques to devine the envset without user or repo guidance is to accept the pNNN set with the highest NNN.  So while it is possible to publish arbitrary envsets for arbitrary reasons, the highest number may be picked up by some inexperienced users, though not for general use.
One of the techniques to divine the envset without user or repo guidance is to accept the pNNN set with the highest NNN.  So while it is possible to publish arbitrary envsets for arbitrary reasons, the highest number may be picked up by some inexperienced users, though not intended for general use.


If you have modified the PRODUCTS path, to pick up a locally build UPS product that is in the envset (KinKal or BTrk are the prime examples), then "muse tarball" will sense that and include the local UPS area in the tarball.
If you have modified the PRODUCTS path, to pick up a locally build UPS product that is in the envset (KinKal or BTrk are the prime examples), then "muse tarball" will sense that and include the local UPS area in the tarball.

Revision as of 19:11, 30 June 2021

Introduction

As Mu2e approaches data-taking there will be many more ntuples, calibration procedures, and analysis code developed. Including this user code in the main Offline repo will be unwieldy and hard to maintain. The main Offline repo should remain minimal, including only what is needed for simulation and reconstruction jobs, for simple maintenance and faster builds. Code for calibration, ntuple, and analysis code, maintained and used by individuals and small groups should be stored in a smaller repos controlled by the appropriate group. These smaller repos could be standalone contain modules and utilities that depend on code and functionality in other small repos, or in Offline. To make this work, we need a system to build the smaller repos against an existing Offline build, or together with the Offline, sharing include files, fcl, and cross-linking. The system should support common use cases and be flexible. The Muse system is a set of scripts developed inside Mu2e that provides this functionality. It is a UPS product containing a set of scripts and support for scons code build at the core.

In this page we will use Ana1 and Ana2 as examples of smaller, non-Offline code repos. WorkDir will represent the path to a user project area on /mu2e/app, or possibly another disk where you can build code. Muse is designed around this workDir which will contain the repos to be built. If building Offline, workDir contains the Offline directory. AnaDir will represent an area where analysis files are kept : scripts, root scripts, ntuples, histograms, and notes. If the user's work is mostly concerned with code development and commits, then it may be convenient to use WorkDir as AnaDir, or, if the work is mostly using a static build, then they may be different. Or there may be several versions of AnaDir, or AnaDir's with different goals, all using the same build in workDir.

Muse maintenance operations for experts are on MuseMaintenance.

Quick Start

Always required:

 setup mu2e
 setup muse

Built-in help:

 muse -h
 muse <command> -h
 muse status

Muse can only be setup once in a process and can't be unsetup. To change setup choices, start in a new process. When building on mu2egpvm01-06, use "-j 4", when building on mu2ebuild01, use "-j 20", see Machines. See GitHubWorkflow for more details on using git. See mgit for more information on using partial checkout (building only parts of Offline).

Muse scripts are focused on the workDir which provides the code to be built, a place to build the code, and persistency of what the user is building and using. There can only be one workDir in a process. After muse is UPS setup ("setup muse"), the workDir is usually configured by cd'ing to the workDir, checking out repos

 cd workDir
 git clone ...

or linking in pre-build backing releases.

 cd workDir
 muse link ...

once the code is organized, then muse is setup

 cd workDir
 muse setup

or from any directory:

 muse setup workDir

Once "muse setup" has been run, then the list of repos which will be built, and the build options, are locked in for the process. If you add or delete repos, or change a link to a backing build, then you must run "muse setup" again in a new process. The status report:

 muse status

can be run before or after "muse setup". If run before, it assumes the default directory is the workDir. If run after, it can be run from anywhere.

Builds can be removed:

 muse build -c

or simply remove the build directory content:

 rm -rf workDir/build

nothing is built in the repo git area, and everything built is under the build directory.

Build Offline

Build the Offline repo locally:

 cd WorkDir
 git clone https://github.com/Mu2e/Offline
    or
 git clone git@github.com:<your GitHub username>/Offline
    optionally, add other repos:
 git clone git@github.com:<your GitHub username>/Ana1
 muse setup
   edit files, as needed..
 muse build -j 20 --mu2eCompactPrint
   run
 mu2e -c ..

Returning later

To re-setup, later, in another process

 cd WorkDir
 muse setup

or

 cd AnaDir
 muse setup WorkDir

Setup options

In one process/window:

 cd WorkDir
 muse setup
 muse build -j 20
 muse -c ...

In another process:

 cd WorkDir
 muse setup -q debug
 muse build -j 20
 gdb --args  mu2e -c ...

Setup published releases

When Offline is tagged, it is build and published on cvmfs. You can setup and use these releases.

See what is there

setup muse
muse link

setup a version:

muse setup Offline v10_00_00

setup what is current

muse setup Offline

or link it locally:

muse link Offline v10_00_00

Backing build

Before "muse setup", link in a pre-built Offline:

 cd WorkDir
    see options for backing builds:
 muse link
    link in a release:
 muse link master/be905d45
    optionally, add smaller repos:
 git clone git@github.com:<your GitHub username>/Ana1
 muse setup
 muse build -j 20

Partial checkout with backing build

Before "muse setup", link in a pre-built Offline:

 cd WorkDir
    see options for backing builds:
 muse link
    link in a release:
 muse link master/be905d45
    optionally, add smaller repos:
 git clone git@github.com:<your GitHub username>/Ana1
    add Offline partial build
 mgit init
 cd Offline
 mgit add HelloWorld
    edits..
 mgit status
 cd ..
 muse setup
 muse build -j 20

See more about mgit below.

The build command can be run from any directory after "muse setup". The developer might find it useful to have two windows open, both setup to the same workDir: one sitting in workDir where code editing is done, and the build command is issued, and a second window sitting an AnaDir where code is run and output is kept.

Tarball

After preparing and testing a build area,

 cd WorkDir
 muse setup (if not already done)
 muse tarball

Tarball location with be printed. This can be sent to the grid with --code=/path/tarball in mu2eprodsys.

Redirecting build output

It may be convenient for a user to keep code in one place, on one disk system, and write the build output (libraries) to another area. A typical use case might be keeping the source code on /mu2e/app and writing the output to /mu2e/data. All the output from a build is under the "build" subdirectory in workDir. To redirect the output:

ln -s /mue2/data/users/$USER/myproject build

which makes the "build" subdirectory into a link. This link should be made while organizing the code in workDir ("git clone", "muse link") and must be before "muse setup", which might create "build" as a subdirectory.

The use of a subdirectory, here called "myproject", is very important. This directory name should reflect the location of the source code, and be unique to the source code. If you link the output of multiple source code workDir's to one output directory, they will overwrite each other. Every source code area needs its own build area.

If you have local builds and want to move them off the local area, you can just move them, then link that area:

mv build /mue2/data/users/$USER/myproject
ln -s /mue2/data/users/$USER/myproject build

Muse product

The Muse UPS product is created out of the Mu2e/Muse repo. When the product is setup,

 setup mu2e
 setup muse

it will do the following.

  1. add $MUSE_DIR/bin to the path
  2. add MUSE_DIR/python to PYTHONPATH
  3. add alias=source $MUSE_DIR/bin/muse
  4. export MUSE_ENVSET_DIR=/cvmfs/mu2e.opensciencegrid.org/DataFiles/Muse

The product can be packaged from a tag by checking out the tag in the repo, then

Muse/bin/museInstall.sh <version> <path to output product area>

muse setup

The command

muse setup

sets link, fhicl, and other paths, and sets up scons and art. Typically, the user would first organize their repos in the wokrDir, or link backing release builds, before running "muse setup". This command is also typically run when returning to work on the code in workDir in a later process.

The command must be run at the bash command line so that the alias is available, causing muse to be sourced so it can alter your environment. All other commands are run in subshells and will not effect the environment.

The setup command has two forms:

 cd workDir
 muse setup

or from anywhere, if workDir is provided:

 muse setup workDir

Options can be provided:

 muse setup <workDir> -q <options>

there is a small set of allowed options, such as prof/debug, the geant visualization build options. See

 muse setup -h

for the list of options. Examples (can only run one in a process)

 muse setup
 muse setup -q debug
 muse setup workDir -q e21
 muse setup workDir -q e21:trigger:debug


The effects of running setup are the following

  1. sets environmentals used by Muse, such as the workDir (MUSE_WORK_DIR)
  2. uses UPS to find the machine flavor
  3. parses the options
  4. determines which set of UPS art products to setup, and does the setup
  5. analyzes available information and determines a repo link order
  6. adds the workDir area to the bin, link, include and other paths
  7. for linked backing builds, links in their build to the workDir build area

and you will see the following (or something like it):

Build: prof     Core: sl7 e20 p008    Options:

prof is prof or debug; sl7 will be replaced with centos; p008 is the envset, like specifying a version of setup.sh. p008 is (p)ermanent version 8 and you can expect that to change. You will also see that p008 in Offline/.muse.

The resulting environment can be examined with

muse status
muse -v status

The UPS version of art, and all the other associated products, including a few that are not always needed, but are convenient, like valgrind, are setup as one "environmental set" or "envset". In implementation, an envset is a script to be sourced as part of "muse setup". This script chooses some defaults (if the user doesn't force a choice), then does the UPS art setups. envsets are files, where N represents a digit, with name pNNN if published permanently, or uNNN if customized by the user.


The "muse setup" command uses the following priority list to find which envset to use, and therefore which version of art to setup.

  1. the user has forced an explicit choice by a qualifier like "-q p000"
  2. an Offline repo is a local checkout, a mgit partial checkout, or a link, and has a .muse, then use the recommendation in there, if any
  3. any other local package with a .muse file with a recommendation
  4. $MUSE_WORK_DIR/muse/uNNN, where N represents a digit, exists, then take highest number there
  5. use highest number from $MUSE_ENVSET_DIR

There is a way to make your own custom local envset, see that section.

muse link

The "muse link" command is used to link (in the bash soft link sense) a repo, that was built earlier, elsewhere, to the local workDir. This is sometimes called a "backing build".

The linked repo must have been built by Muse, in its own workDir.  The link will allow Muse to include from and link to (in the gcc sense) the other build. Muse will not attempt to build, or alter in any way, the linked repo.

The linked repo can be on a local disk, or on cvmfs. Running "muse link" with no argument gives a list of available cvmfs builds. There are two groups: first, a list of recent tagged and built releases, and second, a list of continuous integration (CI) builds. CI builds are made automatically every time code is merged in the main branches in the Offline repo.

The "muse link" commands, to see available builds and make links, are typically done while the default dir is workDir, while organizing code in that area, before "muse setup" is run.

muse link
  Recent published releases:
  v09_14_00
  v09_13_00
  Recent CI builds
  2021-03-29 11:05 master/be905d45
  2021-03-25 00:05 master/d3cc52b5
  2021-03-24 22:05 master/9fe21bab

The choice is then linked:

  muse link v09_14_00
     or
  muse link master/be905d45 


Some examples follow.

A user has an analysis repo they want to build against a recent Offline build. The user would clone the analysis repo in their workDir then link a backing build.

 cd WorkDir
 muse link
 muse link master/be905d45
 git clone git@github.com:<your GitHub username>/Ana1
 muse setup
 muse build -j 20

A user wants to link a backing build, then do a partial checkout of Offline for a small local change, or a quick-fix development.

 cd WorkDir
 muse link
 muse link master/be905d45
 git clone git@github.com:<your GitHub username>/Ana1
 mgit init
 cd Offline
 mgit add HelloWorld
    edits..
 cd ..
 muse setup
 muse build -j 20

Multiple users are sharing a build of an Offline and an ntuple. In the Offline builder's area, call it workDir0,

 cd WorkDir0
 git clone https://github.com/Mu2e/Offline
 git clone git@github.com:<your GitHub username>/MyNtuple
 muse setup
 muse build -j 20

then in the user's separate workDir

 cd workDir
 muse link workDir0/Offline
 muse link workDir0/MyNtuple
 git clone git@github.com:<your GitHub username>/Ana1
 muse setup
 muse build -j 20


The command makes links in the "link" directory, which is in workDir. When "muse setup" is run, it creates links from the local build area to the linked code build area.

cd workDir
muse link master/d3cc52b5
ls -l link
   Offline -> /cvmfs/mu2e-development.opensciencegrid.org/museCIBuild/master/d3cc52b5/Offline
muse setup
ls -l build/sl7-prof-e20-p000/link
   Offline -> /cvmfs/mu2e-development.opensciencegrid.org/museCIBuild/master/d3cc52b5/build/sl7-prof-e20-p000/Offline

Note the link command does not select prof/debug or other possible qualifier choice - those are set in the "muse setup" command. If the requested linked build area does not exist, you will get a warning during "setup".

If you link a repo name which has already been linked, the link will be updated, and a warning issued. After changing a link, the user must wipe out the current build

 muse build -c -j 20
    or
 rm -rf build

and re-run "muse setup" in a new process to finish the change.


muse build

This command is how the code in workDir is compiled and linked. This command can only be run after "muse setup". All build products go under the build subdirectory in workDir. Repos linked in the link subdirectory by the "muse link" command, will have links to their build products under the build directory, created by "muse setup".

All arguments to "muse build" are passed to the scons command.

 muse setup workDir
 muse build -j 20

scons features should all work. To remove a build:

 muse build -c

to build a target

 muse build -j 20 --mu2eCompactPrint build/sl7-prof-e20-p000/Offline/lib/libmu2e_HelloWorld_HelloWorld_module.so
   scons: Reading SConscript files ...
   Linking build/sl7-prof-e20-p000/Offline/lib/libmu2e_HelloWorld_HelloWorld_module.so

or

 muse build GDML
   scons: Reading SConscript files ...
   ... 
   [Tue Mar 30 16:34:15 CDT 2021] procs.sh starting GDML
 dir build/$MUSE_STUB/Offline/gen/gdml 
   -rw-r--r-- 1 rlc mu2e 5.2M Mar 30 16:34 mu2e.gdml

The python driving the scons build is in $MUSE_DIR/python.

The scons command itself is run in workDir.

muse status

The "muse status" command attempts to tell you useful things about how muse is setup in this process, or what setups are available. If run before "muse setup", then the command will assume the default directory is a workDir. It will analysis what's there and what builds have been created.

cd workDir
muse status
  existing builds:
    sl7-prof-e20-p000
         Build times: 03/28/21 17:24:29 to 17:27:35
    sl7-debug-e20-p000
         Build times: 03/26/21 10:20:19 to scons error 23
muse setup
muse status
  existing builds:
    sl7-prof-e20-p000         ** this is your current setup **
         Build times: 03/28/21 17:24:29 to 17:27:35
    sl7-debug-e20-p000
         Build times: 03/26/21 10:20:19 to scons error 23
 MUSE_WORK_DIR = /mu2e/app/users/rlc/muse/phase2/test4 
 MUSE_REPOS =  Offline link/Offline
 ...

The current setup is flagged. The build command will record its latest start and stop times. If the stop time is not present, then the build is still running or failed.


muse tarball

The "muse tarball" command makes a tarball of the build are, ready to be submitted to the grid.

 muse tarball
  Tarball: /mu2e/data/users/rlc/museTarball/tmp.Sh0737kx1M/Code.tar.bz2

which can be submitted:

 setup mu2egrid
 mu2eprodsys ... --code=/mu2e/data/users/rlc/museTarball/tmp.Sh0737kx1M/Code.tar.bz2

additional files can be included:

 muse tarball *.txt

If workDir contains repos linked from cvmfs, the tarball will contain a link to the cvmfs build. If the linked repo is on a local disk, it will be included in the tarball.

If there are files named setup_pre.sh or setup_post.sh in workDir, they will be including in the tarball, and sourced before (pre) and after (post) the "muse setup" command.

mgit partial build

On occasion, a project only requires slight modification to Offline to be build and run. Some examples of this need are simple fixes, or adding a debug print, or modifying a few lines of fcl. In this case, it seems a waste of time to build all of the Offline repo locally. mgit allows you to build only one, or a small number of subdirectories, from the Offline repo. mgit is only available for the Offline repo.

mgit is Muse's successor to pgit, which is deprecated.


Here is a typical scenario where the user wants their analysis build together with a small piece of Offline, locally modified, and the rest of Offline coming from a backing build:

cd WorkDir
muse link
muse link master/be905d45
git clone git@github.com:<your GitHub username>/Ana1
# checkout Offline, and create mu2e and origin remotes
mgit init
cd Offline
# add and build only this subdirectory locally, everything else from the linked build
mgit add HelloWorld
   edits..
cd ..
muse setup
muse build -j 20

Linking order

In order to preserve the one-pass linking Mu2e has established by policy, the repos being built in Muse need to have a specified link order. There is a default link order which will be centrally maintained and will include all repos known to the collaboration generally. When you ask Muse to build your personal analysis repos, or anything not in the central link order, Muse will insert the unknown repos in the beginning of the link order by default. If you have multiple local repos which are not being linked in the right order, you can tell Muse the link order. You do this by creating a new "linkOrder" file locally.

cd workDir
mkdir -p muse
echo "Ana1 Ana2 " > muse/linkOrder
cat $MUSE_ENVSET_DIR/linkOrder >> muse/linkOrder
cat muse/linkOrder
   Ana1 Ana2 Tutorial Offline
muse setup
muse status
     ...
     MUSE_LINK_ORDER =  Ana1 Ana2 Tutorial Offline

and the repo library directories will be searched in that order.

Linked (in the bash soft link sense) repos come after their local counterparts, i.e. an Offline in workDir will be linked before the backing Offline that link/Offline points to. This is the mechanism that allows mgit to work with a backing build.

Customizing setups

This procedure allows the user to provide their own envset. This might be necessary to test code with a new compiler, or to test a new set of art libraries, or test a new verison of geant. This procedure will generally only be needed by collaborators doing code management.

To create a custom envset, start with the most recent published envset. This is probably most reliably seen in the Offline/.muse file from the head of Offline.

cd workDir
git clone https://github.com/Mu2e/Offline
grep ENVSET Offline/.muse
   ENVSET P000
mkdir -p muse
cp /cvmfs/cvmfs/mu2e.opensciencegrid.org/DataFiles/Muse/P000 muse/u001
   edit muse/u001 as needed
muse setup -q u001
muse build -j 20

The user can build the same code in the same area both with the default envset p000 and the custom envset u001. The two builds will appear in different directories under the build directory. The two setup/builds will need to be done in separate processes. The two processes will produce the appropriate two different tarballs.

All local envsets must be named uNNN, to make sure they are not confused with copies of the published envsets pNNN. The "u" is for "user", the "p" is for "published".

Typically, once the new envset is validated,

  1. u001 would be renamed "p001" as the successor to "p000"
  2. p001 is uploaded to /cvmfs/cvmfs/mu2e.opensciencegrid.org/DataFiles/Muse directory, using the standard CVMFS upload procedure
  3. when the Offline code to work with the new products is merged, the ENVSET line in Offline/.muse should updated to p001 in the same PR
  4. p001 is committed to the Muse repo, config directory. This is done with the standard git PR procedures in the Muse repo. Since it is only for archive purposes, exactly when this is done doesn't matter.

One of the techniques to divine the envset without user or repo guidance is to accept the pNNN set with the highest NNN. So while it is possible to publish arbitrary envsets for arbitrary reasons, the highest number may be picked up by some inexperienced users, though not intended for general use.

If you have modified the PRODUCTS path, to pick up a locally build UPS product that is in the envset (KinKal or BTrk are the prime examples), then "muse tarball" will sense that and include the local UPS area in the tarball.

Converting an existing build area

The traditional build of Offline (setup.sh and scons commands) can co-exist with the Muse build since the two sets of build products are in different places. If you want, you can build both, and select which build you run by which setup commands you run. This could be used to compare results, or to confirm that Muse is working before switching to Muse.

If you have an existing Offline checked out in a directory, that directory can become your Muse working directory. Simply, cd to that directory and "muse setup". All functionality is the same as if you created the area specifically for Muse. Secondary repos that are Muse-ready can be added to this working directory. Secondary repos that are not Muse-ready will probably have to be built by whatever methods are currently in use. We recommend making all secondary repos Muse-ready.

Making a repo Muse-ready

It is anticipated that ntuples, calibration and user analysis repos will be build together in the required combinations using Muse. This implies that the user analysis repos will be organized in a way that Muse can operate on them.

The first step is to get the repo recognized by Muse - this is accomplished by adding a file .muse to the top level of the repo. When Muse goes to build a workDir, it will ignore any subdirectory without a .muse.

The second step is to drive the building of libraries, modules and bins in the scons build. This can be achieved by following the patterns in the Offline or Tutorial subdirectories.

The .muse file can also help help configure the Muse build. Please see the algorithm that Muse uses to find the envset (essentially the preferred art products version). One of the prioritized steps is to look in repos' .muse file for a suggested envset, which will look like:

ENVSET p000

If your repo has envset preference, you can include it here. It will most likely be used only if there is no Offline repo involved in the workDir, since the Offline will overrule your repo's suggestion. You can also add directories to the PATH and PYTHONPATH as these example from Offline:

PYTHONPATH Trigger/python
PATH bin

The paths are relative to the top level of the repo, but will become full paths in the setup.