Muse

From Mu2eWiki
Revision as of 22:13, 19 July 2024 by Kutschke (talk | contribs)
Jump to navigation Jump to search

Introduction

As Mu2e approaches data-taking there will be many more ntuples, calibration procedures, and analysis code developed. Including this user code in the main Offline repo will be unwieldy and hard to maintain. The main Offline repo should remain minimal, including only what is needed for simulation and reconstruction jobs, for simple maintenance and faster builds. Code for calibration, ntuple, and analysis code, maintained and used by individuals and small groups should be stored in a smaller repos controlled by the appropriate group. These smaller repos could be standalone contain modules and utilities that depend on code and functionality in other small repos, or in Offline. To make this work, we need a system to build the smaller repos against an existing Offline build, or together with the Offline, sharing include files, fcl, and cross-linking. The system should support common use cases and be flexible. The Muse system is a set of scripts developed inside Mu2e that provides this functionality. It is a UPS product containing a set of scripts and support for scons code build at the core.

In this page we will use Ana1 and Ana2 as examples of smaller, non-Offline code repos. WorkDir will represent the path to a user project area on /mu2e/app, or possibly another disk where you can build code. Muse is designed around this workDir which will contain the repos to be built. If building Offline, workDir contains the Offline directory. AnaDir will represent an area where analysis files are kept : scripts, root scripts, ntuples, histograms, and notes. If the user's work is mostly concerned with code development and commits, then it may be convenient to use WorkDir as AnaDir, or, if the work is mostly using a static build, then they may be different. Or there may be several versions of AnaDir, or AnaDir's with different goals, all using the same build in workDir.

Muse maintenance operations for experts are on MuseMaintenance.

Quick Start

Always required:

 mu2einit

This will setup the muse product too. Built-in help:

 muse -h
 muse <command> -h
 muse status

Muse setup can only be run once in a process and can't be reversed. To change setup choices, start in a new process. When building on mu2egpvm01-07, use "-j 4", when building on mu2ebuild02, use "-j 20", see Machines. See GitHubWorkflow for more details on using git. See mgit for more information on using partial checkout (building only parts of Offline).

Muse scripts are focused on the workDir which provides the code to be built, a place to build the code, and persistency of what the user is building and using. There can only be one workDir in a process. After muse is UPS setup (mu2einit), the workDir is usually configured by cd'ing to the workDir, checking out repos

 cd workDir
 git clone ...

or pointing to pre-build backing releases.

 cd workDir
 muse backing ...

once the code is organized, then muse is setup

 cd workDir
 muse setup

or from any directory:

 muse setup workDir

Once "muse setup" has been run, then the list of repos which will be built, and the build options, are locked in for the process. If you add or delete repos, or change the backing build, then you must run "muse setup" again in a new process. The status report:

 muse status

can be run before or after "muse setup". If run before, it assumes the default directory is the workDir. If run after, it can be run from anywhere.

Builds can be removed:

 muse build -c

or simply remove the build directory content:

 rm -rf workDir/build

nothing is built in the repo git area, and everything built is under the build directory, which can always be replaced by a build command.

Build Offline

Build the Offline repo locally:

 cd WorkDir
 git clone https://github.com/Mu2e/Offline
    or
 git clone git@github.com:<your GitHub username>/Offline
    optionally, add other repos:
 git clone git@github.com:<your GitHub username>/Ana1
 muse setup
   edit files, as needed..
 muse build -j 20 --mu2eCompactPrint
   run
 mu2e -c ..

Returning later

To re-setup, later, in another process

 cd WorkDir
 muse setup

or

 cd AnaDir
 muse setup WorkDir

Setup options

In one process/window:

 cd WorkDir
 muse setup
 muse build -j 20
 muse -c ...

In another process:

 cd WorkDir
 muse setup -q debug
 muse build -j 20
 gdb --args  mu2e -c ...

Setup published releases

When Offline is tagged, it is build and published on cvmfs. You can setup and use these releases.

See what is there

muse backing

or

muse list

setup a version:

muse setup Offline v10_15_00

setup whatever version is current

muse setup Offline

Backing build

Before "muse setup", link in a pre-built Offline:

 cd WorkDir
    see options for backing builds:
 muse backing
    point to a backing build:
 muse backing HEAD
    optionally, add smaller repos:
 git clone git@github.com:<your GitHub username>/Ana1
 muse setup
 muse build -j 20

Partial checkout with backing build

Before "muse setup", link in a pre-built Offline:

 cd WorkDir
    see options for backing builds:
 muse backing
    point to a backing build:
 muse backing HEAD
    optionally, add smaller repos:
 git clone git@github.com:<your GitHub username>/Ana1
    add Offline partial build
 mgit init
 cd Offline
 mgit add HelloWorld
    edits..
 mgit status
 cd ..
 muse setup
 muse build -j 20

See more about mgit below.

The build command can be run from any directory after "muse setup". The developer might find it useful to have two windows open, both setup to the same workDir: one sitting in workDir where code editing is done, and the build command is issued, and a second window sitting an AnaDir where code is run and output is kept.

Tarball

After preparing and testing a build area,

 cd WorkDir
 muse setup (if not already done)
 muse tarball

Tarball location with be printed. This can be sent to the grid with --code=/path/to/tarball in mu2eprodsys.

If using jobsub directly, and the switch --tar_file_name , then the tarball will be unrolled in the default directotry, and the command

muse setup Code
  or
source Code/setup.sh

should work.

Redirecting build output

It may be convenient for a user to keep code in one place, on one disk system, and write the build output (the libraries) to another area. A typical use case might be keeping the source code on /mu2e/app and writing the output to /mu2e/data. All the output from a build is under the "build" subdirectory in workDir. To redirect the output:

ln -s /mue2/data/users/$USER/myproject build

which makes the "build" subdirectory into a link. This link should be made while organizing the code in workDir ("git clone", "muse backing") and must be before "muse setup".

The use of a subdirectory, here called "myproject", is very important. This directory name should reflect the location of the source code, and be unique to the source code. If you link the output of multiple source code workDir's to one output directory, they will overwrite each other. Every source code area needs its own build area.

If you have local builds and want to move them off the local area, you can just move them, then link that area:

mv build /mue2/data/users/$USER/myproject
ln -s /mue2/data/users/$USER/myproject build

Muse product

The Muse UPS product is created out of the Mu2e/Muse repo. When the product is setup,

 mu2einit

inside of the mu2einit is

 setup muse

and it will do the following.

  1. add $MUSE_DIR/bin to the path
  2. add MUSE_DIR/python to PYTHONPATH
  3. add alias=source $MUSE_DIR/bin/muse
  4. export MUSE_ENVSET_DIR=/cvmfs/mu2e.opensciencegrid.org/DataFiles/Muse

The product can be packaged from a tag by checking out the tag in the repo, then

Muse/bin/museInstall.sh <version> <path to output product area>

muse setup

The command

muse setup

sets link, fhicl, and other paths, and sets up scons and art. Typically, the user would first organize their repos in the wokrDir, or link backing release builds, before running "muse setup". This command is also typically run when returning to work on the code in workDir in a later process.

The command must be run at the bash command line so that the alias is available, causing muse to be sourced so it can alter your environment. All other commands are run in subshells and will not effect the environment.

The setup command has two forms:

 cd workDir
 muse setup

or from anywhere, if workDir is provided:

 muse setup workDir

Options can be provided:

 muse setup <workDir> -q <options>

there is a small set of allowed options, such as prof/debug, the geant visualization build options. See

 muse setup -h

for the list of options. Examples (can only run one in a process)

 muse setup
 muse setup -q debug
 muse setup workDir -q e21
 muse setup workDir -q e21:trigger:debug


The effects of running setup are the following

  1. sets environmentals used by Muse, such as the workDir (MUSE_WORK_DIR)
  2. uses UPS to find the machine flavor
  3. parses the options
  4. determines which set of UPS art products to setup, and does the setup
  5. analyzes available information and determines a repo link order
  6. adds the workDir area to the bin, link, include and other paths
  7. if the area has a backing build, includes those build areas as needed
  8. if the area has an artexternals directory add that to the UPS PRODUCTS path

and you will see the following (or something like it):

Build: prof     Core: sl7 e20 p008    Options:

prof is prof or debug; sl7 will be replaced with centos; p008 is the envset, like specifying a version of setup.sh. p008 is (p)ermanent version 8 and you can expect that to change. You will also see that p008 in Offline/.muse.

The resulting environment can be examined with

muse status
muse -v status

The UPS version of art, and all the other associated products, including a few that are not always needed, but are convenient, like valgrind, are setup as one "environmental set" or "envset". In implementation, an envset is a script to be sourced as part of "muse setup". This script chooses some defaults (if the user doesn't force a choice), then does the UPS art setups. envsets are files, where N represents a digit, with name pNNN if published permanently, or uNNN if customized by the user.


The "muse setup" command uses the following priority list to find which envset to use, and therefore which version of art to setup.

  1. the user has forced an explicit choice by a qualifier like "-q p000"
  2. there is a file $MUSE_WORK_DIR/.muse and it contains an envset pointer (since v2_12_00)
  3. an Offline repo is a local checkout, a mgit partial checkout, or a backing link, and has a .muse, then use the recommendation in there, if any
  4. any other local package with a .muse file with a recommendation
  5. $MUSE_WORK_DIR/muse/uNNN, where N represents a digit, exists, then take highest number there
  6. use highest number from $MUSE_ENVSET_DIR

There is a way to make your own custom local envset, see that section.

When running setup on a linux OS (such as CentOS) that is related to our available Scientific Linux builds, and may be bit-compatible, it will probably be necessary to do a UPS override.

Musings (published muse builds)

Muse allows a build area to be installed on cvmfs or other long-term location. The user can then setup this area, and effectively make this published area their workDir. Typically, this area would be on cvmfs, so the user cannot build anything, but can use all the content. The idea of a published Muse build area is called a "musing". A musing may contain the builds of one or more repos.

At this writing, the available musings are

  • Offline - just the Offline repo
  • Production - just the Production repo
  • SimJob - Production, backed by Offline
  • TrkAna - the TrkAna ntuple, backed by Production and Offline

The user can see the available musings with

muse list

and setup these musings with a command, for example

 muse setup Offline v10_06_00
 muse setup Offline  (will use the "current" version)
 muse setup SimJob MDC2020k

after these setup commands, "muse status" will show what was setup. Since these areas have been built, the libraries are ready to go, for example

 muse setup SimJob
 mu2e -n 100 -c Production/Validation/ceSimReco.fcl

It is also possible to submit grid jobs using these builds.

It is also possible to setup the continuous integration builds and use them directly (as opposed to linking them to your local build area). The most recent can be referred to as the head:

 muse setup head
 mu2e -n 100 -c Production/Validation/ceSimReco.fcl

muse backing

The "muse backing" command is used to link (in the bash soft link sense) another build area, that was built earlier, elsewhere, into the local working directory. This is sometimes called a "backing build". The linked backing build area must have been built by Muse, in its own workDir. The link will allow Muse to include from and link to (in the gcc sense) the other build. Muse will not attempt to build, or alter in any way, the backing build. A common use case is to build a small analysis repo locally while pointing to a full Offline build on cvmfs as the backing build.

The linked backing repo can be on a local disk, or on cvmfs. Running "muse backing" with no argument gives a list of available cvmfs builds. There are two groups: first, a list of recent tagged and built releases, and second, a list of continuous integration (CI) builds. CI builds are made automatically every time code is merged in the main branches in the Offline repo.

The "muse backing" commands, to see available builds and make links, are typically done while the default dir is workDir, while organizing code in that area, before "muse setup" is run.

> muse backing
 Recent published Offline releases:
v10_13_00
v10_14_00
v10_15_00   (current)
 Recent Offline CI builds
2022-04-19 18:11 main/23266677
2022-03-16 20:15 main/71adc776
2022-03-16 18:15 main/dfcf4282


The choice is then created as a link:

  muse backing Offline v10_15_00
     or
  muse backing SimJob   (current SimJob version is assumed)
     or
  muse backing main/23266677    (for a specific CI build)
     or
  muse backing HEAD   (for the latest CI build)


Some examples follow.

A user has an analysis repo they want to build against the current Offline build. The user would clone the analysis repo in their workDir then point to a backing build.

 cd WorkDir
 muse backing HEAD
 git clone git@github.com:<your GitHub username>/Ana1
 muse setup
 muse build -j 20

A user wants to point to a backing build, then do a partial checkout of Offline for a small local change, or a quick-fix development.

 cd WorkDir
 muse backing HEAD
 git clone git@github.com:<your GitHub username>/Ana0
 mgit init
 cd Offline
 mgit add HelloWorld
    edits..
 cd ..
 muse setup
 muse build -j 20

Multiple users are sharing a build of an Offline and an ntuple. In the Offline builder's area, call it workDir0,

 cd WorkDir0
 git clone https://github.com/Mu2e/Offline
 git clone git@github.com:<your GitHub username>/MyNtuple
 muse setup
 muse build -j 20

then in the user's separate workDir

 cd workDir
 muse backing workDir0
 git clone git@github.com:<your GitHub username>/Ana0
 muse setup
 muse build -j 20


The command makes a unix soft link in the working directory, workDir. When "muse setup" is run, it looks into the backing build (and any backing to backing build, etc) and creates the correct paths for compiling and linking.

cd workDir
muse backing HEAD
ls -l backing
   backing -> /cvmfs/mu2e-development.opensciencegrid.org/museCIBuild/main/d3cc52b5
muse setup


Note the backing command does not select prof/debug or other possible qualifier choice - those are determined as part of the "muse setup" command. If the requested backing build specific qualifiers does not exist, you will get an error during "setup".

If you issue the backing command while there is already a backing link in the workDir, you will get a warning and the link will be updated. In this case, if you have already "muse setup", you will need to setup again in a new process because the paths have to be changed. You can also rm backing if desired, but again you will need to setup in a new process.

muse build

This command is how the code in workDir is compiled and linked. This command can only be run after "muse setup". All build products go under the build subdirectory in workDir. Builds represented by the backing link will not be processed.

All arguments to "muse build" are passed to the scons command.

 muse setup workDir
 muse build -j 20

scons features should all work. To remove a build:

 muse build -c

to build a target

 muse build -j 20 --mu2eCompactPrint build/sl7-prof-e20-p000/Offline/lib/libmu2e_HelloWorld_HelloWorld_module.so
   scons: Reading SConscript files ...
   Linking build/sl7-prof-e20-p000/Offline/lib/libmu2e_HelloWorld_HelloWorld_module.so

or

 muse build GDML
   scons: Reading SConscript files ...
   ... 
   [Tue Mar 30 16:34:15 CDT 2021] procs.sh starting GDML
 dir build/$MUSE_STUB/Offline/gen/gdml 
   -rw-r--r-- 1 rlc mu2e 5.2M Mar 30 16:34 mu2e.gdml

To also build python wrappers for select c++ classes

muse build --mu2ePyWrap

The python driving the scons build is in $MUSE_DIR/python.

The scons command itself is run in workDir.

muse status

The "muse status" command attempts to tell you useful things about how muse is setup in this process, or what setups are available. If run before "muse setup", then the command will assume the default directory is a workDir. It will analysis what's there and what builds have been created.

 cd workDir
 muse status
   existing builds:
     sl7-prof-e20-p000
          Build times: 03/28/21 17:24:29 to 17:27:35
     sl7-debug-e20-p000
          Build times: 03/26/21 10:20:19 to scons error 23
 muse setup
 muse status
   existing builds:
     sl7-prof-e20-p000         ** this is your current setup **
          Build times: 03/28/21 17:24:29 to 17:27:35
     sl7-debug-e20-p000
          Build times: 03/26/21 10:20:19 to scons error 23

  MUSE_WORK_DIR = /mu2e/app/users/rlc/muse/phase2/test4 
  MUSE_REPOS =  Offline Production
  ...

The current setup is flagged. The build command will record its latest start and stop times. If the stop time is not present, then the build is still running or failed.

muse tarball

The "muse tarball" command makes a tarball of the build are, ready to be submitted to the grid.

 muse tarball
  Tarball: /mu2e/data/users/rlc/museTarball/tmp.Sh0737kx1M/Code.tar.bz2

which can be submitted:

 setup mu2egrid
 mu2eprodsys ... --code=/mu2e/data/users/rlc/museTarball/tmp.Sh0737kx1M/Code.tar.bz2

additional files can be included:

 muse tarball *.txt

If workDir contains a backing link to cvmfs, the tarball will contain a link to the cvmfs build. If the linked build is on a local disk, it will be included in the tarball.

If there are files named setup_pre.sh or setup_post.sh in workDir, they will be including in the tarball, and sourced before (pre) and after (post) the "muse setup" command.

mgit partial build

On occasion, a project only requires slight modification to Offline to be build and run. Some examples of this need are simple fixes, or adding a debug print, or modifying a few lines of fcl. In this case, it seems a waste of time to build all of the Offline repo locally. mgit allows you to build only one, or a small number of subdirectories, from the Offline repo. mgit is only available for the Offline repo.

mgit is Muse's successor to pgit, which is deprecated.


Here is a typical scenario where the user wants their analysis build together with a small piece of Offline, locally modified, and the rest of Offline coming from a backing build:

cd WorkDir
muse backing Offline
git clone git@github.com:<your GitHub username>/Ana1
# checkout Offline, and create mu2e and origin remotes
mgit init
cd Offline
# add and build only this subdirectory locally, everything else from the linked build
mgit add HelloWorld
   edits..
cd ..
muse setup
muse build -j 20

Linking order

In order to preserve the one-pass linking Mu2e has established by policy, the repos being built in Muse need to have a specified link order. There is a default link order which will be centrally maintained and will include all repos known to the collaboration generally. When you ask Muse to build your personal analysis repos, or anything not in the central link order, Muse will insert the unknown repos in the beginning of the link order by default. If you have multiple local repos which are not being linked in the right order, you can tell Muse the link order. You do this by creating a new "linkOrder" file locally.

cd workDir
mkdir -p muse
echo "Ana1 Ana2 " > muse/linkOrder
cat $MUSE_ENVSET_DIR/linkOrder >> muse/linkOrder
cat muse/linkOrder
   Ana1 Ana2 Tutorial Offline
muse setup
muse status
     ...
     MUSE_LINK_ORDER =  Ana1 Ana2 Tutorial Offline

and the repo library directories will be searched in that order.

Linked (in the bash soft link sense) backing builds come after their local counterparts, i.e. an Offline in workDir will be linked before the Offline the backing link points to. This is the mechanism that allows mgit to work with a backing build.

Customizing envsets

This procedure allows the user to provide their own envset. This might be necessary to test code with a new compiler, or to test a new set of art libraries, or test a new verison of geant or other UPS product dependency. This procedure will generally only be needed by collaborators doing code management.

To create a custom envset, start with the most recent published envset. This is probably most reliably seen in the Offline/.muse file from the head of Offline.

cd workDir
git clone https://github.com/Mu2e/Offline
grep ENVSET Offline/.muse
   ENVSET P000
mkdir -p muse
cp /cvmfs/mu2e.opensciencegrid.org/DataFiles/Muse/P000 muse/u001
   edit muse/u001 as needed
muse setup -q u001
muse build -j 20

The user can build the same code in the same area both with the default envset p000 and the custom envset u001. The two builds will appear in different directories under the build directory. The two setup/builds will need to be done in separate processes. The two processes will produce the appropriate two different tarballs.

All local envsets must be named uNNN, to make sure they are not confused with copies of the published envsets pNNN. The "u" is for "user", the "p" is for "published".


Typically, once the new envset is validated, and if it is being published to general use,

  1. u001 would be renamed "p001" as the successor to "p000"
  2. p001 is uploaded to /cvmfs/cvmfs/mu2e.opensciencegrid.org/DataFiles/Muse directory, using the standard CVMFS upload procedure
  3. when the Offline code to work with the new products is merged (if any changes are needed), the ENVSET line in Offline/.muse should updated to p001 in the same PR. If no code changes are required, just make an Offline PR to change the envset number.
  4. p001 is committed to the MuseConfig repo, envset directory. This is done with the standard git PR procedures in the MuseConfig repo. Since it is only for archive purposes, exactly when this is done doesn't matter.

One of the techniques to divine the envset without user or repo guidance is to accept the pNNN set with the highest NNN. So while it is possible to publish arbitrary envsets for arbitrary reasons, the highest number may be picked up by some inexperienced users, though not intended for general use.

If you have modified the PRODUCTS path, to pick up a locally build UPS product that is in the envset (KinKal or BTrk are the prime examples), then "muse tarball" will sense that and include the local UPS area in the tarball.

Maintaining a Customized Setup

If you are using your own muse/uNNN file, and if you want to merge the head of Mu2e/Offline/main into your working branch, then in case the envset used by the head has changed, you need to maintain your muse/uNNN file, as follows:

  1. Before you merge, note the preferred envset set in Offline/.muse (the pNNN) file.
    1. Note the differences between your uNNN and the pNNN file.
  2. Do the merge and resolve conflicts.
  3. Check Offline/.muse to see if the preferred envset file has changed.
  4. If it has not changed, you are finished. If it has changed, continue with the steps below.
  5. Make a new uNNN file by copying the new preferred envset from /cvmfs/mu2e.opensciencegrid.org/DataFiles/Muse.
  6. If your old uNNN file sets up UPS products that need to be updated to match the new envset, then update those products and add them to your own local artexternals directory. If you are not sure what needs to be done, consult the owner of those products.
  7. Edit your new uNNN file to bring in the changes noted in step 2, updating versions and qualifiers to match those created in step 7.
  8. Start a new terminal session.
  9. Do a muse setup -q uNNN, where uNNN is your new envset
  10. Build and test.

Using locally built UPS products

One development pattern is to not just use a new UPS product version from cvmfs, but to build that product locally and test it with Offline. In this case, the user should build the product in a directory unrelated to muse. Since UPS products are only accessible in their UPS format, the UPS product must then be installed in its UPS format in a temporary local directory. By convention, that local UPS directory should be workDir/artexternals. This directory must be added to the UPS PRODUCTS path after muse setup. Typically, the local installation of the product would have a unique version number and the local envset would be modified to setup this version. If this convention is followed, then when muse makes a tarball, it will discover, tar, and employ this directory content.

Because it is a relatively common need, this process has been scripted for building a local artdaq_core_mu2e, which contains definitions of the raw data format.

cd workDir
muse setup
setup codetools
museADCM.sh install

Once artdaq_core_mu2e is built locally and installed in workDir/artexternals, you must re-setup to get this local build in your path.

cd workDir
muse setup

This will also work to return to work after logging out. the UPS PRODUCTS path will automatically include workDir/artexternals if it is present. The muse tarball command will also automatically include this product in the tarball.

To modify the ADCM code, edit the clone under the adcm directory and re-run museADCM.sh install. This will run the ADCM build and install the code in workDir/artexternals, in your path.

Making a repo Muse-ready

It is anticipated that ntuples, calibration and user analysis repos will be build together in the required combinations using Muse. This implies that the user analysis repos will be organized in a way that Muse can operate on them.

The first step is to get the repo recognized by Muse - this is accomplished by adding a file .muse to the top level of the repo. When Muse goes to build a workDir, it will ignore any subdirectory without a .muse.

The second step is to drive the building of libraries, modules and bins in the scons build. This can be achieved by following the patterns in the Offline or Tutorial subdirectories.

The .muse file can also help help configure the Muse build. Please see the algorithm that Muse uses to find the envset (essentially the preferred art products version). One of the prioritized steps is to look in repos' .muse file for a suggested envset, which will look like:

ENVSET p000

If your repo has envset preference, you can include it here. It will most likely be used only if there is no Offline repo involved in the workDir, since the Offline will overrule your repo's suggestion. You can also add directories to the PATH and PYTHONPATH as these example from Offline:

PYTHONPATH Trigger/python
PATH bin

The paths are relative to the top level of the repo, but will become full paths in the setup.