User:Kutschke/Mu2eEnvironments
Introduction
This note provides the big picture for the transition of Mu2e Environment management and build management that started in the spring of 2024 and is ongoing. The target audience is people who are new to Mu2e software (or have been away from it for a while) but who are experienced with the concepts of environment and build management.
UPS
Until recently Mu2e used a Fermilab developed system called UPS for management of external software products such as g++, art, root, geant4 etc. When we build Mu2e Offline we use about 60 external software products. Most of the UPS products were curated by CSAID, built and packaged for distribution as tar.bz2 files. Here curated means that CSAID ensured that
- the set of products chosen was internally consistent (ie ensure that version A of product 1 is known to work with version B of product 2; if products 1 and 2 both depend on product 3, choose a version of product 3 that is known to work with both.)
- all products were compiled and linked with the same compiler and similar enough compiler options
- the build options for each product were selected appropriately. For example xrootd can be built with support for authentication by proxies, tokens, both or neither. And root can be built with and without xrootd support.
- on rare occasions, apply bug fix patches to releases tagged by the package authors. Decorate the version number to indicate this.
Spack
CSAID has decided to discontinue support for UPS and to migrate to a new system for environment management named Spack. It also supports the build and distribution of a curated set of software packages (UPS "product" means the same as spack "package"). The reasons for making this change are that
- UPS is 30+ years old and the people who have experience supporting it are near retirement.
- Rather than invest new resources in an old technology we should choose an industry standard solution.
- Spack is the industry standard for US supercomputer centers. By adopting it we hope to streamline our onboarding into the supercomputer centers.
- Curation of UPS products takes a lot of CSAID resources and spack promises to reduce the amount of effort required to do this. (Personal comment: this has not yet proven to be the case; at this stage in the learning curve it is quite the opposite. Perhaps we will get there).
Spack is a big powerful tool and users can make many choices about how to use it. So it has a steep, long learning curve. CSAID, Mu2e and other Fermilab based experiments are still working through the learning curve. You can expect that policies and procedures for how to use Spack will evolve over the next year or so.
Transition to Spack
CSAID decided that, for the AL9 operating system, CSAID supplied software will only be distributed via Spack. On June 30, 2024, the SL7 operating system reached end-of-support-life. All Mu2e interactive machines now run AL9. We can still work in SL7 docker containers on our AL9 machines; see Legacy Access to SL7. So we still have all of our UPS distributions of CSAID software. We expect that new CSAID supplied software will be supplied only for AL9 machines and only via Spack.
Some UPS products are "script only" (ie no compiled code) and these continue to work on AL9 machines just as they worked on SL7 machines. Mu2e uses some such products in our AL9 environment. They will be transitioned to spack as we better understand how to use spack. In UPS-speak, script only products are called "NULL flavored".
Because we are still using UPS in this limited say, you will still see "setup abc". Be a little suspicious of these instructions; please check with Mu2e computing management to see if that instruction is still current.
Offline vs Online Usage
UPS Era Usage
The Mu2e online and offline communities have made different choices about environment management and build systems.
The offline community wrote a thin scripting layer named Muse that manages the environment by setting up a curated set of UPS products and defining some additional environment variables and bash functions. When you do a build using the muse build
command it delegates the build to the scons build system, which can build multiple repositories concurrently.
Each source directory contains a file named SConscript that contains instructions for scons.
The online community chose to use a Fermilab supplied build system named MRB (Multi Repository Build) to manage their environment and oversee coordinated builds of many repositories. The underlying build system is cmake. They also use UPS to provide external software such as the compiler, art, root, clhep etc.
MRB is the product that CSAID uses to build all of their UPS supplied software products. The online group chose to use MRB because it is the official CSAID supported environment and build management system. The offline group already had a mature environment when the online group made their choice and a cost benefit analysis reached the decision that we would stay with muse/scons. The driver was that MRB required a higher level of expertise than did Muse/scons, so Muse/scons was better suited to novice users.
Most of the repositories managed by the online group are independent of the offline software so these worlds co-existed independently for a long time. There are two points of contact.
- The repository artdaq_core_mu2e contains the raw data formats. It is used by both online and offline. It is owned and maintained by the online community.
- At run time, the trigger loads art modules that are found in shared libraries created by the offline build system. This code is managed by the offline community and is developed in the offline world.
When it is necessary to update artdaq_core_mu2e for use in the offline world, an expert from the offline community builds it and installs it in UPS using MRB. Once it is in UPS, it's just another UPS product like art, root, geant4 and so on.
The offline community provides a tool to package a build of the Offline repository as a UPS product. When the online community needed a new version of Offline, they built it using the native offline tools and installed it in UPS. The online worlds sees Offline as just another UPS product.
This worked well throughout the UPS era.
The online and offline worlds maintain separate collections of UPS products. The reason is that each world needs to be able to operate when the other is down.
Spack Era
In the spack era, CSAID developed a spack based system for environment and build management. They have completely retired UPS. The Mu2e online world has adopted this system. The build system component is cmake based and is the same as for MRB. To be a bit more specific, there is component of the build system named cetmodules that was adapted from UPS awareness to spack awareness. Modern versions of cetmodules work with either UPS or spack. At some time in the future support of UPS will be withdrawn. Online end users need to learn a new sequence of commands to use the new spack based environment.
The offline community adapted the backend of Muse so that it has a UPS aware backend and a spack aware backend. The default is that it chooses the UPS aware backend when it is run on an SL7 machine and the spack aware backend when it is run on an AL9 machine. From an end user viewpoint, nothing changed.
As with UPS, the online and offline worlds maintain separate collections of spack packages. The reason is the same as above.
When a new version of artdaq_core_mu2e is needed in the offline environment, an offline expert builds it natively and installs it as a spack package in the offline spack area.
To get access to modules from the Mu2e Offline repo, the online team populated the Mu2e Offline repository with CMakeLists.txt files. They then build Offline natively in the online environment and use it in that environment.
It is possible to build Offline using cmake and spack in the offline environment. The instructions can be found at Building_Offline_with_cmake#Using_Spack.
The 3 Phase Plan
When the offline community first started to think about using spack we made the following 3 phase plan:
- Receive CSAID supplied external products via spack. Build KinKal, BTrk and artdaq_core_muse against the CSAID supplied spack packages but install them in UPS as before. Update the muse backend to use the CSAID supplied spack packages and to get KinKal, BTrk and artdaq_core_mu2e from UPS.
- As Phase 1 but build and distribute KinKal, BTrk and artdaq_core_mu2e using spack. Update the muse backed and continue to use it.
- Move 100% to spack. Prerequisites for this include:
- Usage of spack by both CSAID and Mu2e is stable, robust and easy enough for novice users.
- All functionality currently provided by Muse is made available in the new environment.
As it happened, we skipped Phase 1 and went directly to Phase 2. At this writing we are at Phase 2. Mu2e computing management plans to stay at phase 2 until the prerequisites for Phase 3 are achieved.
Offline Details
For Offline software, Mu2e created a software product named Muse to manage the Mu2e software environment. It's a thin layer that collects commonly used sequences of commands into a single muse
command. Muse defines the concept of an envset, short for "environmental set", which is a script that creates a complete Mu2e environment by choosing a curated set of UPS products and defining some additional environment variables. Envsets are versioned with names like p001, p002 ... p057, which is the current one as of this writing. Any version of the Mu2e software is guaranteed to work with at least one envset and will often work with nearby versions.
Muse delegates building code to scons, which Mu2e has used from it's early days.
When it was time to upgrade to a new version of art, we
- unwound a curated set of tar files onto /cvmfs/mu2e.opensciencegrid.org/artexternals/
- built and tested a few Mu2e software repositories using the new products and installed them into /cvmfs/mu2e.opensciencegrid.org/artexternals/ (BTrk, KinKal, artdaq_core_mu2e).
- Created a new Muse envset to chose the new set of products
- Tested building and running Offline and some other Mu2e repositories against the products in the new envset. Update our code as necessary.
- At an appropriate time update the default envset to the new one.
Users can choose to continue to use an old envset or migrate to the new default. All builds are labelled with the envset used for the build.