User:Kutschke/Mu2eEnvironments

From Mu2eWiki
Jump to navigation Jump to search

Introduction

This note provides the big picture for the transition of Mu2e computing environment management and build management that started in the spring of 2024. As of August 2024 this transition is ongoing. The target audience for this page is people who are new to Mu2e software (or have been away from it for a while) but who are experienced with the concepts of environment and build management.

UPS

Since its beginning, Mu2e used a Fermilab developed system called UPS for management of external software products such as g++, art, root, geant4 etc. When we build Mu2e Offline we use about 60 external software products. Most of the UPS products were curated by CSAID, built and distributed on cvmfs. Here curated means that CSAID ensured that

  1. the set of products chosen was internally consistent (ie ensure that version A of product 1 is known to work with version B of product 2; if products 1 and 2 both depend on product 3, choose a version of product 3 that is known to work with both.)
  2. all products were compiled and linked with the same compiler and similar enough compiler options
  3. the build options for each product were selected appropriately. For example xrootd can be built with support for authentication by proxies, tokens, both or neither. And root can be built with and without xrootd support.
  4. on rare occasions, apply bug fix patches to releases tagged by the package authors. Decorate the version number to indicate this.

Spack

CSAID has decided to discontinue support for UPS and to migrate to a new system for environment management named Spack. It also supports the build and distribution of a curated set of software packages (UPS "product" means the same as spack "package"). The reasons for making this change are that

  1. UPS is 30+ years old and the people who have experience supporting it are near retirement.
  2. Rather than invest new resources in an old technology we should choose an industry standard solution.
  3. Spack is the industry standard for US supercomputer centers. By adopting it we hope to streamline our onboarding into the supercomputer centers.
  4. Curation of UPS products takes a lot of CSAID resources and spack promises to reduce the amount of effort required to do this. (Personal comment: this has not yet proven to be the case; at this stage in the learning curve it is quite the opposite. Perhaps we will get there).

Spack is a big powerful tool and users can make many choices about how to use it. So it has a steep, long learning curve. As of August 2024, CSAID, Mu2e and other Fermilab based experiments are still working through the learning curve. You can expect that policies and procedures for how to use Spack will evolve over the next year or so.

Transition to Spack

CSAID decided that, for the AL9 operating system, CSAID supplied software will only be distributed via Spack. On June 30, 2024, the SL7 operating system reached end-of-support-life. All Mu2e interactive machines now run AL9. We can still work in SL7 docker containers on our AL9 machines; see Legacy Access to SL7. So we still have all of our UPS distributions of CSAID software. We expect that new CSAID supplied software will be supplied only for AL9 machines and only via Spack.

Some UPS products are "script only" (ie no compiled code) and these continue to work on AL9 machines just as they worked on SL7 machines. Mu2e uses some such products in our AL9 environment. They will be transitioned to spack as we better understand how to use spack. In UPS-speak, script only products are called "NULL flavored".

Because we are still using UPS in this limited way, you will still see instructions that say "setup abc". Be a little suspicious of these instructions; please check with Mu2e computing management to see if that instruction is still current.

Offline vs Online Usage

UPS Era Usage

The Mu2e online and offline communities have made different choices about environment management and build systems.

The offline community wrote a thin scripting layer named Muse that manages the environment by setting up a curated set of UPS products and defining some additional environment variables and bash functions. When you do a build using the muse build command it delegates the build to the scons build system, which can build multiple repositories concurrently. Each source directory contains a file named SConscript that contains instructions for scons.

The online community chose to use a Fermilab supplied build system named MRB (Multi Repository Build) to manage their environment and oversee coordinated builds of many repositories. The underlying build system is cmake. They also use UPS to provide external software such as the compiler, art, root, clhep etc.

MRB is the product that CSAID uses to build all of their UPS supplied software products. The online group chose to use MRB because it is the official CSAID supported environment and build management system. The offline group already had a mature environment when the online group made their choice and a cost benefit analysis reached the decision that offline would stay with muse/scons. The driver was that MRB required a higher level of expertise than did Muse/scons, so Muse/scons was better suited to novice users.

Most of the repositories managed by the online group are independent of the offline software so these worlds co-existed independently for a long time. There are two points of contact.

  1. The repository artdaq_core_mu2e contains the raw data formats. It is used by both online and offline. It is owned and maintained by the online community.
  2. At run time, the trigger loads art modules that are found in shared libraries created by the offline build system. This code is managed by the offline community and is developed in the offline world.

When it is necessary to update artdaq_core_mu2e for use in the offline world, an expert from the offline community builds it and installs it in UPS using MRB. Once it is in UPS, it's just another UPS product and it is used as is any other.

The offline community provides a tool to package a build of the Offline repository as a UPS product. When the online community needed a new version of Offline, they built it using the native offline tools and installed it in UPS. The online worlds sees Offline as just another UPS product.

This worked well throughout the UPS era.

The online and offline worlds maintain separate collections of UPS products. The reason is that each world needs to be able to operate when the other is down.

Spack Era

CSAID has developed a spack based system for environment and build management. This is now their standard system and they no longer use UPS. The Mu2e online world has adopted the CSAID spack based system. The build system component is cmake based and is the same as for MRB. To be a bit more specific, there is component of the build system named cetmodules that was adapted from UPS awareness to spack awareness. Modern versions of cetmodules work with either UPS or spack. At some time in the future support of UPS will be withdrawn. Online end users need to learn a new sequence of commands to use the new spack based environment.

The offline community adapted the backend of Muse so that it has a UPS aware backend and a spack aware backend. The default is that it chooses the UPS aware backend when it is run on an SL7 machine and the spack aware backend when it is run on an AL9 machine. From an end user viewpoint, nothing changed.

As with UPS, the online and offline worlds maintain separate collections of spack packages. The reason is the same as above.

When a new version of artdaq_core_mu2e is needed in the offline environment, an offline expert builds it natively and installs it as a spack package in the offline spack area.

To get access to modules from the Mu2e Offline repo, the online team populated the Mu2e Offline repository with CMakeLists.txt files. They then build Offline natively in the online environment and use it in that environment.

It is possible to build Offline using cmake and spack in the offline environment. The instructions can be found at Building_Offline_with_cmake#Using_Spack. Two caveats:

  1. As our understanding of how to best use spack evolves, these instructions will evolve. If things suddenly stop working, check for updated instructions. If things still do not work, let Mu2e computing management know.
  2. As of August 2024, code that is built this way cannot be captured in a tarball and submitted to run on the grid. This will be addressed in the future.

Phased Adoption of Spack

When the offline community first started to think about using spack we developed a phased plan for adopting spack. It is described in Spack#Introduction. As of August 2024 we are at Phase 2. The linked page describes the prerequisites for moving to Phase 3.