User:Kutschke/Draft SL7 EOL: Difference between revisions

From Mu2eWiki
Jump to navigation Jump to search
No edit summary
 
(6 intermediate revisions by the same user not shown)
Line 3: Line 3:
==Executive Summary==
==Executive Summary==


Starting sometime in June 2024, exact date TBA, the last of the Mu2e interactive machines will be converted from SL7 to AL9.
As of May 30, 2024, some of our interactive machines run the old Scientific Linux 7 (SL7) operating system and some run the new Alma Linux 9 (AL9) operating system.  Sometime in June 2024, exact date TBA, the last of the Mu2e interactive machines will be converted from SL7 to AL9.  The purpose of this page is to discuss the transition to AL9.  There are lots of corner cases so it's long.


# All code built on SL7 machines will no longer be directly runnable interactively.
If you are logged into a Mu2e interactive machine you can check which OS it runs with:
## It will be possible to run interactively using an SL7 container but our strong preference is that you not do this and instead port your work to AL9 as discussed on this page. We will only support it for well motivated use cases.
 
/etc/redhat-release
 
You will see one of:
  Scientific Linux release 7.9 (Nitrogen)
  AlmaLinux release 9.4 (Seafoam Ocelot)
 
As of May 30, 2024 there are only 4 machines are still running SL7: see [[ComputingLogin#Machines]].
 
Below is a summary of what to expect:
 
# Code built on one type of machine will not run on the other.  So you will need to rebuild your code when you start to work to work on an AL9 machine.
## It will be possible to run SL7-built code interactively on an AL9 machine by working within a SL7 container but our strong preference is that you not do this and instead port your work to AL9 as discussed on this page. We will support use of interactive containers only for very for well motivated use cases.
# All existing SL7 builds, including grid tarballs, will be runnable on the grid for a long time into the future.  Grid jobs run in containers and SL7 containers remain supported.
# All existing SL7 builds, including grid tarballs, will be runnable on the grid for a long time into the future.  Grid jobs run in containers and SL7 containers remain supported.
# All files written on SL7 will be readable on AL9; this includes .art and .root files.
# All files written on SL7 will be readable on AL9; this includes .art and .root files.
# root based analysis of TrkAna and Stnutple files will continue to work, regardless of where the input files were produced.
# root based analysis of TrkAna and Stnutple files will continue to work, regardless of where the input files were produced.
# You can already build and run your Mu2e code on our AL9 machines.  Just log into an AL9 machine and try.  The procedure is exactly the same as on SL7. To learn which machines are SL7 and which AL9 see [[ComputingLogin#Machines]].  
# You can already build and run your Mu2e code on our AL9 machines.  Just log into an AL9 machine and try.  The procedure is exactly the same as on SL7. Details below.
# There are both SL7 and AL9 builds in the musings for Offline v10_29_00, MDC2020ae and recent git CI builds, for example main/4b7e9fcb .  Please start to use them.
# There are both SL7 and AL9 builds in the musings for Offline v10_29_00, MDC2020ae and recent git CI builds, for example main/4b7e9fcb .  Please start to use them.
# Future musings will always have an AL9 build. The next musings Offline v10_30_00 and MDC2020af will have both SL7 and AL9 builds. We will soon stop producing SL7 builds for musings.  Watch for an announcement.
# Future musings will always have an AL9 build. The next musings Offline v10_30_00 and MDC2020af will have both SL7 and AL9 builds. We will soon stop producing SL7 builds for musings.  Watch for an announcement.
# Our strong preference is that you switch to using one of the AL9 interactive machines at your earliest convenience. See below for details. If you have issues please use the slack channel is_it_me_or_a_bug.
# Our strong preference is that you switch to using one of the AL9 interactive machines at your earliest convenience. See below for details. If you have issues please use the slack channel is_it_me_or_a_bug.


===Minimal Test===
The sections below give more details.
Here is a procedure for a minimal test for an existing development area:
 
==Introduction==
The rest of this page discusses work that you will need to do to transition from working on a SL7 machine to an AL9 machine.  If you have questions about how to do this work, please do one the following.  The preferred option is to post on the is_it_me_or_a_bug channel in Mu2e Slack.  You can also
contact any of Rob, Ray and Dave.
 
==Muse Working Area with a Recent Offline==
 
If you have a Muse working area with a recent version of Offline, either a local clone or via a backing release, follow the procedure below:
 
# Do a clean login into an AL9 node.
# Do a clean login into an AL9 node.
# cd to a muse working area that contains a known working SL7 build
# cd to a muse working area that contains a working SL7 build
# Check that the ENVSET line Offline/.muse is p056 or higher.  If not, go to the section [[Muse Working Area with an Older Offline]].
# mu2einit
# mu2einit
# mu2eQuota  # make sure that you have at least 3.5 GB free on /exp/mu2e/app; if not, clean up.
# mu2eQuota  # make sure that you have at least 3.5 GB free on /exp/mu2e/app; if not, clean up.
Line 25: Line 46:
# Run a test job
# Run a test job


If you try to run before doing the build you will see a message that says that the command "mu2e" was not found.  That's because it's looking for the command mu2e in the not-yet-built al9 build area.
If you try to run mu2e before doing the build you will see a message that says that the command "mu2e" was not found.  That's because it's looking for the command mu2e in the not-yet-built al9 build area.


The sections below give more details.
After you have done the "muse build", look in the build subdirectory of your Muse working area.  You will see that it contains subdirectories with names like "al9-prof-e28-p056" or "sl7-prof-e28-p056".  You may recursively delete the any of the sl7-* directories.  


==Introduction==
==Muse Working Area with an Older Offline==
Most of the Mu2e interactive machines have already been upgraded to AL9. For a list of which machines are running which OS see [[ComputingLogin#Machines]].  Sometime before June 30, likely as early as mid June, the remaining machines will be upgraded to AL9.  Ray or I will send email to the computing and software spack channel when the date for the transition is known.


The rest of this page discusses work that you may have to do to prepare for the last SL7 nodes going awayIf you have questions about how to do this work, please do one the followingThe preferred option is to post on the is_it_me_or_a_bug channel in Mu2e Slack.  You can also
As of May 30, on AL9 we have art suite binaries only for art suite v3_14_03.  We expect to have binaries for art suite v3_15_00 during the week of May 3, 2024.  We expect to have another new art built with root v6_32_00 not too long after thatWe do not, and will not, have older versions of the art suite available for AL9Therefore we cannot support versions of Offline on AL9 that require older versions of art.
contact any of Rob, Ray and Dave.


==Muse Working Area with a Recent Offline==
There are two cases to consider:
If you have a Muse working area with a recent version of Offline, either a local clone or via a backing release, follow the procedure given in [[#Minimal Test]].  You will be able to continue working as before.


After you have done the "muse build" look in the build subdirectory of your Muse working area.  You will see that it contains subdirectories with names like "al9-prof-e28-p056" or "sl7-prof-e28-p056"You can delete the tree rooted at and subdirectory that starts with sl7. Depending on your disk quota situation you may need to delete the sl7 subdirectory before you do the buildSee [[Disks#Quotas]].
#  If you use a working clone of Offline that you build yourself, and if uses an envset older than p055, then your working code will not build on AL9To check, look in your clone of Offline at the file .muse and look for the line that begins ENVSET.
# If you use a backing musing of check the version.  Do do "ls -l backing" and see what the symlink points toThe versions that are know to work on AL9 are Offline v10_28_00, v10_29_00, SimJob/MDC2020ac, and SimJob/MDC2020ae.


==Muse Working Area with an Older Offline==
In either case, if you are using an older version you will need to advance your work to one of the known-to work versions.  It's hard to give an punchlist to do the upgrade because different people use git in slightly different ways.  The concept is:


As of May 30, on AL9 we only have art suite binaries for art v3_14_03. We expect to have binaries for art suite v3_15_00 during the week of May 3, 2024.   We expect to have another new art built with root v6_32_00 not too long after that. We will not have older versions of the art suite available on AL9.  Therefore we cannot support versions of Offline that require older versions of art.
# Save your current work on branch that is NOT named "main" and push it to your GitHub fork.
#* If you have been working on main, "checkout -b new_branch_name main" will do what you need.
# Make sure that your working clone contains the https://github.com/Mu2e/Offline as a remote.
# Start a new working branch from the head of the main branch of Mu2e/Offline.
# Merge your old working branch into your new working branch, resolve conflicts, build and test.
# Push your new working branch to your GitHub fork.


If your have a clone of Offline that you build yourself, and if uses an envset p055 or older, then your working code will not build on AL9.  To check, look in your clone of Offline at the file .muse and look for the line that begins ENVSET.  The upgrade to this envset was done with SHA 281f08ed60e, PR #1225, which was merged on March 28, 2024.


If you are in this situation, please port your code to work from the current head of Offline.  Please do so as soon as possible so that you are not caught when the remaining machines are upgraded to AL9.  If you don't know how to do this, ask an expert.
Please do so as soon as possible so that you are not caught when the remaining machines are upgraded to AL9.  If you don't know how to do this, ask an expert.
 
If you are using a backing musing of Offline older than v10_29_00, it will not work an AL9.  If you are using a SimJob Musing older than SimJob/MDC2020ae, it will not work on AL9.  If you are in either of these situations, please port your code to work from the musing Offline/v10_29_00 or later ( or SimJob/MDC2020ae or later).


==Running Grid Jobs==
==Running Grid Jobs==


Grid jobs run in containers.  The jobsub default is to choose the most recent Fermilab supplied AL9 container.  The mu2eprodsys default is to choose the most recent Fermilab supplied AL9 container.  To tell mu2eprodsys to omit the override of the jobsub default and to let the jobsub default stand, use the mu2eprodsys option:
Grid jobs run in containers.  The jobsub default is to choose the most recent Fermilab supplied AL9 container.  The mu2eprodsys default is to choose the most recent Fermilab supplied SL7 container.  To tell mu2eprodsys to let the jobsub default stand, use the mu2eprodsys option:


   --predefined-args=none
   --predefined-args=none
Line 65: Line 86:
   --setup=/cvmfs/mu2e.opensciencegrid.org/Musings/SimJob/MDC2020ae/setup.sh
   --setup=/cvmfs/mu2e.opensciencegrid.org/Musings/SimJob/MDC2020ae/setup.sh


If the Musing contains both SL7 and AL9 builds, this will work on either type of node.  If it contains only one build, you must be sure to chose the OS that matches the available build.
If the Musing contains both SL7 and AL9 builds, this will work on either type of container.  If it contains only one build, you must be sure to use --predefined-args to chose the container type that matches the available build.


==Existing Grid Tarballs==
==Existing Grid Tarballs==
Line 73: Line 94:
I believe, but have not tested, that in the AL9 only era you will be able to make grid tarball from and existing SL7 build and it will work on the grid.
I believe, but have not tested, that in the AL9 only era you will be able to make grid tarball from and existing SL7 build and it will work on the grid.


==Legacy codes: SU2020, MDC2018 etc==
==Legacy Offline builds: SU2020, MDC2018 etc==


The older art suite versions necessary to support legacy Mu2e codes will not be available on AL9.  If there are vital features in these older codes that are not available in the current Offline, the right answer is to port these features to current Offline.
The older art suite versions necessary to support legacy Mu2e codes will not be available on AL9.  If there are vital features in these older codes that are not available in the current Offline, the right answer is to port these features to current Offline.
==Stntuple==
As of May 30, 2024, the Stntuple build scripts have not been updated to be complient with spack.  Therefore they will not work on AL9.  The author of Stnutple has been notified.

Latest revision as of 16:44, 3 June 2024

Countdown to SL7 End of Life

Executive Summary

As of May 30, 2024, some of our interactive machines run the old Scientific Linux 7 (SL7) operating system and some run the new Alma Linux 9 (AL9) operating system. Sometime in June 2024, exact date TBA, the last of the Mu2e interactive machines will be converted from SL7 to AL9. The purpose of this page is to discuss the transition to AL9. There are lots of corner cases so it's long.

If you are logged into a Mu2e interactive machine you can check which OS it runs with:

/etc/redhat-release

You will see one of:

 Scientific Linux release 7.9 (Nitrogen)
 AlmaLinux release 9.4 (Seafoam Ocelot)

As of May 30, 2024 there are only 4 machines are still running SL7: see ComputingLogin#Machines.

Below is a summary of what to expect:

  1. Code built on one type of machine will not run on the other. So you will need to rebuild your code when you start to work to work on an AL9 machine.
    1. It will be possible to run SL7-built code interactively on an AL9 machine by working within a SL7 container but our strong preference is that you not do this and instead port your work to AL9 as discussed on this page. We will support use of interactive containers only for very for well motivated use cases.
  2. All existing SL7 builds, including grid tarballs, will be runnable on the grid for a long time into the future. Grid jobs run in containers and SL7 containers remain supported.
  3. All files written on SL7 will be readable on AL9; this includes .art and .root files.
  4. root based analysis of TrkAna and Stnutple files will continue to work, regardless of where the input files were produced.
  5. You can already build and run your Mu2e code on our AL9 machines. Just log into an AL9 machine and try. The procedure is exactly the same as on SL7. Details below.
  6. There are both SL7 and AL9 builds in the musings for Offline v10_29_00, MDC2020ae and recent git CI builds, for example main/4b7e9fcb . Please start to use them.
  7. Future musings will always have an AL9 build. The next musings Offline v10_30_00 and MDC2020af will have both SL7 and AL9 builds. We will soon stop producing SL7 builds for musings. Watch for an announcement.
  8. Our strong preference is that you switch to using one of the AL9 interactive machines at your earliest convenience. See below for details. If you have issues please use the slack channel is_it_me_or_a_bug.

The sections below give more details.

Introduction

The rest of this page discusses work that you will need to do to transition from working on a SL7 machine to an AL9 machine. If you have questions about how to do this work, please do one the following. The preferred option is to post on the is_it_me_or_a_bug channel in Mu2e Slack. You can also contact any of Rob, Ray and Dave.

Muse Working Area with a Recent Offline

If you have a Muse working area with a recent version of Offline, either a local clone or via a backing release, follow the procedure below:

  1. Do a clean login into an AL9 node.
  2. cd to a muse working area that contains a working SL7 build
  3. Check that the ENVSET line Offline/.muse is p056 or higher. If not, go to the section Muse Working Area with an Older Offline.
  4. mu2einit
  5. mu2eQuota # make sure that you have at least 3.5 GB free on /exp/mu2e/app; if not, clean up.
  6. muse setup
  7. muse build -j N # where N is 6 for otherwise empty mu2egpvm machines; N=24 is good for mu2ebuild02.
  8. Run a test job

If you try to run mu2e before doing the build you will see a message that says that the command "mu2e" was not found. That's because it's looking for the command mu2e in the not-yet-built al9 build area.

After you have done the "muse build", look in the build subdirectory of your Muse working area. You will see that it contains subdirectories with names like "al9-prof-e28-p056" or "sl7-prof-e28-p056". You may recursively delete the any of the sl7-* directories.

Muse Working Area with an Older Offline

As of May 30, on AL9 we have art suite binaries only for art suite v3_14_03. We expect to have binaries for art suite v3_15_00 during the week of May 3, 2024. We expect to have another new art built with root v6_32_00 not too long after that. We do not, and will not, have older versions of the art suite available for AL9. Therefore we cannot support versions of Offline on AL9 that require older versions of art.

There are two cases to consider:

  1. If you use a working clone of Offline that you build yourself, and if uses an envset older than p055, then your working code will not build on AL9. To check, look in your clone of Offline at the file .muse and look for the line that begins ENVSET.
  2. If you use a backing musing of check the version. Do do "ls -l backing" and see what the symlink points to. The versions that are know to work on AL9 are Offline v10_28_00, v10_29_00, SimJob/MDC2020ac, and SimJob/MDC2020ae.

In either case, if you are using an older version you will need to advance your work to one of the known-to work versions. It's hard to give an punchlist to do the upgrade because different people use git in slightly different ways. The concept is:

  1. Save your current work on branch that is NOT named "main" and push it to your GitHub fork.
    • If you have been working on main, "checkout -b new_branch_name main" will do what you need.
  2. Make sure that your working clone contains the https://github.com/Mu2e/Offline as a remote.
  3. Start a new working branch from the head of the main branch of Mu2e/Offline.
  4. Merge your old working branch into your new working branch, resolve conflicts, build and test.
  5. Push your new working branch to your GitHub fork.


Please do so as soon as possible so that you are not caught when the remaining machines are upgraded to AL9. If you don't know how to do this, ask an expert.

Running Grid Jobs

Grid jobs run in containers. The jobsub default is to choose the most recent Fermilab supplied AL9 container. The mu2eprodsys default is to choose the most recent Fermilab supplied SL7 container. To tell mu2eprodsys to let the jobsub default stand, use the mu2eprodsys option:

 --predefined-args=none

Sometime soon we will update mu2eprodsys to leave AL9 as the default. Watch for an announcement. After this, you will be able to select a SL7 container with the mu2eprodsys option

 --predefined-args=sl7

If you are submitting a grid tarball be sure to choose the OS that matches the OS on which the tarball was built.

If you submit a grid job that runs on an existing musing, you use a mu2eprodsys option like:

 --setup=/cvmfs/mu2e.opensciencegrid.org/Musings/SimJob/MDC2020ae/setup.sh

If the Musing contains both SL7 and AL9 builds, this will work on either type of container. If it contains only one build, you must be sure to use --predefined-args to chose the container type that matches the available build.

Existing Grid Tarballs

If you have existing grid tarballs built for sl7, you will still be able to run them on the grid for a long time to come. To do this, request an SL7 container when you submit your job; see #Running Grid Jobs.

I believe, but have not tested, that in the AL9 only era you will be able to make grid tarball from and existing SL7 build and it will work on the grid.

Legacy Offline builds: SU2020, MDC2018 etc

The older art suite versions necessary to support legacy Mu2e codes will not be available on AL9. If there are vital features in these older codes that are not available in the current Offline, the right answer is to port these features to current Offline.

Stntuple

As of May 30, 2024, the Stntuple build scripts have not been updated to be complient with spack. Therefore they will not work on AL9. The author of Stnutple has been notified.