User:Kutschke/Draft SL7 EOL: Difference between revisions

From Mu2eWiki
Jump to navigation Jump to search
No edit summary
 
(4 intermediate revisions by the same user not shown)
Line 18: Line 18:


# Code built on one type of machine will not run on the other.  So you will need to rebuild your code when you start to work to work on an AL9 machine.
# Code built on one type of machine will not run on the other.  So you will need to rebuild your code when you start to work to work on an AL9 machine.
## It will be possible to run SL7-built code interactively on an AL9 by working within a SL7 container but our strong preference is that you not do this and instead port your work to AL9 as discussed on this page. We will support use of interactive containers only for very for well motivated use cases.
## It will be possible to run SL7-built code interactively on an AL9 machine by working within a SL7 container but our strong preference is that you not do this and instead port your work to AL9 as discussed on this page. We will support use of interactive containers only for very for well motivated use cases.
# All existing SL7 builds, including grid tarballs, will be runnable on the grid for a long time into the future.  Grid jobs run in containers and SL7 containers remain supported.
# All existing SL7 builds, including grid tarballs, will be runnable on the grid for a long time into the future.  Grid jobs run in containers and SL7 containers remain supported.
# All files written on SL7 will be readable on AL9; this includes .art and .root files.
# All files written on SL7 will be readable on AL9; this includes .art and .root files.
Line 48: Line 48:
If you try to run mu2e before doing the build you will see a message that says that the command "mu2e" was not found.  That's because it's looking for the command mu2e in the not-yet-built al9 build area.
If you try to run mu2e before doing the build you will see a message that says that the command "mu2e" was not found.  That's because it's looking for the command mu2e in the not-yet-built al9 build area.


After you have done the "muse build" look in the build subdirectory of your Muse working area.  You will see that it contains subdirectories with names like "al9-prof-e28-p056" or "sl7-prof-e28-p056".  You may recursively delete the any of the sl7-* directories.  
After you have done the "muse build", look in the build subdirectory of your Muse working area.  You will see that it contains subdirectories with names like "al9-prof-e28-p056" or "sl7-prof-e28-p056".  You may recursively delete the any of the sl7-* directories.  


==Muse Working Area with an Older Offline==
==Muse Working Area with an Older Offline==


As of May 30, on AL9 we only have art suite binaries for art v3_14_03.  We expect to have binaries for art suite v3_15_00 during the week of May 3, 2024.  We expect to have another new art built with root v6_32_00 not too long after that.  We do not, and will not, have older versions of the art suite available on AL9.  Therefore we cannot support versions of Offline on AL9 that require older versions of art.
As of May 30, on AL9 we have art suite binaries only for art suite v3_14_03.  We expect to have binaries for art suite v3_15_00 during the week of May 3, 2024.  We expect to have another new art built with root v6_32_00 not too long after that.  We do not, and will not, have older versions of the art suite available for AL9.  Therefore we cannot support versions of Offline on AL9 that require older versions of art.


There are two cases to consider:
There are two cases to consider:


#  If you use a working clone of Offline that you build yourself, and if uses an envset older than p056, then your working code will not build on AL9.  To check, look in your clone of Offline at the file .muse and look for the line that begins ENVSET.
#  If you use a working clone of Offline that you build yourself, and if uses an envset older than p055, then your working code will not build on AL9.  To check, look in your clone of Offline at the file .muse and look for the line that begins ENVSET.
# If you use a backing musing of check the version.  Do do "ls -l backing" and see what the symlink points to.  The versions that are know to work on AL9 are Offline v10_29_00 and SimJob/MDC2020ae.   
# If you use a backing musing of check the version.  Do do "ls -l backing" and see what the symlink points to.  The versions that are know to work on AL9 are Offline v10_28_00, v10_29_00, SimJob/MDC2020ac, and SimJob/MDC2020ae.   


In either case, if you are using an older version you will need to advance your work to one of the known-to work versions.  It's hard to give an punchlist to do the upgrade because different people use git in slightly different ways.  The concept is:
In either case, if you are using an older version you will need to advance your work to one of the known-to work versions.  It's hard to give an punchlist to do the upgrade because different people use git in slightly different ways.  The concept is:


# Save your current work on branch that is NOT named "main" and push it to your GitHub fork.
#* If you have been working on main, "checkout -b new_branch_name main" will do what you need.
# Make sure that your working clone contains the https://github.com/Mu2e/Offline as a remote.
# Make sure that your working clone contains the https://github.com/Mu2e/Offline as a remote.
# Start a new working branch from the main branch of Mu2e/Offline.
# Start a new working branch from the head of the main branch of Mu2e/Offline.
# Merge your old working branch into your new working branch, resolve conflicts, build and test.
# Merge your old working branch into your new working branch, resolve conflicts, build and test.
# Push your new working branch to your GitHub fork.
# Push your new working branch to your GitHub fork.




If you are in this situation, please port your code to work from the current head of Offline.  Please do so as soon as possible so that you are not caught when the remaining machines are upgraded to AL9.  If you don't know how to do this, ask an expert.
Please do so as soon as possible so that you are not caught when the remaining machines are upgraded to AL9.  If you don't know how to do this, ask an expert.


==Running Grid Jobs==
==Running Grid Jobs==


Grid jobs run in containers.  The jobsub default is to choose the most recent Fermilab supplied AL9 container.  The mu2eprodsys default is to choose the most recent Fermilab supplied AL9 container.  To tell mu2eprodsys to omit the override of the jobsub default and to let the jobsub default stand, use the mu2eprodsys option:
Grid jobs run in containers.  The jobsub default is to choose the most recent Fermilab supplied AL9 container.  The mu2eprodsys default is to choose the most recent Fermilab supplied SL7 container.  To tell mu2eprodsys to let the jobsub default stand, use the mu2eprodsys option:


   --predefined-args=none
   --predefined-args=none
Line 84: Line 86:
   --setup=/cvmfs/mu2e.opensciencegrid.org/Musings/SimJob/MDC2020ae/setup.sh
   --setup=/cvmfs/mu2e.opensciencegrid.org/Musings/SimJob/MDC2020ae/setup.sh


If the Musing contains both SL7 and AL9 builds, this will work on either type of node.  If it contains only one build, you must be sure to chose the OS that matches the available build.
If the Musing contains both SL7 and AL9 builds, this will work on either type of container.  If it contains only one build, you must be sure to use --predefined-args to chose the container type that matches the available build.


==Existing Grid Tarballs==
==Existing Grid Tarballs==
Line 92: Line 94:
I believe, but have not tested, that in the AL9 only era you will be able to make grid tarball from and existing SL7 build and it will work on the grid.
I believe, but have not tested, that in the AL9 only era you will be able to make grid tarball from and existing SL7 build and it will work on the grid.


==Legacy codes: SU2020, MDC2018 etc==
==Legacy Offline builds: SU2020, MDC2018 etc==


The older art suite versions necessary to support legacy Mu2e codes will not be available on AL9.  If there are vital features in these older codes that are not available in the current Offline, the right answer is to port these features to current Offline.
The older art suite versions necessary to support legacy Mu2e codes will not be available on AL9.  If there are vital features in these older codes that are not available in the current Offline, the right answer is to port these features to current Offline.
==Stntuple==
As of May 30, 2024, the Stntuple build scripts have not been updated to be complient with spack.  Therefore they will not work on AL9.  The author of Stnutple has been notified.

Latest revision as of 16:44, 3 June 2024

Countdown to SL7 End of Life

Executive Summary

As of May 30, 2024, some of our interactive machines run the old Scientific Linux 7 (SL7) operating system and some run the new Alma Linux 9 (AL9) operating system. Sometime in June 2024, exact date TBA, the last of the Mu2e interactive machines will be converted from SL7 to AL9. The purpose of this page is to discuss the transition to AL9. There are lots of corner cases so it's long.

If you are logged into a Mu2e interactive machine you can check which OS it runs with:

/etc/redhat-release

You will see one of:

 Scientific Linux release 7.9 (Nitrogen)
 AlmaLinux release 9.4 (Seafoam Ocelot)

As of May 30, 2024 there are only 4 machines are still running SL7: see ComputingLogin#Machines.

Below is a summary of what to expect:

  1. Code built on one type of machine will not run on the other. So you will need to rebuild your code when you start to work to work on an AL9 machine.
    1. It will be possible to run SL7-built code interactively on an AL9 machine by working within a SL7 container but our strong preference is that you not do this and instead port your work to AL9 as discussed on this page. We will support use of interactive containers only for very for well motivated use cases.
  2. All existing SL7 builds, including grid tarballs, will be runnable on the grid for a long time into the future. Grid jobs run in containers and SL7 containers remain supported.
  3. All files written on SL7 will be readable on AL9; this includes .art and .root files.
  4. root based analysis of TrkAna and Stnutple files will continue to work, regardless of where the input files were produced.
  5. You can already build and run your Mu2e code on our AL9 machines. Just log into an AL9 machine and try. The procedure is exactly the same as on SL7. Details below.
  6. There are both SL7 and AL9 builds in the musings for Offline v10_29_00, MDC2020ae and recent git CI builds, for example main/4b7e9fcb . Please start to use them.
  7. Future musings will always have an AL9 build. The next musings Offline v10_30_00 and MDC2020af will have both SL7 and AL9 builds. We will soon stop producing SL7 builds for musings. Watch for an announcement.
  8. Our strong preference is that you switch to using one of the AL9 interactive machines at your earliest convenience. See below for details. If you have issues please use the slack channel is_it_me_or_a_bug.

The sections below give more details.

Introduction

The rest of this page discusses work that you will need to do to transition from working on a SL7 machine to an AL9 machine. If you have questions about how to do this work, please do one the following. The preferred option is to post on the is_it_me_or_a_bug channel in Mu2e Slack. You can also contact any of Rob, Ray and Dave.

Muse Working Area with a Recent Offline

If you have a Muse working area with a recent version of Offline, either a local clone or via a backing release, follow the procedure below:

  1. Do a clean login into an AL9 node.
  2. cd to a muse working area that contains a working SL7 build
  3. Check that the ENVSET line Offline/.muse is p056 or higher. If not, go to the section Muse Working Area with an Older Offline.
  4. mu2einit
  5. mu2eQuota # make sure that you have at least 3.5 GB free on /exp/mu2e/app; if not, clean up.
  6. muse setup
  7. muse build -j N # where N is 6 for otherwise empty mu2egpvm machines; N=24 is good for mu2ebuild02.
  8. Run a test job

If you try to run mu2e before doing the build you will see a message that says that the command "mu2e" was not found. That's because it's looking for the command mu2e in the not-yet-built al9 build area.

After you have done the "muse build", look in the build subdirectory of your Muse working area. You will see that it contains subdirectories with names like "al9-prof-e28-p056" or "sl7-prof-e28-p056". You may recursively delete the any of the sl7-* directories.

Muse Working Area with an Older Offline

As of May 30, on AL9 we have art suite binaries only for art suite v3_14_03. We expect to have binaries for art suite v3_15_00 during the week of May 3, 2024. We expect to have another new art built with root v6_32_00 not too long after that. We do not, and will not, have older versions of the art suite available for AL9. Therefore we cannot support versions of Offline on AL9 that require older versions of art.

There are two cases to consider:

  1. If you use a working clone of Offline that you build yourself, and if uses an envset older than p055, then your working code will not build on AL9. To check, look in your clone of Offline at the file .muse and look for the line that begins ENVSET.
  2. If you use a backing musing of check the version. Do do "ls -l backing" and see what the symlink points to. The versions that are know to work on AL9 are Offline v10_28_00, v10_29_00, SimJob/MDC2020ac, and SimJob/MDC2020ae.

In either case, if you are using an older version you will need to advance your work to one of the known-to work versions. It's hard to give an punchlist to do the upgrade because different people use git in slightly different ways. The concept is:

  1. Save your current work on branch that is NOT named "main" and push it to your GitHub fork.
    • If you have been working on main, "checkout -b new_branch_name main" will do what you need.
  2. Make sure that your working clone contains the https://github.com/Mu2e/Offline as a remote.
  3. Start a new working branch from the head of the main branch of Mu2e/Offline.
  4. Merge your old working branch into your new working branch, resolve conflicts, build and test.
  5. Push your new working branch to your GitHub fork.


Please do so as soon as possible so that you are not caught when the remaining machines are upgraded to AL9. If you don't know how to do this, ask an expert.

Running Grid Jobs

Grid jobs run in containers. The jobsub default is to choose the most recent Fermilab supplied AL9 container. The mu2eprodsys default is to choose the most recent Fermilab supplied SL7 container. To tell mu2eprodsys to let the jobsub default stand, use the mu2eprodsys option:

 --predefined-args=none

Sometime soon we will update mu2eprodsys to leave AL9 as the default. Watch for an announcement. After this, you will be able to select a SL7 container with the mu2eprodsys option

 --predefined-args=sl7

If you are submitting a grid tarball be sure to choose the OS that matches the OS on which the tarball was built.

If you submit a grid job that runs on an existing musing, you use a mu2eprodsys option like:

 --setup=/cvmfs/mu2e.opensciencegrid.org/Musings/SimJob/MDC2020ae/setup.sh

If the Musing contains both SL7 and AL9 builds, this will work on either type of container. If it contains only one build, you must be sure to use --predefined-args to chose the container type that matches the available build.

Existing Grid Tarballs

If you have existing grid tarballs built for sl7, you will still be able to run them on the grid for a long time to come. To do this, request an SL7 container when you submit your job; see #Running Grid Jobs.

I believe, but have not tested, that in the AL9 only era you will be able to make grid tarball from and existing SL7 build and it will work on the grid.

Legacy Offline builds: SU2020, MDC2018 etc

The older art suite versions necessary to support legacy Mu2e codes will not be available on AL9. If there are vital features in these older codes that are not available in the current Offline, the right answer is to port these features to current Offline.

Stntuple

As of May 30, 2024, the Stntuple build scripts have not been updated to be complient with spack. Therefore they will not work on AL9. The author of Stnutple has been notified.