Difference between revisions of "SatelliteRelease"

From Mu2eWiki
Jump to navigation Jump to search
(Created page with "==Introduction== This page discusses a system called satellite builds or satellite releases that is used to support the following two use cases: #You want to develop a new s...")
 
Line 215: Line 215:
  
 
*Make the test for an existing setup release robust against spurious differences due to different logical paths that resolve to the same physical path.
 
*Make the test for an existing setup release robust against spurious differences due to different logical paths that resolve to the same physical path.
 +
 +
[[Category|Computing]]
 +
[[Category|Computing/Code]]

Revision as of 21:04, 24 March 2017

Introduction

This page discusses a system called satellite builds or satellite releases that is used to support the following two use cases:

  1. You want to develop a new study for which you will create a few new art modules and, perhaps, a few classes and functions that are used by these modules; then you will run an instance of Mu2e Offline that uses your new module. Your new code may build against Mu2e Offline but no code in Mu2e Offline may build against your new files.
  2. You want to develop modifications to a small part of Mu2e Offline. You would like to rebuild only the parts of Mu2e Offline that you modify, plus all of the parts that depend on your modifications.

In both cases it is convenient to be able to build just your new code without having to clone and build the entirety of Mu2e Offline. Compared to building all of Mu2e Offline, the conveniences are:

  • Each build is faster, particularly the first build.
  • Your working area will occupy much less disk space.
  • Your working space is not littered with files that are not relevant to the task at hand.

This page will discuss some new features of Mu2e Offline that are designed to support these use cases. When we are emphasizing the build system aspects we will refer to satellite builds; when we are emphasizing the release management aspects we will refer to satellite releases.

While we have a prototype for the build system we do not yet have good support for release management. See the section on missing features.

Quick Start

This section gives an cookbook example of creating, building and running a satellite release that will work on any of the mu2egpvm* machines. These instructions work only for version of Mu2e Offline v5_7_7 and higher.

The example below:

  • uses a git repository that contains one module, an SConscript file and a fcl file to run it.
  • builds against a base release of Offline v5_7_7 built for SLF6 with the prof qualifier.
  1. Start a new terminal window.
  2. Choose a clean working directory. Either your home directory or /mu2e/app/users will work:
    mkdir ~/satellite-test
    cd ~/satellite-test
    

    This example shows how to work in your home directory. If you wish to submit grid jobs from this directory, you should put this directory in your space on /mu2e/app/users/<username>.

  3. Clone the satellite release code and cd into it
    git clone http://cdcvs.fnal.gov/projects/mu2eofflinesoftwaremu2eoffline-satellite satellite
    cd satellite
    

    This created four files:

    .gitignore
    Example/fcl/printGens.fcl
    Example/src/PrintGens_module.cc
    Example/src/SConscript
    

    This code+fcl will print the content of a GenParticleCollection data product with the input tag "generate".

  4. Setup the Mu2e environment:
    setup mu2e
    
  5. Create the satellite release management files. This example shows how to use Offline version v5_7_7 as base release. If you wish to use a different base release, run the bin/createSatelliteRelease script from that base release:
    /cvmfs/mu2e.opensciencegrid.org/Offline/v5_7_7/SLF6/prof/Offline/bin/createSatelliteRelease
    

    If you get a message asking you if you want to continue or abort, see the full description below.

  6. The previous step created one new file and two new symlinks in the current directory:
    ls -la
    ( files you have seen before removed )
    lrwxrwxrwx 1 kutschke 3000   75 Jul 11 16:01 SConstruct -> /cvmfs/mu2e.opensciencegrid.org/Offline/v5_7_7/SLF6/prof/Offline/SConstruct
    lrwxrwxrwx 1 kutschke 3000   54 Jul 11 16:01 out -> /mu2e/data/users/kutschke/satellite-test/satellite/out
    -rw-rw-r-- 1 kutschke 3000  446 Jul 11 16:01 setup.sh
    

    If the base release contains a .buildopts file, this step will also create a symlink in the current to directory to that .buildopts file.

  7. If in step 4 you chose to use a version of Mu2e Offline numbered v6_0_0 or greater, you can skip this step. (this includes the g4v10_validation_* releases). If you are using an release of Mu2e Offline between v5_7_7 and v5_7_9, inclusive, do the following:
    git checkout -b work v5_7_7     
    

    In this step you are checking out a version of the example code that is matched to these releases of Mu2e Offline.

  8. Setup this satellite release.
    source setup.sh
    
  9. Build and run the example
    scons -j 4
    mu2e -c Example/fcl/printGens.fcl -n 10 \
    -s /pnfs/mu2e/tape/phy-sim/sim/mu2e/cd3-beam-g4s4-detconversion/v566/1d/ab/sim.mu2e.cd3-beam-g4s4-detconversion.v566.004001_00006825.art
    

Comments:

  • Steps 3 and 4 are independent and can be done in either order.
  • You can run the createSatelliteRelease script on an empty directory and create code by hand; remember to provide an SConscript file.

Choosing a Base Release

The first step is to choose a base release against which you will build your satellite release; a base release is any complete build of Mu2e Offline. If you are running on a machine that sees the mu2e cvmfs file system, you will normally choose the most recent release of Mu2e Offline on cvmfs. You also need to learn the operating system on which you are running and you need to decide if you want to use a prof or debug build. At this writing (July 2016) all of the mu2egpvm* machines run SLF6 and detsim is SL5. If you are on another machine you can learn which SLF version you are using by:

cat /etc/redhat-release

At this writing (July 2016) the most recent SLF6 base release with a prof build is:

/cvmfs/mu2e.opensciencegrid.org/Offline/v5_7_7/SLF6/prof/Offline

Everything on this page uses the above base release as an example.

You may use any complete build of Mu2e Offline as a base release. For example, if you cloned Mu2e Offline into:

/mu2e/app/users/<username>/Offline

and built it, then you can use it as a base release.

Creating a Satellite Release

On any of the mu2egpvm machines, you create a satellite release as follows:

setup mu2e
/cvmfs/mu2e.opensciencegrid.org/Offline/v5_7_7/SLF6/prof/Offline/bin/createSatelliteRelease

Note that the location of the script is in "bin/createSatelliteRelease" relative to the base release chosen in the previous section. This script will create the following files:

lrwxrwxrwx 1 kutschke mu2e     75 Jul 10 17:03 SConstruct -> /cvmfs/mu2e.opensciencegrid.org/Offline/v5_7_7/SLF6/prof/Offline/SConstruct
lrwxrwxrwx 1 kutschke mu2e     54 Jul 10 17:03 out -> /mu2e/data/users/kutschke/Development/Tutorial/t01/out
-rw-rw-r-- 1 kutschke mu2e    446 Jul 10 17:03 setup.sh

If the base release contains a .buildopts file, the satellite release directory will also contain a symbolic link to that .buildopts file. A discussion of these files follows shortly.

To use the satellite release there is one more step:

source setup.sh

Look at the setup.sh file. You will see that it sources the setup script from the base release and then adds the local bin and lib directories to the relevant path variables (it's OK that these directories do not yet exist). SConstruct and, if present, .buildopts, are the rules to build your code; this configuration says to use the same rules as those found in the base release.

You should put all output files, such as event-data files, root files and log files in the out subdirectory. One of the rules for using Mu2e disks is that code disks, such as /mu2e/app, should contain only source code and binaries built from that source code. You may not put output files such as event-data files, root files, log files on code disks; it's also not a good idea to put them on your home disk. They belong on a data disk such as /mu2e/data/users /pnfs/mu2e/scratch/users.

To facilitate this, the createSatelliteRelease script creates an output directory in your disk space on /mu2e/data/users and makes a symbolic link from the current working directory to that output directory. The algorithm for choosing the name of the working directory tries to be smart enough so that that you can have many satellite releases on /mu2e/app and each will have a uniquely named output directory.

If that directory already exists, you will see this message:

The output directory already exists:  /mu2e/data/users/kutschke/satellite-test/satellite/out

You have two choices:
   1) Continue: be aware that new output files may overwrite old files
   2) Abort this script: you can mv or rm that directory and rerun this script

Enter c to continue anything else to abort:

If you are comfortable reusing this directory, then type c and return; createSatelliteRelease will complete its work and will make a symlink named "out" that points to this directory. If you are not comfortable reusing this directory, then type any character except c, and return. This check is made before createSatelliteRelease creates any files. At this time you can take whatever action is necessary to resolve the situation, perhaps renaming the output directory. Then rerun createSatelliteRelease.

If you wish to put output files on a disk other than /mu2e/data, you can specify a new disk with the environment variable MU2E_DATA_USERS. If the createSatelliteRelease script cannot see the designated data disk, it will simply make a directory named out in the local working directory.

Logging in again

When you log out and log in again, you do not need to recreate the satellite release. You need only:

setup mu2e
source setup.sh

Safety Features in createSatelliteRelease

createSatelliteRelease has the following safety features

  • If the enviornment variable MU2E_BASE_RELEASE exists and if it points to any directory other than the requested base release, then an informational message is printed and the script exits. If the environment variable points to the requested base release, the script continues normally. The script is not smart enough to deal with different logical paths that evaluate to the same physical path.
  • If any of the files SConstruct, .buildopts, or setup.sh exists, then createSatelliteRelease will print an informational message and exit.
  • If the output directory already exists, createSatelliteRelease will notify you and prompt to ask if you wish to continue or abort. If you continue the directory will not be modified. However you should be aware that any new files you create may overwrite files that you wrote previously.

Naming of .so Files

In Offline v5_7_7 and earlier, the build system decorates module .so filenames differently depending on whether they were made as part of a base release or as part of a satellite release. In a base release, .so filenames start with "libmu2e_"; in a satellite release they start with "libmu2euser_". Starting on the evening of July 20, the scripts at the HEAD of the git master branch decorate both satellite release and base release .so files with "libmu2e_".

Cleaning a Satellite Release

If you want to disassociate your satellite release from its base release, use the command:

cleanSatelliteRelease

This command will be in your path once you have:

setup mu2e
source setup.sh

When you run cleanSatelliteRelease it will do the following, prompting you before each step:

  • scons -c
  • Remove any of the following files that are present: setup.sh, SConstruct, .sconsign.dblite, .buildopts.

It does not touch the symbolic link ( or subdirectory ) out. The lib directory remains but will be empty.

You can then connect it to a new base release by:

Log out
Log in again
cd to the  working directory
setup mu2e
/path/to/a/new/base/release/bin/createSatelliteRelease

Obsolete Scripts

The following files are present in Mu2e Offline as of version v5_7_7 but are obsolete and will be removed.

The following script was intended to do a cleanSatelliteRelease and createSatelliteRelease (pointing at a new base release) in a single step. It does not work, indeed cannot work: it is necessary to log out between the steps or to do the second step in a new terminal window.

Offline/bin/rebindSatelliteRelease

The following three files are left over from a previous attempt at a satallite releease system and are now obsolete:

bin/cleanTestRelease
bin/createTestRelease
bin/rebindTestRelease

Missing Features

  • In case 1) from the Introduction, if the results from your code become part of the institutional memory of Mu2e, then your new code must become part of the official code base. For now we suggest that you simply add it to Mu2e Offline when it matures. Be sure to add it to Mu2e Offline before you produce any offical results; always report the git SHA or tag that you used to produce any officail results. In the future we plan to support some number of "analysis git repositories" which would be separate from Mu2e Offline; these repositories would be a natural spot for this sort of code.
  • In case 2) from the Introduction, your code must be committed back to Mu2e Offline prior to producing any official results; always report the git SHA or tag that you used to produce the official result. In some circumstances it may be appropriate to add your revisions on a git branch.
  • For case 2: we do not yet have a system in place to ensure that you correctly identify and rebuild all Mu2e Offline code that depends on the code you are modifying. At this point you just gotta know. We have several ideas; we should put these ideas on the table, do a quick search for other ideas and make a decision.
  • jobsub has support for making a gzipped tar file of a satellite release and distributing it to all worker nodes. Mu2e has not yet added the features to the mu2eart command to support this feature.
  • Need a command line argument or environment variable to control the value given to the scons -j option. We will want to use a higher number for multi-core machines and smaller number for single core machines.
  • Make the test for an existing setup release robust against spurious differences due to different logical paths that resolve to the same physical path.

Computing Computing/Code