SearchPaths

From Mu2eWiki
Jump to navigation Jump to search


Introduction

Mu2e Offline supports the concept of a search path for most files that configure either the behaviour of art itself or that of the Mu2e Offline code. This means that you can specify a partial path to such a file and the code will search for that file in an ordered list of places. The code will traverse the list of places and look for the file in each place. As soon as it successfully finds the file it will open that file and continue.

What happens if the file is present at more than one place on the list? The list is ordered and the code declares success on the first match; it never looks to see if there is more than one match.

If the code cannot find a match, it throws an exception.

A search path is specified in an environment variable as a colon separated list of directories. This is in the same spirit as the well known environment variables PATH, LD_LIBRARY_PATH and PRODUCTS.

Mu2e has chosen to configure the search algorithm so that, with one exception, absolute paths are forbidden. That exception is for the -c command line argument of the mu2e command. If an absolute path is specified in any other context the code will throw an exception. Mu2e has also chosen that the search algorithm will not look for files relative to the current working directory unless that directory is included in the search path. These are safety features to ensure that production campaigns can only reference configuration files that are source-code controlled.


For those not familiar with the concept of a search path, the foillowing example explains it. Suppose that we wish to specify a file:

/A/B/C/D/E/F.txt

And the setup scripts have defined:

export SEARCH_PATH=/A/B/I/:/A/G/I:/A/B/C

If we search for the file "F.txt" it will not be resolved because the system only looks for the following files: "A/B/I/F.txt", "A/G/I/F.txt", "A/B/C/F.txt". To find the file of interest with the path above, one must ask for "D/E/F.txt", which will find a match on the last element. Alternatively, if the search path were

export SEARCH_PATH=/A/B/I/:/A/G/I:/A/B/C/D/E

then "F.txt" would match with the last element.


Three important classes of files are not covered by this policy: event-data input files, event-data output files and the root output file managed by the TFileService.

FHICL_FILE_PATH

This environment variable is used by art to find .fcl files.

When you run art, it looks for the .fcl file named with the -c argument in the current working directory; if that fails it looks for the file relateive to the search path defined by the environment variable FHICL_FILE_PATH. When art looks for files referenced by #include directives it only looks for files relative to FHICL_FILE_PATH; it does NOT look for files relative to the current working directory unless that directory is included in FHICL_FILE_PATH (which it normally is). The definition of FHICL_FILE_PATH is done when you issue the command "muse setup". When you issue that command in a Mu2e working directory without a backing release and with Offline cloned in your Muse working area, FHICL_FILE_PATH is set to

export FHICL_FILE_PATH=${MUSE_BUILD_DIR}:${MUSE_WORK_DIR}

Why is MUSE_BUILD_DIR included? This allows for complex fcl files to be built by scripts that are run during "muse build". Such files are located in the build area. A Muse working directory without a backing release and without Offline is not a common thing to do; if you have such a working area, the FHICL_FILE_PATH contains only ${MUSE_WORK_DIR}.


When you do "muse setup" in a working area that contains a backing release, muse defines FHICL_FILE_PATH by the following rule:

  1. The first element is the muse working directory
  2. The remaining elements of the path are those that would be present if you did "muse setup" in the backing directory.
  3. For nested backing release, the previous element is recursive. The spirit is, the higher a directory is in the backing heirarchy, the earlier it is in FHICL_FILE_PATH.

FHiCL is configured to allow an absolute path for the .fcl file specified on the command line but, for paths to included files, FHiCL only allows paths relative to FHICL_FILE_PATH. This is a safety feature to ensure that production campaigns only use .fcl files that are source code controlled.

There are two FHiCL prolog files that are included in many .fcl files:

#include "fcl/minimalMessageService.fcl"
#include "fcl/standardProducers.fcl"

As time goes on other such files may be defined.


As we get experience with FHiCL we will consider adjustments to these policies. The policies will be as open as possible during our development phase, with the constraint that we understand how to ensure a strict audit trail when the time comes for large scale production.

MU2E_SEARCH_PATH

Mu2e code uses this environment variable to search for auxilliary files. The environment variable is defined when you run "mu2e setup". The variable is defned to be

export MU2E_SEARCH_PATH=${FHICL_FILE_PATH}:${MU2E_DATA_PATH};

where MU2E_DATA_PATH normally points to /cvmfs/mu2e.opensciencegrid.org/DataFiles/ but can be configured to other values if /cvmfs is not visible.

Auxiliary files include:

  1. Any file that is parsed by SimpleConfig. This includes:
    • The geometry file, read by the GeometryService.
    • Some of the old style event generator run-time configuration files; newer event generator modules use fcl to get their configuration.
    • Old style conditions data that is still read with the ConditionsService. It will eventually be migrated to the ProductionsService and these files will be removed.
    • Any files included into the above three files using #include.
  2. The magnetic field maps, read by the BFieldManagerMaker.
  3. The particle data table files, read by the ParticleDataList class.
  4. The G4 macro file optionally read by G4_plugin.cc
  5. Some probability distribution functions represented as binned data.
  6. Configuration data used by Ai/ML inference code.
  7. Input particles in G4beamline format, read using EventGenerator/inc/FromG4BLFile.hh


Most run-time configuration files should be found in one of Mu2e repositories so that they are under source code management and can evolve along with the code that reads them. Large files that are not tied to a particular code version, such as magnetic field maps should be found under $MU2E_DATA_PATH.

CET_PLUGIN_PATH

art uses CET_PLUGIN_PATH to look for libraries that contain either art plugins or root dictionaries. art plugins include modules, services and tools. On startup art scans CET_PLUGIN_PATH to find all such files. It loads the dictionaries and it records where to find plugin libraries for use when the job fcl requests to load a plugin.

Very early versions of art used LD_LIBRARY_PATH for this purpose. The are team changed this to use a new environment variable, CET_PLUGIN_PATH for two reasons:

  1. To reduce startup time by reducing the number of directories that had to be searched
  2. In preparation for builds using RPATHS, for which LD_LIBRARY_PATH is not necessary.

Hint for Looking at Search Paths

If you want to look at the definition of a search path you can just echo the environment variable. However this prints everything in one line, which is often difficult to read. You make make a more readable view using, for example:

echo $FHICL_FILE_PATH | tr : \\n

which replaces the colons in the path with newline characters. See man tr.