SearchPaths: Difference between revisions

From Mu2eWiki
Jump to navigation Jump to search
No edit summary
Line 11: Line 11:
code declares success on the first match; it never looks to see if there is more than one match.
code declares success on the first match; it never looks to see if there is more than one match.


If the code cannot find a match, it throws.
If the code cannot find a match, it throws an exception.


A search path is specified in an environment variable as colon separated list of directories.
A search path is specified in an environment variable as a colon separated list of directories.
This is in the same spirit as the well known environment variables PATH, LD_LIBRARY_PATH and PRODUCTS.
This is in the same spirit as the well known environment variables PATH, LD_LIBRARY_PATH and PRODUCTS.


The search algorithm treats absolute paths specially.  The code can be configured so that, for absolute
Mu2e has chosen to configure the search algorithm so that, with one exception, absolute paths are
paths, it ignores the search path and simply looks to see if there is a file at the absolute path.
forbidden. That exception is for the -c command line argument of the mu2e command.  If an absolute
If there is no such file, it throwsAlternatively it can be configured to disallow absolute paths
path is specified in any other context the code will throw an exception.   
and to only look for files relative to the search path; in this case it throws as soon as it sees
Mu2e has also chosen that the search algorithm will not look for files relative to the current working
an absolute path.  This last feature is useful when the program is being used to in production runs
directory unless that directory is included in the search path.
for which it is important to maintain a strict audit trail. Mu2e Offline is normally configured to
These are safety features to ensure that production campaigns can only reference configuration files that are source-code controlled.
disallow absolute paths; this is to make sure that we do not have a lot of work to do when we
start production.


The search algorithm can also be configured to treat paths with a leading "." with the same rules
as apply to absolute paths.


Both tools have the same policy, that the file must be found exactly by adding the requested relative
For those not familiar with the concept of a search path, the foillowing example explains it.
path onto each element of the search path. Suppose that we wisht to specify a file:
Suppose that we wish to specify a file:
<pre>
<pre>
/A/B/C/D/E/F.txt
/A/B/C/D/E/F.txt
</pre>
</pre>
And the setup scripts have defined:
And the setup scripts have defined:
<pre>
<pre>
export MU2E_FILE_PATH=/A/B/I/:/A/G/I:/A/B/C
export SEARCH_PATH=/A/B/I/:/A/G/I:/A/B/C
</pre>
</pre>
If we search for the file "F.txt" it will not be resolved because the system only looks for the following
If we search for the file "F.txt" it will not be resolved because the system only looks for the following
Line 42: Line 38:
path were
path were
<pre>
<pre>
export MU2E_FILE_PATH=/A/B/I/:/A/G/I:/A/B/C/D/E
export SEARCH_PATH=/A/B/I/:/A/G/I:/A/B/C/D/E
</pre>
</pre>
then "F.txt" would match with the last element.
then "F.txt" would match with the last element.




Three important files are not covered by this policy:
Three important classes of files are not covered by this policy:
event-data input files, event-data output files and the root file managed
event-data input files, event-data output files and the root output file managed
by the TFileService.  These files are managed by other facilties which
by the TFileService.
allow only two options: an absolute path or a path that is relative to the
current working directory at the time of execution.


==FHICL_FILE_PATH==
==FHICL_FILE_PATH==
Line 84: Line 78:
==MU2E_SEARCH_PATH==
==MU2E_SEARCH_PATH==


The meaning of this environment variable is defined by Mu2e.  The value is set in setup.sh.
Mu2e code uses this environment variable to search for auxilliary files.
The environment variable is defined when you run "mu2e setup".  The variable is defned to be


All of the non-fcl run-time configuration and one special data stream search for their files using
export MU2E_SEARCH_PATH=${FHICL_FILE_PATH}:${MU2E_DATA_PATH};
the environment variable MU2E_SEARCH_PATH.  This distinction is a historical artifact: the formats
of these other configuration files, and the tools to read them, were defined before FHiCL was
created.


The class that supports this functionality is
where MU2E_DATA_PATH normally points to /cvmfs/mu2e.opensciencegrid.org/DataFiles/ but can be configured to other
<code>Mu2eUtilities/inc/ConfigFileLookupPolicy.hh</code>; under the covers
values if /cvmfs is not visible.
it uses technology from cetlib. Throughout the Mu2e Offline documentation,
 
we will refer to this functionality as
Auxiliary files include:
"the file lookup policy".  This feature is used in:
<ol>
<ol>
  <li> Any file that is parsed by SimpleConfig.  This includes:
  <li> Any file that is parsed by SimpleConfig.  This includes:
Line 105: Line 96:
       </ul>
       </ul>
  <li> The magnetic field maps, read by the BFieldManagerMaker.
  <li> The magnetic field maps, read by the BFieldManagerMaker.
  <li> The particle data table files, read by the ParticleDataTable class.
  <li> The particle data table files, read by the ParticleDataList class.
  <li> The G4 macro file optionally read by G4_plugin.cc
  <li> The G4 macro file optionally read by G4_plugin.cc
  <li> The beam arrival time distribution read by FoilParticleGenerator( This should be moved into the conditions service ).
  <li> Some probability distribution functions represented as binned data.
  <li> Configuration data used by Ai/ML inference code.
  <li> Input particles in G4beamline format, read using <code>EventGenerator/inc/FromG4BLFile.hh</code>
  <li> Input particles in G4beamline format, read using <code>EventGenerator/inc/FromG4BLFile.hh</code>
</ol>
</ol>


In the following, it is presumed that the reader is familiar with the ideas
of [[SatelliteRelease|base releases and satellite releases]].
The environment variable MU2E_SEARCH_PATH is defined in setup.sh:
<pre>
export MU2E_SEARCH_PATH=$MU2E_BASE_RELEASE:$MU2E_DATA_PATH;
</pre>
The idea is that the code will search for files first in the base release and then in $MU2E_DATA_PATH.
Most run-time configuration files should be found under $MU2E_BASE_RELEASE.  Large files
that do not tied to a particular code version, such as magnetic field maps should
be found under $MU2E_DATA_PATH.
For Satellite releases the definition is prefixed with the Satellite release:
<pre>
export MU2E_SEARCH_PATH=$MU2E_SATELLITE_RELEASE:$MU2E_BASE_RELEASE:$MU2E_DATA_PATH;
</pre>
The code will first look in the satellite release, then in the base release and lastly in MU2E_DATA_PATH.


Most run-time configuration files should be found in one of Mu2e repositories so that they are under source code management
and can evolve along with the code that reads them.  Large files
that are not tied to a particular code version, such as magnetic field maps should be found under $MU2E_DATA_PATH.


[[Category:Computing]]
[[Category:Computing]]
[[Category:Code]]
[[Category:Code]]

Revision as of 16:00, 8 September 2023


Introduction

Mu2e Offline supports the concept of a search path for most files that configure either the behaviour of art itself or that of the Mu2e Offline code. This means that you can specify a partial path to such a file and the code will search for that file in an ordered list of places. The code will traverse the list of places and look for the file in each place. As soon as it successfully finds the file it will open that file and continue.

What happens if the file is present at more than one place on the list? The list is ordered and the code declares success on the first match; it never looks to see if there is more than one match.

If the code cannot find a match, it throws an exception.

A search path is specified in an environment variable as a colon separated list of directories. This is in the same spirit as the well known environment variables PATH, LD_LIBRARY_PATH and PRODUCTS.

Mu2e has chosen to configure the search algorithm so that, with one exception, absolute paths are forbidden. That exception is for the -c command line argument of the mu2e command. If an absolute path is specified in any other context the code will throw an exception. Mu2e has also chosen that the search algorithm will not look for files relative to the current working directory unless that directory is included in the search path. These are safety features to ensure that production campaigns can only reference configuration files that are source-code controlled.


For those not familiar with the concept of a search path, the foillowing example explains it. Suppose that we wish to specify a file:

/A/B/C/D/E/F.txt

And the setup scripts have defined:

export SEARCH_PATH=/A/B/I/:/A/G/I:/A/B/C

If we search for the file "F.txt" it will not be resolved because the system only looks for the following files: "A/B/I/F.txt", "A/G/I/F.txt", "A/B/C/F.txt". To find the file of interest with the path above, one must ask for "D/E/F.txt", which will find a match on the last element. Alternatively, if the search path were

export SEARCH_PATH=/A/B/I/:/A/G/I:/A/B/C/D/E

then "F.txt" would match with the last element.


Three important classes of files are not covered by this policy: event-data input files, event-data output files and the root output file managed by the TFileService.

FHICL_FILE_PATH

This environment variable is used by art to find .fcl files.

When you run art, it looks for the .fcl file named with the -c argument in the current working directory; if that fails it looks for the file relateive to the search path defined by the environment variable FHICL_FILE_PATH. When art looks for files referenced by #include directives it only looks for files relative to FHICL_FILE_PATH; it does NOT look for files relative to the current working directory unless that directory is included in FHICL_FILE_PATH (which it normally is). The definition of FHICL_FILE_PATH is done when you issue the command "muse setup". When you issue that command in a Mu2e working directory without a backing release it is set to

export FHICL_FILE_PATH=${MUSE_BUILD_DIR}:${MUSE_WORK_DIR}

Why is MUSE_BUILD_DIR included? This allows for complex fcl files to be built by scripts that are run during "muse build". Such files are located in the build area.

When you do "muse setup" in a working area that contains a backing release, muse defines FHICL_FILE_PATH by the following rule:

  1. The first element is the muse working directory
  2. The remaining elements of the path are those that would be present if you did "muse setup" in the backing directory.
  3. For nested backing release, the previous element is recursive. The spirit is, the higher a directory is in the backing heirarchy, the earlier it is in FHICL_FILE_PATH.

FHiCL is configured to allow an absolute path for the .fcl file specified on the command line but, for paths to included files, FHiCL only allows paths relative to FHICL_FILE_PATH. This is a safety feature to ensure that production campaigns only use .fcl files that are source code controlled.

There are two FHiCL prolog files that are included in many .fcl files:

#include "fcl/minimalMessageService.fcl"
#include "fcl/standardProducers.fcl"

As time goes on other such files may be defined.


As we get experience with FHiCL we will consider adjustments to these policies. The policies will be as open as possible during our development phase, with the constraint that we understand how to ensure a strict audit trail when the time comes for large scale production.

MU2E_SEARCH_PATH

Mu2e code uses this environment variable to search for auxilliary files. The environment variable is defined when you run "mu2e setup". The variable is defned to be

export MU2E_SEARCH_PATH=${FHICL_FILE_PATH}:${MU2E_DATA_PATH};

where MU2E_DATA_PATH normally points to /cvmfs/mu2e.opensciencegrid.org/DataFiles/ but can be configured to other values if /cvmfs is not visible.

Auxiliary files include:

  1. Any file that is parsed by SimpleConfig. This includes:
    • The geometry file, read by the GeometryService.
    • Some of the old style event generator run-time configuration files; newer event generator modules use fcl to get their configuration.
    • Old style conditions data that is still read with the ConditionsService. It will eventually be migrated to the ProductionsService and these files will be removed.
    • Any files included into the above three files using #include.
  2. The magnetic field maps, read by the BFieldManagerMaker.
  3. The particle data table files, read by the ParticleDataList class.
  4. The G4 macro file optionally read by G4_plugin.cc
  5. Some probability distribution functions represented as binned data.
  6. Configuration data used by Ai/ML inference code.
  7. Input particles in G4beamline format, read using EventGenerator/inc/FromG4BLFile.hh


Most run-time configuration files should be found in one of Mu2e repositories so that they are under source code management and can evolve along with the code that reads them. Large files that are not tied to a particular code version, such as magnetic field maps should be found under $MU2E_DATA_PATH.