FclPaths: Difference between revisions

From Mu2eWiki
Jump to navigation Jump to search
No edit summary
Line 192: Line 192:
   <lI> If a path appears in neither the <font color=blue>trigger_paths</font> nor the <font color=blue>end_paths</font>, there is no      warning given.
   <lI> If a path appears in neither the <font color=blue>trigger_paths</font> nor the <font color=blue>end_paths</font>, there is no      warning given.
   <lI> If a module label appears in no path, a warning will be given.
   <lI> If a module label appears in no path, a warning will be given.
  <lI> If a module label appears in no path, a warning will be given.
  <lI> A filter result may be inverted by adding "!" in front of its name in the path, and the return value of the filter will be ignored if a "-" is added to the front of the name in the path.
   </ol>
   </ol>



Revision as of 03:35, 17 December 2020

Introduction

art no longer supports reconstruction on demand, which is also called unscheduled reconstruction. This page has been left here to scrub it for useful information that should be moved to a new home.

This page discusses the two methods of reconstruction in art, scheduled reconstruction, which uses trigger paths, and unscheduled reconstruction, which is usually called reconstruction on demand. It is presumed that the reader is reasonably familiar with the art run-time configuration system.


  • Scheduled Reconstruction, Using Trigger Paths
  • Unscheduled Reconstruction (Reconstruction On Demand)

You may also want to look at the art wiki page on art framework parameters


Scheduled Reconstruction, Using Trigger Paths

On this page, the use of paths is illustrated using an example, how to use filters to write one subset of events to one output file and a different subset of events to a second output file. The two subsets may be disjoint or they may contain events in common; it is legal that some events may be written to no output file. The example shows the case of writing two output files but art alllows one to write many output files in one job.


Consider the following problem. You wish to run a job that has:

  1. Two producers MakeA_module.cc and MakeB_module.cc. You want to run both producers on all events.
  2. One analyzer module that you want to run on all events, CheckAll_module.cc.
  3. You have a filter module, Filter1_module.cc that has two modes; the mode can be selected at run time via the parameter set.
  4. You wish to write all events that pass mode 0 of the filter to the file file0.root and you wish to write all events that pass mode 1 of the filter to file1.root


process_name: filter1

source: {
   # Configure some services here.
}

physics: {

 producers : {
    aProducer: { module_type: MakeA }
    bProducer: { module_type: MakeB }
  }

 analyzers : {
    checkAll: { module_type: CheckAll }
  }

 filter : {
    selectMode0: {
      module_type: Filter1
      mode: 0
    }
    selectMode1: {
      module_type: Filter1
      mode: 1
    }
  }

  mode0: [ aProducer, bProducer, selectMode0 ]
  mode1: [ aProducer, bProducer, selectMode1 ]
  analyzermods: [ checkAll  ]
  outputFiles:  [ out0, out1 ]

  trigger_paths : [ mode0, mode1 ]
  end_paths : [ analyzermods, outputFiles ]
}

outputs: {
  out0: {
   module_type: RootOutput
   fileName: "file0.root"
   SelectEvents: [ mode0 ]
  }

  out1: {
   module_type: RootOutput
   fileName: "file1.root"
   SelectEvents: [ mode1 ]
  }

}

The color key used above is explained here. The following names are identifiers reserved to art: process_name, source, physics, producers, analyzers, filters, trigger_paths, end_paths, outputs. To be a little more precise, FHiCL names obey scoping rules similar to C++; therefore the identifier process_name is really only reserved to art within the outermost scope; but it would seem to be needlessly confusing to use process_name as the name of a parameter within some other scope. The names trigger_paths and end_paths are artifacts of the first use of the CMS framework, to simulate the several hundred parallel paths within the CMS trigger; their meaning should be come clear after reading the remainder of this page.

The following are module labels: aProducer, bProducer, checkAll, selectMode0, selectMode1, out0, out1 . For a module label you may choose any name so long as it is unique within a job and is not one of the names reserved to art.

The following are names of paths: mode0, mode1, outputFiles, analyzermods. For the name of a path you may choose any name so long as it is unique within a job and is not one of the names reserved to art. Any name that is a top level name inside of the physics parameter set is either a reserved name or it is the name of a path.

When understanding a FHiCL document it is important to recognize which identifiers are module labels and which are path names. It is also important to recognize that paths are lists of module labels, while the two reserved names, trigger_paths and end_paths are lists of paths. Finally, it is important to distinguish between a class that is a module and instances of that module class, each uniquely identified by a module label.

Art has several rules that were recommended practices in the old framework but which were not strictly enforced by that framework. Art enforces some of these rules and will, soon, enforce all of them:

  • A path may go into either the trigger_paths list or into the end_paths list but not both.
  • A path that is in the trigger_paths list may only contain the module labels of producer modules and filter modules.
  • A path that is in the end_paths list may only contain the module labels of analyzer modules and output modules.

This example happens to separate the analyzer modules and the output modules into separate paths; that might be convenient at some times but it is not necessary. One would also get the same behaviour from,

xxx: [ checkAll, out0, out1 ]

end_paths : [ xxx ]

On the other hand, keeping trigger paths separate has real meaning.

Art's scheduling stategy is described below. Some of the details are remnants of compromises and conflicting interests with CMS. One of the top level rules in the scheduler is that all producers and filters should be run first, using the ordering rules specified below. After that, all analyzer and output modules will be run. Moreover, analyzer modules and output modules may not modify the event may not have side effects that influence the behaviour of other analyzer or output modules. Therefore art is free to run all analyzer and output modules in any order. The full description of the scheduler strategy is given below:

  1. If a module name appears in the definition of a path name but it is not found among the the list of defined module labels, FHiCL will issue an error.
  2. One each event, before executing any of the paths, execute the source module.
  3. On each event, execute all of the paths listed in the trigger_paths.
    • Within one path, the order of modules listed in the path is followed strictly; at present there is one exception to this: see the discussion below about the remaining issues
    • Art can identify module labels that are in common to several trigger_paths and will execute them only once per event. In the above example, aProducer and bProducer are executed only once per event.
    • The various paths within the trigger_paths may be executed in any order, subject to the above constraints.
    • If a path contains a filter, and if the filter return false, then the remainder of the path is skipped.
    • The module name of a filter can be negated in path using, "!module_label"; in this case the path will continue if the filter returns false and will be aborted if the filter returns true. A negated path name must be quoted (because a fhicl string must start with either a letter or an underscore; or it must be quoted).
    • If the module label of a filter appears in two paths, negated in one path and not negated in the other, art will only run the instance of filter module once and will use the result in both places.
    • If a module in a trigger path throws, the default behaviour of art is to stop all processing and to shut down the job as gracefully as possible. Art can be configured, at run time, so that, for selected exceptions, it behaves differently. For example it can be configured to continue with the current trigger path, skip to the next trigger path, skip to the next event, and so on.
  4. On each event, execute all of the paths listed in the end_paths.
    • The module labels listed in end_paths are executed exactly once per event, regardless of how many paths there are in the trigger_paths and regardless of any filters that failed.
    • If a module label appears multiple times among the end paths, it is executed only once. No warning message is given.
    • Even if all trigger_paths have filters that fail, all module labels in the end path will be run.
    • End_path is free to execute the modules in the end_path in any order.
    • If a module in the end_path throws, the default response of art is to make a best effort to complete all other modules in the end path and then to shutdown the job in an orderly fashion. This behaviour can be changed at run-time by adding the appropriate parameter set to the top level .fcl file.
  5. One can ask that an output module be run only for events that pass a given trigger_path; this is done using the SelectEvents parameter set, as illustrated above. I believe, but am not certain, that SelectEvents allows some simple boolean logic on the pass/fail status of several paths; I have not found the documentation for this.
  6. At present there is no syntax to ask that an analyzer module be run only for events that pass or fail some of the trigger paths. A planned improvement to art is to give analyzer modules a SelectEvents parameter that behaves as it does for output modules.
  7. If a path appears in neither the trigger_paths nor the end_paths, there is no warning given.
  8. If a module label appears in no path, a warning will be given.
  9. If a module label appears in no path, a warning will be given.
  10. A filter result may be inverted by adding "!" in front of its name in the path, and the return value of the filter will be ignored if a "-" is added to the front of the name in the path.

In the above there is a lot of focus on which groups of modules are free to be run in an arbitrary order. This is laying the groundwork for module-parallel execution: art is capable of identifying which modules may be run in parallel and, on a multi-core machine, art could start separate threads for each module. At present both ROOT and G4 are not thread-safe so this is not of immediate interest. But there are efforts underway to make both of these thread-safe and we may one day care about module-parallel execution; our interest in this will depend a great deal on the future evolution of the relative costs of memory and CPU.

For simple cases, in which there is one trigger path with only a few modules in the path, and one end path with only a few modules in the path, the extra level of bookkeeping is just extra typing with no obvious benefit. The benefit comes when many work groups wish to run their modules on the same events during one art job; perhaps this is a job skimming off many different calibration samples or perhaps it is a job selecting many different streams of interesting Monte Carlo events. In such a case, each work group needs only to define their own trigger path and their own end path, without regard for the requirements of other work groups; each work group also needs to ensure that their paths are added to the end_paths and trigger_paths variables. Art will then automatically, and correctly, schedule the work without redoing any work twice and without skipping work that must be done. This feature came for free with art and, while it imposes a small burden for novice users doing simple jobs, it provides an enormously powerful feature for advanced users. Therefore it was retained in art when some other features were removed.

Some Remaing Issues

The above material presents the intended behaviour of art. This section discusses a few places in which art currently falls short; it is our intention to fix these cases.

  1. Supppose that in the path mode1, the positions of aProducer and bProducer were reversed but that they were left unchanged in the path mode0. In this case there would be an ambiguity about which must be run first. In such a case art could issue a diagnostic and stop. However it does not; instead it uses the ordering specified by whichever path it decided to process first. When it is time to process the second path, art discovers that both modules have already been executed and takes no further action. Clearly this is not a problem if the order of execution of aProducer and bProducer does not matter. Clearly it is a problem if the order of execution does matter; in almost all cases this can be diagnosed quickly because, by definition, expected output will be missing. This choice of behaviour is left over from CMS: in the CMS trigger studies there might have been be serveral hundred trigger paths in one job, each maintained by different groups. Typically each path contained many modules for which order of execution did not matter; their judgement was that forcing people to fix the many apparent inconsistencies was harder than finding and fixing the few true inconsistencies. Mu2e will likely decide that we have a sufficiently smaller problem that it makes sense to make the opposite choice.
  2. Art will let you define the parameter set for an EDProducer within the physics.analyzers parameter set and vice-versa. This mistake is legal FHiCL, which is unware of the meaning of the configuration, but should be an illegal art run-time configuration.
  3. Art will let you put an EDAnalyzer on a path that is in the trigger_paths.
  4. If multiple people define their own path names, there is no tool to detect name collisions and FHiCL's substitution/replacement rules will operate.

It is not necessary to use the key words trigger_paths and end_paths.

  1. The variables trigger_paths and end_paths are no longer required. If they are present they will be honored. The remaining comments all presume that these variables are absent.
  2. All fcl variables defined in the physics parameter set that are not in the "known" list (producer, analyzers, filters) are presumed to be paths.
  3. The variables defined in 2 are inspected to see if they are valid trigger paths or valid end paths. All that are valid paths will be executed with all trigger_paths before all end_paths; all modules within a trigger path will be executed in the specified order. There are no other guarantees about order and we must not count on an order that we happen to observe today.
  4. If no end paths are present I *think* that all analyzers and output modules that are configured will be executed.
  5. We have not looked to see what happens if no trigger paths are present - my guess is that no producers or filters are run but that could be wrong.
  6. We are not sure what art does if a variable found in 2 is not a valid path. Valid: it is a fcl sequence and all entries are module labels; trigger paths may only contain the labels of producers and filters; end paths may only contain the labels of analyzers and output modules.
  7. We are not sure if there is a plan to remove trigger_paths and/or end_paths from the grammar.

Reconstruction On Demand

The art operating mode described above is known as scheduled reconstruction, in which the order of modules is given by a user supplied schedule, the trigger_paths. There is a second operating mode, reconstruction on demand, also called unscheduled reconstruction. This mode is not currently used by Mu2e but we might decide to use it at a later date. Some of the features of reconstruction on demand are critical for dealing with some of the practical problems faced by a large experiment like CMS. It is not yet clear if the tradeoffs will result in the same decision for a much smaller experiment such as Mu2e.

In reconstruction on demand, it is not necessary to provide any trigger_paths. It is necessary to provide:

  1. All of the module and service configuration, just as before.
  2. The end_paths information, just as before.
  3. The following parameter must be set in the services block of the run-time configuration
services.scheduler.allowUnscheduled: true

In this mode art behaves as follows:

  1. At job startup it instantiates all of the modules, just as before.
  2. It makes a list of which products can be produced by which EDProducer module; this information is obtained from the calls to produces() in the constructor of each EDProducer.
  3. Art then starts to execute the modules in the end_paths, exactly as it did before.
  4. As art executes modules in the end_paths, those modules will make calls to get data products from the event. Art uses the following strategy to satisfy those requests; once the request is satisfied, art does not consider the remaining alternatives.
    • Art will check if it can satisfy the request from among the products already produced by the current art job.
    • It will look at the list made in 2 to see if if there is an EDProducer than can satisfy the request; if so art will run that EDProducer.
    • Art will look to see if there is a an appropriate data product in the input event.
    • If all of the above fail, art will return an empty handle or an empty collection of handles.
  5. Some comments on the order given in the previous rule:
    • The first rule ensures that, if several modules in the end_paths ask for the same data product, and if an EDProducer is available to make that product, then that EDProducer will be run only once.
    • The ordering of the second and third rules means that if an appropriate data product both exists in the input event and can be produced by a registered EDProducer, art will run the EDProducer! Many will find this behaviour anti-intuitive; it will be discussed further below.
  6. In this mode, if a module asks for the fitted tracks, this can trigger a call to the track fitting code, which can trigger a call to the pattern recognition code, which can trigger a call to the straw cluster making code, which can trigger a call to the raw hit unpack code, which will finally get the raw hits from the input file. Each precursor data product is only made as it is needed.

One of the big advantages of reconstruction on demand is that the end user never needs to delare the required order of producers; art can figure it out on its own. Therefore the concern about two trigger_paths having an inconsistent order of modules is moot. On a small experiment such as Mu2e, which might only ever have a handful of trigger_paths within one art job, this is small win. For large experiments that may have many trigger_paths within one job, it is a big win.

If we take one step further, and ask that all EDAnalyers and all output modules declare in their constructors what data products they require as inputs, then it is possible for reconstruction on demand to identify opportunities for parallel evalulation of EDProducers. If sub-event multithreading evolves into a useful feature, this extension to reconstruction on demand would allow it naturally.

Consider again the case that a requested data products both exists in the input event and can be produced by a registered EDProducer. After the producer has run, both data products will be present in the event but they are easily distinguished because art labels each data product with a [DataProducts.shtml#identifiers four part data product ID] and one part of this ID is the process_name. If an art process reads an input file, and if any of the data products from that input file have an ID with a process_name field that matches the process_name of the current process, then art will throw an exception. Therefore the two data products in question are guaranteed to have IDs that differ at least in their process_name field.

The CLEO III experiment also had a reconstruction on demand system and their system also had the rule that if an appropriate EDProducer was available, it would be run to supercede data already in the input file. This way of thinking is essentially: if you don't want an EDProducer to run, don't configure it into the producer set. If you do put it in the producer set, it will take priority over existing data products. s CMS experience has shown that, in order for reconstruction on demand to work well, it is important that modules ask for their input data products using a well qualified name. Normally this is the getByLabel method of art::Event. Consider the case of writing a cluster finder for the calorimeter system; the input to this system will be a list of hit Avalanche Photo Diodes (APDs). There might be several different modules that can produce a list of hit APDs; perhaps one module makes simulated APD hits from MC truth information while another module unpacks raw data to produce APD hits. Perhaps their might be several standard configurations of this last module, one with tight pedestal cuts on one with loose pedestal cuts. The same cluster finder module can run on all of these different inputs, without recompilation, using the following pattern in the cluster finder module:

// In the member data.
std::string _caloReadoutModuleLabel;

// In the intializer list of the constructor:
_caloReadoutModuleLabel(pset.get<string>("caloReadoutModuleLabel","CaloReadoutHitsMaker")),

// In the analyze method
art::Handle<CaloHitCollection> caloHits;
event.getByLabel(_caloReadoutModuleLabel,caloHits);

The default behaviour of this code fragment is to get the following object from the event: an object of type CaloHitCollection that was produced by a module whose label is "CaloReadoutHitsMaker". In this pattern the label of the EDProducer module is run-time configurable so exactly the same module can be run on different inputs. The art metadata system will store the information about the source of the input hits that were used by the cluster finder.

Some Remaing Issues

This note needs to be extended to include the use of filters when reconstruction on demand is enabled. It also needs to talk about the meaning of "keep *_*_*_*" in an output module when reconstruction on demand is enabled: this will trigger running all of the registered producers and all of their data products will be written to the output file.

For a discussion about the keep/drop syntax when using Scheduled Reconstruction, see the discussion of [IOModules|output configuring output modules] .