Staging: Difference between revisions

From Mu2eWiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(9 intermediate revisions by the same user not shown)
Line 2: Line 2:
{{Expert}}
{{Expert}}
==Introduction==
==Introduction==
Mu2e [[Simulation|simulation]] often uses a technique called staging, where an event is generated and simulated only partially, then the simulation of that event is stopped, and the event is written out.  The next stage reads in the result of first stage, continues the simulation further, then stops and writes that out.  Several stages may be used to complete a simulation.    A stage may stop when all particles have reached a certain plane or a volume.  A typical volume to stop a simulation on would be the DS region containing the target and the detector, or a volume that contains the tracker.  Another example of a boundary of a stage is when all muons have stopped in the target, but not decayed.
The motivations for staging are listed here.  They are especially important if the early stages dominate the CPU time, which is usually the case in mu2e.
* A piece of the detector can be defined in a later stage.  For example, all the computationally-expensive protons on target can be simulated in stage 1, which might take months, and the simulation is stopped outside the detectors.  During this time, the tracker geometry and algorithms can continue to develop.  When the tracker code is ready, the simulation can be continued into the tracker volume.
* Variations in design can be studied efficiently.  If a simulation stage is stopped outside of a detector, then the simulation can be continued using different designs of the detector.  Each design can be tested quickly because all the simulation up to the detector doesn't have to be repeated.
* Staging can make error recovery much more efficient - it is a kind of checkpointing.  If a mistake is discovered in stage 3, then stage 3 only needs to be repeated, starting with the output of stage 2.  When most of the CPU time is spent in the early stages, this can a huge savings.
* There are often limitations in how long a job can run or how large an output file can be, that can be efficiently handled with staging.  Sometimes it helps to compress and drop information before continuing, or filter events for more efficient storage, or to concatenate.  See [[JobPlan|job planning]] for more discussion of these considerations.
Staging also enables two more important simulation techniques - [[Mixing|mixing]] and resampling, see [[Simulation]].
==Examples==
Stages are often abbreviated as '''s1''', '''s2''', etc.  The following examples reflect the cd3 staging plan.  These stages are designed to meet many needs of the collaboration.  Users might typically start their individual analysis work with one of the samples from the higher stages, depending on the analysis goals.
===Beam===
* '''s1''' - generate primary protons, simulate up to TS2.  This is the majority of the CPU. 
** write particles that are in the transport channel to the "mubeam" dataset
** write particles that are approaching the outside of the CRV to the "dsregion" dataset
** write particles that are approaching extmon their own dataset
** filter out events with no interesting particles (~90%)
** allows variations in the CRV, TS3 foils, but locks in any TS1 foils.
* '''s2''' - continue the s1 mubeam simulation through TS4, stop on the boundary of TS5. 
** write out result to s2 mubeam dataset
** allows the variation of TS5 materials and anything downstream - the target, the detector, etc.
* '''s3''' - continue the s2 mubeam simulation to the outside of the tracker and calorimeter
** for muons stopped in the target, write them out to an ntuple, remove them from the simulation
** for muons stopped everywhere else, stop their simulation and write them to the "ootstops" dataset
** write everything else to "mothers" dataset
*  '''s4''' - finally simulate the detector.  This is a series of different jobs, creating many datasets which will be brought back together in [[Mixing|mixing]].
** read s3 mothers, simulate the detector, write to "flash" dataset
** read s3 ootstops, simulate the detector, write to "detoot" dataset
** read target-stopped muon ntuple, force decay to DIO, simulate the detector, write to "detdio" dataset
** read target-stopped muon ntuple, force muon capture and production of photon, neutron, proton or deuteron,  simulating each specia separately with a custom model, and write each species to its own dataset
** generate conversion electrons from scratch (these will be mixing later), write to "detconversion" dataset
** generate conversion electrons from scratch, with flat momentum spectrum, write to "flate" dataset
These many datasets are typically [[Mixing|mixed]] for different purposes.
===Cosmics===
Cosmic rays can cause real and fake 105 MeV electrons in the conversion electron signal region, so cosmics are generated to study these rates.  Unlike the [[#Beam|beam]] jobs, each of these stages writes a single output dataset.  Also unlike the beam jobs, some of the stages are defined by reconstruction results, not just simulation results. The filtering is more important here because cosmic rays rarely produce good tracks.  The primary purpose of the stages is blinding parts of the detector or reconstruction, and filtering, where each stage has a filtering factor of up to several hundred.
* '''s1''' - generate cosmic rays approaching the detector
** write out events with a particle with 45 MeV entering the tracker or calorimeter volumes
** locks in CRV geometry but not other detectors
* '''s2''' - complete simulation
** write out events with 15 StepPointMC's in the tracker
** locks in tracker geometry
* '''s3''' - perform track reconstruction
** write out events with a track with momentum 50<p<200 MeV
** locks in tracking
* '''s4''' -  do PID
**write out events with a high-quality track passing PID, with momentum 100<p<110 MeV
* '''s5''' - simulate the CRV, and write an ntuple
==Other Staging==
==Other Staging==
time.  turning off decays...
In the above examples, the stages were defined by particles reaching physical boundaries, or by the reconstruction passing certain cuts.  The staging concept might be applied to other domains.  For example the ''time'' of a stopped muon capture or decay is often often set to zero and simulated at a time where is is most convenient, see [[TimeSim|time simulation]]Another example is turning off decays of muons. This allows the muons to be decayed in a later stage, maybe with constraints of the decay time or position. The muons may also be re-sampled.
 
 
 
 
[[Category:Computing]]
[[Category:Code]]

Latest revision as of 19:01, 27 November 2018

Construction.jpeg This page is a draft, please help complete it!

Expert.jpeg This page page needs expert review!

Introduction

Mu2e simulation often uses a technique called staging, where an event is generated and simulated only partially, then the simulation of that event is stopped, and the event is written out. The next stage reads in the result of first stage, continues the simulation further, then stops and writes that out. Several stages may be used to complete a simulation. A stage may stop when all particles have reached a certain plane or a volume. A typical volume to stop a simulation on would be the DS region containing the target and the detector, or a volume that contains the tracker. Another example of a boundary of a stage is when all muons have stopped in the target, but not decayed.

The motivations for staging are listed here. They are especially important if the early stages dominate the CPU time, which is usually the case in mu2e.

  • A piece of the detector can be defined in a later stage. For example, all the computationally-expensive protons on target can be simulated in stage 1, which might take months, and the simulation is stopped outside the detectors. During this time, the tracker geometry and algorithms can continue to develop. When the tracker code is ready, the simulation can be continued into the tracker volume.
  • Variations in design can be studied efficiently. If a simulation stage is stopped outside of a detector, then the simulation can be continued using different designs of the detector. Each design can be tested quickly because all the simulation up to the detector doesn't have to be repeated.
  • Staging can make error recovery much more efficient - it is a kind of checkpointing. If a mistake is discovered in stage 3, then stage 3 only needs to be repeated, starting with the output of stage 2. When most of the CPU time is spent in the early stages, this can a huge savings.
  • There are often limitations in how long a job can run or how large an output file can be, that can be efficiently handled with staging. Sometimes it helps to compress and drop information before continuing, or filter events for more efficient storage, or to concatenate. See job planning for more discussion of these considerations.

Staging also enables two more important simulation techniques - mixing and resampling, see Simulation.

Examples

Stages are often abbreviated as s1, s2, etc. The following examples reflect the cd3 staging plan. These stages are designed to meet many needs of the collaboration. Users might typically start their individual analysis work with one of the samples from the higher stages, depending on the analysis goals.

Beam

  • s1 - generate primary protons, simulate up to TS2. This is the majority of the CPU.
    • write particles that are in the transport channel to the "mubeam" dataset
    • write particles that are approaching the outside of the CRV to the "dsregion" dataset
    • write particles that are approaching extmon their own dataset
    • filter out events with no interesting particles (~90%)
    • allows variations in the CRV, TS3 foils, but locks in any TS1 foils.
  • s2 - continue the s1 mubeam simulation through TS4, stop on the boundary of TS5.
    • write out result to s2 mubeam dataset
    • allows the variation of TS5 materials and anything downstream - the target, the detector, etc.
  • s3 - continue the s2 mubeam simulation to the outside of the tracker and calorimeter
    • for muons stopped in the target, write them out to an ntuple, remove them from the simulation
    • for muons stopped everywhere else, stop their simulation and write them to the "ootstops" dataset
    • write everything else to "mothers" dataset
  • s4 - finally simulate the detector. This is a series of different jobs, creating many datasets which will be brought back together in mixing.
    • read s3 mothers, simulate the detector, write to "flash" dataset
    • read s3 ootstops, simulate the detector, write to "detoot" dataset
    • read target-stopped muon ntuple, force decay to DIO, simulate the detector, write to "detdio" dataset
    • read target-stopped muon ntuple, force muon capture and production of photon, neutron, proton or deuteron, simulating each specia separately with a custom model, and write each species to its own dataset
    • generate conversion electrons from scratch (these will be mixing later), write to "detconversion" dataset
    • generate conversion electrons from scratch, with flat momentum spectrum, write to "flate" dataset

These many datasets are typically mixed for different purposes.

Cosmics

Cosmic rays can cause real and fake 105 MeV electrons in the conversion electron signal region, so cosmics are generated to study these rates. Unlike the beam jobs, each of these stages writes a single output dataset. Also unlike the beam jobs, some of the stages are defined by reconstruction results, not just simulation results. The filtering is more important here because cosmic rays rarely produce good tracks. The primary purpose of the stages is blinding parts of the detector or reconstruction, and filtering, where each stage has a filtering factor of up to several hundred.

  • s1 - generate cosmic rays approaching the detector
    • write out events with a particle with 45 MeV entering the tracker or calorimeter volumes
    • locks in CRV geometry but not other detectors
  • s2 - complete simulation
    • write out events with 15 StepPointMC's in the tracker
    • locks in tracker geometry
  • s3 - perform track reconstruction
    • write out events with a track with momentum 50<p<200 MeV
    • locks in tracking
  • s4 - do PID
    • write out events with a high-quality track passing PID, with momentum 100<p<110 MeV
  • s5 - simulate the CRV, and write an ntuple

Other Staging

In the above examples, the stages were defined by particles reaching physical boundaries, or by the reconstruction passing certain cuts. The staging concept might be applied to other domains. For example the time of a stopped muon capture or decay is often often set to zero and simulated at a time where is is most convenient, see time simulation. Another example is turning off decays of muons. This allows the muons to be decayed in a later stage, maybe with constraints of the decay time or position. The muons may also be re-sampled.