Mixing and Resampling: Difference between revisions
Line 89: | Line 89: | ||
There are two types of input files. The first is a single particle file, such as protons ejected by stopped muons in a single-stage simulation. These will have SimParticles numbered starting with 1, and running up to, perhaps, 1000. When mixing needs to mix in several events from one of the these datasets, they are renumbered as follows. The first event keeps its numbers 1 through N1. The second event will have the numbers shifted to N1+1 though N1+N2, etc. | There are two types of input files. The first is a single particle file, such as protons ejected by stopped muons in a single-stage simulation. These will have SimParticles numbered starting with 1, and running up to, perhaps, 1000. When mixing needs to mix in several events from one of the these datasets, they are renumbered as follows. The first event keeps its numbers 1 through N1. The second event will have the numbers shifted to N1+1 though N1+N2, etc. | ||
The flash files have been through several stages of simulation, and the new particles in each stage have an offset of (N-1)*100,000 for stage N, so after M stages, they might have SimParticles sparsely numbered in the range 1 through M*100,000. If mixing requires to read in several of these events, they will be appended as described above for the single-stage simulation files. This will result in numbers ranging from 1 to K*M*100,000, where K is the number of mixed events, which may run to the 1000's. | The flash files have been through several stages of simulation, and the new particles in each stage (in the 2106 cd3 scheme) have an offset of (N-1)*100,000 for stage N, so after M stages, they might have SimParticles sparsely numbered in the range 1 through M*100,000. If mixing requires to read in several of these events, they will be appended as described above for the single-stage simulation files. This will result in numbers ranging from 1 to K*M*100,000, where K is the number of mixed events, which may run to the 1000's. | ||
Mixing has to not only mix several events from one dataset, but then also mix the events from several datasets. The same pattern of shifting and appending numbers as needed, is applied to the collections from different files. The same algorithm is applied to the GenParticle and McTrajectory collections. A summary of the mixing process is in the '''MixingSummary''' art product. | Mixing has to not only mix several events from one dataset, but then also mix the events from several datasets. The same pattern of shifting and appending numbers as needed, is applied to the collections from different files. The same algorithm is applied to the GenParticle and McTrajectory collections. A summary of the mixing process is in the '''MixingSummary''' art product. |
Revision as of 17:03, 25 May 2017
This page is a draft, please help complete it!
This page page needs expert review!
Introduction
In mu2e simulation, mixing is a procedure where different parts of an event are generated separately and written out, then, sometime later, mixed together to create the final event. It is an approximate, practical approach, driven by limitations on computing. It is helpful to be familiar with the the experiment, simulation, and staging before continuing.
In the real experiment, one event is 900 ns of time-dependent detector data recording the result of a one microbunch (40M protons on target or POT). In one event, there is always a lot of background activity (hits) in the detector and, sometimes, there will be one track of interest, such as a conversion electron, for example. If we were to naively simulate this event, we would find we find we spend thousands of hours generating the protons on target to simulate the background activity for one event, and then we add the one conversion electron to the event. When you need thousands or millions of conversion electrons to test resolutions or efficiencies, this approach becomes impossible.
There are two things we can do, and we take both these approaches depending on the needs of the study. The first approach is to forget the background activity and simply generate conversion electrons as they pop out of muons stopped in the target. This runs very fast and produces compact output files. The downside is that we do not have the background activity and we know this activity it is important if you are trying to make accurate estimates of efficiency or resolution.
The second approach is generate as many full microbunch events as we can. We then overlay conversion electrons on top of these microbunches to make realistic signal events. It turns out we can't generate more than one hundred or so microbunches, so that's how many realistic event we end up with. To generate high statistics, we perform one more step. We continue to generate more unique conversion electrons, and re-use or resample the background events over and over. Since each conversion electron passes through a different region of the detector, the effect of the background activity on the electron is statistically independent, even though we are using the same background activity event many times. The combination of the background event and the conversion electron event is the mixing.
In the next section we briefly discuss some simulation issues relevant to all mixing. There are many ways that parts of an event might be generated separately, then mixed, but we will focus only the highest, and most common, level of mixing in the following sections.
Simulation mixing
Events are mixed at the StepPointMC level. This is essentially overlapping lists of particles on the event. You can't mix after creating digis since you can no longer correctly add the charge depositions and include non-linear effects.
Mixing and Geometry
The signal and the background frame may have been simulated with different geometry. This will still work to first order since the SteppointMC is a list of space points and particle ID's. The digis can still be made. If a piece of the detector moved then the SteppointMC may be in the wrong place and the simulation won't make sense.
Story of CompressPV...
Background frames
In this section we mix many sources of physics together to make the event that represents the simulation of all standard model physics in a microbunch event. This is also called a "background event", a "background fame" or "detmix". Usually one expert prepares these frames for the collaboration.
At the end of stage 4 the staged production of protons on target we are left with the following datasets. At this point, each art event is based on one proton on target, and may be filtered, saving only events with particles in the detector region.
- an ntuple containing the position of muons that stopped in the target (target stops)
- art files containing the following particles species, produced using custom physics models, originating from target stops
- DIO electrons
- protons
- neutrons
- photons
- deuterons
- art files with all the particles caused by muons that stopped outside the target (oot)
- art files with all the other particles caused by protons on target (flash)
At this point, we have all the standard model physics from protons on target. The five particle species represent all the products of muons stopped in the target, the OOT represents all muons stopped out the target, and the flash presents everything else. We only need to mix them together in the right proportions to make realistic events. The mixing factors will depend on the simulation of the POT, the number of stopped muons, the probability of a muon capture, the probability for a muon capture to produce a particle of a certain species.
Mixing signal
In this section we discuss how to mix a signal, such as a conversion electron, on top of a background frame.
There are generally two cases at this point
- the conversion electrons have been generated previously and are stored, one per event, in an art file and this dataset is mixed with the background. In this case the input module will be reading conversion electrons one at a time, and adding in the background frame.
- the conversion electron will be generated in the same job as it is mixed. In this case the input module is a generator and the background frame is mixing after the generation.
Mixing and
Mixing fcl
The mixing fractions for cd3 samples are recorded here doc 6273
See also GenerateFcl
BEGIN_PROLOG draEventMixing: { producers: { @table::EventMixing.producers } filters: { @table::EventMixing.filters } CD3Mixers: [ @sequence::EventMixing.CD3Mixers ] } END_PROLOG #include "JobConfig/cd3/common/prolog.fcl" process_name: dram @table::draTopLevelDefs physics.filters.flashMixer.fileNames : @local::bgHitFiles physics.filters.ootMixer.fileNames : @local::bgHitFiles physics.filters.dioMixer.fileNames : @local::bgHitFiles physics.filters.neutronMixer.fileNames : @local::bgHitFiles physics.filters.photonMixer.fileNames : @local::bgHitFiles physics.filters.protonMixer.fileNames : @local::bgHitFiles physics.filters.deuteronMixer.fileNames : @local::bgHitFiles
Particle Numbering
In mixing, the input file might have GenParticles and SimParticles with the same id numbers, but these need to be combined into one coherent collection.
There are two types of input files. The first is a single particle file, such as protons ejected by stopped muons in a single-stage simulation. These will have SimParticles numbered starting with 1, and running up to, perhaps, 1000. When mixing needs to mix in several events from one of the these datasets, they are renumbered as follows. The first event keeps its numbers 1 through N1. The second event will have the numbers shifted to N1+1 though N1+N2, etc.
The flash files have been through several stages of simulation, and the new particles in each stage (in the 2106 cd3 scheme) have an offset of (N-1)*100,000 for stage N, so after M stages, they might have SimParticles sparsely numbered in the range 1 through M*100,000. If mixing requires to read in several of these events, they will be appended as described above for the single-stage simulation files. This will result in numbers ranging from 1 to K*M*100,000, where K is the number of mixed events, which may run to the 1000's.
Mixing has to not only mix several events from one dataset, but then also mix the events from several datasets. The same pattern of shifting and appending numbers as needed, is applied to the collections from different files. The same algorithm is applied to the GenParticle and McTrajectory collections. A summary of the mixing process is in the MixingSummary art product.