Mock Data (MDS): Difference between revisions

From Mu2eWiki
Jump to navigation Jump to search
Line 13: Line 13:


* Each primary (DIOtail, CE, Cosmics etc.) is simulated separately, the number of events simulated must be equal to ( or greater than ) the livetime of the Mock Data sample (we do not resample);
* Each primary (DIOtail, CE, Cosmics etc.) is simulated separately, the number of events simulated must be equal to ( or greater than ) the livetime of the Mock Data sample (we do not resample);
* The run_si.py script is ran with several input arguements:
* The run_si.py script is run with several input arguments:
   * stdpath = the path to output and where filelists for each input is located
   stdpath = the path to output and where filelists for each input is located
   * BB = 1, 2 or averaged booster batch mode
   BB = 1, 2 or averaged booster batch mode
   * livetime = livetime in seconds
   livetime = livetime in seconds
   * prc = the list of process to be included
   prc = the list of processes to be included
   * rmue = signal branching rate
   rmue = signal branching rate
* run_si finds the expected number of events for each input process given the chosen BB mode, livetime and Rmue values.
* run_si finds the expected number of events for each input process given the chosen BB mode, livetime and Rmue values.
* run_si effectively constructs .fcl files which utilize the SamplingInput module from art. This allows sampling from a single file per process and an overall weight which is determined by the previously listed parameters.
* the "events-per-sub run" factor in run_si can be used to split the number of events sampled into a given amount for .fcl files, thus allowing parallelization.
* TODO: need to write about how this would be ran on the grid
* each fcl file generated a combined primary sample which contains a mixture of primaries, weighted according to the chosen livetime, rmue, BB etc.
* this combined primary can be mixed with pile-up and reconstructed as any other primary using standard Production workflows.


== Diagram ==
== Diagram ==

Revision as of 19:01, 19 May 2024

MDC 2024: Mock Data samples

Introduction

Mock data samples can be helpful in two ways:

  • to help prepare physics analysis efforts;
  • to help us understand the size of our data .art files and ntuples.

Combining primaries

To create Mock Data the process is as follows:

  • Each primary (DIOtail, CE, Cosmics etc.) is simulated separately, the number of events simulated must be equal to ( or greater than ) the livetime of the Mock Data sample (we do not resample);
  • The run_si.py script is run with several input arguments:
 stdpath = the path to output and where filelists for each input is located
 BB = 1, 2 or averaged booster batch mode
 livetime = livetime in seconds
 prc = the list of processes to be included
 rmue = signal branching rate
  • run_si finds the expected number of events for each input process given the chosen BB mode, livetime and Rmue values.
  • run_si effectively constructs .fcl files which utilize the SamplingInput module from art. This allows sampling from a single file per process and an overall weight which is determined by the previously listed parameters.
  • the "events-per-sub run" factor in run_si can be used to split the number of events sampled into a given amount for .fcl files, thus allowing parallelization.
  • TODO: need to write about how this would be ran on the grid
  • each fcl file generated a combined primary sample which contains a mixture of primaries, weighted according to the chosen livetime, rmue, BB etc.
  • this combined primary can be mixed with pile-up and reconstructed as any other primary using standard Production workflows.

Diagram

Components

DIO tail

The DIO tails is simulated from stopped muons using the SingleProcessGenerator defined in the Offline EventGenerator directory. The DIOGenerator tool is used to provide the correct momentum distribution based on the 5-8 polynomial derived by Czarnecki et al.

A filter called GenFilter is used to remove events unlikely to produce viable events in the reconstruction. The effect of the filter is to improve the time performance by 40%, there is no loss of efficiency.

Two DIO tail samples are included as primaries in two sets of samples for MDC2024: one has a cut at p > 95 MeV/c (a fraction of 3.64e-11 of the entire DIO momentum spectrum) and another has a lower cut, below the trigger threshold, of p > 75 MeV/c (a fraction of 4.19e-7 of the entire DIO spectrum).

In previous simulation studies, DIOs of all momenta are included in the pile-up stream and not as primaries, including them as primaries has the advantage of giving us a large sample of events and therefore increased realism.

Conversion and Conversion Leading Log

CeEndpoints are a standard part of production. The Leading Log camapign includes the leading log corrections calculated by Szfaron. This results in about 10% of electrons being in a lower momentum tail (as opposed to all being at 104.97 MeV/c in the case of the CeEndpoint).

Cosmics

As part of SU2020 a campaign that used the CORSIKA generator was built and exercised, providing 1.1e7s of cosmic events to be sampled from. Similarily a campaign of a similar size using the CRY generator is also available.

The CRY sample is used for pass 0, but the CORSIKA one is used for the later camapigns.

Pile-up

For pass 0 the existing pile-up streams were used. These were mixed with the combined primary sample as if it were any other primary sample.

This will provide some inaccuracies, as we are mixing in two DIO samples (one as a primary for p > 95 MeV/c and one which is part of the MuStopPileup sample and covers all momentum ranges up to the endpoint). This could introduce some double counting but it is unlikely to overly effect the outcomes of any physics analysis applied to these samples.

For future passes, custom pile-up samples will be combined as primaries in the same way we have done the DIO tails.

RPC

TODO

pass 0

The pass 0 samples all include DIO tail events with the 95 MeV/c cut. Two sample sizes are chosen: 1 week livetime and 1 month livetime.

All components except the RPC are included. Two Rmue values are used, one at 1e-13 which is just below the present upper limit (7e-13) and allows around 55 true CE events for the 1 week sample and 222 tru CE for the 1 month livetime (before any selection or reconstruction efficiency is factored in).

The samples available are listed below:

pass 1