Revision as of 14:14, 25 October 2024

MDC2020 ensembles

MDC2024 is the original name which is now part of MDC2020, and refers specifically to the ensembles/mock data made using MDC2020 stops and primaries.

Introduction

Mock data samples can be helpful in two ways:

to help prepare physics analysis efforts;
to help us understand the size of our data .art files and ntuples.

Streams

The two purposes above (physics studies, trigger studies) will result in different samples, with differing complexity.

Physics stream

Includes all major components and pile-up. The DIOtail momentum cut can be higher (nominally p>95 MeV/c will be used as a starting point). Three samples will be made: signal at just below current limit (1e-13), closed sample (random signal choice), no signal.

Trigger stream

Here all backgrounds and pile-up will be included but no signal. The DIOtail cut is reduced below the trigger threshold to p > 75 MeV/c.

Inputs

There are several assumptions made when we choose a livetime:

Booster Batch Mode

The Booster Batch (BB) mode describes the incoming operational mode of the booster which feeds the beam through our delivery ring which in turn passes protons to Mu2e.

There are two "run modes" in Mu2e: 1BB and 2BB, in the low-intensity running mode the mean intensity is 1.6E7 protons/pulse and in the higher-intensity mode this becomes 3.9E7 protons/pulse.

Batch	Time (s)	T Cycle (s)	T Spill (s)	Spills	frac	On Spill Time	N-cycles	POT per cycle
`1`	`9.52E+06`	`1.33`	`1.07E-01`	`4`	`0.323`	`3.07E+06`	`7.16E+06`	`4.00E+12`
`2`	`1.58E+06`	`1.4`	`4.31E-02`	`8`	`0.246`	`3.89E+05`	`1.13E+06`	`8.00E+12`

Expected DIOs

The expected number of muon stops per POT is 1.56E-3 muons/POT (from MDC2020p). The decay : capture ratio for Al is 0.39:0.61.

In our simulation, we tend to focus on simulating the higher momentum tail with cuts of p > 75MeV/c sampling a fraction of 4.19E-07 of the entire DIO spectrum. and p > 95 MeV/c sampling a fraction of 3.64E-11 of the entire DIO spectrum.

livetime	BB	POT	Stopped Muons	DIOs (p>75MeV/c)	DIOs (p>95MeV/c)
`1 hour`	`1BB`	`1.08E16`	`1.67e+13`	`7.10e6`	`609`

Sim Efficiencies and Databases

One major improvement from the previous MDC2018 is that we now have database implementation in Mu2e. This includes a "SimEfficiency" data-base which can be used to extract the expected rates from the simulation, without need for hardcoding. Here is an example of how to extract the number of stopped muons per pot:

# get stopped rates from DB
dbtool = DbService.DbTool()
dbtool.init()
args=["print-run","--purpose","MDC2020_best","--version","v1_1","--run","1200","--table","SimEfficiencies2","--content"]
dbtool.setArgs(args)
dbtool.run()
rr = dbtool.getResult()
# get number of target muon stops:
target_stopped_mu_per_POT = 1.0
rate = 1.0
lines= rr.split("\n")
for line in lines:
    words = line.split(",")
    if words[0] == "MuminusStopsCat" or words[0] == "MuBeamCat" :
        #print(f"Including {words[0]} with rate {words[3]}")
        rate = rate * float(words[3])
        target_stopped_mu_per_POT = rate * 1000 
print(f"Final stops rate muon {target_stopped_mu_per_POT}")

Production Scripts

To automate parts of the process a number of scripts have been written. These reside in the Produciton repo: Production/ensembles.

There are a number of helper scripts available:

ensemble python scripts

normalization.py - calculated normalization for each sample based on user assumptions
maketemplatefcl.py - makes the SamplingInput fcl for the provided data set
calculateEvents.py - prints number of a specific process events for chosen user inputs

ensemble shell scripts

Stage1: the S1 script tells you the minimum number of events needed to match a chosen cosmic sample for each input process. The number of jobs for each input process must match the number of chosen cosmic files. You could simulate more than this to allow for failure modes, but only use that number of jobs in the eventual ensembling.

Stage2: this script combines the input samples, the template fcl and passes these to the grid to run the parrallized ensembling using the standard Mu2e grid tools.

validation: this script loops over the grid log files and accessess the specific part telling how many of each input were in fact sampled. It sums up those and spits out the total amount of dts events from each process (this includes the dts efficiency, so its only a validation)

Components

DIO tail

The DIO tails is simulated from stopped muons using the SingleProcessGenerator defined in the Offline EventGenerator directory. The DIOGenerator tool is used to provide the correct momentum distribution based on the 5-8 polynomial derived by Czarnecki et al.

A filter called GenFilter is used to remove events unlikely to produce viable events in the reconstruction. The effect of the filter is to improve the time performance by 40%, there is no loss of efficiency.

Two DIO tail samples are included as primaries in two sets of samples for MDC2024: one has a cut at p > 95 MeV/c (a fraction of 3.64e-11 of the entire DIO momentum spectrum) and another has a lower cut, below the trigger threshold, of p > 75 MeV/c (a fraction of 4.19e-7 of the entire DIO spectrum).

In previous simulation studies, DIOs of all momenta are included in the pile-up stream and not as primaries, including them as primaries has the advantage of giving us a large sample of events and therefore increased realism.

Conversion and Conversion Leading Log

CeEndpoints are a standard part of production. The Leading Log camapign includes the leading log corrections calculated by Szfaron. This results in about 10% of electrons being in a lower momentum tail (as opposed to all being at 104.97 MeV/c in the case of the CeEndpoint).

Cosmics

As part of SU2020 a campaign that used the CORSIKA generator was built and exercised, providing 1.1e7s of cosmic events to be sampled from. Similarily a campaign of a similar size using the CRY generator is also available.

The CRY sample is used for pass 0, but the CORSIKA one is used for the later camapigns.

Pile-up

For pass 0 the existing pile-up streams were used. These were mixed with the combined primary sample as if it were any other primary sample.

This will provide some inaccuracies, as we are mixing in two DIO samples (one as a primary for p > 95 MeV/c and one which is part of the MuStopPileup sample and covers all momentum ranges up to the endpoint). This could introduce some double counting but it is unlikely to overly effect the outcomes of any physics analysis applied to these samples.

For future passes, custom pile-up samples will be combined as primaries in the same way we have done the DIO tails.

RPC

RPC is simulated using the RPCGun generator. Both internal and external RPC can be simulated using the same generator.

At timing filter on arrival proper time of the stopped pions is used to improve performance of the simulation. This must be factored in when normalizing the samples.

Input Samples

Here is a list of the relevant samples used as part of the current Mock Data effort.

The exact samples input into each MDS is mentioned below.

process	campaign	generated	reconstructed	livetime (s) (1BB)	POT equiv	eff
`CosmicCORSIKA`	`MDC2020ae`		`9568173`	`1.1e7`
`DIOtailp95MeVc`	`MDC2024a_4`	`401760000`	`192315262`	`6032089213`	`1.81416E+22`	`0.478`
`DIOtailp75MeVc`	`MDC2024a_3`	`160000000`	`28887387`	`2.03E+05`	`6.11958E+17`	`0.180`
`DIOtail (95)`	`MDC2020ad`	`3267000`	`1534077`	`127960436.9`	`3.84843E+20`	`0.469`
`DIOtail (75)`	`MDC2020ad_sm0`	`170150000`	`30357435`	`2219329`	`6.67465E+17`	`0.178`
`CeEndpoint`	`MDC2020ac`	`100000`	`54280`			`0.542`
`CeMLeadingLog`	`MDC2024a_sm4`	`22400`	`11687`			`0.522`

DIO 75MeV/c short tests

A set of very short samples with a p>75MeV/c cut on the DIO tail were generated to get a feel for the size and time taken to generate this sample. These are available for anyone testing the trigger, but not useful for physics studies. The larger DIOtail only sample can also be used as in all cases the eventual events were all DIOtails.

Tag	Processes	BB	equiv. time	equiv. POT	Rmue	conditions	Comments	sam name
`testa`	`CE+DIO(75MeV/c)`	`1BB`	`17s`	`5.22e13`	`0`	`perfect`	`dts,dig,mcs`	`ensemble-1BB-CEDIO-60s-p75MeVc`
`testb`	`CE+DIO(75MeV/c)+CRY`	`1BB`	`17s`	`5.22e13`	`0`	`perfect`	`dts,dig,mcs`	`ensemble-1BB-CEDIOCRYCosmic-60s-p75MeVc`
`testc`	`CE+DIO(75MeV/c)`	`2BB`	`13s`	`7.7e13`	`0`	`perfect`	`dts,dig,mcs`	`ensemble-2BB-CEDIO-60s-p75MeVc`
`testd`	`CE+DIO(75MeV/c)+CRY`	`2BB`	`13s`	`7.7e13`	`0`	`perfect`	`dts,dig,mcs`	`ensemble-2BB-CEDIOCRYCosmic-60s-p75MeVc`
`teste`	`CE+DIO(75MeV/c)+CRY`	`1BB`			`0`	`perfect`	`dts only`	`ensemble-1BB-CEDIOCRYCosmic-3600s-p75MeVc`
`testf`	`CE+DIO(75MeV/c)+CRY+PU`	`1BB`			`0`	`perfect`	`dts only`	`ensembles-1BB-CEDIOCRYCosmic-60s-p75MeVc-OnSpillMix1BBTriggered`

Mock-Dataset-0 (MDS0) (95 MeV/c)

The MDS0 samples all include DIO tail events with the 95 MeV/c cut.

All components except the RPC are included. Two Rmue values are used, one at 1e-13 which is just below the present upper limit (7e-13) and allows around 55 generated CE events for the 1 week sample and 222 generated CE for the 1 month livetime (before any selection or reconstruction efficiency is factored in).

The samples available are listed below:

Tag	Processes	BB	eqiv POT	equiv livetime	Rmue	conditions	sam name	Comments
`MDS0a`	`CE+DIO(95MeV/c)`	`1BB`	`1.50e18`	`4.89E+05 (5.7days)`	`9.39E-14`	`best,perfect`	`nts.mu2e.MDS0a.MDC2020ad_perfect_v1_2.root`
`MDS0b`	`CE+DIO(95MeV/c)+CRY`	`1BB`	`5.61E+17`	`1.86E+05 (2.1 days)`	`1e-13`	`best,perfect`	`nts.mu2e.MDS0b.MDC2020ad_perfect_v1_3.root`
`MDS0c`	`CE+DIO(95MeV/c)+CRY`	`1BB`	`2.31E+18`	`7.70E+05 (8.9 days)`	`8.88E-14`	`best,perfect`	`ensemble-1BB-CEDIOCRYCosmic-2400000s-p95MeVc-Trigger-`
`MDS0d`	`CE+DIO(95MeV/c)+CRY+PU`	`1BB`	`5.84E+17`	`1.94E+05 (2.2 days)`	`1.35E-13`	`perfect`	`ensemble-1BB-CEDIOCRYCosmic-600000s-p95MeVcMix1BBTriggered`	`normalization somewhat handwavy here, expect similar to pass0b`
`MDS0e`	`CE+DIO(95MeV/c)+CRY`	`1BB`			`1e-13`	`perfect`	`dts only: ensemble-1BB-CEDIOCRYCosmic-31000000s-p95MeVc`	`largest simple sample`

The dts, digi, mcs and TrkAna ntuples are available in the usual locations. In most cases the digi and reco stages were ran with perfect and best condtions.

The component samples which went into these streams are listed here:

process	tag	Comments
`CeEndpoint`	`MDC2020ac`
`DIOtail (95MeV/c)`	`MDC2020ad`
`DIOtail (75MeV/c)`	`MDC2020ad_sm0`
`CRY Comsic`	`MDC2020s`	`1 year sample, signal stream`
`pile-up/stops`	`MDC2020p`	`most recently made mu beam sample`

Yields

Before any cuts are applied the true values for each process can be derived using the Process Code:

tag	DIOtail	CE	Cosmics
`MDS0a`	`9966`	`49`	`N/A`
`MDS0b`	`3801`	`20`	`4684`
`MDS0c`	`15691`	`73`	`19101`
`MDS0d`	`3956`	`28`	`4681`

Once a standard set of cuts is defined we will apply those and update these yields.

Effects of Pile-up

Samples pass0b and pass0d are essentially the same, in terms of physics contributions. However, pass0d contains standard pile up. Here is a list of the processess reconstructed in each file:

Pass0b:

[(38, 'cosmicCRY', 4684)]
process code counts:
[(12, 'compt', 71), 
(13, 'conv', 882), 
(14, 'Decay', 181), 
(17, 'eIoni', 24), 
(31, 'muIoni', 2269), 
(34, 'muPairProd', 143), 
(56, 'mu2ePrimary', 4684), 
(97, 'neutronInelastic', 2), 
(99, 'pi_PlusInelastic', 1), 
(114, 'DIO', 7),
(116, 'muonNuclear', 4), 
(166, 'mu2eMuonDecayAtRest', 3801), 
(167, 'mu2eCeMinusEndpoint', 20)]

Pass0d

(38, 'cosmicCRY', 4681)
process codes:
[(12, 'compt', 176),
(13, 'conv', 892), 
(14, 'Decay', 470),
(17, 'eIoni', 39),
(31, 'muIoni', 2285), 
(34, 'muPairProd', 139),
(56, 'mu2ePrimary', 4681),
(97, 'neutronInelastic', 1), 
(100, 'pi_MinusInelastic', 1), 
(114, 'DIO', 456),
(116, 'muonNuclear', 4), 
(133, 'RadioactiveDecayBase', 1),
(165, 'mu2eMuonCaptureAtRest', 4),
(166, 'mu2eMuonDecayAtRest', 3956), 
(167, 'mu2eCeMinusEndpoint', 28)

So, as you see, the total number of mu2eMuonDecayAtRest is similar, a slight increase in 0d. This could be a result of decays < the chosen cut but also there will be some chance of double counting in current PU model. Part of the goals moving forward is to remove that possibility. There are also a few more CeMinusEndpoints. Whether these would pass selection cuts is not known. We see an increase in compton events, Decay events and DIO (which includes Michel decays and DIFs). We see a small amount of capture backgrounds (4).

Mock Dataset 1 (MDS1)

MDS1 will inherit from the MDC2020ae (Cosmics) and MDC2020ai primaries

Several updates are made for MDS 1:

CeEndpoint now including the leading log too;
DIO tail momentum cut moved to 75 MeV/c for triggered stream only;
CORSIKA generator used for cosmics;
PU streams upgraded (might move to pass2).

All will assume 1BB:

Tag	Processes	equiv livetime	Rmue
`MDS1a`	`CELL+DIO(95MeV/c)+CORSIKA`	`~1 week`	`1e-13`
`MDS1b`	`CELL+DIO(75MeV/c)+CORSIKA`	`~1 day`	`1e-13`
`MDS1c`	`CELL+DIO(95MeV/c)+CORSIKA`	`~1 month`	`0`
`MDS1d`	`CELL+DIO(95MeV/c)+CORSIKA`	`~ 4 months`	`1e-13`
`MDS1e`	`CELL+DIO(75MeV/c)+CORSIKA+PU`	`~ 1 day`	`0`
`MDS1f`	`CELL+DIO(95MeV/c)+CORSIKA+PU`	`~1 week`	`1e-13`
`MDS1g`	`CELL+DIO(95MeV/c)+CORSIKA+RPC`	`~1 week`	`1e-13`

Additional signals will follow

process	tag	events
`CeMLeadingLog`	`MDC2024a_sm4`	`800K`
`DIOtail (95MeV/c)`	`MDC2024a_sm4`	`very large`
`DIOtail_95 (95MeV/c)`	`MDC2020ai`	`1 year`
`DIOtail_75 (75MeV/c)`	`MDC2020ai`	`1 year`
`DIOtail (75MeV/c)`	`MDC2024a_sm3`	`1 week`
`CORSIKA`	`MDC2020ae`
`pile-up/stops`	`MDC2020p`	`-`

MDS1a

The config is as follows:

njobs= 50
rmue= 1e-13
dem_emin= 95
stops=  MDC2020p
cosmics= MDC2020ae
current= MDC2020ai
inputprimary= MDC2020ai
livetime= 623819
BB= 1BB
Tcycle= 1.33
POT_per_cycle= 4000000000000.0
onspilltime= 201493.537
NPOT= 1.8761473684210527e+18
CEMLL= 177.97751169106488
DIOfrac= 3.6370937564509995e-11
DIO= 41560.339992620764
CORSIKA= 201493.537
Mixed= No

The values extracred from the ensemble log files are as follows (factoring random sampling and efficiencies):

CeMLL 147
DIO 19913 (47% efficiency)
Cosmic 184863

Samples:

family	location
`dts`	`/pnfs/mu2e/tape/usr-sim/dts/sophie/ensembleMDS1a/MDC2020ai/art/`
`ensemble logs`	`/pnfs/mu2e/tape/usr-etc/cnf/sophie/ensembleMDS1a/MDC2020ai_logs/tar/`
`dig`	`/pnfs/mu2e/tape/phy-sim/dig/mu2e/ensembleMDS1aOnSpillTriggered/MDC2020ai_perfect_v1_3/art`
`mcs`	`/pnfs/mu2e/tape/phy-sim/mcs/mu2e/ensembleMDS1aOnSpillTriggered/MDC2020ai_perfect_v1_3/art`
`nts`	`/pnfs/mu2e/tape/phy-nts/nts/mu2e/ensembleMDS1aOnSpillTriggered/MDC2020ai_perfect_v1_3/root/1e/7e/nts.mu2e.ensembleMDS1aOnSpillTriggered.MDC2020ai_perfect_v1_3.0.root (v6)`

MDS1b

This is the first large trigger sample. The config is:

njobs= 700
cosmicjob= MDC2020ag
primaries= MDC2020ai
rmue= 1e-13
dem_emin= 75
stops=  MDC2020p
livetime= 87263.8
BB= 1BB
Tcycle= 1.33
POT_per_cycle= 4000000000000.0
onspilltime= 28186.207400000003
NPOT= 2.6244751879699248e+17
CEMLL= 24.89663505713476
DIOfrac= 4.188075916508229e-07
DIO= 66944463.209833
CORSIKA= 28186.207400000003
Mixed= No

The resulting true dts sampled number of events are (factoring efficiency), there is some randomness hence these are also quoted here:

CeMLL : 0
DIO : 1.22e7 (efficiency = 18%)
Cosmic : 28245 (livetime)

Samples:

family	location
`dts`	`/pnfs/mu2e/tape/usr-sim/dts/sophie/ensembleMDS1b/MDC2020ai/`
`dig`	`/pnfs/mu2e/tape/phy-sim/dig/mu2e/ensembleMDS1bOnSpillTriggered or Triggerable /MDC2020ai_perfect_v1_3/art/ (56/70 files complete --> factor this into your studies)`
`mcs`	`/pnfs/mu2e/tape/phy-sim/mcs/mu2e/ensembleMDS1bOnSpillTriggered/MDC2020ai_perfect_v1_3/art/`

MDS1c

configuration:

njobs= 200
cosmicjob= MDC2020ae
primaries= MDC2020ai
rmue= 0
dem_emin= 95
stops=  MDC2020p
livetime= 2.49376e+06
BB= 1BB
Tcycle= 1.33
POT_per_cycle= 4000000000000.0
onspilltime= 805484.48
NPOT= 7.50003007518797e+18
CEMLL= 0
DIOfrac= 3.6370937564509995e-11
DIO= 166140.36036093475
CORSIKA= 805484.48
Mixed = No

The values extracred from the ensemble log files are as follows (factoring random sampling and efficiencies):

DIO  81079
CeMLL  0
cosmic  746665

Samples:

family	location
`dts`	`/pnfs/mu2e/tape/usr-sim/dts/sophie/ensembleMDS1c/MDC2020ai/art/`
`dig`	`/pnfs/mu2e/tape/phy-sim/dig/mu2e/ensembleMDS1cOnSpillTriggered/art/`
`mcs`	`/pnfs/mu2e/tape/phy-sim/mcs/mu2e/ensembleMDS1cOnSpillTriggered/MDC2020ai_perfect_v1_3/art/`
`nts`	`/pnfs/mu2e/tape/phy-nts/nts/mu2e/ensembleMDS1cOnSpillTriggered/MDC2020ai_perfect_v1_3/root/47/ce/nts.mu2e.ensembleMDS1cOnSpillTriggered.MDC2020ai_perfect_v1_3.0.root`

MDS1d

Config:

njobs= 800
cosmicjob= MDC2020ae
primaries= MDC2020ai
rmue= 1e-13
dem_emin= 95
stops=  MDC2020p
livetime= 9.96697e+06
BB= 1BB
Tcycle= 1.33
POT_per_cycle= 4000000000000.0
onspilltime= 3219331.31
NPOT= 2.997584962406015e+19
CEMLL= 2840.3607712653018
DIOfrac= 3.6370937564509995e-11
DIO= 664023.7984034654
CORSIKA= 3219331.31
Mixed= No

The values extracred from the ensemble log files are as follows (factoring random sampling and efficiencies):

total DIO  323889
total CeMLL  1592
total cosmic  2982244

Samples:

family	location
`dts`	`/pnfs/mu2e/tape/usr-sim/dts/sophie/ensembleMDS1d/MDC2020ai/art`
`dig`	`/pnfs/mu2e/tape/phy-sim/dig/mu2e/ensembleMDS1dOnSpillTriggered or Triggerable /MDC2020ai_perfect_v1_3/art/`
`mcs`	`/pnfs/mu2e/tape/phy-sim/mcs/mu2e/ensembleMDS1dOnSpillTriggered/MDC2020ai_perfect_v1_3/art/`
`nts`	`/pnfs/mu2e/tape/phy-nts/nts/mu2e/ensembleMDS1dOnSpillTriggered/MDC2020ai_perfect_v1_3/root/d3/6f/ (v6)`

MDS1e

This sample is MDS1b with standard pile up mixed in. Only 640/700 files made it through the mixing due to memory limits. Factor this in to your final understanding of the POT.

files:

family	location
`dts`	`see MDS1b`
`dig`	`/mu2e/tape/phy-sim/dig/mu2e/ensembleMDS1eMix1BBTriggered/MDC2020ai_perfect_v1_3/ and Triggerable`
`mcs`
`nts`

MDS1f

This is MDS1a + Mixing. There was a loss of files at the mixing stage so the results are expected to represent 90% of MDS1a.

family	location
`dts`	`see MDS1a`
`dig`	`/pnfs/mu2e/tape/phy-sim/dig/mu2e/ensembleMDS1fMix1BBTriggered/MDC2020ai_perfect_v1_3/art/ andTriggerable`
`mcs`	`/pnfs/mu2e/tape/phy-sim/mcs/mu2e/ensembleMDS1fMix1BBTriggered/MDC2020ai_perfect_v1_3/art/`
`nts`

Mock Dataset 2 (MDS2)

Here we add in the RPC/RMC streams and also provide positron samples ... TBC

@@ Line 850: / Line 850: @@
 This is MDS1a + Mixing. There was a loss of files at the mixing stage so the results are expected to represent 90% of MDS1a.
-coming soon (Waiting on Prestaging)
 {| class="wikitable"

Mock Data (MDS): Difference between revisions

Revision as of 14:14, 25 October 2024

Contents

MDC2020 ensembles

Introduction

Streams

Physics stream

Trigger stream

Inputs

Booster Batch Mode

Expected DIOs

Sim Efficiencies and Databases

Production Scripts

ensemble python scripts

ensemble shell scripts

Components

DIO tail

Conversion and Conversion Leading Log

Cosmics

Pile-up

RPC

Input Samples

DIO 75MeV/c short tests

Mock-Dataset-0 (MDS0) (95 MeV/c)

Yields

Effects of Pile-up

Mock Dataset 1 (MDS1)

MDS1a

MDS1b

MDS1c

MDS1d

MDS1e

MDS1f

Mock Dataset 2 (MDS2)

Navigation menu

Mock Data (MDS): Difference between revisions

Revision as of 14:14, 25 October 2024

MDC2020 ensembles

Introduction

Streams

Physics stream

Trigger stream

Inputs

Booster Batch Mode

Expected DIOs

Sim Efficiencies and Databases

Production Scripts

ensemble python scripts

ensemble shell scripts

Components

DIO tail

Conversion and Conversion Leading Log

Cosmics

Pile-up

RPC

Input Samples

DIO 75MeV/c short tests

Mock-Dataset-0 (MDS0) (95 MeV/c)

Yields

Effects of Pile-up

Mock Dataset 1 (MDS1)

MDS1a

MDS1b

MDS1c

MDS1d

MDS1e

MDS1f

Mock Dataset 2 (MDS2)

Navigation menu

Search