MDC2018Ensembles: Difference between revisions

From Mu2eWiki
Jump to navigation Jump to search
Line 9: Line 9:


==Ensemble Datasets==
==Ensemble Datasets==
The full set of datasets generated for MDC2018 can be found at [https://mu2ewiki.fnal.gov/wiki/MDC2018]. An independent set of simulations for each signal was generated to build the ensembles out of (although the regular MDC2018 background frames were reused for mixing).


Currently, a single ensemble representing an approximately 1 week dataset is available. Art files are available in 'reco' format, with "ensemble-Data" files containing only the output of the reconstruction algorithms, and with "ensemble-MC" files containing reconstruction algorithm output as well as corresponding Monte Carlo truth information.
Currently, a single ensemble representing an approximately 1 week dataset is available. Art files are available in 'reco' format, with "ensemble-Data" files containing only the output of the reconstruction algorithms, and with "ensemble-MC" files containing reconstruction algorithm output as well as corresponding Monte Carlo truth information.

Revision as of 16:53, 5 September 2019

Introduction

Fake data ensembles were assembled to mimic what a real dataset from the experiment would look like in order to allow tests of analyses. This means that background and signals are mixed together in a single file, and all events are unweighted. For more information on the construction of the ensembles, please refer to docdb 27037, 26271, 24376, 22693, 28381.

The scripts used to produce the simulations for the ensembles are all in the JobConfig/ensembles directory in Offline. The simulation was run using Offline v7_4_0, but the ensemble scripts used come from a more up to date version. Additionally, the scripts used to build the actual mixed ensemble files are in the same directory. JobConfig/ensembles/genEnsemble.py is used to randomly select Rue,Rup, etc. for a given ensemble, then JobConfig/ensembles/run_si.py creates and runs the fcl to mix together the various signals and backgrounds into a single art file. JobConfig/ensembles/normalizations.py contains the code that is used to calculate the normalization of the signal and backgrounds.

Analyses currently using the ensembles can be listed at https://docs.google.com/spreadsheets/d/1So88Z1RYXwGCEGFcapGpdnmrWGR8TfpurxMdsADZ7do/edit

Ensemble Datasets

The full set of datasets generated for MDC2018 can be found at [1]. An independent set of simulations for each signal was generated to build the ensembles out of (although the regular MDC2018 background frames were reused for mixing).

Currently, a single ensemble representing an approximately 1 week dataset is available. Art files are available in 'reco' format, with "ensemble-Data" files containing only the output of the reconstruction algorithms, and with "ensemble-MC" files containing reconstruction algorithm output as well as corresponding Monte Carlo truth information.

Files are located on tape at

 /pnfs/mu2e/tape/phy-sim/mcs/mu2e/ensemble-Data/MDC2018i/art

and

 /pnfs/mu2e/tape/phy-sim/mcs/mu2e/ensemble-MC/MDC2018i/art

(dataset names are mcs.mu2e.ensemble-Data.MDC2018i.art or mcs.mu2e.ensemble-MC.MDC2018i.art). Additionally, a tarball containing the randomized parameter values, settings, and scripts used to generate the final mix ensemble file are uploaded as etc.mcs.ensemble-MC.MDC2018i.tgz.

Currently there are open and closed ensembles available. For the open ensemble, the true values of the randomized parameters (Rue, Rup, kMax, effective mean PBI) are all available, as are the ensemble-MC art files. For the closed ensembles, these parameters are hidden and only the ensemble-Data art files are currently available.

Although only one open ensemble was simulated, the backgrounds were reused to create 7 separate sets of files with different values of Rue and Rup. Note that due to the random sampling the exact number of background events will vary slightly from file to file but the large majority of background events will be identical between them. The different values are labelled by run with

Run # Rue Rup kMax Livetime (s) Eff. Mean PBI
1 9.418019e-14 2.014332e-14 89.461158 410400 4.032551e7
2 0 0 89.461158 410400 4.032551e7
3 1e-14 1e-14 89.461158 410400 4.032551e7
4 2e-14 2e-14 89.461158 410400 4.032551e7
5 4e-14 4e-14 89.461158 410400 4.032551e7
6 8e-14 8e-14 89.461158 410400 4.032551e7
7 1.6e-13 1.6e-13 89.461158 410400 4.032551e7

Additionally, two closed ensembles are available, a one week sample and a one month sample. These samples are completely statistically independent from the open samples above.

Run # Rue Rup kMax Livetime (s) Eff. Mean PBI
1001 hidden hidden hidden 410400 hidden
1004 hidden hidden hidden 1641600 hidden

Encrypting and decrypting closed ensemble data

RSA public/private key pair was created on the Fermilab machines in the mu2epro account using

 gpg --gen-key

The public key id is 6827CEA8 and the private key id is C8268954. The recipient was set to "Richie Bonventre <rbonventre@lbl.gov>", and the private key was password protected with the default mu2e docdb password.

The private key was exported using

 gpg --export-secret-keys C8268954 > mu2eSecretKey.asc

Afterwards, the secret key was deleted from the keyring using

 gpg --delete-secret-key C8268954

The public key remains and so any file can be encrypted from the mu2epro account using

 gpg --output myfile.enc --encrypt --recipient 6827CEA8 myfile

The file mu2eSecretKey.asc has been saved on several usb drives. To decrypt the files, copy mu2ePrivateKey.asc to a fermilab machine, then:

 gpg --import --no-default-keyring --secret-keyring temporary mu2eSecretKey.asc
 gpg --no-default-keyring --secret-keyring temporary --trust-model always --output myfile --decrypt myfile.enc
 rm ~/.gnupg/temporary