Data Products and Processing Tutorial
Tutorial Session Goal
This tutorial will explore the data products used in Mu2e and the modules and algorithms which create them. It is part of the June 2019 Computing and Software tutorial
Session Prerequisites and Advance Preparation
This tutorial assumes knowledge of art and the Mu2e detector. You will need to understand basic principles of how modules and event processing function in art. You will need to understand C++ data structures and fundamental types. You should have completed the following tutorials:
- Mu2e detector overview
- Mu2e_Offline_Tutorial
- Running_Art_Tutorial
Session Introduction
The information content of Mu2e is stored in the form of art data products. There are several levels of information:
- Monte Carlo generator information
- Geant4 information
- Digitized detector data, or digis (Offline format)
- Reconstructed data
We will explore a few of these, and the algorithms which create them.
Exercises
General
On your machine, setup an area for this tutorial, and launch Docker interactively with a link to it. The docker command is for macos, modify the Display setting as needed for windows, linux, see Docker for instructions.
> cd $HOME > mkdir Tutorials > mkdir Tutorials/DataExploration > docker run -it --rm -v /Users/brownd/Tutorials/DataExploration:/home/DataExploration -e DISPLAY=$ip:0 mu2e/user:tutorial_1-02
Inside the docker window, setup a satellite release for data exploration exercises:
[root@80c41be82418 home]# source /Tutorials_2019/setup_container.sh [root@80c41be82418 home]# cp -r /Tutorials_2019/DataExploration/* /home/DataExploration/ [root@80c41be82418 home]# cd /home/DataExploration/ [root@80c41be82418 DataExploration]# $TUTORIAL_OFFLINE/v7_4_1/SLF6/prof/Offline/bin/createSatelliteRelease --directory . [root@80c41be82418 DataExploration]# ls [root@80c41be82418 DataExploration]# source setup.sh [root@80c41be82418 DataExploration]# scons
Monte Carlo Generators
- Mu2e generators and GenParticle class
Geant4 and Detector Simulation
- The G4 Mu2e Detector description text files
- Examine the SimParticle and StepPointMC classes
- Virtual detectors
Digitized signals
The term 'digi' refers to the digitized detector data stored during Mu2e operations by the Data Acquisition (DAQ) system.
Exercise 1: Tracker digis
Histogram the number of digis in a pure μ- → e- conversion sample:
> mu2e -c Examples/fcl/Ex01.fcl $TUTORIAL_DATA/dig.mu2e.CeEndpoint.MDC2018b.001002_00000001.art > root -l Ex01_CeE.root root [1] DE->Get("NStrawDigis")->Draw();
You should see ~40 StrawDigis/event on average. Now try with a μ- → e- conversion sample with beam backgrounds mixed in:
> mu2e -c Examples/fcl/Ex01.fcl $TUTORIAL_DATA/dig.mu2e.CeEndpoint-mix.MDC2018d.001002_00000000.art
You should see around 2300 StrawDigis/event. The signal/noise for raw data is < 2% ! This is why we need background rejection and pattern recognition.
Question: what is the format and what are the fields in the data collection file name and what do they mean? Hint: use the Mu2e wiki!
Now look at the TDC and ADC spectra:
> root -l Ex01_CeE.root root [] DE->Get("tdc")->Draw(); root [] DE->Get("deltatdc")->Draw(); root [] DE->Get("tot")->Draw(); root [] DE->Get("adc")->Draw();
The histograms will not have the correct range. By looking at Mu2e doc 4914, figure out what the ranges should be and correct the histograms marked with FIXME!. You can see how the values are accessed from the data product. Use your favorite editor (vim, emacs, ...) to edit the file.
> vim Examples/src/DataExplorer_module.cc ...
Questions: what is the physical meaning of deltatdc? tot? cal and hv?
Exercise 2: Calo digis
In this exercise, we'll explore calo crystal and calo cluster digis. Some supporting slides are written in doc-db 26766 [1].
We will start with a few questions:
1) Which data product contains crystal hits? How can you find the time / energy of a hit?
Answer: The data product is CaloCrystalHit, described in RecoDataProduct/inc/CaloCrystalHit.hh. The member functions time() and energy() give the corresponding information.
2) Which modules produce calorimeter clusters. What is the difference between them? Which data product should you use?
Answer: CaloProtoClusterFromCrystalHits and CaloClusterFromProtoCluster. CaloProtoClusterFromCrystalHits forms simply connected clusters from calorimeter hits. CaloClusterFromProtoCluster combines proto clusters close in time / distance into final clusters. You should use CaloClusters, unless you want to study how the proto-clusters are merged together.
3) Which data member indicates whether a cluster is contains several proto-clusters or a single one?
Answer: The boolean variable isSplit is true if the cluster contains several proto-clusters
4) How do I access the list of crystal hits contained a cluster!
Answer: The caloCrystalHitsPtrVector is a vector containing a list of art::Ptr to the CaloCrystalHits.
Now that you are all warmed up, we'll make a few plots (I know, this is getting so exciting!). First run the following snippet to produce the required data, then load the TTree in memory. The TTree name is DumpCaloDigis/Calo.
> mu2e -c Examples/fcl/Ex02.fcl $TUTORIAL_DATA/dig.mu2e.CeEndpoint.MDC2018b.001002_00000001.art > root -l ExploreCaloDigis.root > TTree *calo = (TTree*) _file0->Get("DumpCaloDigis/Calo")
What is available in this ntuple?
Hint: Look at the file Examples/src/ExploreCaloDigis_module.cc and the corresponding data products. Most of the names are self-explanatory, but a few other more cryptic!
Tip: you can use the TBrowser to inspect the content of a file, simply type
> TBrowser tb;
Now histogram the energy of the crystal hits (switching to log scale is a good idea here):
> calo->Draw("cryEnergyDep")
You should a rapidly falling distribution, as most hits are low energy. Now let's plot the crystal hits only in the second disk (first disk ID=0, second disk ID=1)
> calo->Draw("cryEnergyDep","cryDiskId==1")
Can you histogram the position of each crystal hit?
> calo->Draw("cryPosY:cryPosX","","box")
You should see 674 boxes... the bigger the box, the larger the number of hits in that crystal. As expected, there are more hits in the central region.
You should be on fire at this point, so we'll look at the cluster. Let's plot the number of crystals in the cluster
> calo->Draw("cluNumCrystals")
Next, draw the energy of all clusters with a radius of the center-of-gravity greater than 400 (less than 400)
> calo->Draw("cluEnergy","sqrt(cluCogX**2+cluCogY**2)>400") > calo->Draw("cluEnergy","sqrt(cluCogX**2+cluCogY**2)<400")
There is a lot of noise below 400! What about clusters in disk 0 and disk 1? Can we plot both on the same plot?
> calo->Draw("cluEnergy","cluIsSplit==0") > calo->Draw("cluEnergy","cluIsSplit==1","same")
As expected, disk 1 is cleaner!
Bonus: If you feel audacious, try to write an analysis module to do the following:
> Plot the energy and time of all crystal hits in the microbunch > Plot the energy of all clusters with a radial location greater than 400 mm. > Plot the energy of clusters containing a single proto-cluster or several proto-clusters in two separate histograms > Plot the energy of the most energetic hit in the cluster
An implementation is shown in Examples/src/ExploreCaloDigis_module.cc
Hit Reconstruction
- Track reconstruction algorithms and data products
- Hit Reconstruction
- Time Clusters
- Helices
- Kalman Fit
- Calorimeter reconstruction algorithms and data products
- CRV reconstruction algorithms and data products
Reference Materials
- Mu2e doc 4914 Packet Definition
- Mu2e doc 22693 Mock Data Challenge 2018
Glossary of Raw and Reconstructed Data Products
class | description | contents |
---|---|---|
StrawDigi | Offline format of a single Tracker hit | TDC and TOT from both straw ends, ADC waveform |
ComboHit | Calibrated Tracker hit, or an aggregate of several hits | position in space, time, and time differences |
TimeCluster | Collection of ComboHits nearby in time and (roughly) space | average time and error |
HelixSeed | Helix interpretation of a subset of hits in a TimeCluster | Helix parameters, t0, ComboHits with position along the helix |
KalRep | Full Kalman filter fit result: not persistable | Complete set of weight and parameter matrices and vectors used in the fit |
KalSeed | Compact summary of the Kalman filter fit result | Sampled fit segments, associated straw hits and straws |
KalSegment | KalSeed component: local fit result | Fit parameters and covariance at a particular point |
TrkStrawHitSeed | KalSeed component: straw hit as used in fit | hit position, residual, time, drift radius, errors, ... |
TrkStraw | KalSeed component: straw intersected by the fit | strawID, DOCA to wire, radiation length, energy loss, ... |
CaloCluster | Cluster of calorimeter crystal energy deposits | Total energy, center of gravity (COG), energy moments |
CrvCoincidenceCluster | Cluster of adjacent CRV reco pulses | position, PE count, start and end times |
Glossary of Principle Reconstruction Modules
module | category | description |
---|---|---|
StrawDigisFromStepPointMCs | Simulation | Converts G4 straw energy deposits into StrawDigs |
StrawHitReco | Reconstruction | Converts StrawDigs into single-straw ComboHits |
CombineStrawHits | Reconstruction | Combines adjacent ComboHits in a panel into aggregate ComboHits |
FlagBkgHits | Reconstruction | Identify (flag) panel ComboHits likely produced by low-energy Compton or delta-ray electrons |
TimeClusterFinder | Reconstruction | Group time-adjacent panel ComboHits (and calorimeter cluster if available) into a cluster |
RobustHelixFinder | Reconstruction | Fit a cluster of panel ComboHits to a simple helix using space-point positions |
CalTimePeakFinder | Reconstruction | Group panel ComboHits near a calorimeter cluster in time into a cluster |
CalHelixFinder | Reconstruction | Fit the calorimeter cluster position, target position and panel ComboHits to a simple helix |
KalSeedFit | Reconstruction | Fit single-straw transverse wire positions to a helix, using a simple helix as starting point |
KalFinalFit | Reconstruction | Kalman filter fit of single-straw drift ellipses, constrained with calorimeter cluster time (if present) |