TrkAnaTutorial

From Mu2eWiki
Revision as of 00:06, 30 May 2019 by Edmonds (talk | contribs)
Jump to navigation Jump to search

Under Construction!

This tutorial is currently being written

Tutorial Session Goal

A TrkAna tree is a ROOT TTree where each entry in the tree represents a single track. The TrkAna tree is created by the TrackAnalysisReco module of Mu2e Offline which runs over a KalSeedCollection.

In this tutorial you will:

  • create TrkAna trees using the Mu2e Offline software; and,
  • analyze them using the ROOT command line and ROOT macros.

Session Prerequisites

This tutorial is aimed at anyone starting ntuple-analysis.

Before starting this tutorial you should:

  • know about the physics of Mu2e;
  • have the appropriate docker container set up; and,
  • know how to run the Mu2e Offline software and ROOT

Basic Exercises

Exercise 1: Creating a simple TrkAna tree

In this exercise, we will create a simple TrkAna tree and investigate it with the ROOT command line.

  1. First, run mu2e on a single CeEndpoint-mix reco art file:
  2. > mu2e -c $TUTORIAL_BASE/TrkAna/fcl/TrkAnaTutEx01.fcl -S $TUTORIAL_BASE/TrkAna/filelists/mcs.mu2e.CeEndpoint-mix-cat.MDC2018h.1-file.lst
  3. Now let's have a look at the TrkAna tree with the ROOT command line
  4. > root -l trkana-ex01.root root[n]: TrkAnaEx01->cd() root[n]: trkana->Print() You will see the TrkAna tree structure. Here is a brief description of the branches:
    1. evtinfo: event level information (e.g. event ID of the event this track is from)
    2. hcnt: hit count of different types of hit (e.g. number that pass certain collections)
    3. tcnt: track count of different track types
    4. trk: global fit information for the track (e.g. fit status, ranges of validity, number of hits, track quality)
    5. trk(ent/mid/xit): local fit information for the track at the enttrance of the tracker, the middle of the tracker and exit of the tracker (e.g. fit momentum, pitch angle)
    6. trktch: calorimeter hit information for the calorimeter function associated to the track (tch = TrkCaloHit)
    7. crvinfo: information of associated hits in the CRV

    Note that the "trk" parts of the branch names are configurable -- you will see this is a minute

  5. Now we can plot some simple things:
    1. the track momentum at the tracker entrance
    2. root[n]: trkana->Draw("trkent.mom")
    3. the calorimeter cluster energy
    4. root[n]: trkana->Draw("trktch.edep")
    5. With this last command you will see some entries at -1000. This means that there is no associated calorimeter cluster for this track. To exclude these we want to want to add a cut on the trktch.active flag (0 = there is no TrkCaloHit, 1 = there is TrkCaloHit):
    6. root[n]: trkana->Draw("trktch.edep", "trktch.active==1")
  6. Let's take a quick look at the fcl file to see how the TrackAnalysisReco module has been configured. Open it up in your favourite text editor and look at these important lines:
  7. TrkAnaEx01 : { @table::TrackAnalysisReco } physics.analyzers.TrkAnaEx01.candidate.input : "KFFDeM" physics.analyzers.TrkAnaEx01.candidate.branch : "trk" physics.analyzers.TrkAnaEx01.diagLevel : 0 physics.analyzers.TrkAnaEx01.FillMCInfo : false In order, these lines:
    1. import an example TrkAna module configuration (you can find it in $MU2E_BASE_RELEASE/TrkDiag/fcl/prolog.fcl);
    2. define the input KalSeedCollection that we want a TrkAna tree for (KFFDeM = KalFinalFit Downstream eMinus);
    3. configure the name of the output branches;
    4. set TrkAna to use the lowest diagnostic level (0 = simple list of tracks, 1 = hit level diagnostics); and,
    5. make sure we are not touching the MC truth
  8. (Optional): Run on a CeplusEndpoint-mix file ($TUTORIAL_BASE/TrkAna/filelist/mcs.mu2e.CeplusEndpoint-mix-cat.MDC2018h.1-file.lst) and get a list of positively-charged tracks. What is the momentum of these tracks?
  9. (Optional): Create a second instance of the TrackAnalysisReco module. Have one instance set to look at negatively-charged tracks and the other set to look at positively charged tracks. Run on muplusgamma-mix ($TUTORIAL_BASE/TrkAna/filelist/mcs.mu2e.flatmugamma-mix-cat.MDC2018h.1-file.lst) and count how many tracks of each type are found

Exercise 2: Calculating the Ce efficiency

Now that we can create a TrkAna tree, let's calculate how efficient we are at reconstructed conversion electrons with some signal cuts. Before starting this exercise, a quick note about event counting, event weighting, and track quality.

Event Counting

In the simulation, we generate a certain number of events. However, in order to save space, we only write out events that will produce a reconstructed tracks. We have various ways of filtering events but the result is the same -- the number of events in the output art files do not correspond to the number of events that were generated, which is what we need to calculate absolute efficiencies. To account for this, we keep track of the number of events that were generated by creating a GenEventCount object. Then we can run the genCountLogger module to read the actual number of generated events.

Event Weighting

In each "mixed" event, we add a single "primary" particle onto a set of "background frames", which represent the background hits from other processes. We want to simulate the variable intensity of the proton beam at the production target and so we scale the number of background hits when we create the mixed event. However, we still only add a single primary particle and so we record the scale factor used in a ProtonBunchIntensity object for use later. In this exercise, we will add a new module to the trigger path (PBIWeight), which translates the scale factor used for the proton bunch intensity into an EventWeight object. The TrackAnalysisReco then writes out these event weight values to a new branch (evtwt.PBIWeight). Event weighting is explored in more detail in Advanced Exercise #2.

Track Quality

We want a simple way to determine how well-reconstructed the tracks are. We use an artificial neural network (ANN) called TrkQual that takes various properties of the track and is trained to give each track a trkqual value between 0 (poorly-reconstructed) and 1 (well-recosntructed). In this exercise, we add the TrkQual modules to the trigger path and TrkAna writes out the output value to trk.trkqual. There is one TrkQualCollection for each KalSeedCollection. TrkQual is explored in more detail in Advanced Exercise #3.

The Exercise

Now onto the exercise:

  1. Create a TrkAna tree with CeEndpoint-mix tracks and include the genCountLogger, PBIWeight and TrkQual modules
  2. mu2e -c $TUTORIAL_BASE/TrkAna/fcl/TrkAnaTutEx02.fcl -S $TUTORIAL_BASE/TrkAna/filelists/mcs.mu2e.CeEndpoint-mix-cat.MDC2018h.1-file.lst You can read the file to see how the modules are added (Note: some of this hidden away in $MU2E_BASE_RELEASE/TrkDiag/fcl/prolog.fcl).
  3. Run this example ROOT macro that plots the track momentum onto a histogram with 0.5 MeV wide bins
  4. root -l $TUTORIAL_BASE/TrkAna/scripts/TrkAnaTutEx02.C
  5. Change the bin width to 0.05 MeV
  6. Add the following signal cuts to the Draw function
    1. the fit is successful (trk.status > 0)
    2. the track is in the time window of 700 ns -- 1695 ns (trk.t0)
    3. the tan-dip of the track is consistent with coming from the target: 0.577350 -- 1.000 (trkent.td)
    4. the impact parameter of the track is consistent with coming from the target: -80 mm -- 105 mm (trkent.d0)
    5. the maximum radius of the track is OK: 450 mm -- 680 mm (trkent.d0 + 2./trkent.om)
    6. the track is of good quality (trk.trkqual > 0.8)
  7. Because we simulated each event with a different proton bunch intensity, each track should be weighted by the PBIWeight. To do this you will want to modify the cut command to add the event weighting:
  8. evtwt.PBIWeight*(cuts)
  9. Now we can count the number of tracks that pass all these cuts
  10. hRecoMom->Integral()
  11. We can also integrate in the momentum signal region with the same function. Be careful TH1F::Integral takes bin numbers as its arguments and not x-values. Some hints:
    1. You can find a bin for a given x-value with hist->GetXaxis()->FindBin(x-value)
    2. Make sure you aren't off by one bin by checking the bin low edeges and bin high edges with TAxis::GetBinLowEdge() and TAxis::GetBinUpEdge().
  12. To calculate the efficiency you need to know the number of events generated for this simulation. This is stored in the output of the genCountLogger module
  13. TH1F* hNumEvents = (TH1F*) file->Get("genCountLogger/numEvents"); double n_generated_events = hNumEvents->GetBinContent(1);
  14. Now you can calculate the absolute Ce efficiency. What's the answer?
  15. (Optional): Make the plot prettier by:
    1. adding appropriate axis labels
    2. writing the Ce efficiency on the plot with a TLatex
    3. adding dashed lines to show the momentum window with TLines (extra bonus: have the lines move when the momentum window values change)
  16. (Optional): Add a second momentum plot but with a higher track quality cut. Include a TLegend.

Exercise 3: Adding MC truth

In this exercise, we will add MC truth information to the TrkAna tree and see how close our reconstruction matches the truth.

  1. Create a TrkAna tree with CeEndpoint-mix tracks and include the MC truth information:
  2. mu2e -c $TUTORIAL_BASE/TrkAna/fcl/TrkAnaTutEx03.fcl -S $TUTORIAL_BASE/TrkAna/filelists/mcs.mu2e.CeEndpoint-mix-cat.MDC2018h.1-file.lst If you look in the fcl file, you will notice that there is a global switch fillMCInfo and a local switch for the candidate track candidate.fillMC. This will be important when we add supplemental tracks in Exercise 5.
  3. You can open up the file and Print the tree structure. For this exercise, we will focus on the trkmcent, which contains the MC information for the step that crossed the entrance into the tracker. The other trkmc* branches will be the focuse of Exercise 4.
  4. Run this example ROOT macro that plots the intrinsic tracker momentum for tracks that pass our signal cuts except for the trkqual and momentum window cuts
  5. root -l $TUTORIAL_BASE/TrkAna/scripts/TrkAnaTutEx03.C
  6. Add the following line to the start of the macro:
  7. gStyle->SetOptStat(111111); This will add the "Overflow" and "Underflow" values of the histogram to the stats box. This shows the number of events that fall outside of the axis range of the histogram. In ROOT the overflow bin is at bin number n_bins+1 and the underflow bin is bun number 0.
  8. Using what you learned in Exercise 2, count the number of events above +3 MeV, including the events in the overflow bin
  9. Add the trkqual cut and play with it. How does the number of events above +3 MeV change?
  10. (Optional): Do the same comparison at the middle or exit of the tracker
  11. (Optional): Perform a fit to a double-sided crystal ball function. This function is a Gaussian with polynomial tails. The function can be found here: $TUTORIAL_BASE/TrkAna/scripts/dscb.h. You will need to create your own TF1 (hint) and use the TH1::Fit() function (hint). What is the core resolution?

Exercise 4: Following genealogy

With TrkAna, we also have access to important steps in the genealogy of the track (i.e. which particles produced the track). Before looking at the branches themselves, we need to discuss a little about the simulation.

For each event in the simulation we instantiate a GenParticle, which represents the particle we want to start the simulation with. For some of our samples, we create more than one of these per event (e.g. for cosmic rays, we take all the particles created in the shower at a certain altitude). We then pass these GenParticles to the physics simulation and create a SimParticle for each one. We then let Geant4 simulate these SimParticles and create more SimParticles until we end up with StepPointMCs in the detectors. For the tracker, there will be many different StepPointMCs from different SimParticles in every straw. We assign a SimParticle to be responsible each straw hit if it had the most StepPointMCs in that hit.

Now here is a description of the branches we have in TrkAna:

trkmc
contains information about the SimParticle that produced the most hits on the track
trkmcpri
contains information about the primary particle that produced the trkmc particle (e.g. for cosmic rays, this will correspond to the cosmic ray that produced the shower (i.e. the proton))
trkmcgen
contains information about the actual GenParticle in our simulation that ultimately produced the track (e.g. for cosmic rays, this will be the particle in the shower that ultimately produced the track)

For most of our samples, these are all the same particle. In order to explore the differences, we will run on a sample of cosmic rays made by the CRY generator.

  1. Create a TrkAna tree with CRY-cosmic-general-mix tracks and include the MC truth information:
  2. mu2e -c $TUTORIAL_BASE/TrkAna/fcl/TrkAnaTutEx04.fcl -S $TUTORIAL_BASE/TrkAna/filelists/mcs.mu2e.CRY-cosmic-general-mix-cat.MDC2018h.1-file.lst
  3. look at tracks
  4. look at trkmc.opos. This shows where the particles were created
  5. look at trkmcgen.opos. This is where the simulation started
  6. look at trkmcpri.opos. This is where the shower would have started(?)
  7. Find an interesting event to look at

For any other intermediate steps in the genealogy, you will need to run Offline.

Exercise 5: Adding supplemental tracks

There might be other tracks that are important to your analysis (e.g. upstream going tracks)

  1. run with supplemental tracks
  2. look for reflected tracks? compare DeM to DmuM?
  3. check for CRV coincidence?

Conclusion

This last exercise created a TrkAna tree that is the same the one created in TrkAnaReco.fcl

Advanced Exercises

Exercise 1: Hit level diagnostics?

Exercise 2: TrkQual?

Write out trkqual branch and retrain

Exercise 3: Event weighting?

Run on flateminus-mix with DIO weights and plot that

Exercise 4: Running reconstruction?

Exercise 5: TrkAnaLoop?

Exercise 6: Using InfoStructHelper?

Reference Materials

  • Use this place to add links to reference materials.
  • TrkAna wiki page

A Useful Glossary

ROOT
data analysis framework developed at CERN
KalSeed
data product that represents a track
CeEndpoint-mix
dataset name for CeEndpoint (i.e. mono-energetic electrons) with background frames mixed in
CeplusEndpoint-mix
dataset name for CeplusEndpoint (i.e. mono-energetic positrons) with background frames mixed in
flatmugamma-mix
dataset name for flatmugamma (i.e. flate energy photons generated at muon stopping positions) with background frames mixed in
KalFinalFit
the module name for the final stage of the Kalman filter fit for the track
TrkQual
an artificial neural network (ANN) that takes parameters from the track and outputs a value between 0 (poorly reconstructed) and 1 (well-reconstructed)