TrkAnaTutorial: Difference between revisions
(→Setup) |
|||
Line 30: | Line 30: | ||
<nowiki>> source /Tutorials_2019/setup_container.sh | <nowiki>> source /Tutorials_2019/setup_container.sh | ||
> cp -r $TUTORIAL_BASE/TrkAna /home/working_area/ | > cp -r $TUTORIAL_BASE/TrkAna /home/working_area/ | ||
> cd /home/working_area/TrkAna</nowiki> | > cd /home/working_area/TrkAna | ||
> source $TUTORIAL_OFFLINE/v7_4_1/SLF6/prof/Offline/setup.sh</nowiki> | |||
We also want to create some filelists for the exercises: | We also want to create some filelists for the exercises: |
Revision as of 19:06, 20 June 2019
Session Prerequisites
This tutorial is aimed at anyone who wants to use TrkAna trees for analysis.
Before starting the Basic Exercises of this tutorial you should:
- know about the physics of Mu2e;
- have the appropriate docker container set up; and,
- know how to run the Mu2e Offline software and ROOT.
Before starting the Advanced Exercises of this tutorial you should:
- know how to write an art module.
Session Introduction
One of the final outputs of the Mu2e reconstruction are fits to the tracks in the tracker. These are stored as KalSeed
s in KalSeedCollection
s. For each fit hypothesis, we have a different KalSeedCollection
(e.g. downstream electrons, downstream muons, upstream positrons).
In order to do analyses with these tracks, we have an art module that creates a ROOT TTree of these KalSeed
s called TrkAna. Each entry in the tree corresponds to a single track.
In the Basic Exercises you will:
- create TrkAna trees using the Mu2e Offline software and MDC2018 datasets; and,
- analyze them using the ROOT command line and ROOT macros.
In the Advanced Exercises you will:
- retrain an artificial neural network (ANN); and,
- create customised versions of TrkAna.
Basic Exercises
Setup
These exercises are designed to be run on v7_4_1 from $TUTORIAL_BASE/TrkAna
In the docker, we assume that you have a volume mounted on your machine to /home/working_area/ in the container:
> source /Tutorials_2019/setup_container.sh > cp -r $TUTORIAL_BASE/TrkAna /home/working_area/ > cd /home/working_area/TrkAna > source $TUTORIAL_OFFLINE/v7_4_1/SLF6/prof/Offline/setup.sh
We also want to create some filelists for the exercises:
> mkdir filelists > ls $TUTORIAL_BASE/data/mcs.mu2e.CeEndpoint-mix.*.art > filelists/mcs.mu2e.CeEndpoint-mix.MDC2018h.lst > ls $TUTORIAL_BASE/data/mcs.mu2e.CeplusEndpoint-mix.*.art > filelists/mcs.mu2e.CeplusEndpoint-mix.MDC2018h.lst > ls $TUTORIAL_BASE/data/mcs.mu2e.flateminus-mix.*.art > filelists/mcs.mu2e.flateminus-mix.MDC2018h.lst > ls $TUTORIAL_BASE/data/mcs.mu2e.CRY-cosmic-general-mix.*.art > filelists/mcs.mu2e.CRY-cosmic-general-mix-cat.MDC2018h.lst > ls $TUTORIAL_BASE/data/mcs.mu2e.flatmugamma-mix.*.art > filelists/mcs.mu2e.flatmugamma-mix.MDC2018h.lst
If you are working on mu2egpvm, then the setup instructions will be different. These will be added to this tutorial at a later date.
Exercise 1: Creating the simplest TrkAna tree
In this exercise, we will create the simplest TrkAna tree and investigate it with the ROOT command line.
- First, run
mu2e
on a single CeEndpoint-mix reco art file:
> mu2e -c Ex01/fcl/TrkAnaEx01.fcl -S filelists/mcs.mu2e.CeEndpoint-mix.MDC2018h.lst
- Now let's have a look at the TrkAna tree with the ROOT command line > root -l trkana-ex01.root root[n]: TrkAnaEx01->cd() root[n]: trkana->Print() You will see the TrkAna tree structure. Here is a brief description of the branches:
-
evtinfo
: event level information (e.g. event ID of theart::Event
this track is from) -
hcnt
: hit count of different types of hit on this track (e.g. number of hits that passed the time selection) -
tcnt
: track count of different track types -
trk
: global fit information for the track (e.g. fit status, ranges of validity, number of hits) -
trk(ent/mid/xit)
: local fit information for the track at the entrance of the tracker, the middle of the tracker and the exit of the tracker (e.g. fit momentum, pitch angle) -
trktch
: calorimeter hit information for the calorimeter cluster associated with the track (tch = TrkCaloHit) -
crvinfo
: information of associated hits in the CRV - Now we can plot some simple things:
- the track momentum at the tracker entrance root[n]: trkana->Draw("trkent.mom")
- the calorimeter cluster energy root[n]: trkana->Draw("trktch.edep")
- With this last command you will see some entries at -1000. This means that there is no associated calorimeter cluster for this track. To exclude these we want to want to add a cut on the
trktch.active
flag (0 = there is noTrkCaloHit
, 1 = there isTrkCaloHit
):
root[n]: trkana->Draw("trktch.edep", "trktch.active==1")
- Let's take a quick look at the fcl file to see how the
TrackAnalysisReco
module has been configured. Open it up in your favourite text editor and look at these important lines:
TrkAnaEx01 : { @table::TrackAnalysisReco }
physics.analyzers.TrkAnaEx01.candidate.input : "KFFDeM"
physics.analyzers.TrkAnaEx01.candidate.branch : "trk"
physics.analyzers.TrkAnaEx01.diagLevel : 1
physics.analyzers.TrkAnaEx01.FillMCInfo : false
In order, these lines:
- import an example TrkAna module configuration (you can find it in
$MU2E_BASE_RELEASE/TrkDiag/fcl/prolog.fcl
); - define the input
KalSeedCollection
that we want the TrkAna tree to read from (KFFDeM = KalFinalFit DownstreameMinus); - configure the name of the output branches;
- set TrkAna to use the lowest diagnostic level (1 = simple list of tracks, 2 = hit level diagnostics); and,
- make sure we are not touching the MC truth
Note that the "trk" parts of the branch names are configurable -- you will see this in a minute
That's the end of this exercise -- you can now create a simple TrkAna tree! Try the following optional exercise to explore further:
- (Optional): Modify the module configuration to look at positively-charged tracks and run on CeplusEndpoint-mix (code>filelist/mcs.mu2e.CeplusEndpoint-mix.MDC2018h.lst)
- (Optional): Create a second instance of the
TrackAnalysisReco
module. Have one instance configured to look at negatively-charghed tracks and the other to look at positively-charged tracks. Run on muplusgamma-mix (filelist/mcs.mu2e.flatmugamma-mix.MDC2018h.lst
) and count how many tracks of each type are found
Exercise 2: Calculating the Ce efficiency
Now that we can create a TrkAna tree, let's calculate how efficient our reconstruction is for conversion electrons after some signal cuts. There's a lot of concepts introduced in this exercise, so don't worry if it all doesn't make sense at first. I'll start with some quick notes on event counting, event weighting, and reco quality.
Event Counting
In the simulation, we generate a certain number of events. However, in order to save space, we only write out events that will produce a reconstructed tracks. We have various ways of filtering events but the result is the same -- the number of events in the output art files do not correspond to the number of events that were generated. We need to know the total number of generated events in order to calculate the absolute efficiency. We keep track of the number of events that were generated by creating a GenEventCount
object for each art::SubRun
. Then in our TrkAna job, we run the genCountLogger
module to read the actual number of generated events.
Event Weighting
In each "mixed" event, we add a single "primary" particle onto a set of "background frames", which represent the background hits from other processes. We want to simulate the variable intensity of the proton beam at the production target and so we scale the number of background hits when we create the mixed event. However, we still only have a single primary particle and so we record the scale factor used in a ProtonBunchIntensity
object for use later. In our TrkAna job, we add a new module (PBIWeight
), which translates the scale factor used for the proton bunch intensity into an EventWeight
object. The TrackAnalysisReco
module writes out these proton beam scale factors to a new branch (evtwt.PBIWeight
). Event weighting is explored a little bit more in one of the optional exercises.
Reco Quality
We use various algorithms to check the "quality" of a track in some category. Currently there are two we consider:
- the track fit quality (i.e. how will-reconstructed a track is); and
- the particle ID (PID) quality (i.e. how closely a track resembles an electron rather than a muon).
In our TrkAna job, we will two artificial neural network (ANN) based algorithms to determine these reco qualities (TrkQual
and TrkCaloHitPID
). In this exercise, we only care that the output of both of these modules is a RecoQualCollection
with one RecoQual
object per KalSeed
. A RecoQual
is essentially a float and, in these two algorithms, is between 0 (poorly-reconstructed, muon-like track) and 1 (well-reconstructed, electron-like track). TrkQual
and ANNs are explored in more detail in Advanced Exercise 3.
The Exercise
Now onto the exercise:
- Create a TrkAna tree with CeEndpoint-mix tracks and include the
genCountLogger
,PBIWeight
,TrkQual
andTrkCaloHitPID
modules
mu2e -c Ex02/fcl/TrkAnaEx02.fcl -S filelists/mcs.mu2e.CeEndpoint-mix.MDC2018h.lst
If you read the file you will see that we have added:
physics.TrkAnaTrigPath : [ @sequence::TrkAnaReco.TrigSequence ]
which adds a standard set of modules including - Also in TrkAnaEx02.fcl, you will see that we have changed the
input
parameter to just"KFF"
and added thesuffix
parameter:
physics.analyzers.TrkAnaEx02.candidate.input : "KFF"
physics.analyzers.TrkAnaEx02.candidate.branch : "trk"
physics.analyzers.TrkAnaEx02.candidate.suffix : "DeM"
In the Mu2e reconstruction jobs, we keep consistency between modules that run on individual fit hypotheses by using a standard suffix to the module label. If you look in - If you open up the ROOT file and print the tree structure, you will notice the following new branches:
evtwt
: stores the values of allEventWeight
objects in the art eventtrkqual
: stores the values of all relevantRecoQual
objects in the art event (NB "trk" here is our chosen branch prefix for the time being)- Run this example ROOT macro that plots the track momentum onto a histogram with 0.05 MeV wide bins root -l Ex02/scripts/TrkAnaEx02.C
- Add the following signal cuts to the Draw function
- this event triggered (
(trigbits&0x208)>0
) - the fit is successful (
trk.status > 0
) - the track is in the time window of 700 ns -- 1695 ns (
trk.t0
) - the tan-dip of the track is consistent with coming from the target: 0.577350 -- 1.000 (
trkent.td
) - the impact parameter of the track is consistent with coming from the target: -80 mm -- 105 mm (
trkent.d0
) - the maximum radius of the track is OK: 450 mm -- 680 mm (
trkent.d0 + 2./trkent.om
) - the track is of good quality (
trkqual.TrkQualDeM > 0.8
) - there was no hit in the CRV between -50 ns and 150 ns of the track (
bestcrv<0||(de.t0-crvinfo._timeWindowStart[bestcrv]<-50||de.t0-crvinfo._timeWindowStart[bestcrv]>150.0)
) - Because we simulated each event with a different proton bunch intensity, each track should be weighted by the
PBIWeight
. To do this you will want to modify the cut command to add the event weighting:
evtwt.PBIWeight*(your cuts)
- Now we can count the number of tracks that pass all these cuts hRecoMom->Integral()
- We can also integrate in the momentum signal region with the same function. Be careful
TH1F::Integral()
takes bin numbers as its arguments and not x-values. Some hints: - you can find a bin for a given x-value with
hist->GetXaxis()->FindBin(x-value)
- make sure you aren't off by one bin by checking the bin low edges and bin high edges with
TAxis::GetBinLowEdge()
andTAxis::GetBinUpEdge()
. - To calculate the efficiency you need to know the number of events generated for this simulation. This is stored in the output of the
genCountLogger
module
TH1F* hNumEvents = (TH1F*) file->Get("genCountLogger/numEvents");
double n_generated_events = hNumEvents->GetBinContent(1);
- Now you can calculate the absolute Ce efficiency. What's the answer?
PBIWeight
etc. (you can look in $MU2E_BASE_RELEASE/TrkDiag/fcl/prolog.fcl
for more details).
$MU2E_BASE_RELEASE/TrkDiag/fcl/prolog.fcl
, you will see that we have TrkQualDeM, TrkQualDeP etc.
Defining a suffix
in the TrkAna configuration allows us to store only the relevant RecoQual
values. You can try running without a suffix parameter and set input
back to "KFFDeM"
to see what happens.
Now that you can calculate the Ce efficiency, try some of the following exercises:
- (Optional): Make the plot prettier by:
- adding appropriate axis labels
- writing the Ce efficiency on the plot with a
TLatex
- adding dashed lines to show the momentum window with
TLine
s (extra bonus: have the lines move when the momentum window values change) - (Optional): Add a second momentum plot but with a higher track quality cut and include a
TLegend
. - (Optional): Add the following module to your producer block and append it to your trigger path dioLLWeight : { module_type : BinnedSpectrumWeight physics : @local::EventGenerator.producers.dioalll.physics genParticleTag : "compressRecoMCs" genParticlePdgId : 11 genParticleGenId : dioTail BinCenter : false } Run on the flateminus-mix filelist and look at the TrkAna tree. You will notice that the
dioLLWeight
has been added to the evtwt
branch without having to reconfigure TrkAna! Plot the reconstructed momentum of DIOs
Exercise 3: Adding MC truth
In this exercise, we will add MC truth information to the TrkAna tree and see how close our reconstruction matches the truth.
- Create a TrkAna tree with CeEndpoint-mix tracks and include the MC truth information: mu2e -c Ex03/fcl/TrkAnaEx03.fcl -S filelists/mcs.mu2e.CeEndpoint-mix.MDC2018h.lst If you look in the fcl file, you will notice that there is a global switch
- You can open up the file and
Print
the tree structure. For this exercise, we will focus on thetrkmcent
branch, which contains the MC information for the step that crossed the entrance into the tracker. The othertrkmc*
branches will be the focus of Exercise 4. - Run this example ROOT macro that plots the intrinsic tracker momentum for tracks that pass our signal cuts, except for the trkqual and momentum window cuts root -l Ex03/scripts/TrkAnaEx03.C
- Add the following line to the start of the macro: gStyle->SetOptStat(111111); This will add the "Overflow" and "Underflow" values of the histogram to the stats box. This shows the number of events that fall outside of the axis range of the histogram. In ROOT the overflow bin is at bin number n_bins+1 and the underflow bin is bin number 0.
- Using what you learned in Exercise 2, count the number of events above +3 MeV, including the events in the overflow bin
- Add the trkqual cut and play with it. How does the number of events above +3 MeV change?
fillMCInfo
and a local switch for the candidate track candidate.fillMC
. This will be important when we add supplemental tracks in Exercise 5.
Now that you've created a TrkAna tree with some MC information in it, try some of the following optional exercises:
- (Optional): Do the same comparison at the middle or exit of the tracker
- (Optional): Perform a fit to a double-sided crystal ball function. This function is a Gaussian with polynomial tails. The function can be found here: scripts/dscb.h. You will need to create your own TF1 (hint) and use the TH1::Fit() function (hint). What is the core resolution?
Exercise 4: Following genealogy
With TrkAna, we also have access to important steps in the genealogy of the track. Before looking at the branches themselves, we need to discuss a little about the simulation.
For each event in the simulation we instantiate a GenParticle
, which represents the particle we want to start the simulation with. For some of our samples, we create more than one of these per event (e.g. for cosmic rays, we take all the particles created in the shower at a certain altitude). We then pass these GenParticles
to the physics simulation and create a SimParticle
for each one. We then let Geant4 simulate these SimParticles
and create more SimParticles
until we end up with StepPointMCs
in the detectors.
For the tracker, there will be many different StepPointMCs
from different SimParticles
in every straw. We assign a SimParticle
to be responsible for each straw hit if it had the most StepPointMCs
in that straw.
Now here is a description of the branches we have in TrkAna:
- trkmc
- contains information about the
SimParticle
that produced the most hits on the track - trkmcgen
- contains information about the actual
GenParticle
in our simulation that ultimately produced the track (e.g. for cosmic rays, this will be the particle in the shower that ultimately produced the track) - trkmcpri
- contains information about the primary particle that produced the
trkmcgen
particle. Note that for some samples, this particle did not appear in the simulation (e.g. for cosmic rays, this will correspond to the cosmic ray proton that produced the shower, which was never simulated)
For most of our samples, these are all the same particle. In order to explore the differences, we will run on a sample of cosmic rays made by the CRY generator.
- Create a TrkAna tree with CRY-cosmic-general-mix tracks and include the MC truth information: mu2e -c Ex04/fcl/TrkAnaEx04.fcl -S filelists/mcs.mu2e.CRY-cosmic-general-mix.MDC2018h.lst
- Open the ROOT file and look at the PDG ID codes of the MC particles (defined here but some important ones are 11 = electron, -11 = positron, 13 = muon, -13 = positive muon, 2212 = proton). root -l trkana-ex04.root root[n]: TrkAnaEx04->cd() root[n]: trkana->Scan("trkmc.pdg:trkmcgen.pdg:trkmcpri.pdg") Here are the first few rows: ************************************************ * Row * trkmc.pdg * trkmcgen. * trkmcpri. * ************************************************ * 0 * 13 * 13 * 2212 * * 1 * 13 * 13 * 2212 * * 2 * 11 * 13 * 2212 * * 3 * 13 * 13 * 2212 * * 4 * -13 * -13 * 2212 * * 5 * 11 * -13 * 2212 * So the "primary" particle is a proton, this is the cosmic ray proton and doesn't actually appear in our simulation. The
- However just because we have two particles of the same type at the different stages, does not mean that they are the same particle. We need to look at the
trkmc.prel
, which encodes the relationship between the track particle and theGenParticle
:
root[n]: trkana->Scan("trkmc.pdg:trkmc.prel:trkmcgen.pdg:trkmcgen.pdg")
You can look in $MU2E_BASE_RELEASE/MCDataProducts/inc/MCRelationship.hh for the definitions (0 = same, 1 = direct child, -1 = unrelated).
GenParticles
that started our simulation are either positive or negative muons. The particles that are responsible for the tracks are either muons or electrons.
That's it for this exercise. As you can tell, this isn't an exhaustive genealogy tree so if you need to look at something more complex, then you can run in the full Offline framework where we (currently) store every step in the genealogy for saved tracks. However, you can try this optional exercise for another example:
- (Optional): Run on the flatmugamma-mix file list and look at the genealogy of those tracks
Exercise 5: Adding supplemental tracks
From the last exercise, we can see that we get muon tracks. But we were only looking at the result of the electron-hypothesis fit. We can have TrkAna write out the results of "supplement" fits (e.g. downstream muon tracks)
- Create a TrkAna tree with CRY-cosmic-general-mix tracks and include the result of downstream mu-minus fits: mu2e -c Ex05/fcl/TrkAnaEx05.fcl -S filelists/mcs.mu2e.CRY-cosmic-general-mix.MDC2018h.lst
- Open up the fcl file and you will see that we have created a couple of blocks to handle branch definitions: DeM : { input : "KFF" branch : "de" suffix : "DeM" fillMC : true } DmuM : { input : "KFF" branch : "dm" suffix : "DmuM" fillMC : false } We have kept the same definition for the downstream electron (although we have changed the branch name to
- Open up the ROOT file and look in the tree. You will see that we now have both
de*
branches anddm*
branches - Plot the resolution for DeM like you did in Exercise 3. You will see that it doesn't look great... root -l Ex05/scripts/TrkAnaEx05.C We know from Exercise 4 that some of the actual particles that are creating the track are muons and positrons but we are looking at the negatively-charged electron-hypothesis fit result.
- Add cuts on the true MC particle and compare the resolutions (
demc==11
,demc==-11
,demc==13
, anddemc==-13
) - Now let's look at the resolution of the muon-hypothesis fits
- re-run TrkAnaEx05.fcl but set the DmuM branch to be filled with MC information
- re-run your ROOT macro but look at the DmuM branch rather than the DeM branch
- Obviously in the real experiment, we won't know what the true particle is and so we need to use reconstructed quantities to get a handle on the truth. In TrkAna we have TrkQual and TrkPID so play with cuts on one or both of dequal.TrkQualDeM and dequal.TrkPIDDeM to remove as much of the "wrong" truth while keeping as much of the "correct" truth as possible
de
) and we have added a definition for the downstream mu-minus tracks. These are then used here:
physics.analyzers.TrkAnaEx05.candidate : @local::DeM
physics.analyzers.TrkAnaEx05.supplements : [ @local::DmuM ]
where we are making the DmuM fits a "supplement" track. This means that for each KalSeed
in the DeM collection, TrkAna will look in the DmuM collection and write out the track information for the DmuM track that is closest in time to the DeM track.
You will see that the true mu^{-} histogram is now centered at 0
You've added supplemental tracks to the TrkAna tree, which might be useful for your analysis. Try some of these optional exercise to explore further:
- (Optional): add upstream e-minus fits (UeM) as a supplement and plot the number of tracks of each type (branch
tcnt
- (Optional): Swap the candidate (DeM) with one of the supplements (e.g. UeM) and re-run. You will notice that you will have more candidate (UeM) tracks now than when it was a supplement. This is because we were only writing out a supplement track if there was a candidate track in the same event, so we would have missed events that only have the supplement.
Exercise 6: TrkAnaReco
In this exercise, we will run the standard TrkAnaReco.fcl that is available in the Offline repository. This is probably the quickest and easiest way to get a reasonably comprehensive TrkAna tree that will be useful for most samples.
- Run TrkAnaReco.fcl mu2e -c $MU2E_BASE_RELEASE/TrkDiag/fcl/TrkAnaReco.fcl -S filelists/mcs.mu2e.CRY-cosmic-general-mix.MDC2018h.lst
- If you open up the resulting
nts.owner.trkana-reco.version.sequencer.root
file and look at its contents, you will see that we run two instances of TrkAna: one for negative tracks (TrkAnaNeg
) and one for positive tracks (TrkAnaPos
), as well as the genCountLogger - If you
cd
into one of the TrkAna folders and print the tree structure, it will look very similar to the trees you've made in the previous exercises - There are two additions though:
detrkqual
: this contains all the input variables that can be used for training the TrkQual MVAdetrkpid
: this contains all the input variables that can be used for training the TrkCaloHitPID MVA
Both of these branches are turned on by setting candidate.trkqual
and candidate.trkpid
parameters to module labels and by setting fillTrkQual
and fillTrkPID
to true.
There's not much to this exercise, but you can try to run some of the macros that you've written in the previous exercises on the output file produced here.
Advanced Exercises
Setup
These exercises are designed to be run on a Satellite release of v7_4_1 from $TUTORIAL_BASE/TrkAna. The Satellite release is important for exercises 4, 5 and 6.
In the docker, we assume that you have a volume mounted on your machine to /home/working_area/ in the container:
> source /Tutorials_2019/setup_container.sh > cp -r $TUTORIAL_BASE/TrkAna /home/working_area/ > cd /home/working_area/TrkAna > /Offline/v7_4_1/SLF6/prof/Offline/bin/createSatelliteRelease --directory . > source setup.sh
We also want to create some filelists for the exercises:
> mkdir filelists >ls $TUTORIAL_BASE/data/mcs.mu2e.CeEndpoint-mix.*.art > filelists/mcs.mu2e.CeEndpoint-mix.MDC2018h.lst
If you are working on mu2egpvm, then the setup instructions will be different. These will be added to this tutorial at a later date.
Exercise 1: Looking at the hits in the track
In this exercise, we will get TrkAna to give us more information about each track by writing out the information for each hit.
- Run the example fcl mu2e -c AdvEx01/fcl/TrkAnaAdvEx01.fcl -S filelists/mcs.mu2e.CeEndpoint-mix.MDC2018h.lst If you look in the fcl file, you will see that this just
- Look at the tree and you will see a bunch more branches:
detsh
: information about the straw hits that went into the fit (tsh = TrkStrawHit)detsm
: information about each straw the fit passes through (tsm = TrkStrawMaterial)detshmc
: truth information about the straw hits that went into the fit (tsh = TrkStrawHit)- Find an "interesting" track (e.g. it has poor resolution or a poor TrkQual value) and find it's run ID, subrun ID and event ID root[n]: trkana->Scan("evtinfo.runid:evtinfo.subrunid:evtinfo.eventid", "your cuts")
- Now we can plot the Y-X hit distribution: root[n]: trkana->Draw("detsh._poca.fCoordinates.fY:detsh._poca.fCoordinates.fX", "evtinfo.runid==X && evtinfo.subrunid==Y && evtinfo.eventid==Z") where X, Y, and Z are the run ID, subrun ID and event ID for the track you are interested in.
- Change the marker attributes so that the hits are easier to see
- We can also check which of these are actually used in the Kalman filter: root[n]: trkana->Draw("detsh._poca.fCoordinates.fY:detsh._poca.fCoordinates.fX", "evtinfo.runid==X && evtinfo.subrunid==Y && evtinfo.eventid==Z && detsh._active==1")
- And with the MC truth we can see which hits come from the truth particle: trkana->Draw("sqrt(detsh._poca.fCoordinates.fY^2+detsh._poca.fCoordinates.fX^2):detsh._poca.fCoordinates.fZ", "evtinfo.runid==X && evtinfo.subrunid==Y && evtinfo.eventid==Z && detshmc._rel._rel==0", "PSAME")
#include
s TrkAnaReco.fcl but then sets the diagnostic level to 2
We can make a very rudimentary event display with this information
That's it for this exercise. We've shown that TrkAna can also store hit level information for the tracker, which can be used for a quick, simple event display.
Exercise 2: Analyzing a TrkAna tree with a compiled macro
In this exercise, we will create a compiled ROOT macro that loops through each event and performs a more complex analysis. Simple TTree::Draw commands with cuts are great for quickly prototyping analyses but can cause issues in the long-term.
In the TrkAna tree, each branch is essentially a struct. We call them info structs and they can be found in $MU2E_BASE_RELEASE/TrkDiag/inc/*Info.hh
. If you look in those files, you will see each struct and you will recognise the various leaf names. These structs are stored in a shared library so we can use them in compiled ROOT macros.
A TrkAna tree containing the full CeEndpoint-mix dataset can be found in $TUTORIAL_BASE/data/trkana-cem.root
.
- To motivate a little more why this is better than a complicated cut command, try to run a cut command with a typo in the leafname e.g.: root[n]: trkana->Draw("deent.time") You will see that there are no errors or warnings and that you have the momentum distribution. This is because ROOT will give you the first leaf of the branch, which is
- Run the example script that will print some information about the first few tracks but first we need to update the ROOT_INCLUDE_PATH environment variable: > source AdvEx02/setup_trkana-macro.sh > root -l root[n]: .L scripts/TrkAnaTutAdvEx02.C+ root[n]: TrkAnaTutAdvEx02()
- Here are the important lines: #include "TrkDiag/inc/TrkInfo.hh" We
- Add the
demid
branch by copying what is done for thedeent
branch - Add event weight info branch and print the PBI weight. You will need to include the file
TrkDiag/inc/EventWeightInfo.hh
and know that, in the current TrkAna tree, the PBI weight is stored inevt._weights[0]
. This is a weakness in TrkAna at the moment (you could re-run with different weight modules and_weights[0]
will be different). However it allows us to be more flexible in the number and provenance ofEventWeight
objects - Create a "cutflow" plot of the cuts given in Basic Exercise 2 (i.e. for each track, if it passes the first cut add an entry to the first bin. If it then passes the second cut, add an entry to the second bin etc.)
deent.mom
for this branch.
#include
the header file that contains the info structs that we will use
mu2e::TrkInfo de;
trkana->SetBranchAddress("de", &de);
mu2e::TrkFitInfo deent;
trkana->SetBranchAddress("deent", &deent);
Here we create empty info structs which ROOT will fill for each TTree entry. TrkInfo
stores information about the fit in general and TrkFitInfo
stores information about the fit locally.
for (int i_entry = 0; i_entry < n_entries; ++i_entry) {
trkana->GetEntry(i_entry);
std::cout << "Track #" << i_entry << ": Status = " << de._status << ", p = " << deent._fitmom << " MeV/c" << std::endl;
}
Here we loop through each event in the TrkAna tree and print some info from each event using the info structs directly.
That's it for writing a compiled TrkAna analysis macro. While writing this exercise, I noticed that ROOT also provides a TTreeReader
class and a RDataFrame
class, which seem to be more modern. A quick test showed that TrkAna trees do not work with these. However, they may be supported in future.
Exercise 3: Retraining TrkQual
In this exercise, we will retrain the artificial neural network (ANN) that determines the quality of the track fit.
- Copy the training script (This is a standard TMVA script with our modifications) cp $MU2E_BASE_RELEASE/TrkDiag/test/TrainTrkQual.C AdvEx03/scripts/TrainTrkQual.C If you look in that file, you will see a lot of standard boilerplate:
Use[]
lines are the TMVA options that are availableTCut
lines are our cuts on what we decide are signal (well-reconstructed tracks) and background (poorly reconstructed tracks)SetBackgroundWeight
lines mean that we can weight the background events differently (i.e. give the worst reconstructed tracks the most importance)AddVariable
lines define the input variables that we use- We will create a new set of weights by removing the max radius input variable so comment it out in the TrainTrkQual.C script
- Now we'll retrain: root -l $TUTORIAL_BASE/data/trkana-cem.root root[n]: TTree* trkana = (TTree*) _file0->Get("TrkAnaNeg/trkana") root[n]: .L AdvEx03/scripts/TrainTrkQual.C root[n]: TrainTrkQual(trkana) This will take a few minutes and create a TrkQual directory and a TrkQual.root file.
- The TrkQual.root file contains important histograms that you can look at with the TMVA GUI: root[n]: TMVA::TMVAGUI gui("TrkQual.root") Important plots are the input variables (option 1a), where you can check that there is good separation in your input variables; the output value distribution (option 4b), which should peak at 0 and 1; and, the ROC curve (option 5b), which show the signal efficiency and background rejection as a function of output value cut.
- In the TrkQual folder, we have the weights XML file that we can now use when running T
- Create a new fcl file with the following contents: #include "TrkDiag/fcl/TrkAnaReco.fcl" physics.producers.TrkQualNewDeM : @local::physics.producers.TrkQualDeM physics.producers.TrkQualNewDeM.TrkQualMVA.MVAWeights : "TrkQual/TrkQualWeights/TMVAClassification_MLP.weights.xml" physics.TrkAnaTrigPath : [ @sequence::physics.TrkAnaTrigPath, TrkQualNewDeM ] services.TFileService.fileName : "trkana-adv-ex03.root" Note that we don't need to reconfigure TrkAna at all, and are simply declaring a new instance of the
- Now run your fcl on the flateminus-mix file mu2e -c solutions/AdvEx03/fcl/TrkAnaAdvEx03.fcl -S filelists/mcs.mu2e.flateminus-mix.MDC2018h.lst
- If you look at the TrkAna tree structure you will see that the result of your new training has been added to the
dequal
tree:
*............................................................................*
*Br 52 :dequal : nquals/I:TrkQualDeM/F:TrkPIDDeM/F:TrkQualNewDeM/F *
*Entries : 17906 : Total Size= 287964 bytes File Size = 181334 *
*Baskets : 9 : Basket Size= 32000 bytes Compression= 1.58 *
*............................................................................*
- You can Scan or Draw these values to see how they compare
- Plot the momentum resolutions (like in Basic Exercise 3) for cuts on each TrkQual variable.
TrackQuality
module with a module label ending in DeM.
In this exercise, you retrained TrkQual with a reduced set of input variables. This has been easy to do because of some underlying code in TrackQuality
and MVATool
that allows us to mask out missing input variables as defined in the weights XML file. Adding a variable to TrkQual is a little tricker, one would have to:
- add it to the TrkQual object;
- fill it in the TrackQuality module; and,
- add it to the TrainTrkQual.C training script,
which would require a recompilation of Offline.
Exercise 4: Adding your own weight module
In this module we will create a new event weight module that will create an EventWeight
object that will automatically be read into the TrkAna tree. It might be best if you've already covered the art module writing tutorial.
This module will be based around the WeightModule
template that can be found in TrkDiag/inc/WeightModule.hh. We will create an _module.cc
, which will create the module itself and a Phys.hh
file which will define the event weight calculation. In our example, we will just define a weight as twice the primary particle's x-momentum.
- First create a file in
AdvEx04/inc/
calledDoubleWeightPhys.hh
and fill it with the following:
#ifndef DoubleWeightPhys_hh_
#define DoubleWeightPhys_hh_
#include "MCDataProducts/inc/PrimaryParticle.hh"
namespace mu2e {
class DoubleWeightPhys {
public:
DoubleWeightPhys(const fhicl::ParameterSet& pset)
: _input(pset.get<art::InputTag>("input")) {}
double weight(const art::Event& event) {
const auto& pph = event.getValidHandle<PrimaryParticle>(_input);
const auto& pp = *pph;
double wt = 2 * pp.primary().momentum().px();
return wt;
}
private:
art::InputTag _input;
};
}
#endif
This - Then create a file in
AdvEx04/src/
calledDoubleWeight_module.cc
with the following contents:
#include "TrkDiag/inc/WeightModule.hh"
#include "inc/DoubleWeightPhys.hh"
namespace mu2e {
typedef WeightModule<DoubleWeightPhys> DoubleWeight;
}
DEFINE_ART_MODULE(mu2e::DoubleWeight);
By splitting things like this, we can reuse the - Now create a fcl with the following content: #include "TrkDiag/fcl/TrkAnaReco.fcl" physics.producers.DoubleWeight : { module_type : DoubleWeight input : "compressRecoMCs" } physics.TrkAnaTrigPath : [ @sequence::physics.TrkAnaTrigPath, DoubleWeight ] services.TFileService.fileName : "trkana-adv-ex04.root" Note that we only add an
- Run it and you will see that the
DoubleWeight
leaf have been added to theevtwt
branch of TrkAna:
*............................................................................*
*Br 40 :evtwt : nwts/I:PBIWeight/F:DoubleWeight/F *
*Entries : 31034 : Total Size= 374042 bytes File Size = 251701 *
*Baskets : 12 : Basket Size= 32000 bytes Compression= 1.48 *
*............................................................................*
- You can Scan
evtwt.DoubleWeight
anddemcpri.momx
to see that this worked
Phys
class needs to have a double weight(const art::Event& event)
method and a DoubleWeightPhys(const fhicl::ParameterSet& pset)
constructor.
DoubleWeightPhys
class to be used in other places (e.g. an event generator that might want to generate based on a true spectrum).
EventWeight
module to the trigger path and do not need to reconfigure TrkAna.
This way of writing an EventWeight
module still hasn't made it's way throughout all of Offline. However, even if you end up creating an event weight module in a different way, as long as your module creates an EventWeight
object, it will automatically be written into the TrkAna tree.
Exercise 5: Adding your own custom branch to TrkAna
In this exercise, we will add a custom branch of a different data product to TrkAna using a new info struct.
First, copy the current TrackAnalysisReco_module.cc
into this area since we will need to modify it:
cp $MU2E_BASE_RELEASE/TrkDiag/src/TrackAnalysisReco_module.cc AdvEx05/src/TrackAnalysisRecoCustom_module.cc</code>
Do a find and replace of "TrackAnalysisReco" to "TrackAnalysisRecoCustom".
Also, we will need our own InfoStructHelper
:
> cp $MU2E_BASE_RELEASE/TrkDiag/inc/InfoStructHelper.hh solutions/AdvEx05/inc/InfoStructHelperCustom.hh > cp $MU2E_BASE_RELEASE/TrkDiag/src/InfoStructHelper.cc solutions/AdvEx05/src/InfoStructHelperCustom.cc
and do a find-and-replace of "InfoStructHelper" to "InfoStructHelperCustom" in all your files.
Compile and make sure things run as expected (there is an example fcl file in AdvEx05/fcl/TrkAnaAdvEx05.fcl)
- For this example, we will add the
EventWindowMarker
object, which is the start time of the event relative to the proton beam arrival. If you look inDataProducts/inc/EventWindowMarker.hh
, you will see that it just contains a time offset - Let's create an info struct that will be used as the basis of the branch in
AdvEx05/inc/
:
#ifndef EWMInfo_hh_
#define EWMInfo_hh_
namespace mu2e {
struct EWMInfo {
EWMInfo() { reset(); }
Float_t _tOffset;
void reset() { _tOffset = 0.0; }
static std::string leafnames() {
static std::string leaves;
leaves = "tOffset/F";
return leaves;
}
};
}
#endif
As you can see, to create an info struct, we just need the member variables, a - Now lets add a function to
InfoStructHelperCustom
to fill this in. I will leave this to you but it should have a function signature like so:
void fillEWMInfo(const EventWindowMarker& ewm, EWMInfo& info);
- Now, in
TrackAnalysisRecoCustom
we need to add the following (in order as we go down through the source code) - a fcl parameter to take the
art::InputTag
of the module that creates theEventWindowMarker
object
hicl::Atom<art::InputTag> ewmTag{Name("EventWindowMarkerTag"), Comment("Tag for EventWindowMarker"), art::InputTag()};
- a private member variable of type
EWMInfo
which will temporarily store the information before being written to the tree
EWMInfo _ewmInfo;
- create the branch (in
beginJob
)
_trkana->Branch("ewm", &_ewmInfo,EWMInfo::leafnames().c_str());
- fill the branch (in
fillEventInfo
)
const auto& ewmH = event.getValidHandle<EventWindowMarker>(_conf.ewmTag());
_infoStructHelper.fillEWMInfo(*ewmH, _ewmInfo);
- clear the branch (in
resetBranches
)
_ewmInfo.reset();
- Recompile with
scons -j4
and fix any errors - Now edit your fcl to use the new
EventWindowMarkerTag
fcl parameter (the value you should use is"SelectRecoMC"
) and re-run - If you look in the output ROOT file, there will be a TrkAnaCustom folder, in which is your modified TrkAna tree: *............................................................................* *Br 51 :ewm : tOffset/F * *Entries : 31095 : Total Size= 125157 bytes File Size = 115474 * *Baskets : 4 : Basket Size= 32000 bytes Compression= 1.08 * *............................................................................*
- You can Scan or Draw your new branch to see what it looks like
leafnames()
function so that we can create the branch, and a reset()
function to clear the contents.
You've just added a new branch to the TrkAna tree. This might be useful in the future, if you have a custom data object that you want outputted alongside all the standard track information that TrkAna provides.
Here is an optional exercise to try:
- (Optional): make the new branch optional with a fcl parameter and to not be used by default
Exercise 6: Creating your own "TrkAna" tree using InfoStructs
For this exercise, let's say you want a fundamentally different structure to what the TrkAna tree offers (e.g. instead of one row per track, you want one row per calorimeter cluster) but you still want a branch stored with KalSeed
information. You can use the infrastructure we've already built around TrkAna to make your life a little easier.
- There's a skeleton module in
AdvEx06/src
that creates a tree with one calorimeter cluster per row as well as the art event information. Compile and run the example:
-c AdvEx06/fcl/TrkAnaAdvEx06.fcl -S filelists/mcs.mu2e.CeEndpoint-mix.MDC2018h.lst
If you look at the structure of this new "CalAna" tree and scan the leaves and you will see that there is one row for each cluster and the same event id will appear in multiple rows because of this.
- Now let's add a branch for the information from the
KalSeed
: - add a fhicl parameter for the input
KalSeedCollection
- add private member variables of type
TrkInfo
andInfoStructHelper
- create the branch (in
beginJob()
) - call the
updateSubRun
function ofInfoStructHelper
inbeginSubRun()
- grab the
KalSeedCollection
from the event (inanalyze()
) - in this step of the exercise, we will just write out the first
KalSeed
in the collection:
if (kalSeedColl.size() > 0) {
const auto& kseed = kalSeedColl.at(0);
_infoHelper.fillTrkInfo(kseed, _trkInfo);
}
- reset the
TrkInfo
struct (inresetBranches()
) - Recompile and re-run with the added fhicl parameter (it's value should be
"KFFDeM"
) - Look in the output file and at the new tree and you will see that we have all the information about the global track fit with only a few lines of code
- To make this a little more useful, let's only write out the TrkInfo, if the calorimeter cluster was used in the Kalman fit. This means we can see the calorimeter clusters that weren't used. Make the following changes:
- loop over all the KalSeeds
- get the calo cluster used in the track fit (
KalSeed::caloCluster()
returns aPtr
) if it has one (KalSeed::hasCaloCluster()
returns abool
) - only fill
TrkInfo
if the energy deposit is the same as the original cluster (NB there should be a better way of identifying the same cluster) - make sure to clear the
TrkInfo
struct before filling it - Re-compile, re-run and re-Scan the tree and you will see that now only some clusters have TrkInfo associated with it
- Plot the energy spectrum of all energy clusters and those that have an associated track
Now you can create a TTree with your own structure and fill it easily with information that we have in TrkAna.
Reference Materials
- Use this place to add links to reference materials.
- TrkAna wiki page
A Useful Glossary
- ROOT
- data analysis framework developed at CERN
- KalSeed
- data product that represents a track
- CeEndpoint-mix
- dataset name for CeEndpoint (i.e. mono-energetic electrons) with background frames mixed in
- CeplusEndpoint-mix
- dataset name for CeplusEndpoint (i.e. mono-energetic positrons) with background frames mixed in
- flatmugamma-mix
- dataset name for flatmugamma (i.e. flate energy photons generated at muon stopping positions) with background frames mixed in
- KalFinalFit
- the module name for the final stage of the Kalman filter fit for the track
- TrkQual
- an artificial neural network (ANN) that takes parameters from the track and outputs a value between 0 (poorly reconstructed) and 1 (well-reconstructed)