TrkAna: Difference between revisions

From Mu2eWiki
Jump to navigation Jump to search
No edit summary
(→‎Python: Open with uproot and plot with matplotlib)
Line 31: Line 31:


=== Python ===
=== Python ===
To be written
The best way to open ROOT files in Python is to use the [https://uproot.readthedocs.io/en/latest/ uproot] package. It does not require a working installation of ROOT, so it allows to read a ROOT file on basically any platform that supports Python (e.g. [https://colab.research.google.com/ Colaboratory]). It also provides a more ''pythonic'' bridge between ROOT and Numpy/pandas.


In order to open a ROOT TTree with uproot it's sufficient to write:
<pre>
import uproot
file = uproot.open("trkana.root")
trkananeg = file["TrkAnaNeg"]["trkana"] # opens the 'trkana' tree in the 'TrkAnaNeg' folder
</pre>
Now, if we want to store, for example, the reconstructed momentum in a Numpy array we can do:
<code>
deent_mom = trkananeg["deent.mom"].array()
</code>
Instead of using Numpy arrays it is possible to convert a ROOT TTree in a pandas dataframe with:
<code>
df = trkananeg.pandas.df(flatten=False)
</code>
The <code>flatten=False</code> is required to manage branches with vectors or arrays of variable size (e.g. <code>crvinfo</code>).
Once your data is stored in a pandas dataframe or in a numpy array you can plot it with the many plotting libraries available ([https://matplotlib.org matplotlib], [https://plot.ly/python/ plot.ly], etc.). This example shows how to plot an histogram with matplotlib:
<pre>
import uproot
import matplotlib.pyplot as plt
file = uproot.open("trkana.root")
trkananeg = file["TrkAnaNeg"]["trkana"]
df = trkananeg.pandas.df(flatten=False)
fig, ax = plt.subplots(1,1)
n, bins, patches = ax.hist(df["deent.mom"],
                          bins=60,
                          range=(95,110),
                          label="Reco. momentum")
</pre>


== Tree Structure ==
== Tree Structure ==

Revision as of 01:29, 28 October 2019

Overview

TrkAna is a track-based ROOT TTree that can be used to help with analysis. Each entry in the TTree corresponds to a single fitted track and contains reconstructed information from the tracker, calorimeter and CRV. There is also the option to write out Monte Carlo truth information and additional information for other track types that might be important to an analysis.

There is a tutorial and a hypernews forum you can sign up to.

How To Run

Reco Datasets

There is an example fcl configuration in the Offline repository that runs on reco files (see e.g. MDC2018 reco datasets).

mu2e -c TrkDiag/fcl/TrkAnaReco.fcl -s mcs.art --TFileName trkana.root

This runs two instances of the TrackAnalysisReco module. One for negative tracks (TrkAnaNeg) and one for positive tracks (TrkAnaPos). Each tree has the same structure has shown below.

Ensemble Datasets

If you are running on the MDC2018 ensemble datasets, then you can use the fcl files:

  • TrkDiag/fcl/TrkAnaRecoEnsemble-Data.fcl, and
  • TrkDiag/fcl/TrkAnaRecoEnsemble-MC.fcl.

Digi Datasets

If you are running on digi files (e.g. MDC2018 digi datasets), then you can try to use:

  • TrkDiag/fcl/TrkAnaDigisReco.fcl

which will also run the reconstruction path. You will need to move the RecoFilter module to be after the TrkQual and TrkPID modules to avoid TrkAna trying to search for data products created by these in events that don't pass the filter.

However, it would be easier to run the reconstruction path separately and save that output. Then you can use the TrkAnaReco.fcl described above.

How To Use

ROOT

Can use the command line, macros

TTreeReader: to be written

Python

The best way to open ROOT files in Python is to use the uproot package. It does not require a working installation of ROOT, so it allows to read a ROOT file on basically any platform that supports Python (e.g. Colaboratory). It also provides a more pythonic bridge between ROOT and Numpy/pandas.

In order to open a ROOT TTree with uproot it's sufficient to write:

import uproot

file = uproot.open("trkana.root")
trkananeg = file["TrkAnaNeg"]["trkana"] # opens the 'trkana' tree in the 'TrkAnaNeg' folder

Now, if we want to store, for example, the reconstructed momentum in a Numpy array we can do:

deent_mom = trkananeg["deent.mom"].array()

Instead of using Numpy arrays it is possible to convert a ROOT TTree in a pandas dataframe with:

df = trkananeg.pandas.df(flatten=False)

The flatten=False is required to manage branches with vectors or arrays of variable size (e.g. crvinfo). Once your data is stored in a pandas dataframe or in a numpy array you can plot it with the many plotting libraries available (matplotlib, plot.ly, etc.). This example shows how to plot an histogram with matplotlib:

import uproot
import matplotlib.pyplot as plt

file = uproot.open("trkana.root")
trkananeg = file["TrkAnaNeg"]["trkana"] 
df = trkananeg.pandas.df(flatten=False)

fig, ax = plt.subplots(1,1)
n, bins, patches = ax.hist(df["deent.mom"],
                           bins=60, 
                           range=(95,110), 
                           label="Reco. momentum")

Tree Structure

Here is a very rough description of the tree branches and where to find leaf definitions in the repository:

evtinfo
information about the event (TrkDiag/inc/EventInfo.hh)
hcnt
count of various hit types (TrkDiag/inc/HitCount.hh)
tcnt
count of various track types (TrkDiag/inc/TrkCount.hh)
de
global fit information for downstream electron track (TrkDiag/inc/TrkInfo.hh)
deent
local fit information for downstream electron track at tracker entrance (TrkDiag/inc/TrkInfo.hh)
demid
local fit information for downstream electron track at middle of tracker (TrkDiag/inc/TrkInfo.hh)
dexit
local fit information for downstream electron track at tracker exit (TrkDiag/inc/TrkInfo.hh)
detch
calorimeter cluster information for cluster used in downstream electron track fit (tch = TrkCaloHit, TrkDiag/inc/TrkCaloHitInfo.hh)
dequal
the output values of the TrkQual and TrkPID ANNs
uetch
calorimeter cluster information for cluster used in upstream electron track fit (tch = TrkCaloHit, TrkDiag/inc/TrkCaloHitInfo.hh)
ue
global fit information for upstream electron track (TrkDiag/inc/TrkInfo.hh)
dm
global fit information for downstream muon track (TrkDiag/inc/TrkInfo.hh)
trigbits
unsigned int of the triggers
crvinfo
information about CRV coincidences (CRVAnalysis/inc/CrvHitInfoReco.hh)
bestcrv
element in crvinfo array that is the best
demc
MC information about particle that created downstream track (TrkDiag/inc/TrkInfo.hh)
demcgen
MC information about particle that started the simulation (TrkDiag/inc/GenInfo.hh)
demcpri
MC information about particle that would have ultimately created the GenParticle (TrkDiag/inc/GenInfo.hh)
demcent
MC information about step of particle that created downstream track as it enters the tracker (TrkDiag/inc/TrkInfo.hh)
demcmid
MC information about step of particle that created downstream track as it passes the middle of the tracker (TrkDiag/inc/TrkInfo.hh)
demcxit
MC information about step of particle that created downstream track as it leaves the tracker (TrkDiag/inc/TrkInfo.hh)
crvinfomc
MC information about CRV coincidences (CRVAnalysis/inc/CrvHitInfoMC.hh)
detchmc
MC information about calorimeter cluster used in the downstream electron fit (TrkDiag/inc/CaloClusterInfoMC.hh)
uetchmc
MC information about calorimeter cluster used in the upstream electron fit (TrkDiag/inc/CaloClusterInfoMC.hh)
detshmc
MC information about the straw hits used in the downstream electron fit (need diagLevel > 0, TrkDiag/inc/TrkStrawHitInfo.hh)
detrkqual
the input variables and output value of the track quality artificial neural network (TrkDiag/inc/TrkQualInfo.hh)
evtwt
the values of all EventWeight objects that were in the art even (e.g. proton bunch intensity = PBIWeight)

By setting the diagLevel to 2, you can get hit level information:

detsh
reconstructed straw hit information for downstream electron track fit (TrkDiag/inc/TrkStrawHitInfo.hh)
detsm
information about straw materials that the downstream electron track fit goes through (TrkDiag/inc/TrkStrawMatInfo.hh)

Future Developments?

  • Support for TTreeReader and RDataFrame?