NtupleTutorial: Difference between revisions

From Mu2eWiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 114: Line 114:
  root [6] t->Draw("nhits")      --- here nhits is the leaf you are plotting
  root [6] t->Draw("nhits")      --- here nhits is the leaf you are plotting


[[File:nhits.jpg|500px |Nhits histogram with all data values]]
[[File:Nhits.jpg|500px |Nhits histogram with all data values]]




Line 123: Line 123:
  root [7] t->Draw("nhits", "nhits>15");
  root [7] t->Draw("nhits", "nhits>15");


[[File:nhits15.jpg|500px |Nhits histogram with selected range >15]
[[File:Nhits15.jpg|500px |Nhits histogram with selected range >15]


Now we can better see the distribution to the left of 15.  A more stark example is looking at the momentum leaf.  Compare the distributions for momentum of track that you see in the two different histograms created below:   
Now we can better see the distribution to the left of 15.  A more stark example is looking at the momentum leaf.  Compare the distributions for momentum of track that you see in the two different histograms created below:   

Revision as of 17:02, 10 July 2018

Introduction

When you want to perform a physics analysis, you need a way to access and manipulate the data. For Mu2e, this could include evaluating the performance of the tracking chamber or calorimeter or cosmic ray veto detector. You might want to look at "raw" information from the detector, like voltages or waveforms, or you might want higher-level quantities, where hits from the tracking detector have been combined into tracks or energy deposits in the calorimeter have been clustered into total particle energy. Whatever you need to do, you need a way to organize the information that you need.

At its most basic, you can think of an ntuple as a database. It has a variables defined that are filled with information. There are many different formats your ntuple can take, depending on what information you want to access. In this tutorial we will work with a basic tracking ntuple. We will learn how to discover what information the ntuple contains and how to access and display that information. Hopefully what you learn here will translate to other ntuples that you face in the future.

Note that much of the work we'll do is related to learning how to work with [root][1], a common plotting/fitting program used across many experiments in particle physics. There are many root tutorials out there that you might find useful before, during, or after going through this tutorial.

Setting up your ROOT environment and accessing the Ntuple

Before we begin looking at the Ntuple, please enter into your mu2e offline directory and make sure the mu2e base release is set up ("source setup.sh" and then "setup mu2e"). There are two Ntuples created for you to explore during this tutorial. One includes signal and background events, while the other just has signal event information. We will focus on the latter.

The files are located at /cvmfs/mu2e.opensciencegrid.org/DataFiles/tutorials/trkana_signalAndBkg.root /cvmfs/mu2e.opensciencegrid.org/DataFiles/tutorials/trkana_signal.root

It is easiest if you copy the root files into your own offline directory using the cp command.

Become Familiar with the Ntuple

To open the ntuple file type: root -l trkana_signal.root (the -l is optional but stops the root logo from popping up when you open root). This opens the ROOT environment and allows you to navigate through the Ntuple. There are two ways to see the structure of the Ntuple: using a TBrowser and using the ROOT command line.

To set the Ntuple as your starting file, type: TFile *f1 = new TFile("trkana_signal.root");

Then to see the first folder, type: f1->ls();

You should see this on your terminal screen:


TFile**		trkana_signal_forENE.root	
 TFile*		trkana_signal_forENE.root	
  KEY: TDirectoryFile	TrkAna;1	TrkAna (TrackAnalysis) folder ==



Browse with the TBrowser

Once ROOT is open, type in "New TBrowser". Now you can look at the file structure! At the top of the left panel should be the trkana_signal.root file. From there you can click on TrkAna folder and the trkana subfolder (called a Tree). Now you have a list of 12 branches in your tree visible to you!

Here is the general info a few of the branches contain: evtinfo: event level info dem: results of downstream e minus fit uem: results of upstream e minus fit dmm: results of downstream mu minus fit demc: calorimeter info for downstream e minus demmc: MC info for downstream e minus demmcgen: MC generator info for downstream e minus (i.e. the particle that was created) demmcent: MC info for downstream e minus at entrance to tracker demmcmid: MC info for downstream e minus at middle of tracker demmcxit: MC info for downstream e minus at exit of tracker

Remember this Ntuple is filled with information from the tracker that follows a particle's path through the detector. Thus it contains information from the fits for downtstream and upstream moving electrons in the straw tracker, information from the calorimeter (located behind the straw tracker), and the Monte Carlo (MC) information which can be thought of as our "truth" information. The MC shows us how well our tracking information accurately reconstructs the simulated event data.

Explore a Single Branch

Click on the dem branch. Here you have 34 different leaves! Have you noticed the Tree, Branch, Leaf structure of the Ntuple yet? This cute naming convention makes it easy to understand the hierarchical structure of the Ntuple. These leaves are histograms that contain various information about the downstream electron fit. We will look at just a few of them, but feel free to explore more on your own.

Click on the status leaf. There are 2 peaks, one at 0 and one at -1000. A value of >0 is a success, and you can see that about 333/593 entries were a success! You can move the legend box by clicking and dragging on it. Placing your mouse at the top of the bin at 0 will give you the bin contents on the bottom right hand side of the screen.

Status Histogram for dem branch with instructions


The next leaf is the pdg, or particle id number. A pdg value of 11 is an electron. Then we have nhits: the number of hits on the track. If no track is found, we have 0 hits, but the sucessful tracks range from 15-82 hits.

The nactive leaf shows the number of hits used in the fit. The histogram looks similar to the nhits histogram but is not quite the same. You can see the mean value decreased from 21.88 to 21.44.

The t0 leaf has the time of the track. Our live window for data taking is between 400 and 1700ns. The next leaf t0err is just the error on the t0 values. We only have a finite resolution in timing with the detector.

Working with ROOT in the command line

When we are looking at the plots in the leaves, we are seeing the cumulative data for all events. What if we wanted to just see the information for one specific event? To get this you need to type in the terminal ROOT browser:;

root [0] TFile *f1 = new TFile("trkana_signal.root");  (to establish the Ntuple as your starting point)
root [1] TTree *t = (TTree*)f1->Get("TrkAna/trkana");   (to get the trkana tree from the TrkAna folder in the ntuple)
root [2] t->Show(10)   (to get the information for event 10)

Now you should see a lot of information printed out that starts with:

eventid         = 13
runid           = 4001
subrunid        = 0
evtwt           = 1
beamwt          = 1
genwt           = 1
nprotons        = -1
nsh             = 63
nesel           = 59
nrsel           = 61
...


Build a simple analysis structure

Another way to interact with the Ntuple is by creating a Make Class Analysis loop. This creates a .C and the corresponding .h header file. This allows you to make plots of the variables in a root macro. It also shows you all the information included in the Ntuple in the header file.

Try running these commands:

root [0] TFile *f1 = new TFile("trkana_signal.root");  (establishes that you are using the Ntuple)
root [1] TTree *t = (TTree*)f1->Get("TrkAna/trkana");   (get the tree TrkAna from the Ntuple)
root [2] t->MakeClass("TreeAnalysis");     (makes the macros, the words in "" are the name of the macros you create)
Info in <TTreePlayer::MakeClass>: Files: TreeAnalysis.h and TreeAnalysis.C generated from TTree: trkana

You can then browse the .C and .h files using your favorite editor (emacs, vim, etc.)

In the TreeAnalysis.C file you will see that it includes the TreeAnalysis.h file and some basic ROOT macros. Then the Tree Analysis loop starts. You put the code in here that you want the macro to run. The commented section in red gives you some more commands you can run in ROOT. I will explain how to add code to the macro below in the Making Plots section.

Now that you know how to browse the Ntuple, let's start making plots!

Making plots

There are two main ways to make plots: interactively using the ROOT terminal and in the macros you created above (TreeAnalysis.C)

Interactively

You can create a separate canvas and plot any of the histograms included in the leaves of the ntuple. You just need to know the name of the leaf you want to plot. Follow these steps:

root [5] TCanvas *myCanvas = new TCanvas()    ---- create a canvas to have your histogram drawn on
root [6] t->Draw("nhits")       --- here nhits is the leaf you are plotting

Nhits histogram with all data values


If you have the TBrowser open, your chosen plot will show up there and not in the canvas.

What if you only want to look at a subrange of the data on the histogram? You can change the range by specifying the value in the Draw command.

root [7] t->Draw("nhits", "nhits>15");

[[File:Nhits15.jpg|500px |Nhits histogram with selected range >15]

Now we can better see the distribution to the left of 15. A more stark example is looking at the momentum leaf. Compare the distributions for momentum of track that you see in the two different histograms created below:

root [8] t->Draw("mom");   ---  unmodified histogram with peaks at -1000 and 100
root [9] t->Draw("mom", "mom>50");  ---  you can actually see the momentum distributions for the downstream electrons' tracks.


With a macro

Back to the TreeAnalysis.C macro we created earlier. To create a single histogram, you first need to book it near the start of the histogram, then fill it with the event information for each hit (in the for loop), and then draw the histogram. We will make a histogram of the number of track hits. The histogram will appear in the TBrowser.


 if (fChain == 0) return; | black}}
---- Need to book the histogram here 

To declare the histogram use this command:


 TH1F* ndofhistdem = new TH1F ("ndofdem", "Histogram of NDOF for dem", 100, 0, 100); 

where TH1F says it will be a 1-dim histogram, ndofhist is the name you will use to refer to the histogram in the script, in the paranthesis: the first is the short hand name that will be in the legend, the second phrase is the title of the histogram, then the number of bins, xmin, xmax.


Long64_t nentries = fChain->GetEntriesFast();
Long64_t nbytes = 0, nb = 0;
for (Long64_t jentry=0; jentry<nentries;jentry++) {
Long64_t ientry = LoadTree(jentry);
if (ientry < 0) break;
nb = fChain->GetEntry(jentry); nbytes += nb;
// if (Cut(ientry) < 0) continue;

---now in the for loop portion of the code you want to fill the histogram


ndofhistdem->Fill(dem__ndof)    ---- you need to specify both the branch and leaf value you want to fill
}

the drawing of the histogram goes here:

ndofhistdem->Draw();
}


Now to run and fill your histogram, you need to open up root. Then you need to load the code:

.L TreeAnalysis.C

then you need to create an object (a) for the macro:

TreeAnalysis a

lastly you run the loop over the ntuple:

a.Loop()

Making a Stacked Histogram

What if you want to compare the distribution of one variable for two different track types? You can stack the histgrams on top of each other and compare their distributions. We are going to compare the number of hits in the track for downstream e minus fit and the downstream mu minus fit. Using the same TreeAnalysis.C macro, add a second histogram in the same way as the first, but naming it ndofhistdmm and changing the corresponding dem -> dmm changes. To make the distributions easier to differentiate, change the color of the second histogram by

ndofhistdmm->SetLineColor(kRed):  (you can also use kGreen, kMagenta, kYellow, etc.)

Then the one difference is that you need to add Draw("same") after you draw the first histogram.

ndofhistdem->Draw();
ndofhistdmm->Draw("same");

You can see that the two types of fits have very similar distributions, but this may not always be the case. Stacked histograms can help us choose where to make selection cuts in our trigger and analysis in order to separate signal from background!

Making a 2D histogram