Ntuples: Difference between revisions

Revision as of 17:57, 2 August 2018

Introduction

Our primary data is stored in the art format. This format uses root I/O, but embeds it in a framework with restrictive rules for data access. These framework rules are important during primary processing in order to precisely track the provenance of the data. At some point later in the analysis process, the dominate problem becomes accessing the data in a convenient way, rather than a very controlled way. The solution is to copy the most high-level parts of the data (such the number of hits on a track, and its momentum) into a smaller and faster format - an ntuple. Once the data is in this format, usually a root tree, the user can make histograms with simple cuts. Since only the high-level data values are stored, the dataset is very small and access is very fast.

Ideally the collaboration would chose a primary ntuple format and officially support this format. The official support would include code and document support, and priority in support and processing. With one central format, everyone's work in creating datasets and tools could be shared. At this time (8/2018) the selection of a primary ntuple format is not done, but it is still the plan. Meanwhile, there are several methods

Stntuple

- Stntuple git url - one choice of user ntuple

TrkAna

TrkAna docdb 7775

Custom root tree

gallery

Other formats

Tools other than root have been explored at times, but there is no major effort on mu2e at this writing.

- Jupyter
- HDF5
- r

@@ Line 1: / Line 1: @@
-TrkAna docdb 7775
+==Introduction==
+Our primary data is stored in the [[Code|art]] format.  This format uses root I/O, but embeds it in a framework with restrictive rules for data access.  These framework rules are important during primary processing in order to precisely track the provenance of the data.  At some point later in the analysis process, the dominate problem becomes accessing the data in a convenient way, rather than a very controlled way.  The solution is to copy the most high-level parts of the data (such the number of hits on a track, and its momentum) into a smaller and faster format - an ntuple.  Once the data is in this format, usually a root tree, the user can make histograms with simple cuts.  Since only the high-level data values are stored, the dataset is very small and access is very fast.
+Ideally the collaboration would chose a primary ntuple format and officially support this format.  The official support would include code and document support, and priority in support and processing.  With one central format, everyone's work in creating datasets and tools could be shared.  At this time (8/2018) the selection of a primary ntuple format is not done, but it is still the plan.  Meanwhile, there are several methods
+==Stntuple==
+** [https://sites.google.com/view/stntuple/home Stntuple] [ssh://p-mu2eofflinesoftwarestntuple@cdcvs.fnal.gov/cvs/projects/mu2eofflinesoftwarestntuple/Stntuple.git git url] - one choice of user ntuple
+==TrkAna==
+TrkAna [https://mu2e-docdb.fnal.gov/cgi-bin/private/ShowDocument?docid=7775 docdb 7775]
+==Custom root tree==
+==gallery==
+==Other formats==
+Tools other than root have been explored at times, but there is no major effort on mu2e at this writing.
+** [http://jupyter.org/ Jupyter]
+** [https://www.hdfgroup.org/ HDF5]
+** [https://www.r-project.org/ r]

Ntuples: Difference between revisions

Revision as of 17:57, 2 August 2018

Contents

Introduction

Stntuple

TrkAna

Custom root tree

gallery

Other formats

Navigation menu

Ntuples: Difference between revisions

Revision as of 17:57, 2 August 2018

Introduction

Stntuple

TrkAna

Custom root tree

gallery

Other formats

Navigation menu

Search