Ntuples: Difference between revisions
(Created page with " TrkAna docdb 7775") |
No edit summary |
||
Line 1: | Line 1: | ||
TrkAna docdb 7775 | ==Introduction== | ||
Our primary data is stored in the [[Code|art]] format. This format uses root I/O, but embeds it in a framework with restrictive rules for data access. These framework rules are important during primary processing in order to precisely track the provenance of the data. At some point later in the analysis process, the dominate problem becomes accessing the data in a convenient way, rather than a very controlled way. The solution is to copy the most high-level parts of the data (such the number of hits on a track, and its momentum) into a smaller and faster format - an ntuple. Once the data is in this format, usually a root tree, the user can make histograms with simple cuts. Since only the high-level data values are stored, the dataset is very small and access is very fast. | |||
Ideally the collaboration would chose a primary ntuple format and officially support this format. The official support would include code and document support, and priority in support and processing. With one central format, everyone's work in creating datasets and tools could be shared. At this time (8/2018) the selection of a primary ntuple format is not done, but it is still the plan. Meanwhile, there are several methods | |||
==Stntuple== | |||
** [https://sites.google.com/view/stntuple/home Stntuple] [ssh://p-mu2eofflinesoftwarestntuple@cdcvs.fnal.gov/cvs/projects/mu2eofflinesoftwarestntuple/Stntuple.git git url] - one choice of user ntuple | |||
==TrkAna== | |||
TrkAna [https://mu2e-docdb.fnal.gov/cgi-bin/private/ShowDocument?docid=7775 docdb 7775] | |||
==Custom root tree== | |||
==gallery== | |||
==Other formats== | |||
Tools other than root have been explored at times, but there is no major effort on mu2e at this writing. | |||
** [http://jupyter.org/ Jupyter] | |||
** [https://www.hdfgroup.org/ HDF5] | |||
** [https://www.r-project.org/ r] |
Revision as of 17:57, 2 August 2018
Introduction
Our primary data is stored in the art format. This format uses root I/O, but embeds it in a framework with restrictive rules for data access. These framework rules are important during primary processing in order to precisely track the provenance of the data. At some point later in the analysis process, the dominate problem becomes accessing the data in a convenient way, rather than a very controlled way. The solution is to copy the most high-level parts of the data (such the number of hits on a track, and its momentum) into a smaller and faster format - an ntuple. Once the data is in this format, usually a root tree, the user can make histograms with simple cuts. Since only the high-level data values are stored, the dataset is very small and access is very fast.
Ideally the collaboration would chose a primary ntuple format and officially support this format. The official support would include code and document support, and priority in support and processing. With one central format, everyone's work in creating datasets and tools could be shared. At this time (8/2018) the selection of a primary ntuple format is not done, but it is still the plan. Meanwhile, there are several methods
Stntuple
TrkAna
TrkAna docdb 7775
Custom root tree
gallery
Other formats
Tools other than root have been explored at times, but there is no major effort on mu2e at this writing.