AnalysisWorkflow

From Mu2eWiki
Revision as of 17:59, 14 December 2018 by Rlc (talk | contribs) (→‎Running jobs)
Jump to navigation Jump to search

Introduction

This workflow is used to access the data in existing art format data files. If you don't know what art files are, please review the basic information at ComputingTutorials. Some major collaboration efforts to make files for analysis were the "cd3" processing in 2015 and the MDC2018 processing in 2018. Individual users may also produce and upload datasets.

Finding data

In a typical scenario, you will be given or find a dataset name, such as dig.mu2e.CeEndpoint.MDC2018b.art. Usually this will arise out of discussions with you physics group or mentor. You can also discover these from the listing of MDC 2018 data or the full listing.

Once you have the dataset name, you can see the list of files in the dataset with

setup mu2e
setup mu2efiletools
mu2eDatasetFileList  <dataset name> > fcllist.txt

fcllist.txt will contain one file per line, with the full path to the file, usually in dCache (path starts with "/pnfs"). This list is what will drive your job.

Analysis Methods

You have a choice of how to convert the art files into histograms or other summary formats to get at the analysis quantities you need. We have a summary of at Ntuples. It is common to work with files that are not fully reconstructed. You have the option to run additional simulation and reconstruction after you read the files and before you write out an analysis ntuple. Design these details requires working with a experienced collaborator, familiar with your goals.

You will need an Offline build to run your jobs in. This can be one of the published releases:

ls /cvmfs/mu2e.opensciencegrid.org/Offline

or you can build, and modify, the code locally, and build a tarball of the code for grid submission.


Running jobs

To start, you can run on the first file interactively:

mu2e -S fcllist.txt -n 100 -c <your_job_fcl>

be sure to limit the number of events since you have the entire dataset available for input.

A quick way to see the contents of almost all products in an art file is Validation:

 mu2e -S fcllist.txt -n 100 -c Validation/fcl/val.fcl

which will write validation.root containing many histograms. Or you can print the products:

mu2e -S fcllist.txt -n 10 -c Print/fcl/print.fcl > products.txt


If you are new to grid jobs, please review Grids, Dcache, DataTransfer, and JobPlan. To submit a grid job you will need to make a set of fcl files, each takes your basic interactive fcl and customizes it for different input and output files. The resulting set of fcl files are submitted to the grid. This is likely to be "example 2" of the SubmitJobs page.