DQM
Introduction
Data Quality monitoring is the process of running checks on new data as it comes in, and record the results. For example, as pass1 reconstruction is run, we plan to also create and save a set of histograms. To evaluate the status of the data quality, we expect two general approaches. First, is to extract a set of simple numbers (such as the mean number of hit on a track) that would be sensitive to overall detector performance and data quality. These quantities can be saved in a database and plotted as a function of time or run number. The second general approach is to compare the histograms to previous runs, including perhaps quantitative comparison, such as a chi2. Generally, the Offline operations and online shift crews would be responsible for reviewing these monitors to spot unexpected changes.
File names
DQM file names will have a specific pattern:
ntd.mu2e.DQM_stream.process_aggregation_version.run_subrun.root
- the names, such as the data_tier, owner, and sequencer, should follow the standard file name pattern
- DQM string is always first in the description
- stream would typically represents a stream of files written by the DAQ, expected to be names as a dataset, and fed into the offline processing. This filed is epxected to be something like "ele" or "cosmic".
- process is the procedures which produced this DQM plots, say "pass1" or "pass2"
- aggregation allows for the fact that smaller DQM files are likely to be added together, so the, say, 20 files that go into a stream during a run can be added together and a DQM result can be recorded for the whole run. Some aggregation key words might be "file" for a single file, or "run" or "week".
- version is an integer and should reflect the version of process which produced the file which this DQM file represents. For example, we expect pass1 in will be stopped, repaired or improved, and then re-run on some subset of data. This will advance the pass1 version number and DQM should follow.