Validation: Difference between revisions

From Mu2eWiki
Jump to navigation Jump to search
mNo edit summary
 
Line 6: Line 6:
* If we gain access to a new platform, such as a supercomputer center, we can run simulation and reco on a known platform and the new platform, and compare results.
* If we gain access to a new platform, such as a supercomputer center, we can run simulation and reco on a known platform and the new platform, and compare results.
* If we release a new code version, we can run standard tests and histograms, and compare to previous releases.
* If we release a new code version, we can run standard tests and histograms, and compare to previous releases.
* If we install a new version of a critical package like geant, we can run standard code and histograms, and compare the results from before and after the change.
* If we install a new version of a critical package like Geant4, we can run standard code and histograms, and compare the results from before and after the change.
* In a simulation or reco production run, we can produce a standard set of histograms.  This can serve as a check on the results, then can also be archived as a record of what was produced.
* In a simulation or reco production run, we can produce a standard set of histograms.  This can serve as a check on the results, then can also be archived as a record of what was produced.


Line 19: Line 19:
* every night, we test build the secondary repos such as Tutorial,  Mu2eEventDisplay and others.
* every night, we test build the secondary repos such as Tutorial,  Mu2eEventDisplay and others.
* every night, we run Offline DQM on the valJob reco test, and record the result in the database
* every night, we run Offline DQM on the valJob reco test, and record the result in the database
* every night, we run a jenkins job to check that the code builds with a future geant4 version
* every night, we run a jenkins job to check that the code builds with a future Geant4 version
* every week, as part of the valJob system, we run tests for floating point errors and gnu sanitize warnings.
* every week, as part of the valJob system, we run tests for floating point errors and gnu sanitize warnings.


Line 106: Line 106:


===Overlaps===
===Overlaps===
If the volumes we define to geant overlap in a non-physical way, which is easy to do in a complex geometry, the simulation code may crash, or may fail in subtle ways.  After making changes to geometry, we want to run this overlap check.  The check selects random points on surfaces of volumes and asks if they are in another volume.  Since the points only have some chance to land in the trouble spot, the more points checked, the better.
If the volumes we define to Geant4 overlap in a non-physical way, which is easy to do in a complex geometry, the simulation code may crash, or may fail in subtle ways.  After making changes to geometry, we want to run this overlap check.  The check selects random points on surfaces of volumes and asks if they are in another volume.  Since the points only have some chance to land in the trouble spot, the more points checked, the better.


====root method====
====root method====
Line 120: Line 120:
  = Overlap ov00001: ProtonBeamDumpFront extruded by: ProtonBeamDumpFront/ProtonBeamDumpCoreAir0x74aeb10 ovlp=1.27
  = Overlap ov00001: ProtonBeamDumpFront extruded by: ProtonBeamDumpFront/ProtonBeamDumpCoreAir0x74aeb10 ovlp=1.27


====geant method====
====Geant4 method====


A check (runs in 2 hours) can be done with an art executable.  This method picks many random points on the surface of a volume and checks if they are either outside the method volume or inside a sister volume (a volume with the same mother).
A check (which as of 2024 should take under 30min) can be done with an art executable.  This method picks many random points on the surface of a volume and checks if they are either outside the method volume or inside a sister volume (a volume with the same mother).
  mu2e -c Offline/Mu2eG4/fcl/surfaceCheck.fcl >& surfaceCheck.log
  mu2e -c Offline/Mu2eG4/fcl/surfaceCheck.fcl >& surfaceCheck.log
(Note: The geometry file you are testing needs to have some configuration parameters to perform the surface check. See Offline/Mu2eG4/geom/geom_SurfaceCheck.txt for these parameters)
(Note: The geometry file you are testing needs to have some configuration parameters to perform the surface check. See Offline/Mu2eG4/geom/geom_SurfaceCheck.txt for these parameters)
Line 135: Line 135:
If there were no overlap, the print would end with "OK".
If there were no overlap, the print would end with "OK".


====geant method for subsystems====
====Geant4 method for subsystems====


Edit this file:
Edit this file:
Line 145: Line 145:
Then
Then
  mu2e -c Offline/Mu2eG4/fcl/surfaceCheckSelect.fcl
  mu2e -c Offline/Mu2eG4/fcl/surfaceCheckSelect.fcl
The output can analyzed the same way as the [[#geant_method|geant method]]
The output can analyzed the same way as the [[#Geant4_method|Geant4 method]]


===Stopped muons===
===Stopped muons===

Latest revision as of 03:35, 4 December 2024

Introduction

Validation serves to answer whether our code is functional, what physics the code is producing, and whether it is performing the same in two contexts. It primarily operates by producing standard numbers and histograms, and if needed, comparing the histograms between contexts. Some examples are:

  • Every night on the Jenkins platform we run conversion electron generation, simulation, reconstruction, and standard histogramming. We save the histograms and compare to the previous night's result. If the histograms have changed, then send alerts by mail.
  • If we gain access to a new platform, such as a supercomputer center, we can run simulation and reco on a known platform and the new platform, and compare results.
  • If we release a new code version, we can run standard tests and histograms, and compare to previous releases.
  • If we install a new version of a critical package like Geant4, we can run standard code and histograms, and compare the results from before and after the change.
  • In a simulation or reco production run, we can produce a standard set of histograms. This can serve as a check on the results, then can also be archived as a record of what was produced.

Below we list the automated procedures, the individual procedures we run occasionally, and the validation tools. The tools include code to make a standard set of validation histograms, and code to compare histograms.

Automatic procedures

The automatic procedures run on the Jenkins build platform, and there is more information on those pages

  • Every time a Pull Request is made to our github repository, a procedure on the Jenkins platform is triggered to check that the code still builds and runs simple executables. A report sent back to the github PR page. More documentation
  • Every night, a series of grid jobs are submitted to check if the physics results of the code base have changed. The system, called valJob, runs jobs for conversion electrons, protons on target, cosmic rays, mixing, pileup, extracted cosmics, stopped muons, and reco only (no simulation involved). The results are posted. More documentation in the next section.
  • every night, we check the run-ability of a series of short executables
  • every night, we check geometry overlaps
  • every night, we test build the secondary repos such as Tutorial, Mu2eEventDisplay and others.
  • every night, we run Offline DQM on the valJob reco test, and record the result in the database
  • every night, we run a jenkins job to check that the code builds with a future Geant4 version
  • every week, as part of the valJob system, we run tests for floating point errors and gnu sanitize warnings.

valJob procedure

The valJob jobs are run in the mu2epro account, off a cron running on mu2egpvm01. The build is done under here

/mu2e/app/users/mu2epro/nightly

the output goes under here:

/pnfs/mu2e/persistent/users/mu2epro/valjob

the web results are written under here:

https://mu2e.fnal.gov/atwork/computing/ops/val/valJob2/nightly/nightly.html
/web/sites/mu2e.fnal.gov/htdocs/atwork/computing/ops/val/valJob/nightly (historical archive)

and the scripts are here:

~mu2epro/cron/val

and also committed to the codetools repo

The results of each grid job is set of art files, which are then histogrammed, the histograms compared to the previous day, the results are posted, and an email report is sent to a set of code maintainers. In the report, the comparison to the day before, for each type of job, is categorized as follows:

  • PERFECT - no differences
  • UNSTABLE - only one difference in one file is found. This is usually caused by uninitialized memory.
  • OK - for all plots, either the K-S test or the fraction test has greater than 99% agreement
  • FAIL - does not pass "OK" test
  • MISSING - jobs did not produce output

Individual Procedures

Reco Validation

To validate reconstruction using input digi files, you can run the following:

mu2e -c Production/Validation/reco.fcl -s artfile

where 'artfile' is a file containing simulated digis (see the MDC2018 page for example collections of those) The output of this will be an art file that can be used as input to the actual validation described below.

Validation module

The module can be run by itself on any art file:

mu2e -c Offline/Validation/fcl/val.fcl -s artfile

This will produce validation.root (or whatever you specify with -T). This root file will contain a Validation directory and in there will be more directories, one for each art product the module could find and analyze. The directories are named by the product name. If there are many instances of products, such as StepPointMCs, then each instance will get its own directory and set of histograms.

The module can also be run as part of a path:

services : {
# request standard geometry and conditions services
# request TFileService
}
# setup module
physics :{
  analyzers: {
    Validation : {
      module_type : Validation
      validation_level : 1
    }
  }
}
# put Validation in a path..

Using the module as part of your path will let you see all the products in the event, even ones that will get dropped on output. The validation level will control how in-depth the histogramming will go. So far, only level 1 is implemented. This is intended to be quick and robust, just histogramming a few variables from each product. When level 2 is implemented, it might histogram derived quantities or quantities derived from multiple products, or make cuts. For which products are histogrammed, see the repo.

The set of histograms is appropriate to use for basic evaluation of a file contents, and for comparing output from different contexts.

g4test03

This test produces conversion electrons, makes straw hits, and runs calorimeter reconstruction, then runs the Analyses/src/ReadBack_module.cc module which reads these art products and makes sanity-check histograms and nutples of basic straw hit and calorimeter quantities. The output is an art file data_03.root and a root file g4test_03.root

mu2e -n 200 -c Offline/Mu2eG4/fcl/g4test_03.fcl

The histogram file is suitable to be used as a validation result for checks and comparisons and the ntuples can be used for basic investigations of file contents.

Conversion electrons

One executable can generate conversion electrons, simulate them, and reconstruct them. This does not include background frames. This sample is a standard test of tracking efficiency and momentum resolution. Generating 10,000 electrons takes about two hours:

mu2e -n 10000 -c Offline/Validation/fcl/ceSimReco.fcl

The output will be an art file whcih can tehn be histogrammed:

mu2e -s mcs.owner.val-ceSimReco.dsconf.seq.art -c Offline/Validation/fcl/val.fcl

The output histogram file validation.root can be inspected.

needs updating You can make standard plots of the tracking efficiency and resolution by running this script, which opens genReco.hist and looks at the ntuple:

root -l
root [0] .x TrkDiag/test/ce.C

The outputs will be two pdf files: rcan_ce.pdf, with the resolution fits, and acan_ce.pdf, with the acceptance plots.

Overlaps

If the volumes we define to Geant4 overlap in a non-physical way, which is easy to do in a complex geometry, the simulation code may crash, or may fail in subtle ways. After making changes to geometry, we want to run this overlap check. The check selects random points on surfaces of volumes and asks if they are in another volume. Since the points only have some chance to land in the trouble spot, the more points checked, the better.

root method

In this method you generate a gdml file which summarizes the geometry. First edit Mu2eG4/fcl/gdmldump.fcl to make sure it is reading the right Geometry, you may want to chenge this to geom_common_current.txt. Then run it

mu2e -c Offline/Mu2eG4/fcl/gdmldump.fcl

This produces mu2e.gdml. If mu2e.gdml already exists in your release remove it before running gdmldump.fcl to make sure the new version is saved. Then run the root script on this

setup codetools
overlapCheck.sh mu2e.gdml

Here is what an overlaps looks like:

Info in <TGeoNodeMatrix::CheckOverlaps>: Number of illegal overlaps/extrusions : 2
=== Overlaps for Default ===
= Overlap ov00000: ProtonBeamDumpFront extruded by: ProtonBeamDumpFront/ProtonBeamDumpFrontSteel0x74af9a0 ovlp=1.27
= Overlap ov00001: ProtonBeamDumpFront extruded by: ProtonBeamDumpFront/ProtonBeamDumpCoreAir0x74aeb10 ovlp=1.27

Geant4 method

A check (which as of 2024 should take under 30min) can be done with an art executable. This method picks many random points on the surface of a volume and checks if they are either outside the method volume or inside a sister volume (a volume with the same mother).

mu2e -c Offline/Mu2eG4/fcl/surfaceCheck.fcl >& surfaceCheck.log

(Note: The geometry file you are testing needs to have some configuration parameters to perform the surface check. See Offline/Mu2eG4/geom/geom_SurfaceCheck.txt for these parameters)

To check the output, first check that about 10K volumes were checked:

grep 'Checking overlaps for volume' surfaceCheck.log | grep -c OK

Then if the following command produces any lines of output, the check failed, and the text should indicate one of the volumes in question:

grep 'Checking overlaps for volume' surfaceCheck.log | grep -v OK

Here is an example

> grep 'Checking overlaps for volume' surfaceCheck.log | grep -v OK
Checking overlaps for volume ProtonBeamDumpBackSteel ... 

If there were no overlap, the print would end with "OK".

Geant4 method for subsystems

Edit this file:

Offline/Mu2eG4/test/geom_SurfaceCheck_Select.txt

turning on only the subsystems to be checked. To turn on all subsystems, set the global variable at the top:

bool g4.doSurfaceCheck    = true;

There is at least some capability os taking a line like X.Y.Z and setting X.Y or X to set larger pieces of the subsystems. WARNING: these flags are not necessarily intuitive, to be sure what they are doing, you have to read the code (search for doSurfaceCheck).

Then

mu2e -c Offline/Mu2eG4/fcl/surfaceCheckSelect.fcl

The output can analyzed the same way as the Geant4 method

Stopped muons

Any change to the generation, simulation, or magnetic fields can affect the stopped muon count and distribution in the target. This is a critical number for the experiment, so there is a procedure to do a standard check. The check generates protons hitting the primary target, and propagates outgoing muons to the stopping target. Low statistics (~100 stopped muons)can be generated interactively in an hour:

mu2e -n 50000 -c Production/JobConfig/validation/stoppedMuonsSingleStage.fcl

To get good statistics, ~18K stops, it is necessary to submit the job to the grid.

Setup the necessary packages. (The exact steps may evolve, please see Workflows for how to submit jobs. Please also check for the latest fcl dataset.)

TAG=stops_`date +%y-%m-%d`
WORKDIR=/mu2e/data/users/$USER
OFFLINE=/cvmfs/mu2e.opensciencegrid.org/Offline/v6_2_4/SLF6/prof/Offline/setup.sh
# this is 200 fcl files of 50K POT each
FCLDS=cnf.mu2e.stoppedMuonsSingleStage.180927.fcl
source $OFFLINE
setup mu2efiletools
setup mu2egrid

mu2eDatasetFileList --disk $FCLDS > stops.fcllist

mu2eprodsys  \
--fcllist=stops.fcllist \
--clustername=00   \
--wfproject=$TAG  \
--setup=$OFFLINE  \
--dsconf=0000 \
--disk=2GB --memory=1950MB --expected-lifetime=2h

After a few hours, the output will appear in /pnfs/mu2e/scratch/users/$USER/workflow/$TAG/outstage. Check the output:

cd /pnfs/mu2e/scratch/users/$USER/workflow/$TAG/outstage
mu2eClustercheckAndMove *

Create a list of the files containing the stopped muons:

mu2eClusterFileList--dsname=sim.${USER}.stoppedMuonsSingleStage.0000.art \
   /pnfs/mu2e/scratch/users/$USER/workflow/$TAG/good/* \
  > $WORKDIR/${TAG}.input

Now run the module to make a ntuple

mu2e -S $WORKDIR/${TAG}.input -T $WORKDIR/${TAG}.root -c Production/JobConfig/beam/TGTstops.fcl

Now make plots of the stops from the ntuple.

Validation tools

When you setup any modern Offline release, you have the validation tools in your path. There are two main tools. The first is a module to make a standard set of histograms, which is explained in the above section. The second is a executable, valCompare, to compare two sets of histogram files.


valCompare

valCompare is an executable that uses the custom code to compare two files of histograms and make reports in various ways, including a web page.

The executable is run with arguments of two histogram filespecs. The first histogram file is take as the standard and in plots it will be the gray histogram. The second file is taken as the file to be tested and appears as red dots. It will only compare histograms that are in the same directories and with the same names, and ignore anything else, such as ntuples.

Two criteria are used: a standard root KS test, and a fraction test. When comparing histograms, underflow and overflow bins are included by default, but you can switch this off. The fraction test is done by integrating through each histogram and at each step compute the difference in the two sums and save the largest difference found, then divide by the total entries in the standard histogram and subtract from one (so when they are similar, the result is near 1). When statistics get very small, the KS test fails but the fraction test still is useful. When statistics are very high, and there is any tiny systematic variation (like comparing two data runs) the KS test will fail. In this case the fraction test will tell you what you want to know. Very generally, if a comparison passes one of the two tests, it is "OK".

There are two comparison levels, a loose and tight, and also two modes. The two most common cases for comparison are either the files are supposed to be identical (like in nightly validation) or statistically independent (like comparing two simulation files which had different random seeds). If the files were supposed to be identical, the alarm levels for the tests are <0.999 (tight) and <0.99 (loose). If the files are independent, they are set at <0.01 (tight) and <0.001 (loose). The normalization of the two files can be scaled.


valCompare options:

valCompare -h
    lists all options
valCompare -s file1.root file2.root
    produces a summary of the histograms compared
valCompare -r file1.root file2.root
    produces a line for each histogram and how it compared
valCompare -w dir/result.html file1.root file2.root
    produces the named web page and overlay plots for each histogram also in that dir

The core histogram comparison classes an can also be used in root:

root [0] .L lib/libmu2e_Validation_root.so
TValCompare c
c.SetFile1("file1.root")
c.SetFile2("file2.root")
c.Analyze()
c.Summary()
c.Browse()

They can also be used in code. The histograms are compared with TValHistH, TProfile with TValHistP and TEfficiency with TValHistE

#include "Validation/inc/root/TValHistH.h"
TH1D* h1 = new TH1D...
TH1D* h2 = new TH1D...
TValHistH comph()
comph.SetHist1(h1)
comph.SetHist2(h2)
comph.Analyze()
comph.Summary()
comph.Report()
comph.GetKsProb()
comph.GetFrProb()
comph.Draw()

Example of Using val.fcl and valCompare

Suppose that you have two art files, f1.art and f2.art and you wish to compare them.

mu2e -c Offline/Validation/fcl/val.fcl -s f1.art -T f1.root
mu2e -c Offline/Validation/fcl/val.fcl -s f2.art -T f2.root
valCompare -s f1.root f2.root
mkdir compare_1_2
valCompare -w compare_1_2/results.html f1.root f2.root

The first valCompare command will produce a short summary. The second valCompare command will produce a web page with plots that you can see by looking at the file compare_1_2/results.html using a web browser. It will look something like,

https://mu2e.fnal.gov/atwork/computing/ops/val/valJob/nightly/2023_09/16/ceSimReco/result.html

It may be impractically slow to view the results by running a browser on the mu2egpvm machine. You can use scp to copy the files from the mu2egpvm machine to your local machine and view the output there. For example:


scp -r mu2ebuild01:/path/to/compare_1_2 output/directory/on/your/computer

Then file the file results.html web browser.