Running Art Tutorial
Tutorial Session Goal
In this Tutorial you will learn how to run the Mu2e 'art' framework executable (mu2e), both interactively and on the grid.
Session Prerequisites and Advance Preparation
- Perform the Tutorial on setting up the Mu2e Offline
Session Introduction
Art is a software framework for processing events with modular code with lots of run-time configurability. Art is controlled by scripts in a dedicated configuration language called fhicl (.fcl suffix). Art uses rootIO to store events.
This tutorial will cover how to build and run several different kinds of art jobs, and how to use the mu2e job tools to divide large projects into many separate jobs, and how to run those jobs in parallel on Fermigrid or the OSG (open science grid).
Exercises
Exercise 1: Running a simple module (Hello, Tutorial!) and basic FHiCL
In this exercise, we will run a simple module that will print a welcoming message.
- First set up to run Offline source /setupmu2e-art.sh source /Offline/v7_3_5/SLF6/prof/Offline/setup.sh cd $TUTORIAL_BASE/RunningArt
- The executable we use to run Offline is "mu2e." Use the --help option to display all the command line options mu2e --help
- FHiCL files (*.fcl) tell Offline what to do. We specify the fcl file we want to use every time we run Offline using the "-c" option. We will now run a simple job that prints a hello message using a premade fcl file. mu2e -c fcl/hello.fcl This will write "Hello, world" and the full event id for the first 3 events.
- We can now explore the hello.fcl file that configured this Offline job to see how it works. more fcl/hello.fcl
- In FHiCL, we make definitions using the syntax "variable : value". A group of definitions can be combined into a table by surrounding with braces {}.
- Like in C++, we can refer to FHiCL code in other files by using "#include" statements
- After defining the process name, you will see three main tables: source, services, physics
- "source" configures the inputs to the job. If we are making new events from scratch, we use "EmptyEvent". If we are building on top of old files, we might use "RootInput." You can also see that this job is configured to run 3 events by default
- "services" configures parts of art that are common to all modules, like the geometry, detector conditions, or ability to print out to files
- "physics" is where we configure the modules that do all the work. There are "producer" modules that creates data, and "analyzer" modules that read data and might make things like analysis TTrees. There are a couple different sections to the physics table. First we declare our producer and analyzer modules, then we define our "paths" (see below), and then we tell Art which paths we want to run.
- Just adding a module to physics doesn't mean art will run it. It is like defining a function in c++ without calling it. To make the module run, we must tell art the list of modules and the order we want to run them in. We do this by defining a variable called a path to be this list of module names. Here there are two paths, p1 (which is empty), and e1. We then tell Art which paths to run using the definitions of "trigger_paths" and "end_paths". Producers (and filters) go in trigger paths, analyzers go in end paths.
- You can find a lot more information about fcl in the Art workbook and users guide (https://art.fnal.gov/wp-content/uploads/2016/03/art-workbook-v0_91.pdf), start at chapter 9.
Exercise 2: Module configuration with FHiCL
We will now see how to modify FHiCL to run different modules and even configure those modules at runtime
- We have a new fcl file, hello2.fcl, try running that. mu2e -c fcl/hello2.fcl We can see we are running a different module, and it has some Magic number that we should be able to change. Looking at hello2.fcl, you should see the new module is called HelloWorld2
- We can look at the source code for HelloWorld2 to see how we change Magic number. gedit $MU2E_BASE_RELEASE/HelloWorld/src/HelloWorld2_module.cc In the constructor (line 29 to 37), you can see this module takes a fhicl::ParameterSet object, and magic number is initialized with the code pset.get<int>("magicNumber",-1) The fhicl::ParameterSet is made up of the table of definitions for that module under the physics table. So this line means in the FHiCL configuration of HelloWorld2, it is looking for a variable:value line where the variable name is "magicNumber" and the value is an integer.
- Configure fcl to set Magic number to 5 by adding a line "magicNumber : 5" under module_type. Run the fcl again to check that it changed
- You can also add this configuration to the end of the fcl file by using the full parameter location, i.e. physics.analyzers.hello2.magicNumber : 9 Try adding this to the end of your file and see if the magic Number changed
- Finally, try running both this module and the original HelloWorld module by adding the module declaration from hello.fcl and adding it to your end_path. If you need help, check $TUTORIAL_BASE/solutions/hello2.fcl
Exercise 3: Using a more realistic Mu2e fcl to simulate an event
- prolog, epilog
- @local, @table, @sequence
- primary, mixing, reco
- debugging config
mu2e -c fcl/CeEndpoint.fcl --debug-config CeEndpoint-debug.fcl
Exercise 4: Exploring Offline outputs
The above exercise should produce two files, dig.owner.CeEndpoint.version.sequencer.art and nts.owner.CeEndpoint.version.sequencer.root (also located in $TUTORIAL_BASE/RunningArt/data). Both are actually root files, but they contain different information. The .root files produced by Offline are used for diagnostic histograms and TTrees, and analysis output like TrkAna that can be used in a normal root analysis. The .art files contain the actual c++ objects Offline uses to describe the event (both simulation information and reconstructed information), and so are in general meant to be processed by other Offline jobs.
- Open both files in a root TBrowser to see their contents The .root file will have a few histograms describing the event generator output. The .art file will have TTrees for Event/subRun/Run level information. If you open the Events TTree, you will see lots of branches with complicated names.
- We can use Offline modules to better understand the .art file contents. mu2e -c Print/fcl/dumpDataProducts.fcl --source dig.owner.CeEndpoint.version.sequencer.art The art "dataproducts" are saved into .art files using the naming scheme className_moduleName_instanceName_processName Modules are not allowed to modify data in Art. Instead, if you want to change a dataproduct, modules will create a new modified version. Since the saved version always includes the moduleName, it is possible to refer to only this modified version in future modules or analyses.
Exercise 5: Create your own primary production job
- Look in JobConfig/primary
- Look at EventGenerator/fcl/prolog.fcl
- Add seed information
Exercise 6: Running event reconstruction
- FIXME need non mixing script
- Use output of exercise 4 or $TUTORIAL_BASE/RunningArt/data/dig.owner.CeEndpoint.version.sequencer.art
mu2e -c JobConfig/fcl/mcdigis.fcl --source $TUTORIAL_BASE/RunningArt/data/dig.owner.CeEndpoint.version.sequencer.art
- Run dumpDataProducts.fcl on the dig.*.art and the mcs.*.art files and compare
Exercise 7: Running TrkDiag to create TrkAna TTrees
mu2e -c TrkDiag/fcl/TrkAnaReco.fcl --source-list files.txt
Exercise 8: Using generate_fcl to prepare to run jobs on the grid
setup mu2etools
- Look at JobConfig/examples/generate_CeEndpoint.sh
- description / dsconf / dsowner
- Generate grid jobs for digi production
- Look at JobConfig/examples/generate_reco-CeEndpoint-mix.sh
- inputs / merge-factor
- Generate grid jobs for reco production
Exercise 9: Adding backgrounds with event mixing
- Compare JobConfig/primary/CeEndpoint.fcl to JobConfig/mixing/CeEndpoint.fcl
- locate background-cat.txt files
- aux-input
- generate_fcl
- run interactively, note job length
Exercise 10: Submitting grid jobs with mu2eprodsys
setup mu2egrid
- wfproject
- setup vs code tarball
- fcl on pnfs vs tarball
mu2eprodsys --dry-run
Exercise 11: Running the event display
Reference Materials
- art workbook
- various DocDBs that reference production, satellite release, partial checkout, etc.