Running Art Tutorial

From Mu2eWiki
Jump to navigation Jump to search

Tutorial Session Goal

In this Tutorial you will learn how to run the Mu2e 'art' framework executable (mu2e), both interactively and on the grid.

Session Prerequisites and Advance Preparation

Session Introduction

Art is a software framework for processing events with modular code with lots of run-time configurability. Art is controlled by scripts in a dedicated configuration language called fhicl (.fcl suffix). Art uses rootIO to store events.

This tutorial will cover how to build and run several different kinds of art jobs, and how to use the mu2e job tools to divide large projects into many separate jobs, and how to run those jobs in parallel on Fermigrid or the OSG (open science grid).

Exercises

Exercise 1: Running a simple module (Hello, Tutorial!) and basic FHiCL

In this exercise, we will run a simple module that will print a welcoming message.

  1. First set up to run Offline
  2. source /setupmu2e-art.sh source /Offline/v7_3_5/SLF6/prof/Offline/setup.sh cd $TUTORIAL_BASE/RunningArt
  3. The executable we use to run Offline is "mu2e." Use the --help option to display all the command line options
  4. mu2e --help
  5. FHiCL files (*.fcl) tell Offline what to do. We specify the fcl file we want to use every time we run Offline using the "-c" option. We will now run a simple job that prints a hello message using a premade fcl file.
  6. mu2e -c fcl/hello.fcl This will write "Hello, world" and the full event id for the first 3 events.
  7. We can now explore the hello.fcl file that configured this Offline job to see how it works.
  8. gedit fcl/hello.fcl
    1. In FHiCL, we make definitions using the syntax "variable : value". A group of definitions can be combined into a table by surrounding with braces {}.
    2. Like in C++, we can refer to FHiCL code in other files by using "#include" statements
    3. After defining the process name, you will see three main tables: source, services, physics
    4. "source" configures the inputs to the job. If we are making new events from scratch, we use "EmptyEvent". If we are building on top of old files, we might use "RootInput." You can also see that this job is configured to run 3 events by default
    5. "services" configures parts of art that are common to all modules, like the geometry, detector conditions, or ability to print out to files
    6. "physics" is where we configure the modules that do all the work. There are "producer" modules that create data that is added to the event, and "analyzer" modules that read data and might make things like analysis TTrees. There are a couple different sections to the physics table. First we declare our producer and analyzer modules, then we define our "paths" (see below), and then we tell Art which paths we want to run.
    7. To make the module run, we must tell art the list of modules and the order we want to run them in. We do this by defining a variable called a path to be this list of module names. Here there are two paths, p1 (which is empty), and e1. We then tell Art which paths to run using the definitions of "trigger_paths" and "end_paths". Producers (and filters) go in trigger paths, analyzers go in end paths.
  9. You can see more detail about FHiCL at https://mu2ewiki.fnal.gov/wiki/FclIntro or check out the Art workbook and user guide chapter 9 (https://art.fnal.gov/wp-content/uploads/2016/03/art-workbook-v0_91.pdf)

Exercise 2: Module configuration with FHiCL

We will now see how to modify FHiCL to run different modules and even configure those modules at runtime

  1. We have a new fcl file, hello2.fcl, try running that.
  2. mu2e -c fcl/hello2.fcl We can see we are running a different module, and it has some Magic number that we should be able to change. Looking at hello2.fcl, you should see the new module is called HelloWorld2
  3. We can look at the source code for HelloWorld2 to see how we change Magic number.
  4. gedit $MU2E_BASE_RELEASE/HelloWorld/src/HelloWorld2_module.cc In the constructor (line 29 to 37), you can see this module takes a fhicl::ParameterSet object, and magic number is initialized with the code pset.get<int>("magicNumber",-1) The fhicl::ParameterSet is made up of the table of definitions for that module under the physics table. So this line means in the FHiCL configuration of HelloWorld2, it is looking for a variable:value line where the variable name is "magicNumber" and the value is an integer.
  5. Configure fcl to set Magic number to 5 by adding a line "magicNumber : 5" under module_type. Run the fcl again to check that it changed
  6. You can also add this configuration to the end of the fcl file by using the full parameter location, i.e.
  7. physics.analyzers.hello2.magicNumber : 9 Try adding this to the end of your file and see if the magic Number changed
  8. Finally, try running both this module and the original HelloWorld module by adding the module declaration from hello.fcl and adding it to your end_path. If you need help, check $TUTORIAL_BASE/solutions/hello2.fcl

Exercise 3: Using a more realistic Mu2e fcl to simulate an event

To fully simulate an event in Mu2e, we will need to run many more modules and services. Modules can become dependent on output from previous modules and may require certain services to be set up, so the final FHiCL for a functioning Mu2e Offline job ends up being somewhat complex. To help make things easier, we use a few FHiCL tricks.

  1. Lets look at an example script to produce conversion electron events
  2. gedit fcl/CeEndpoint.fcl
  3. At the top you will see a #include line. Lets look at the file it is including
  4. gedit $MU2E_BASE_RELEASE/JobConfig/fcl/prolog.fcl
  5. You can see that this file includes several more files, and then starts with BEGIN_PROLOG. Prolog files are just a bunch of FHiCL definitions that then can be used later. You can see for example it defines a table called Primary.producers
    • Most directories in Offline have a prolog.fcl file that provide standard definitions for their modules and folders
  6. In JobConfig/fcl/prolog.fcl (and in fcl/CeEndpoint.fcl) you will see several definitions using "@local" or "@table". This is how you reference a previously defined value (for example something defined in a prolog).
    • @local references a standard definition (for example line 16 the definition of a module)
    • @table references a table of several definitions but without the curly braces (for example line 18 adds several more module definition name:value pairs to the producers table)
    • @sequence references a list of values separated by commas like for a path (for example line 62, each sequence adds the name of several modules to this path)
    • For more details read https://mu2ewiki.fnal.gov/wiki/FclIntro
  7. Back in fcl/CeEndpoint.fcl, you should see @table::Primary.producers, which we found in JobConfig/fcl/prolog.fcl. See if you can find out where the generate module in this fcl comes from and what it is running.
  8. You can debug a complicated FHiCL with lots of #includes using the Offline option --debug-config. This will fully process the script, and print out to a file the results with all the @local etc. references made explicit. Lets try with our CeEndpoint.fcl
  9. mu2e -c fcl/CeEndpoint.fcl --debug-config CeEndpoint-debug.fcl Look in CeEndpoint-debug.fcl for the generate module definition and check if you had it right.
  10. Finally, run fcl/CeEndpoint.fcl and generate 10 events
  11. mu2e -c fcl/CeEndpoint.fcl --nevts=10

Exercise 4: Exploring Offline outputs

The above exercise should produce two files, dig.owner.CeEndpoint.version.sequencer.art and nts.owner.CeEndpoint.version.sequencer.root (also located in $TUTORIAL_BASE/RunningArt/data). Both are actually root files, but they contain different information. The .root files produced by Offline are used for diagnostic histograms and TTrees, and analysis output like TrkAna that can be used in a normal root analysis. The .art files contain the actual c++ objects Offline uses to describe the event (both simulation information and reconstructed information), and so are in general meant to be processed by other Offline jobs.

  1. Open both files in a root TBrowser to see their contents
  2. The .root file will have a few histograms describing the event generator output. The .art file will have TTrees for Event/subRun/Run level information. If you open the Events TTree, you will see lots of branches with complicated names.
  3. We can use Offline modules to better understand the .art file contents.
  4. mu2e -c Print/fcl/dumpDataProducts.fcl --source dig.owner.CeEndpoint.version.sequencer.art The art "dataproducts" are saved into .art files using the naming scheme className_moduleName_instanceName_processName Modules are not allowed to modify data in Art. Instead, if you want to change a dataproduct, modules will create a new modified version. Since the saved version always includes the moduleName, it is possible to refer to only this modified version in future modules or analyses.
  5. In general, art dataproducts are much more difficult to work with in plain root scripts outside of the framework. The names are complicated, and the way objects reference other objects only works within the framework. Most of the time you will have some analysis module in Offline process the dataproducts .art file for you and produce a simpler TTree that you can run your root analysis on.

Exercise 5: Create your own primary production job

  1. The JobConfig directory has the base scripts for all the production jobs, and so has examples of how to correctly configure most kinds of Offline jobs. If you need to do anything different, you can usually start with a JobConfig script and modify it slightly.
    • JobConfig/primary has scripts to generate primary only (no backgrounds) events without doing reconstruction
    • JobConfig/mixing has scripts to generate primary plus background events
    • JobConfig/reco has scripts to take output of the previous two and run reconstruction
    • These scripts are designed to be used with grid production scripts, and so don't include configuration of the random seed. To run the scripts as is, you will need to add two lines:
    services.SeedService.maxUniqueEngines: 50 services.SeedService.baseSeed: <put some number here>
  2. Lets try to make our own fcl script now. See if you can make and run a script to run primary only events with 100 MeV photons. Start by looking at $MU2E_BASE_RELEASE/EventGenerator/fcl/prolog.fcl, which has the default configuration for all of the event generators. There should be one called photonGun.
    • Tip: start by copying $MU2E_BASE_RELEASE/JobConfig/primary/CeEndpoint.fcl and replace the generator.
    • Don't forget to add the seed configuration!
  3. Now see if you can turn on the StrawDigi diagnostic output. Looking at $MU2E_BASE_RELEASE/TrackerMC/src/StrawDigisFromStepPointMCs_module.cc, you will see a FHiCL parameter called diagLevel. Try increasing it in your script from 0 to 2.
  4. Run your script and check the output .root file, you should see a new TDirectory with the StrawDigi diagnostics.
  • If you need help, check solutions/photongun.fcl

Exercise 6: Running event reconstruction

The mu2e scripts we have run so far have been digi production jobs, so they ended up producing mu2e::StrawDigis, mu2e::CrvDigis, etc. dataproducts. These represent the data as it would look coming off the detector. We need to use a different script to actually run the reconstruction algorithms. We will use the output of the digi production jobs (the .art file) as the input to these new jobs.

  1. Take a look at $MU2E_BASE_RELEASE/JobConfig/reco/mcdigis-trig.fcl . You will see this script uses a RootInput source, which means it processes on top of an existing .art event. We can tell Offline which file to use three ways:
  2. * mu2e --source <path to .art file> * mu2e --source-list <path to text file with one .art file path per line> * add "source.fileNames : ["<path to .art file 1>","<path to .art file 2>",...] to the .fcl file
  3. Since we are starting from a "primary only" art file (more on that later), we will run the mcdigis-primary.fcl script. Try running it now on your CeEndpoint digi output:
  4. mu2e -c $MU2E_BASE_RELEASE/JobConfig/reco/mcdigis_primary.fcl --source data/dig.owner.CeEndpoint.version.sequencer.art
  5. Run Print/fcl/dumpDataProducts.fcl on the new .art file you produced (mcs.owner.RecoMCTrig.version.sequencer.art), and compare to the data products in the original digi art file.
  6. * You will see that the mu2e::StrawDigis etc. dataproducts are still there, but now associated with the SelectRecoMCs module * Additional dataproducts related to the reconstruction like mu2e::TimeClusters, mu2e::HelixSeeds, mu2e::KalSeeds. The module names for the reconstruction follow the pattern something + <D/U> + <e/mu> + <M/P> for downstream/upstream electron/muon minus/plus. "KSF" is the seed fit and "KFF" is the final fit.

Exercise 7: Running TrkDiag to create TrkAna TTrees

mu2e -c TrkDiag/fcl/TrkAnaReco.fcl --source-list files.txt

Exercise 8: Using generate_fcl to prepare to run jobs on the grid

When you want to simulate a large number of events, you will need to split your work up over multiple Offline jobs, and you will often want to run them on the Fermilab grid. There is a series of scripts to help you set up your fcl files and to start and track your jobs. The first problem is creating the fcl scripts. If you want to generate 10000 conversion electron events for example but only want to run 500 in a single job, you will need 20 fcl files that do the same thing - but each will need to have a different random seed, and each will need to output to different file names! The script generate_fcl takes a base fcl and will build these 20 files for it.

  1. The generate_fcl tool is in mu2etools so you will first need to setup that UPS product
  2. setup mu2etools
  3. In the $MU2E_BASE_RELEASE/JobConfig/examples directory, there are scripts to call generate_fcl for each of the fcl files in JobConfig/primary,JobConfig/mixing, and JobConfig/reco. Lets look at $MU2E_BASE_RELEASE/JobConfig/examples/generate_CeEndpoint.sh
    • You can see from the --include option that this script is using JobConfig/primary/CeEndpoint.fcl as the base fcl, and will be used to produce 2000000 events (200 jobs of 10000 events each)
    • The description, dsconf, and dsowner options configure the output filenames. If you are simulating a large production that will be widely used by others, it is important that the output filenames follow specific patterns, which is what generate_fcl tries to enforce. You can read more about the Mu2e filename patterns at https://mu2ewiki.fnal.gov/wiki/FileNames
    • generate_fcl will by default create directories named 000,001,002, etc. and put 1000 fcl files in each. In this script you will see that the 000 folder is automatically renamed to CeEndpoint
    • More detailed documentation for generate_fcl is located at https://mu2ewiki.fnal.gov/wiki/GenerateFcl
  4. Run this script to produce 200 jobs
  5. source $MU2E_BASE_RELEASE/JobConfig/examples/generate_CeEndpoint.fcl
    • There should now be a CeEndpoint directory. If you ls inside it, you will see many .fcl files and .fcl.json files
    • The .json files are used during official productions to keep track of the provenance of every file produced, you can ignore them for now
    • If you look at one of the .fcl files, you will see it just #includes our base fcl, and the only change from file to file is a reconfiguration of the seed and output filenames.
    We can do the same thing for reconstruction jobs. The configuration of generate_fcl is a bit different since we now are telling it how to read in files instead of how many events to produce.
  6. Look at scripts/generate_reco-CeEndpoint-mix.sh (This is a slightly modified version of JobConfig/examples/generate_reco-CeEndpoint-mix.sh). Instead of njobs and events-per-job we now just provide a --inputs argument with the name of a file containing filenames of art files to read from, and a --merge-factor parameter. Normally reconstruction is fast enough that in one job you can process the output of many simulation jobs. The merge-factor configuration tells generate_fcl how many art files each reco job will use.
  7. Run this script and explore the output
  8. source scripts/generate_reco-CeEndpoint-mix.sh
    • Since data/CeEndpoint-mix.txt had 60 entries and we have a merge-factor of 20, we will produce 3 output fcl files.

Exercise 9: Adding backgrounds with event mixing

So far when we've been simulating Mu2e events, we've only included the primary signal. But in the real experiment any signal event will sit on top of a lot of backgrounds from the beam etc. Simulating backgrounds from the beam takes a very long time, so they are done in separate jobs. The outputs of those dedicated background simulations can then be put on top of a signal only simulation to create something more realistic - this is called event "mixing".

  1. First lets look at a production mixing script: open up JobConfig/mixing/CeEndpointMix.fcl and JobConfig/primary/CeEndpoint.fcl side by side and compare. You can see that by using prolog we've kept the scripts looking basically identical.
  2. We want to randomize which background events are mixed on top of each signal event - we can do this using the generate_fcl script. If you ls data/*.txt you will see files for several different types of backgrounds. Each is a list of filenames of background only simulation art files. These files are not included in Offline, you will need to produce a copy of them making sure to use the most up to date background datasets before you run any mixing jobs. Open scripts/generate_CeEndpointMix.sh. It looks similar to the non-mixing version but now has several --aux-input values which point to these text files.
  3. Build a set of mixing jobs:
  4. source scripts/generate_CeEndpointMix.sh If you look at one of the produced fcl files, you will see a bunch of new fileNames parameters at the bottom that have randomly sampled from the .txt files I have not put the actual background .art files in the docker so you will not be able to run a mixing job here. When you do run a mixing job on the Fermilab machines, one thing to note is that they will take much much longer than a normal simulation job. Simulating 500 conversion electrons for example can take hours to a full day. By looking at the JobConfig/example scripts you can get a sense of how many events to run in each kind of job so they dont end up taking too long.

Exercise 10: Submitting grid jobs with mu2eprodsys

Here we will go over the actual process of running a large set of simulation jobs (like the fcl we produced in exercise 8) on the grid. A more detailed version of the whole grid production workflow can be found at https://mu2ewiki.fnal.gov/wiki/MCProdWorkflow , and details on submitting jobs can be found at https://mu2ewiki.fnal.gov/wiki/SubmitJobs

  • We need to make everything that our job needs will be accessible to the grid nodes. We need to make sure they can access the Offline code we are running, any input files, and the fcl scripts we want them to run. For something to be accessible to a grid node, we need to either send it directly to the node along with the job or put it on a shared disk resource. The background .art files we saw in exercise 9 for example, located on the mu2e machines in the /pnfs/mu2e/tape/ directory, are accessible to the nodes. Additionally, the official Offline releases at /cvmfs/mu2e.opensciencegrid.org/Offline/ are accessible. More information on the various disk spaces can be found at https://mu2ewiki.fnal.gov/wiki/Dcache and https://mu2ewiki.fnal.gov/wiki/Cvmfs .


  1. First we need to set up the UPS products for the grid tools
  2. setup mu2egrid
  3. Next, we will create a tarball of our fcl files produced with generate_fcl. For a real production we would put this tarball on a node accessible disk and point the jobs at it.
  4. tar -cvzf fcl.tar.gz reco-CeEndpoint-mix_000/
  5. The command we will run is mu2eprodsys. The options we need to provide it are:
    • --dsconf and --dsowner : should be same as in generate_fcl
    • --wfproject : sets name of directory where output will be placed
    • --setup : path to the setup.sh script for the version of Offline we want to run. This Offline must be on a grid accessible disk
    • --fcl : path to the fcl.tar.gz file we just created
    Since we don't have access to the CVMFS disk from the docker instance, for now point --setup to /Offline/v7_3_5/SLF6/prof/Offline/setup.sh and --fcl to the fcl.tar.gz in the current directory
  6. Once you think you have the full command, try it out with the --dry-run option. You can compare against the script at solutions/mu2eprodsys.sh
  7. If you want to run a non release version of Offline, we will need to provide mu2eprodsys with a tar containing the code to run. Using the example from https://mu2ewiki.fnal.gov/wiki/Mu2e_Offline_Tutorial, build a satellite release with any package you want. Then we will use gridexport https://mu2ewiki.fnal.gov/wiki/Gridexport
  8. setup gridexport cd /path/to/satellite/ gridexport A new .tar.gz file will be produced in the /pnfs/mu2e/resilient/ directory.
  9. Run mu2eprodsys again but replace --setup with --code=/path/to/tar.gz that was just produced


Exercise 11: Running the event display

Reference Materials