Difference between revisions of "Running Art Tutorial"

From Mu2eWiki
Jump to navigation Jump to search
Line 58: Line 58:
  
 
=== Exercise 3: Using a more realistic Mu2e fcl to simulate an event ===
 
=== Exercise 3: Using a more realistic Mu2e fcl to simulate an event ===
* prolog, epilog
+
 
* @local, @table, @sequence
+
To fully simulate an event in Mu2e, we will need to run many more modules and services. Modules can become dependent on output from previous modules and may require certain services to be set up, so the final FHiCL for a functioning Mu2e Offline job ends up being somewhat complex. To help make things easier, we use a few FHiCL tricks.
* primary, mixing, reco
+
 
* debugging config
+
<ol style="list-style-type:lower-alpha">
<nowiki>mu2e -c fcl/CeEndpoint.fcl --debug-config CeEndpoint-debug.fcl</nowiki>
+
<li>Lets look at an example script to produce conversion electron events</li>
 +
 
 +
  gedit fcl/CeEndpoint.fcl
 +
 
 +
<li>At the top you will see a #include line. Lets look at the file it is including</li>
 +
 
 +
  gedit $MU2E_BASE_RELEASE/JobConfig/fcl/prolog.fcl
 +
 
 +
<li>You can see that this file includes several more files, and then starts with BEGIN_PROLOG. Prolog files are just a bunch of FHiCL definitions that then can be used later. You can see for example it defines a table called Primary.producers</li>
 +
* Most directories in Offline have a prolog.fcl file that provide standard definitions for their modules and folders
 +
<li>In JobConfig/fcl/prolog.fcl (and in fcl/CeEndpoint.fcl) you will see several definitions using "@local" or "@table". This is how you reference a previously defined value (for example something defined in a prolog).
 +
 
 +
* @local references a standard definition (for example line 16 the definition of a module)
 +
* @table references a table of several definitions but without the curly braces (for example line 18 adds several more module definition name:value pairs to the producers table)
 +
* @sequence references a list of values separated by commas like for a path (for example line 62, each sequence adds the name of several modules to this path)
 +
* For more details read https://mu2ewiki.fnal.gov/wiki/FclIntro
 +
 
 +
<li>Back in fcl/CeEndpoint.fcl, you should see @table::Primary.producers, which we found in JobConfig/fcl/prolog.fcl. See if you can find out where the generate module in this fcl comes from and what it is running.</li>
 +
<li>You can debug a complicated FHiCL with lots of #includes using the Offline option --debug-config. This will fully process the script, and print out to a file the results with all the @local etc. references made explicit. Lets try with our CeEndpoint.fcl</li>
 +
 
 +
  mu2e -c fcl/CeEndpoint.fcl --debug-config CeEndpoint-debug.fcl
 +
 
 +
Look in CeEndpoint-debug.fcl for the generate module definition and check if you had it right.
 +
<li>The JobConfig directory has the base scripts for all the production jobs, and so has examples of how to correctly configure most kinds of Offline jobs. If you need to do anything different, you can usually start with a JobConfig script and modify it slightly.
 +
 
 +
* JobConfig/primary has scripts to generate primary only (no backgrounds) events without doing reconstruction
 +
* JobConfig/mixing has scripts to generate primary plus background events
 +
* JobConfig/reco has scripts to take output of the previous two and run reconstruction
 +
* These scripts are designed to be used with grid production scripts, and so don't include configuration of the random seed. To run the scripts as is, you will need to add two lines:
 +
 
 +
  services.SeedService.maxUniqueEngines: 50
 +
  services.SeedService.baseSeed: <put some number here>
 +
 
 +
<li>Lets try to make our own fcl script now. See if you can make and run a script to run primary only electron events with a flat momentum spectrum from 100 to 105 MeV/c</li>
 +
 
 +
* Tip: start by copying JobConfig/fcl/flateminus.fcl
 +
 
 +
<li>Now see if you can turn on the StrawDigi diagnostic output. Looking at $MU2E_BASE_RELEASE/TrackerMC/src/StrawDigisFromStepPointMCs_module.cc, you will see a FHiCL parameter called diagLevel. Try increasing it in your script from 0 to 2.</li>
 +
<li>Run your script and check the output .root file, you should see a new TDirectory with the StrawDigi diagnostics.</li>
 +
</ol>
 +
 
 
=== Exercise 4: Exploring Offline outputs ===
 
=== Exercise 4: Exploring Offline outputs ===
 
The above exercise should produce two files, dig.owner.CeEndpoint.version.sequencer.art and nts.owner.CeEndpoint.version.sequencer.root (also located in $TUTORIAL_BASE/RunningArt/data). Both are actually root files, but they contain different information. The .root files produced by Offline are used for diagnostic histograms and TTrees, and analysis output like TrkAna that can be used in a normal root analysis. The .art files contain the actual c++ objects Offline uses to describe the event (both simulation information and reconstructed information), and so are in general meant to be processed by other Offline jobs.
 
The above exercise should produce two files, dig.owner.CeEndpoint.version.sequencer.art and nts.owner.CeEndpoint.version.sequencer.root (also located in $TUTORIAL_BASE/RunningArt/data). Both are actually root files, but they contain different information. The .root files produced by Offline are used for diagnostic histograms and TTrees, and analysis output like TrkAna that can be used in a normal root analysis. The .art files contain the actual c++ objects Offline uses to describe the event (both simulation information and reconstructed information), and so are in general meant to be processed by other Offline jobs.

Revision as of 20:24, 3 June 2019

Tutorial Session Goal

In this Tutorial you will learn how to run the Mu2e 'art' framework executable (mu2e), both interactively and on the grid.

Session Prerequisites and Advance Preparation

Session Introduction

Art is a software framework for processing events with modular code with lots of run-time configurability. Art is controlled by scripts in a dedicated configuration language called fhicl (.fcl suffix). Art uses rootIO to store events.

This tutorial will cover how to build and run several different kinds of art jobs, and how to use the mu2e job tools to divide large projects into many separate jobs, and how to run those jobs in parallel on Fermigrid or the OSG (open science grid).

Exercises

Exercise 1: Running a simple module (Hello, Tutorial!) and basic FHiCL

In this exercise, we will run a simple module that will print a welcoming message.

  1. First set up to run Offline
  2. source /setupmu2e-art.sh source /Offline/v7_3_5/SLF6/prof/Offline/setup.sh cd $TUTORIAL_BASE/RunningArt
  3. The executable we use to run Offline is "mu2e." Use the --help option to display all the command line options
  4. mu2e --help
  5. FHiCL files (*.fcl) tell Offline what to do. We specify the fcl file we want to use every time we run Offline using the "-c" option. We will now run a simple job that prints a hello message using a premade fcl file.
  6. mu2e -c fcl/hello.fcl This will write "Hello, world" and the full event id for the first 3 events.
  7. We can now explore the hello.fcl file that configured this Offline job to see how it works.
  8. more fcl/hello.fcl
    1. In FHiCL, we make definitions using the syntax "variable : value". A group of definitions can be combined into a table by surrounding with braces {}.
    2. Like in C++, we can refer to FHiCL code in other files by using "#include" statements
    3. After defining the process name, you will see three main tables: source, services, physics
    4. "source" configures the inputs to the job. If we are making new events from scratch, we use "EmptyEvent". If we are building on top of old files, we might use "RootInput." You can also see that this job is configured to run 3 events by default
    5. "services" configures parts of art that are common to all modules, like the geometry, detector conditions, or ability to print out to files
    6. "physics" is where we configure the modules that do all the work. There are "producer" modules that creates data, and "analyzer" modules that read data and might make things like analysis TTrees. There are a couple different sections to the physics table. First we declare our producer and analyzer modules, then we define our "paths" (see below), and then we tell Art which paths we want to run.
    7. Just adding a module to physics doesn't mean art will run it. It is like defining a function in c++ without calling it. To make the module run, we must tell art the list of modules and the order we want to run them in. We do this by defining a variable called a path to be this list of module names. Here there are two paths, p1 (which is empty), and e1. We then tell Art which paths to run using the definitions of "trigger_paths" and "end_paths". Producers (and filters) go in trigger paths, analyzers go in end paths.
  9. You can see more detail about FHiCL at https://mu2ewiki.fnal.gov/wiki/FclIntro or check out the Art workbook and user guide chapter 9 (https://art.fnal.gov/wp-content/uploads/2016/03/art-workbook-v0_91.pdf)

Exercise 2: Module configuration with FHiCL

We will now see how to modify FHiCL to run different modules and even configure those modules at runtime

  1. We have a new fcl file, hello2.fcl, try running that.
  2. mu2e -c fcl/hello2.fcl We can see we are running a different module, and it has some Magic number that we should be able to change. Looking at hello2.fcl, you should see the new module is called HelloWorld2
  3. We can look at the source code for HelloWorld2 to see how we change Magic number.
  4. gedit $MU2E_BASE_RELEASE/HelloWorld/src/HelloWorld2_module.cc In the constructor (line 29 to 37), you can see this module takes a fhicl::ParameterSet object, and magic number is initialized with the code pset.get<int>("magicNumber",-1) The fhicl::ParameterSet is made up of the table of definitions for that module under the physics table. So this line means in the FHiCL configuration of HelloWorld2, it is looking for a variable:value line where the variable name is "magicNumber" and the value is an integer.
  5. Configure fcl to set Magic number to 5 by adding a line "magicNumber : 5" under module_type. Run the fcl again to check that it changed
  6. You can also add this configuration to the end of the fcl file by using the full parameter location, i.e.
  7. physics.analyzers.hello2.magicNumber : 9 Try adding this to the end of your file and see if the magic Number changed
  8. Finally, try running both this module and the original HelloWorld module by adding the module declaration from hello.fcl and adding it to your end_path. If you need help, check $TUTORIAL_BASE/solutions/hello2.fcl

Exercise 3: Using a more realistic Mu2e fcl to simulate an event

To fully simulate an event in Mu2e, we will need to run many more modules and services. Modules can become dependent on output from previous modules and may require certain services to be set up, so the final FHiCL for a functioning Mu2e Offline job ends up being somewhat complex. To help make things easier, we use a few FHiCL tricks.

  1. Lets look at an example script to produce conversion electron events
  2. gedit fcl/CeEndpoint.fcl
  3. At the top you will see a #include line. Lets look at the file it is including
  4. gedit $MU2E_BASE_RELEASE/JobConfig/fcl/prolog.fcl
  5. You can see that this file includes several more files, and then starts with BEGIN_PROLOG. Prolog files are just a bunch of FHiCL definitions that then can be used later. You can see for example it defines a table called Primary.producers
    • Most directories in Offline have a prolog.fcl file that provide standard definitions for their modules and folders
  6. In JobConfig/fcl/prolog.fcl (and in fcl/CeEndpoint.fcl) you will see several definitions using "@local" or "@table". This is how you reference a previously defined value (for example something defined in a prolog).
    • @local references a standard definition (for example line 16 the definition of a module)
    • @table references a table of several definitions but without the curly braces (for example line 18 adds several more module definition name:value pairs to the producers table)
    • @sequence references a list of values separated by commas like for a path (for example line 62, each sequence adds the name of several modules to this path)
    • For more details read https://mu2ewiki.fnal.gov/wiki/FclIntro
  7. Back in fcl/CeEndpoint.fcl, you should see @table::Primary.producers, which we found in JobConfig/fcl/prolog.fcl. See if you can find out where the generate module in this fcl comes from and what it is running.
  8. You can debug a complicated FHiCL with lots of #includes using the Offline option --debug-config. This will fully process the script, and print out to a file the results with all the @local etc. references made explicit. Lets try with our CeEndpoint.fcl
  9. mu2e -c fcl/CeEndpoint.fcl --debug-config CeEndpoint-debug.fcl Look in CeEndpoint-debug.fcl for the generate module definition and check if you had it right.
  10. The JobConfig directory has the base scripts for all the production jobs, and so has examples of how to correctly configure most kinds of Offline jobs. If you need to do anything different, you can usually start with a JobConfig script and modify it slightly.
    • JobConfig/primary has scripts to generate primary only (no backgrounds) events without doing reconstruction
    • JobConfig/mixing has scripts to generate primary plus background events
    • JobConfig/reco has scripts to take output of the previous two and run reconstruction
    • These scripts are designed to be used with grid production scripts, and so don't include configuration of the random seed. To run the scripts as is, you will need to add two lines:
    services.SeedService.maxUniqueEngines: 50 services.SeedService.baseSeed: <put some number here>
  11. Lets try to make our own fcl script now. See if you can make and run a script to run primary only electron events with a flat momentum spectrum from 100 to 105 MeV/c
    • Tip: start by copying JobConfig/fcl/flateminus.fcl
  12. Now see if you can turn on the StrawDigi diagnostic output. Looking at $MU2E_BASE_RELEASE/TrackerMC/src/StrawDigisFromStepPointMCs_module.cc, you will see a FHiCL parameter called diagLevel. Try increasing it in your script from 0 to 2.
  13. Run your script and check the output .root file, you should see a new TDirectory with the StrawDigi diagnostics.

Exercise 4: Exploring Offline outputs

The above exercise should produce two files, dig.owner.CeEndpoint.version.sequencer.art and nts.owner.CeEndpoint.version.sequencer.root (also located in $TUTORIAL_BASE/RunningArt/data). Both are actually root files, but they contain different information. The .root files produced by Offline are used for diagnostic histograms and TTrees, and analysis output like TrkAna that can be used in a normal root analysis. The .art files contain the actual c++ objects Offline uses to describe the event (both simulation information and reconstructed information), and so are in general meant to be processed by other Offline jobs.

  1. Open both files in a root TBrowser to see their contents
  2. The .root file will have a few histograms describing the event generator output. The .art file will have TTrees for Event/subRun/Run level information. If you open the Events TTree, you will see lots of branches with complicated names.
  3. We can use Offline modules to better understand the .art file contents.
  4. mu2e -c Print/fcl/dumpDataProducts.fcl --source dig.owner.CeEndpoint.version.sequencer.art The art "dataproducts" are saved into .art files using the naming scheme className_moduleName_instanceName_processName Modules are not allowed to modify data in Art. Instead, if you want to change a dataproduct, modules will create a new modified version. Since the saved version always includes the moduleName, it is possible to refer to only this modified version in future modules or analyses.

Exercise 5: Create your own primary production job

  • Look in JobConfig/primary
  • Look at EventGenerator/fcl/prolog.fcl
  • Add seed information

Exercise 6: Running event reconstruction

  • FIXME need non mixing script
  • Use output of exercise 4 or $TUTORIAL_BASE/RunningArt/data/dig.owner.CeEndpoint.version.sequencer.art
mu2e -c JobConfig/fcl/mcdigis.fcl --source $TUTORIAL_BASE/RunningArt/data/dig.owner.CeEndpoint.version.sequencer.art
  • Run dumpDataProducts.fcl on the dig.*.art and the mcs.*.art files and compare

Exercise 7: Running TrkDiag to create TrkAna TTrees

mu2e -c TrkDiag/fcl/TrkAnaReco.fcl --source-list files.txt

Exercise 8: Using generate_fcl to prepare to run jobs on the grid

setup mu2etools

  • Look at JobConfig/examples/generate_CeEndpoint.sh
  • description / dsconf / dsowner
  • Generate grid jobs for digi production
  • Look at JobConfig/examples/generate_reco-CeEndpoint-mix.sh
  • inputs / merge-factor
  • Generate grid jobs for reco production

Exercise 9: Adding backgrounds with event mixing

  • Compare JobConfig/primary/CeEndpoint.fcl to JobConfig/mixing/CeEndpoint.fcl
  • locate background-cat.txt files
  • aux-input
  • generate_fcl
  • run interactively, note job length

Exercise 10: Submitting grid jobs with mu2eprodsys

setup mu2egrid

  • wfproject
  • setup vs code tarball
  • fcl on pnfs vs tarball

mu2eprodsys --dry-run

Exercise 11: Running the event display

Reference Materials