Module Writing Tutorial

From Mu2eWiki
Jump to navigation Jump to search

Tutorial Session Goal

This tutorial will show how to write an art module for Mu2e. It will explain how to structure the code, define runtime parameters, produce histograms and/or TTrees, and consume and produce data products.

Session Prerequisites and Advance Preparation

This tutorial requires the user to:

This tutorial also assumes that you are somewhat familiar with C++.

Session Introduction

Mu2e uses the art framework to organize our event processing code. The art framework processes events through a configurable sequential set of modules, called a path. Modules expects certain inputs, and optionally produces certain outputs. Modules have a standard interface for interacting with the art framework, giving the user access to data at run/subrun/event transitions. Modules have a standard library for defining configuration parameters that can be changed at runtime.

In this tutorial you will learn how to create an art module from a basic template, and perform basic data operations in that module. You will learn how to configure your module in code and fcl, and how to produce various kinds of output.

Disclaimer! In this tutorial we will be copying old modules to create new modules. In general, this is not a great idea because we can end up copying errorful code. If you do this in the real world make sure you trust the code you're copying!

Basic Exercises

Exercise 1: Running a simple module (Hello, Tutorial!)

In this exercise, we will run a simple module that will print a welcoming message.

  1. First create a Satelite release in the ModuleWriting tutorial
  2. > setup mu2e > cd $TUTORIAL_BASE/ModuleWriting > /cvmfs/mu2e.opensciencegrid.org/Offline/v7_4_0/SLF6/prof/Offline/bin/createSatelliteRelease --directory . > source setup.sh
  3. This first module has already been written for you so we can compile straight away
  4. > scons -j4
  5. And run
  6. mu2e -c Ex01/fcl/HelloEx01.fcl This will write "Hello, world" and the full event id for the first 3 events.

Exercise 2: Adding module configuration (Hello, Fhicl Validation!)

In this exercise, we will add some module configuration parameters that can be set in fcl

  1. We'll create a new directory for this exercise called Ex02/ and add src/ and fcl/ subdirectories
  2. Copy last exercise's .cc file and create a new file
  3. > cp Ex01/src/HelloTutorial_module.cc Ex02/src/HelloFhiclValidation_module.cc
  4. Open the new file in your favourite text editor and make the following changes
    1. find and replace HelloTutorial with HelloFhiclValidation
    2. in the Config struct, below the using commands add the following line to declare a fhicl parameter
    3. fhicl::Atom<int> number{Name("number"), Comment("This module will print this number")};
    4. add a private member variable to the class that will store the value of this parameter
    5. int _number;
    6. fill this member variable in the constructor initialization list:
    7. _number(conf().number())
    8. change the std::cout command to print this number
    9. std::cout << "My _number is..." << _number << std::endl;
  5. Copy the SConscript file from the previous exercise and recompile with scons -j4
  6. Now create a copy of the fcl file, open it and make the following changes:
    1. find and replace HelloTutorial with HelloFhiclValidation
    2. add the new fhicl parameter to the module configuration and assign it a value
    3. hello: { module_type : HelloFhiclValidation number : 5 }
  7. Now everything's ready to run mu2e so do that
  8. mu2e -c Ex02/fcl/HelloEx02.fcl
  9. Try doing the following and note the errors you see:
    1. run without the new parameter in the fcl file
    2. run with a typo in the parameter name
    3. run with the value of the parameter as a float and a string

    This is why fhicl validation is nice. We can catch errors in the module configurations before we waste time running the module. Some modules use an old format for module configuration, which does not catch such errors. These will be slowly converted but all new modules should use fhicl validation.

  10. We can let the parameter have a default value by adding this value to end of the parameter declaration:
  11. fhicl::Atom<int> defaultNumber{Name("defaultNumber"), Comment("This module will print this number"), 100};
  12. Add another member variable for this parameter, fill it in the constructor initialization and add another cout statement to print this value
  13. Recompile and run without changing the fcl file
  14. Now add the defaultNumber parameter to your fcl file and play with different values
  15. There are some occasions where we want a parameter to be optional. This can be achieved by declaring the parameter like so:
  16. fhicl::OptionalAtom<int> optionalNumber{Name("optionalNumber"), Comment("This module will print this number but it is optional")}; However, we can't initialise this in the initializer list since it might not exist. Instead we have to check it exists in the analyze function: if (_conf.optionalNumber(_optionalNumber)) { std::cout << "My _optionalNumber is..." << _optionalNumber << std::endl; }

That's it for this exercise, here are some optional things to try:

  1. (Optional): add more parameters of different types (try floats, strings etc.)

Exercise 3: Reading in a mu2e data product (Hello, KalSeed!)

In this exercise, we will be reading the results of the track fit and printing their times to the terminal.

  1. Create a new Ex03/ directory with src/ and fcl/ subdirectories
  2. Copy the first exercise's .cc file to create a new module and make the following changes
  3. cp Ex01/src/HelloTutorial_module.cc Ex03/src/HelloKalSeed_module.cc
    1. find and replace HelloTutorial for HelloKalSeed
    2. add a fhicl parameter of type art::InputTag and a member variable to store the value
    3. fill the member variable and print its value in the constructor
  4. Copy a previous fcl file and edit it so that runs your new module and uses the new parameter (at the moment, it doesn't matter what the value of this parameter is).
  5. Compile, run and make sure everything works as you expect
  6. mu2e -c Ex03/fcl/HelloEx03.fcl
  7. Now we want to read in an art file. In this step, we will just change the fcl file so that it can start from an art file input
    1. in the source block of your fcl file, change EmptyEvent to RootInput
    2. run your fcl file on this file list to make sure everything works
    3. mu2e -c Ex03/fcl/HelloEx03.fcl -S filelists/mcs.mu2e.CeEndpoint-mix-cat.MDC2018h.1-file.lst
  8. Now let's actually do something with a Mu2e data product. We will be looking at tracks (class KalSeed) from the downstream e-minus fit. Set the value of your fcl parameter to KFFDeM and then make the following changes in your module's analyze() function
    1. #include the file RecoDataProducts/inc/KalSeed.hh
    2. get a valid handle to the KalSeedCollection from the event:
    3. const auto& kalSeedCollectionHandle = event.getValidHandle<KalSeedCollection>(_input);
    4. get the KalSeedCollection itself
    5. const auto& kalSeedCollection = *kalSeedCollectionHandle;
    6. loop through and print the t0 of each KalSeed (you can look in $MU2E_BASE_RELEASE/RecoDataProducts/inc/KalSeed.hh to work out why we need two .t0()s)
    7. for (const auto& i_kalSeed : kalSeedCollection) { std::cout << "t0 = " << i_kalSeed.t0().t0() << " ns" << std::endl; }

You have just read an already existing Mu2e data product from a Mu2e art file! Here are some optional exercises to explore this topic further:

  1. (Optional): try to print the times of a different KalSeedCollection (e.g. KFFDmuM)
  2. (Optional): try to print some other values from the KalSeed (e.g. momentum (which can be found in KalSegment))
  3. (Optional): try to read a different type of Mu2e data product (you can find the list of data products in an event with mu2e -c $MU2E_BASE_RELEASE/Print/fcl/dumpDataProducts.fcl -S filelist -n 1)

Exercise 4: Filling a histogram (Hello, Histogram!)

In this exercise, we will create a ROOT histogram of the times that we were printing out to the screen in Exercise 3. We will need to use art's TFileService to create and write ROOT objects to a ROOT file.

  1. Create a new Ex04/ directory with src/ and fcl/ subdirectories
  2. Copy Exercise 3's module and make the following changes:
  3. cp Ex03/src/HelloKalSeed_module.cc Ex04/src/HelloHistogram_module.cc
    1. find and replace HelloKalSeed for HelloHistogram
    2. #include art/Framework/Services/Optional/TFileService.h and TH1F.h
    3. add a new private member variable of type TH1F*
    4. add a new function to your module: void beginJob() (this is a standard art module function that will run at the beginning of your job)
    5. in the beginJob() function add:
    6. art::ServiceHandle<art::TFileService> tfs; double min_t0 = 0; double max_t0 = 1700; double t0_width = 10; int n_t0_bins = (max_t0 - min_t0) / t0_width; _hTrackTimes = tfs->make<TH1F>("hTrackTimes","Track t0", n_t0_bins,min_t0,max_t0);
    7. in the analyze() function of your module, fill the histogram with the track times:
    8. _hTrackTimes->Fill(i_kalSeed.t0().t0());
  4. Copy a previous fcl file and change it to make sure it runs the new module (this should just be a change to the module_type)
  5. Compile and run with the following command:
  6. mu2e -c Ex04/fcl/HelloEx04.fcl -S filelists/mcs.mu2e.CeEndpoint-mix-cat.MDC2018h.1-file.lst --TFileName out.root Note that, instead of defining the output ROOT filename on the command line. You can set the fcl parameter services.TFileService.fileName in your fcl file
  7. Open the out.root file and explore it with a TBrowser to find your histogram!

Now that you can create a histogram, you can try some of these optional exercises

  1. (Optional): make the histogram parameters (min, max, bin width) fcl parameters
  2. (Optional): create a second instance of your module and read in a different KalSeedCollection (e.g. KFFDmuM)
  3. (Optional): make the plot informative by adding axis labels

Exercise 5: Creating a subset of a data product (Hello, Cool KalSeed!)

In this exercise, we will find some "cool" KalSeeds that arrive fashionably late by cutting on the track times and put them into their own collection. We will reuse the HelloHistogram module to plot these "cool" collection.

  1. Create a new Ex05/ directory with src/ and fcl/ subdirectories
  2. Copy the module from Exercise 4
  3. cp Ex04/src/HelloHistogram_module.cc Ex04/src/HelloCoolKalSeed_module.cc
  4. The new module will be a producer, rather than an analyzer, so do a find and replace for the following in HelloCoolKalSeed_module.cc
    1. EDAnalyzer to EDProducer
    2. analyze to produce
    3. const art::Event& to art::Event& (because this module needs to be able to modify the event)
    4. remove the call to EDProducer in the constructor initializer list
  5. Also remove the histogram making parts of this module
  6. Before we do any cutting, let's make sure we have a working fcl
  7. Copy the fcl file from Exercise 4 and make the following changes:
    1. add a new producer block in the physics block like so:
    2. producers : { cool : { module_type : HelloCoolKalSeed input : KFFDeM } }
    3. add cool to the p1 path
  8. Compile and run your new fcl file. There should be no difference in the output with the previous exercise
  9. Now that we have the basis of our job, we can start making changes.
  10. In the HelloCoolKalSeed module, make the following changes:
    1. create a new fhicl parameter that will take a float value for a time cut
    2. in the body of the construtor, we need to tell art that the module will be producing a new KalSeedCollection
    3. produces<KalSeedCollection>();
    4. in the produce() function, we need to:
      • create a new KalSeedCollection to fill (actually we create a pointer)
      std::unique_ptr<KalSeedCollection> outputKalSeeds(new KalSeedCollection());
      • fill the new KalSeedCollection
      if (i_kalSeed.t0().t0() > _cut) { outputKalSeeds->push_back(i_kalSeed); }
      • write the new KalSeedCollection to the art::Event once we're done
      event.put(std::move(outputKalSeeds));
  11. In your fcl file
    1. change the input tag of the histogramming module to cool
  12. Recompile and run, and you should see that the histogram is cut at your cut value!

Now that you have created a subset of an already existing collection, you can try some of these optional exercises:

  1. (Optional): change the cut value in fcl and make sure the output is as you expect
  2. (Optional): add some cuts on different variables
  3. (Optional): make some of the cuts optional

Exercise 6: Writing an art file (Hello, My Art File!)

In this exercise, we will be splitting the previous exercise into two different jobs to show how to write an art file. There will be no changes to the source code since this can all be done in fcl.

  1. Create a new Ex06/ directory with only a fcl/ subdirectories
  2. Create two copies of your Exercise 5 fcl file. One will be for reading and one will be for writing.
  3. cp Ex05/fcl/HelloEx05.fcl Ex05/fcl/HelloEx05_read.fcl cp Ex05/fcl/HelloEx05.fcl Ex05/fcl/HelloEx05_write.fcl
  4. In your reading fcl, delete everything related to your producer module
  5. In your writing fcl, delete everything related to your analyzer module
  6. Also in your writing fcl:
    1. add an output block (note, this is at the same level as the physics block):
    2. outputs : { MyOutput : { module_type: RootOutput SelectEvents : [p1] fileName : "my-art-file.art" outputCommands : [ "drop *_*_*_*", "keep *_cool_*_*" ] } }
    3. add MyOutput to e1
  7. Run your writing fcl and then check that the resulting .art file only contains your KalSeeds:
  8. mu2e -c $MU2E_BASE_RELEASE/Print/fcl/dumpDataProducts.fcl -s my-art-file.art
  9. Now run your reading fcl on the art file you just created and you should have a new ROOT file with your time histogram in!

Now that you can write an art file, try some of the following optional exercises:

  1. (Optional): write out some other data products along with your cool KalSeeds

Exercise 7: Filtering events we don't want (Hello, Filter!)

You might notice that the number of events in your output art file is the same as in the input, even though we are only writing out the cool KalSeeds and we are cutting on those. This is because we are still write art::Event even if it's empty. In this exercise, we will add a filter module to remove events that we expect to be empty.

  1. Create a new Ex07/ directory with src/ and fcl/ subdirectories
  2. Copy your previous producer module and make the following changes:
  3. cp Ex05/src/HelloCoolKalSeed_module.cc Ex07/src/HelloFilter_module.cc
    1. HelloCoolKalSeed to HelloFilter
    2. EDProducer to EDFilter
    3. produce to filter
    4. the void return-type of filter() to bool (because we will return to art, true or false whether it passes the filter)
    5. delete anything related to cutting or producing a new KalSeedCollection (you want to keep the input KalSeedCollection)
  4. This module will run over our "cool" KalSeedCollection and tell art whether there are any tracks in the collection or not. Add some logic to the filter() function to do this.
  5. Now, let's set up the fcl file to write out only the events that pass this filter. Create a copies of your Exercise 6 fcls and make the following changes to the writing one:
    1. add a filters block to the physics block:
    2. filters : { helloFilter : { module_type : HelloFilter input : "cool" } }
    3. add "helloFilter" to the path p1
  6. Compile and run your new writing fcl file and you will see that the number of events in the output file has fallen
  7. You should be able to run your the reading fcl on your new art file and the output histogram should be the same

Exercise 8: Creating a new data product (Hello, My Data Product!)

Let's say we want to create a data product that just contains the track t0 (an oversimplified example)

  1. Create a new Ex08/ directory with src/ and fcl/ subdirectories
  2. Create a new class in a header file:
  3. #ifndef ModuleWriting_TrackTime_hh #define ModuleWriting_TrackTime_hh #include "RecoDataProducts/inc/KalSeed.hh" namespace mu2e { class TrackTime { public: TrackTime() : _time(0.0) {} TrackTime(const KalSeed& kseed) { _time = kseed.t0().t0(); } const double time() const { return _time; } private: double _time; }; typedef std::vector<TrackTime> TrackTimeCollection; } #endif
  4. To be able to write this data product out to an art file we need to add two files into the src/ directory:
    1. classes_def.xml
    2. <lcgdict> <class name="mu2e::TrackTime"/> <class name="art::Wrapper<mu2e::TrackTime>"/> <class name="std::vector<mu2e::TrackTime>"/> <class name="art::Wrapper<std::vector<mu2e::TrackTime> >"/> </lcgdict>
    3. classes.h
    4. // // headers needed when genreflex creates root dictionaries // for objects written to art files // #define ENABLE_MU2E_GENREFLEX_HACKS #include <vector> #include "canvas/Persistency/Common/Wrapper.h" #include "inc/TrackTime.hh" #undef ENABLE_MU2E_GENREFLEX_HACKS
  5. Create copies of your producer (HelloCoolDataProducts), filter (HelloFilter) and analyzer (HelloHistogram) modules with new names and:
    1. change KalSeedCollection to TrackTimeCollection
    2. include the new TrackTime.hh file rather than KalSeed.hh
    3. be sure to use the correct function to get the track time
  6. Now we need a new producer module to create the TrackTimes from KalSeeds. Copy your original producer module that created a cut KalSeedCollection and make the following changes:
    1. remove anything to do with cutting
    2. change the output to be a TrackTimeCollection and create TrackTimes rather than KalSeeds
    3. add a check that the input KalSeedCollection and the output TrackTime collection are the same size (throw an exception if they are not)
  7. Create copies of your writing and reading fcls from the previous exercise and
    1. update the modules in both fcl files to use your new modules that use TrackTime (should just be changes to each module_type
    2. add a configuration of your new producer module e.g.
    3. trackTime : { module_type : HelloTrackTimeDataProducts input : "KFFDeM" }
    4. add the new producer to the start of the path p1
    5. change the input of your cutting module to use the new producer
  8. Recompile and run your writing module
  9. Use $MU2E_BASE_RELEASE/Print/fcl/dumpDataProducts.fcl to check that you only have TrackTimeCollections in your art file
  10. Run your reading fcl on your new art file and check that your histogram looks as expected

Exercise 9: Creating and using a pointer to a data product (Hello, Art Ptrs!)

Want some way to go from the time to the original KalSeed. We will add a Ptr to the KalSeed in the previous data product. This is, again, a silly example.

  1. Create a new Ex09/ directory with src/ and fcl/ subdirectories
  2. To your TrackTime.hh file, make the following changes:
    1. add a new private member variable of type art::Ptr<KalSeed>
    2. add a getter function (const art::Ptr<KalSeed> kalSeed() const) and a setter function (void setKalSeedPtr(const art::Ptr<KalSeed>& ptr))
  3. In order to create a Ptr, we need to have a handle to the collection and an index into the collection. In your TrackTime creating module, we need to make the following changes:
    1. move to an index based for loop (i.e. we want something like this now):
    2. for (auto i_kalSeed = kalSeedCollection.begin(); i_kalSeed != kalSeedCollection.end(); ++i_kalSeed) {
    3. create an art::Ptr and set it in the TrackTime class
    4. TrackTime track_time(*i_kalSeed); art::Ptr<KalSeed> kseedPtr(kalSeedCollectionHandle, i_kalSeed-kalSeedCollection.begin()); track_time.setKalSeedPtr(kseedPtr);
  4. producer needs to make the Ptr
  5. analyzer module needs to use the art::Ptr (in a second job)
  6. show the limits of Ptrs? e.g. add an additional condensing of the KalSeedCollection and show that we need to remake the TrackTimeCollection to get the Ptrs to be correct?

Exercise 10: Creating and using an assn between two data products (Hello, Art Assns!)

Another option to associate two data products is to use an Assns. It adds a third product but it's bidirectional.

  1. Create a new Ex10/ directory with src/ and fcl/ subdirectories
  2. producer needs to make the Assns
  3. analyzer module needs to use the Assns (in a second job)

Reference Materials

  • art workbook