Module Writing Tutorial

From Mu2eWiki
Revision as of 22:34, 31 May 2019 by Edmonds (talk | contribs)
Jump to navigation Jump to search

Tutorial Session Goal

This tutorial will show how to write an art module for Mu2e. It will explain how to structure the code, define runtime parameters, produce histograms and/or TTrees, and consume and produce data products.

Session Prerequisites and Advance Preparation

This tutorial requires the user to:

This tutorial also assumes that you are somewhat familiar with C++.

Session Introduction

Mu2e uses the art framework to organize our event processing code. The art framework processes events through a configurable sequential set of modules, called a path. Modules expects certain inputs, and optionally produces certain outputs. Modules have a standard interface for interacting with the art framework, giving the user access to data at run/subrun/event transitions. Modules have a standard library for defining configuration parameters that can be changed at runtime.

In this tutorial you will learn how to create an art module from a basic template, and perform basic data operations in that module. You will learn how to configure your module in code and fcl, and how to produce various kinds of output.

Disclaimer! In this tutorial we will be copying old modules to create new modules. In general, this is not a great idea because we can end up copying errorful code. If you do this in the real world make sure you trust the code you're copying!

Basic Exercises

Exercise 1: Running a simple module (Hello, Tutorial!)

In this exercise, we will run a simple module that will print a welcoming message.

  1. First create a Satelite release in the ModuleWriting tutorial
  2. > setup mu2e > cd $TUTORIAL_BASE/ModuleWriting > /cvmfs/mu2e.opensciencegrid.org/Offline/v7_4_0/SLF6/prof/Offline/bin/createSatelliteRelease --directory . > source setup.sh
  3. This first module has already been written for you so we can compile straight away
  4. > scons -j4
  5. And run
  6. mu2e -c fcl/HelloEx01.fcl This will write "Hello, world" and the full event id for the first 3 events.

Exercise 2: Adding module configuration (Hello, Fhicl Validation!)

In this exercise, we will add some module configuration parameters that can be set in fcl

  1. Copy last exercise's .cc file and create a new file
  2. > cp src/HelloTutorial_module.cc src/HelloFhiclValidation_module.cc
  3. Open the new file in your favourite text editor and make the following changes
    1. find and replace "HelloTutorial" with "HelloFhiclValidation"
    2. in the Config struct, below the using commands add the following:
    3. fhicl::Atom<int> number{Name("number"), Comment("This module will print this number")};
    4. add a private member variable to the class:
    5. int _number;
    6. add to the constructor initialisation list:
    7. _number(conf().number())
    8. add a second std::cout command to print the number e.g.
    9. std::cout << "My number is..." << _number << std::endl;
  4. Recompile with scons -j4 and fix any compilation errors
  5. Now create a copy of the fcl file, open it and make the following changes:
    1. find and replace "HelloTutorial" with "HelloFhiclValidation"
    2. add the new fhicl parameter to the module configuation and assign it a value
  6. Run mu2e with the new fcl file
  7. Try doing the following and note the errors you see:
    1. run without the new parameter in the fcl file
    2. run with a typo in the parameter name
    3. run with the value of the parameter as a float and a string

    This is why fhicl validation is nice. We can catch errors in the module configurations before we waste time running the module. Some modules use an old format for module configuration, which does not catch such errors. These will be slowly converted but all new modules should use fhicl validation.

  8. We can add parameter with a default value by adding this value to end of the parameter declaration:
  9. fhicl::Atom<int> defaultNumber{Name("number"), Comment("This module will print this number"), 100};
  10. Add some code to write this parameter out, recompile and play with the fcl configuration again. Note that it is often best to have default values defined in fcl prolog(?)
  11. We can also add an optional parameter:
  12. fhicl::OptionalAtom<int> optionalNumber{Name("optionalNumber"), Comment("This module will print this number but it is optional")}; However, we can't initialise this in the initialiser list. Instead we have to check that the parameter exists: if (_conf.optionalNumber(_optionalNumber)) { std::cout << "My _optionalNumber is..." << _optionalNumber << std::endl; }
  13. (Optional): add some more parameters (try floats, strings etc.)

Exercise 3: Reading in a mu2e data product (Hello, Mu2e Data Product!)

In this exercise, we will be reading the results of the track fit and printing their times to the screen.

  1. Let's get a minimal working example set up:
    1. Copy the first exercise's .cc file to create a new module
    2. Add a fhicl parameter of type art::InputTag and a simple print statement that prints the parameter
    3. Copy a previous fcl file and edit it so that runs your new module and uses the new parameter (at the moment, it doesn't matter what the value of this parameter is).
    4. Compile, run and make sure everything works as you expect
  2. Now we want to read in an art file. For the time being, we will do nothing with it.
    1. in the source block of your fcl file, change EmptyEvent to RootInput
    2. run your fcl file with this option added on the command line -S filelists/mcs.mu2e.CeEndpoint-mix-cat.MDC2018h.1-file.lst to make sure everything works
  3. Now let's actually do something with a Mu2e data product. We will be looking at tracks (class KalSeed) from the downstream e-minus fit. Set the value of your fcl parameter to KFFDeM and then make the following changes in your module's analyze() function
    1. #include the file RecoDataProducts/inc/KalSeed.hh
    2. get a valid handle to the KalSeedCollection from the event:
    3. const auto& kalSeedCollectionHandle = event.getValidHandle<KalSeedCollection>(_input);
    4. get the KalSeedCollection itself
    5. const auto& kalSeedCollection = *kalSeedCollectionHandle;
    6. loop through and print the t0 of each KalSeed (you can look in $MU2E_BASE_RELEASE/RecoDataProducts/inc/KalSeed.hh to work out why we need two .t0())
    7. for (const auto& i_kalSeed : kalSeedCollection) { std::cout << "t0 = " << i_kalSeed.t0().t0() << " ns" << std::endl; }
  4. (Optional): try to print the times of a different KalSeedCollection (e.g. KFFDmuM)
  5. (Optional): try to print some other values from the KalSeed (e.g. momentum (which can be found in KalSegment))
  6. (Optional): try to read a different type of Mu2e data product (you can find the list of data products in an event with mu2e -c Print/fcl/dumpDataProducts.fcl -S filelist -n 1)

Exercise 4: Filling a histogram (Hello, Histogram!)

In this exercise, we will create a ROOT histogram of the times that we were printing out to the screen in Exercise 3. We will need to use art's TFileService to create and write ROOT objects to a ROOT file.

  1. To start, you can either copy or edit the module from exercise 3:
  2. Make the following changes to your module's source code:
    1. #include art/Framework/Services/Optional/TFileService.h and TH1F.h
    2. add a new private member variable of type TH1F*
    3. add a new function to your module: void beginJob(). This function will run at the beginning of your job.
    4. in the beginJob() function add:
    5. art::ServiceHandle<art::TFileService> tfs; double min_t0 = 0; double max_t0 = 1700; double t0_width = 10; int n_t0_bins = (max_t0 - min_t0) / t0_width; _hTrackTimes = tfs->make<TH1F>("hTrackTimes","Track t0", n_t0_bins,min_t0,max_t0);
    6. in the analyze() function of your module, fill the histogram with the track times:
    7. _hTrackTimes->Fill(i_kalSeed.t0().t0());
  3. To run mu2e, use the following command:
  4. mu2e -c yourEx04.fcl -S filelists/mcs.mu2e.CeEndpoint-mix-cat.MDC2018h.1-file.lst --TFileName out.root Note that, instead of defining the output ROOT filename on the command line. You can set the fcl parameter services.TFileService.fileName in your fcl file
  5. Open the out.root file and explore it with a TBrowser to find your histogram!

Now that you can create a hsitogram, you can try some of these optional exercises

  1. (Optional): make the histogram parameters (min, max, bin width) fcl parameters
  2. (Optional): create a second instance of your module and read in a different KalSeedCollection (e.g. KFFDmuM)
  3. (Optional): make the plot informative by adding axis labels

Exercise 5: Creating a subset of a data product (Hello, Cool Data Products!)

In this exercise, we will cut on the track times and run the previous histogram maker

  1. Copy the module from the previous exercise to create a new module (don't forget to rename the module itself)
  2. This new module will be a producer, rather than an analyzer like we have used before, so do a find and replace for the following
    1. EDAnalyzer to EDProducer
    2. analyze to produce
    3. const art::Event& to art::Event& (because this module needs to be able to modify the event)
    4. remove the call to EDProducer in the constructor initializer list
  3. You can also remove the histogram making parts of the module
  4. Create a new fcl file and make the following changes:
    1. add a new producer block in the physics block:
    2. producers : { cool : { module_type : HelloCoolDataProducts input : KFFDeM } }
    3. add cool to the p1 path
  5. Compile and run your new fcl file. There should be no difference in the output with the previous exercise at the moment
  6. Now that we have the basis of our job. We can start making changes.
  7. In the new producer module, make the following changes:
    1. create a new fhicl parameter that will take a float value for a time cut
    2. in the constructor, tell the module that we will be producing a new KalSeedCollection
    3. produces<KalSeedCollection>();
    4. in the produce() function, we need to:
      • create a new KalSeedCollection to fill (actually we create a pointer)
      std::unique_ptr<KalSeedCollection> outputKalSeeds(new KalSeedCollection());
      • fill the new KalSeedCollection
      if (i_kalSeed.t0().t0() > _cut) { outputKalSeeds->push_back(i_kalSeed); }
      • write the new KalSeedCollection to the art::Event once we're done
      event.put(std::move(outputKalSeeds));
  8. In your fcl file
    1. change the input tag of your histogramming module to "cool"
  9. Recompile and run, and you should see that the histogram is cut at your cut value!

Exercise 6: Writing an art file (Hello, My Art File!)

In this exercise, we will be splitting the previous exercise into two different jobs. There is no source code to be edited and this is all in fcl files.

  1. First, create two copies of your Exercise 5 fcl file. One will be for reading and one will be for writing.
  2. In your reading fcl, delete everything related to your producer module
  3. In your writing fcl, delete everything related to your analyzer module
  4. Also in your writing fcl:
    1. add an output block (note, this is at the same level as the physics block):
    2. outputs : { MyOutput : { module_type: RootOutput SelectEvents : [p1] fileName : "my-art-file.art" outputCommands : [ "drop *_*_*_*", "keep *_cool_*_*" ] } }
    3. add MyOutput to e1
  5. Run your writing fcl and then check that the resulting .art file only contains your KalSeeds:
  6. mu2e -c $MU2E_BASE_RELEASE/Print/fcl/dumpDataProducts.fcl -s my-art-file.art
  7. Now run your reading fcl on the art file you just created and you should have a new ROOT file with your time histogram in!

Exercise 7: Filtering events we don't want (Hello, Filter!)

You might notice that the number of events in your output art file is the same as in the input. That's because, although we dropped all data products except for our cool KalSeeds, we still write out an empty art::Event.

  1. Copy your previous producer module and make the following changes:
    1. EDProducer to EDFilter
    2. produce to filter
    3. the void return-type of filter to bool (because we will return to art, true or false whether it passes the filter)
    4. delete anything related to cutting or producing a new KalSeedCollection (you want to keep the input KalSeedCollection)
  2. This module will run over our "cool" KalSeedCollection and tell art, whether there are any tracks in the collection or not. Add some logiv to the filter function to do this.
  3. Now, let's set up the fcl file to write out the erduced number of events. Create a copy of your exeercise 6 writing fcl and make the following changes:
    1. add a filters block to the physics block:
    2. filters : { helloFilter : { module_type : HelloFilter input : "cool" } }
    3. add "helloFilter" to the path p1
  4. Compile and run your new writing fcl file and you will see that the number of events in the output file has fallen
  5. You should be able to run your Exercise 6 reading fcl with no changes on your new art file

Exercise 8: Creating a new data product (Hello, My Data Product!)

Let's say we want to create a data product that just contains the track t0 (an oversimplified example)

  1. Create a new class in a header file:
  2. #ifndef ModuleWriting_TrackTime_hh #define ModuleWriting_TrackTime_hh #include "RecoDataProducts/inc/KalSeed.hh" namespace mu2e { class TrackTime { public: TrackTime() : _time(0.0) {} TrackTime(const KalSeed& kseed) { _time = kseed.t0().t0(); } const double time() const { return _time; } private: double _time; }; typedef std::vector<TrackTime> TrackTimeCollection; } #endif
  3. To be able to write this data product out to an art file we need to add two files into the src/ directory:
    1. classes_def.xml
    2. <lcgdict> <class name="mu2e::TrackTime"/> <class name="art::Wrapper<mu2e::TrackTime>"/> <class name="std::vector<mu2e::TrackTime>"/> <class name="art::Wrapper<std::vector<mu2e::TrackTime> >"/> </lcgdict>
    3. classes.h
    4. // // headers needed when genreflex creates root dictionaries // for objects written to art files // #define ENABLE_MU2E_GENREFLEX_HACKS #include <vector> #include "canvas/Persistency/Common/Wrapper.h" #include "inc/TrackTime.hh" #undef ENABLE_MU2E_GENREFLEX_HACKS
  4. Create copies of your producer (HelloCoolDataProducts), filter (HelloFilter) and analyzer (HelloHistogram) modules with new names and:
    1. change KalSeedCollection to TrackTimeCollection
    2. include the new TrackTime.hh file rather than KalSeed.hh
    3. be sure to use the correct function to get the track time
  5. Now we need a new producer module to create the TrackTimes from KalSeeds. Copy your original producer module that created a cut KalSeedCollection and make the following changes:
    1. remove anything to do with cutting
    2. change the output to be a TrackTimeCollection and create TrackTimes rather than KalSeeds
    3. add a check that the input KalSeedCollection and the output TrackTime collection are the same size (throw an exception if they are not)
  6. Create copies of your writing and reading fcls from the previous exercise and
    1. update the modules in both fcl files to use your new modules that use TrackTime (should just be changes to each module_type
    2. add a configuration of your new producer module e.g.
    3. trackTime : { module_type : HelloTrackTimeDataProducts input : "KFFDeM" }
    4. add the new producer to the start of the path p1
    5. change the input of your cutting module to use the new producer
  7. Recompile and run your writing module
  8. Use $MU2E_BASE_RELEASE/Print/fcl/dumpDataProducts.fcl to check that you only have TrackTimeCollections in your art file
  9. Run your reading fcl on your new art file and check that your histogram looks as expected

Exercise 9: Creating and using a pointer to a data product (Hello, Art Ptrs!)

Want some way to go from the time to the original KalSeed. We will add a Ptr to the KalSeed in the previous data product. This is, again, a silly example.

  1. To your TrackTime.hh file, make the following changes:
    1. add a new private member variable of type art::Ptr<KalSeed>
    2. add a getter function (const art::Ptr<KalSeed> kalSeed() const) and a setter function (void setKalSeedPtr(const art::Ptr<KalSeed>& ptr))
  2. In order to create a Ptr, we need to have a handle to the collection and an index into the collection. In your TrackTime creating module, we need to make the following changes:
    1. move to an index based for loop (i.e. we want something like this now):
    2. for (auto i_kalSeed = kalSeedCollection.begin(); i_kalSeed != kalSeedCollection.end(); ++i_kalSeed) {
    3. create an art::Ptr and set it in the TrackTime class
    4. TrackTime track_time(*i_kalSeed); art::Ptr<KalSeed> kseedPtr(kalSeedCollectionHandle, i_kalSeed-kalSeedCollection.begin()); track_time.setKalSeedPtr(kseedPtr);
  3. producer needs to make the Ptr
  4. analyzer module needs to use the art::Ptr (in a second job)
  5. show the limits of Ptrs? e.g. add an additional condensing of the KalSeedCollection and show that we need to remake the TrackTimeCollection to get the Ptrs to be correct?

Exercise 10: Creating and using an assn between two data products (Hello, Art Assns!)

Another option to associate two data products is to use an Assns. It adds a third product but it's bidirectional.

  1. producer needs to make the Assns
  2. analyzer module needs to use the Assns (in a second job)

Reference Materials

  • art workbook