MakeProducts: Difference between revisions

From Mu2eWiki
Jump to navigation Jump to search
(Created page with " == Introduction== A Data Product is anything that you can add to an event or see in an event. Examples include the generated particles, the simulated particles produced by...")
 
No edit summary
Line 12: Line 12:
an event and how to define a new type of data product.  For
an event and how to define a new type of data product.  For
more complete information,  consult the
more complete information,  consult the
[https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideCreatingNewProducts ] the CMS documentation for making new products</a>.
[https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideCreatingNewProducts the CMS documentation] for making new products.




Line 19: Line 19:
StrawHitCollection to the event.  If you look back through the
StrawHitCollection to the event.  If you look back through the
nested header files, you will see the StrawHitCollection is just a typedef for
nested header files, you will see the StrawHitCollection is just a typedef for
std::vector&lt;mu2e::StrawHit&gt and that StrawHit is a very simple class, really nothing
<code>std::vector<mu2e::StrawHit></code> and that StrawHit is a very simple class, really nothing
more than a simple struct.
more than a simple struct.


<pre>
#include "art/Framework/Core/EDProducer.h"
 
#include "art/Framework/Core/ModuleMacros.h"
#include "art/Framework/Core/EDProducer.h"
#include "MCDataProducts/inc/<font color=red>StrawHitCollection</font>.hh"
#include "art/Framework/Core/ModuleMacros.h"
#include "MCDataProducts/inc/<font color=red>StrawHitCollection</font>.hh"
namespace mu2e{
 
namespace mu2e{
  class MyClass : public art::EDProducer {
 
class MyClass : public art::EDProducer {
  public:
 
    explicit MyClass(fhicl::ParameterSet const& pSet)
  public:
    {
    explicit MyClass(fhicl::ParameterSet const& pSet)
      <font color=green>produces</font>&lt;<font color=red>StrawHitCollection</font>&gt;();
    {
    }
      <font color=green>produces</font>&lt;<font color=red>StrawHitCollection</font>&gt;();
    virtual ~MyClass() { }
    }
    virtual void <font color=blue>produce</font>(art::Event& e );
    virtual ~MyClass() { }
 
    virtual void <font color=blue>produce</font>(art::Event& e );
 
 
   };
   };
 
   void MyClass::<font color=blue>produce</font>(art::Event& event ) {
   void MyClass::<font color=blue>produce</font>(art::Event& event ) {
 
    unique_ptr&lt;<font color=red>StrawHitCollection</font>&gt; <font color=blue>p</font>(new <font color=red>StrawHitCollection</font>);
  unique_ptr<<font color=red>StrawHitCollection</font>> <font color=blue>p</font>(new <font color=red>StrawHitCollection</font>);
 
    // Some sort of loop to fill the collection:
  // Some sort of loop to fill the collection:
    for ( int i=0; i&lt;10; ++i){
  for ( int i=0; i<10; ++i){
       <font color=blue>p</font>-&gt;push_back(<font color=red>StrawHit</font>(...));
       <font color=blue>p</font>>push_back(<font color=red>StrawHit</font>(...));
     }
     }
 
     event.put(std::move(<font color=blue>p</font>));
     event.put(std::move(<font color=blue>p</font>));
   }
   }
 
} // end of namespace mu2e
} // end of namespace mu2e
 
using mu2e::G4;
using mu2e::G4;
DEFINE_ART_MODULE(G4);
DEFINE_ART_MODULE(G4);


</pre>
In the above fragment there is a member function of MyClass named <font color=blue>produce</font> (singular)
In the above fragment there is a member function of MyClass named <font color=blue>produce</font> (singular)
and a member function of the base class named <font color=green>produces</font> (plural); the second function
and a member function of the base class named <font color=green>produces</font> (plural); the second function
Line 71: Line 65:
<ol>
<ol>
<li> The class must inherit from art::EDProducer.
<li> The class must inherit from art::EDProducer.
<li> The constructor must tell the framework what it produces; it does so via the
<li> The constructor must tell the framework what it produces; it does so via the call to produces<StrawHitCollection>(). This is described in more depth below.
    call to produces&lt;StrawHitCollection&gt;(). This is
<li> Data products are added to an event inside the produce method.  A three step pattern is used:
    [[#produces|described in more depth below]] .
<li> Data products are added to an event inside the produce method.  A three step
    pattern is used:
     <ol>
     <ol>
         <li> Create an unique_ptr to an empty object.
         <li> Create an unique_ptr to an empty object.
Line 81: Line 72:
         <li> Give the unique_ptr to the event.
         <li> Give the unique_ptr to the event.
     </ol>
     </ol>
    It might be possible to create a fully
It might be possible to create a fully formed object in step 1; in that case there is no step 2.
    formed object in step 1; in that case there is no step 2.
<li> The code must invoke the macro DEFINE_ART_MODULE as shown in the last two lines.  This line may appear anywhere in the file after the definition of the class.  Mu2e has adopted the convention of putting it at the end of the file.
<li> The code must invoke the macro DEFINE_ART_MODULE as shown in the last two lines.  This line may appear anywhere
<li> After the call to event.put(...), the variable <font color=blue>p</font> no longer points at anything.  If you try to use it, you will get a run-time error. Therefore you should run diagnostics and other things that read your data product before the call to event.put(...).
    in the file after the definition of the class.  Mu2e has adopted the convention of putting it at the end of the file.
<li> After the call to event.put(...), the variable <font color=blue>p</font> no longer
    points at anything.  If you try to use it, you will get a run-time error. Therefore
    you should run diagnostics and other things that read your data product before the
    call to event.put(...).
</ol>
</ol>


Line 98: Line 84:




== More about produces&lt;T&gt;(); ==
== More about produces<T&>(); ==
In the constructor, there is a call to a function template produces&lt;T&gt;().
In the constructor, there is a call to a function template produces&lt;T&gt;().
This tells the framework that when the produce method of this class is called, it is expected
This tells the framework that when the produce method of this class is called, it is expected
Line 106: Line 92:


If the produce method tries to add a product for which it did not make a
If the produce method tries to add a product for which it did not make a
produce&lt;T&gt;() call, then the framework will throw.  The default response to this exception
produce<T>() call, then the framework will throw.  The default response to this exception
is to stop event processing and to shut down as gracefully as possible; normally this means that
is to stop event processing and to shut down as gracefully as possible; normally this means that
your histogram files an log files will be flushed and closed properly.
your histogram files an log files will be flushed and closed properly.
Line 123: Line 109:
It remains true that, for objects that are collection types, the recommended procedure is to
It remains true that, for objects that are collection types, the recommended procedure is to
put an empty collection into the event rather than to put nothing into the event; this greatly
put an empty collection into the event rather than to put nothing into the event; this greatly
simplifieds code that reads your output.
simplifies code that reads your output.




Line 136: Line 122:
<pre>
<pre>
SampleProducer(fhicl::ParameterSet const& ps){
SampleProducer(fhicl::ParameterSet const& ps){
  produces&lt;T&gt;("version1");
  produces<T>("version1");
  produces&lt;T&gt;("version2");
  produces<T>("version2");
}
}



Revision as of 21:57, 23 March 2017


Introduction

A Data Product is anything that you can add to an event or see in an event. Examples include the generated particles, the simulated particles produced by Geant4, the hits produced by Geant4, tracks found by the reconstruction algorithms, clusters found in the calorimeters and so on.

This page contains a short description of how to add a data product to an event and how to define a new type of data product. For more complete information, consult the the CMS documentation for making new products.


A Minimal Module

The code fragment below shows a minimal example of an EDProducer module that adds a StrawHitCollection to the event. If you look back through the nested header files, you will see the StrawHitCollection is just a typedef for std::vector<mu2e::StrawHit> and that StrawHit is a very simple class, really nothing more than a simple struct.

#include "art/Framework/Core/EDProducer.h"
#include "art/Framework/Core/ModuleMacros.h"
#include "MCDataProducts/inc/StrawHitCollection.hh"

namespace mu2e{

 class MyClass : public art::EDProducer {

  public:
    explicit MyClass(fhicl::ParameterSet const& pSet)
    {
      produces<StrawHitCollection>();
    }
    virtual ~MyClass() { }
    virtual void produce(art::Event& e );
 };

 void MyClass::produce(art::Event& event ) {

  unique_ptr<StrawHitCollection> p(new StrawHitCollection);

  // Some sort of loop to fill the collection:
  for ( int i=0; i<10; ++i){
      p>push_back(StrawHit(...));
   }

   event.put(std::move(p));
 }

} // end of namespace mu2e

using mu2e::G4; DEFINE_ART_MODULE(G4);

In the above fragment there is a member function of MyClass named produce (singular) and a member function of the base class named produces (plural); the second function is called in the constructor of MyClass. The following text refers to both - so pay attention to which of the two is being discussed. The following pattern describes any producer module:

  1. The class must inherit from art::EDProducer.
  2. The constructor must tell the framework what it produces; it does so via the call to produces<StrawHitCollection>(). This is described in more depth below.
  3. Data products are added to an event inside the produce method. A three step pattern is used:
    1. Create an unique_ptr to an empty object.
    2. Fill the object.
    3. Give the unique_ptr to the event.

    It might be possible to create a fully formed object in step 1; in that case there is no step 2.

  4. The code must invoke the macro DEFINE_ART_MODULE as shown in the last two lines. This line may appear anywhere in the file after the definition of the class. Mu2e has adopted the convention of putting it at the end of the file.
  5. After the call to event.put(...), the variable p no longer points at anything. If you try to use it, you will get a run-time error. Therefore you should run diagnostics and other things that read your data product before the call to event.put(...).

You might try the following: call event.put(....) and then get the data product out of the event using one of the get methods. This will not work. The reason is that a data product is not actually registered with the event until the produce method of the module returns. The logic behind this restriction is that if a module fails, then none of its data products should be available via the get interface; therefor event.put(...) only schedules the data product for addition to the event and that addition occurs when the module returns from the produce call.


More about produces<T&>();

In the constructor, there is a call to a function template produces<T>(). This tells the framework that when the produce method of this class is called, it is expected to add a data product of type T to the event. If the produce method is expected to add more than one data product to the event, then there must be a corresponding call to produces for each data product.

If the produce method tries to add a product for which it did not make a produce<T>() call, then the framework will throw. The default response to this exception is to stop event processing and to shut down as gracefully as possible; normally this means that your histogram files an log files will be flushed and closed properly.

One natural question is "what should I do if this particular event has no StrawHits"? One needs to distinguish two cases here. If it is perfectly normal that some events will produce no StrawHits, then you should put an empty StrawHitCollection into the event. The event data model is perfectly happy to hold empty collections. If it is an error for any event to produce no StrawHits, then you should issue an appropriate error message using the message logger. If it is sufficiently severe error, then you should throw an appropriate exception.

In an earlier version of this document it was stated that the framework would throw an exception if a model failed to produce one of its data products advertised via produces calls. This is not true and never was true - the older document was wrong. It remains true that, for objects that are collection types, the recommended procedure is to put an empty collection into the event rather than to put nothing into the event; this greatly simplifies code that reads your output.


If you are wondering where the produces function lives, it comes from deep down in an inheritance chain. First look in the header file for the base class, EDProducer. That class inherits from some other class; check its header file. After several levels you will find the base class that defines produces.

If one module wishes to produce two or more data products of the same data type, these can be distinguished using the instance name argument to produces and put:

SampleProducer(fhicl::ParameterSet const& ps){
 produces<T>("version1");
 produces<T>("version2");
}

void SampleProducer::produce(art::Event& e ){
   std::unique_ptr<SampleCollection> result1(new SampleCollection);
   std::unique_ptr<SampleCollection> result2(new SampleCollection);
   // ... fill the collections ...
   e.put(std::move(result1),"version1");
   e.put(std::move(result2),"version2");
}


where the text strings must be unique but have no other requirements.

Declaring new Data Products

In the above description it was presumed that the class to be added to the event was already known to the framework. A class is made known to the framework using the genreflex system from ROOT, as described below.

Declaring a data product to the system uses two files, named classes_def.xml and classes.h. By convention these files are located in the src subdirectory of each data product package, for example RecoDataProducts/src/classes_def.xml and RecoDataProducts/src/classes.h In principal every cvs module could define its own data products but we have chosen, instead, to segregate the data products in a small number of packages. This enforces the separation of data classes and algorithm classes and makes is possible to load the data product libraries without having to load the much more complex algorithm classes.


If the only data product we had were StrawHitCollection, then classes_def.xml would look like:

<lcgdict>
 <class name="mu2e::StrawHit"/>
 <class name="mu2e::StrawHitCollection"/>
 <class name="art::Wrapper<mu2e::StrawHitCollection>"/>
</lcgdict>

and classes.h would look like:


#include ...

#include "ToyDP/inc/StrawHitCollection.hh"

template class art::Wrapper<mu2e::StrawHitCollection>;

The rule for classes.h is that the Wrapper line must be present for every class that can be given to the event using a call to event.put(...); in this case that is just the StrawHitCollection. The non-wrapper lines must be present for that class, StrawHitCollection, and for all of the classes that are among the persistent data of StrawHitCollection, either directly or indirectly. This applies recursively until only primitive objects are found ( that is, we do not need lines for int, double, float, char and so on).

There is one exception to the rule that you must recursively declare all classes that are data members of your class. You must not declare them if they are already found in another dictionary that is known art. For example none of the Mu2e dictionaries includes a reference to CLHEP::Hep3Vector or CLHEP::HepLorentzVector; these are found in [1] $ART_DIR/source/art/art/Persistency/CLHEPDictionaries/classes_def.xml</a>.

Other dictionaries defined in the art source area:

art/Framework/IO/ProductMix/classes_def.xml
art/Persistency/CetlibDictionaries/classes_def.xml
art/Persistency/WrappedStdDictionaries/classes_def.xml
art/Persistency/CLHEPDictionaries/classes_def.xml
art/Persistency/FhiclCppDictionaries/classes_def.xml
art/Persistency/Common/classes_def.xml
art/Persistency/StdDictionaries/classes_def.xml
art/Persistency/Provenance/classes_def.xml


If we had decided that it made sense to add a single StrawHit as a data product, then we would also need to write the wrapper line for StrawHit. Instead we decided that if you would like to store a single StrawHit, you need to store it by creating a collection with only one member and storing that collection.

Every class for which there is a wrapper line in class_def.xml must also be declared in classes.h; but classes from the non-wrapper lines of classes_def.xml should not be present in classes.h. The appropriate #include must also be present for the header file of the classes that appear in the dictionary section of classes.h.

There is a second class of things that must be present in classes.h. If any data product has a data member that is an an instantiation of a templated class, then the templated class must be present in classes.h. Look, for example, at [2] Offline/ToyDP/src/classes.h</a>. The class mu2e::SimParticleCollection has a data member of type std::map<MapVectorKey,mu2e::SimParticle>; that class has a data member of type std::pair<MapVectorKey,mu2e::SimParticle>. Both of these classes must be declared in classes.h.


There is a syntax to make only a subset of the data members of a class persistent. There is also a syntax to tell the framework to make a data product purely transient: that is, it can be added to the event so that other modules may use it, but it will never be written out. For details see the next two sections and see also [3] the CMS documentation for making new products</a>.

Transient Data Products

It is possible to tell the framework that it should allow data products of a certain type to be added to the event but that it should never write out data products of that type. This is useful, for example, for data products that are full of bare pointers. To declare the class MyClass as a transient data product you need to add one line to classes_def.xml

 <class name="MyClass" persistent="false"/>

and one line to classes.h,

#include "MyClass.hh"

One should not provide the lines for the art::Wrapper<MyClass> to either of these files. Moreover it is not necessary to provide lines in classes_def.xml that describe the classes used as data members inside MyClass. When an output module encounters this data product it will not try to persist the data product.

See also [4] the CMS documentation for making new products</a>.


Transient Data Members within a Persistable Class

It is also possible to declare that a data member of a class is transient. This is done in classes_def.xml. Suppose that the class MyClass has a data member with the name _field of type T. The data member can be declared transient using the syntax:

 <class name="MyClass">
    <field name="_field" transient="true"/>
 </class>

In this case the data member _field will not be written to the output file but the remaining data members of MyClass will be. When these objects are read back, the data member _field will be invalid and the user needs to know not to access this data member until it can be properly initialized by some other method. Ideally MyClass should protect against illegal access either by initializing on demand or by throwing.

If there are no persisted objects of type T in the any of the data products, then it is not necessary to declare the type T in classes_def.xml.

See also [5] the CMS documentation for making new products</a>.

Identifiers of a Data Product

Each data product within an event is unqiuely identified by a 4 part identifier, with the parts separated by an underscore character:

 DataType_ModuleLabel_InstanceName_ProcessName
  1. DataType is a "friendly" version of the name of the data type that is stored in the product. The name includes all namespace information. The friendly part is the way that it deals with collection types:
    • If a product is of type T, then the friendly name is "T".
    • If a product is of type mu2e::T, then the friendly name is "mu2e::T".
    • If a product is of type std::vector<mu2e::T>, then the friendly name is "mu2e::Ts".
    • If a product is of type std::vector< std::vector<mu2e::T> >, then the friendly name is "mu2e::Tss".
    • If a product is of type cet::map_vector<mu2e::T>, then the friendly name is mu2e::Tmv. See below for a discussion about where underscores may not be used; this example is safe because of the substituion of mv for map_vector.
  2. ModuleLabel identifies th e module that created the product; this is the module label, which distinguishes multiple instances of the same module within a produces; it is not the class name of the module.
  3. InstanceName is a label for the data product that distinguishes two or more data products of the same type that were produced by the same module, in the same process. If a data product is already unique within this scope, it is legal to leave this field blank. The instance label is the optional argument of the call to "produces" in the constructor of the module (xxxx below):
          produces<T>("xxxx");
          
  4. ProcessName is the name of the process that created this product. It is specified in the fcl file that specifies the run time configuration for the job (ReadBack02 below):
          process_name : ReadBack02
          

Because the full name of the product uses the underscore character to delimit fields, it is forbidden to use underscores in any of the names of the fields. Therefore none of the following may contain underscores:

  • The class name of a class that is a data product; the exception is the cet::map_vector template; when creating the friendly name, art internally recognizes this case and protects against it.
  • The namespace in which a data product class lives.
  • Module labels.
  • Data product instance names
  • Process names.

You can also read about [magicnames.shtml ] ich names need to match each other</a>.

Writing only Selected Events and Selected Data Products

It is possible to configure an art job so that it writes selected events to one or more different output files. It is also possible to configure each output file so that only selected data products are written to that file. These operations are described in the web page on configuring output files.