MakeProducts: Difference between revisions
(Created page with " == Introduction== A Data Product is anything that you can add to an event or see in an event. Examples include the generated particles, the simulated particles produced by...") |
No edit summary |
||
(5 intermediate revisions by 2 users not shown) | |||
Line 12: | Line 12: | ||
an event and how to define a new type of data product. For | an event and how to define a new type of data product. For | ||
more complete information, consult the | more complete information, consult the | ||
[https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideCreatingNewProducts | [https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideCreatingNewProducts the CMS documentation] for making new products. | ||
In 2022, art provided and updated method for putting products in an event, explained [https://indico.fnal.gov/event/53206/contributions/234465/attachments/151953/196516/2022-2-15.pdf here] | |||
== A Minimal Module== | == A Minimal Module== | ||
Line 19: | Line 20: | ||
StrawHitCollection to the event. If you look back through the | StrawHitCollection to the event. If you look back through the | ||
nested header files, you will see the StrawHitCollection is just a typedef for | nested header files, you will see the StrawHitCollection is just a typedef for | ||
std::vector | <code>std::vector<mu2e::StrawHit></code> and that StrawHit is a very simple class, really nothing | ||
more than a simple struct. | more than a simple struct. | ||
#include "art/Framework/Core/EDProducer.h" | |||
#include "art/Framework/Core/ModuleMacros.h" | |||
#include "art/Framework/Core/EDProducer.h" | #include "MCDataProducts/inc/<font color=red>StrawHitCollection</font>.hh" | ||
#include "art/Framework/Core/ModuleMacros.h" | |||
#include "MCDataProducts/inc/<font color=red>StrawHitCollection</font>.hh" | namespace mu2e{ | ||
namespace mu2e{ | class MyClass : public art::EDProducer { | ||
public: | |||
explicit MyClass(fhicl::ParameterSet const& pSet) | |||
{ | |||
<font color=green>produces</font><<font color=red>StrawHitCollection</font>>(); | |||
} | |||
virtual ~MyClass() { } | |||
virtual void <font color=blue>produce</font>(art::Event& e ); | |||
}; | }; | ||
void MyClass::<font color=blue>produce</font>(art::Event& event ) { | void MyClass::<font color=blue>produce</font>(art::Event& event ) { | ||
unique_ptr<<font color=red>StrawHitCollection</font>> <font color=blue>p</font>(new <font color=red>StrawHitCollection</font>); | |||
// Some sort of loop to fill the collection: | |||
for ( int i=0; i<10; ++i){ | |||
<font color=blue>p</font> | <font color=blue>p</font>>push_back(<font color=red>StrawHit</font>(...)); | ||
} | } | ||
event.put(std::move(<font color=blue>p</font>)); | event.put(std::move(<font color=blue>p</font>)); | ||
} | } | ||
} // end of namespace mu2e | } // end of namespace mu2e | ||
using mu2e::G4; | using mu2e::G4; | ||
DEFINE_ART_MODULE(G4); | DEFINE_ART_MODULE(G4); | ||
In the above fragment there is a member function of MyClass named <font color=blue>produce</font> (singular) | In the above fragment there is a member function of MyClass named <font color=blue>produce</font> (singular) | ||
and a member function of the base class named <font color=green>produces</font> (plural); the second function | and a member function of the base class named <font color=green>produces</font> (plural); the second function | ||
Line 71: | Line 66: | ||
<ol> | <ol> | ||
<li> The class must inherit from art::EDProducer. | <li> The class must inherit from art::EDProducer. | ||
<li> The constructor must tell the framework what it produces; it does so via the | <li> The constructor must tell the framework what it produces; it does so via the call to produces<StrawHitCollection>(). This is described in more depth below. | ||
<li> Data products are added to an event inside the produce method. A three step pattern is used: | |||
<li> Data products are added to an event inside the produce method. A three step | |||
<ol> | <ol> | ||
<li> Create an unique_ptr to an empty object. | <li> Create an unique_ptr to an empty object. | ||
Line 81: | Line 73: | ||
<li> Give the unique_ptr to the event. | <li> Give the unique_ptr to the event. | ||
</ol> | </ol> | ||
It might be possible to create a fully formed object in step 1; in that case there is no step 2. | |||
<li> The code must invoke the macro DEFINE_ART_MODULE as shown in the last two lines. This line may appear anywhere in the file after the definition of the class. Mu2e has adopted the convention of putting it at the end of the file. | |||
<li> The code must invoke the macro DEFINE_ART_MODULE as shown in the last two lines. This line may appear anywhere | <li> After the call to event.put(...), the variable <font color=blue>p</font> no longer points at anything. If you try to use it, you will get a run-time error. Therefore you should run diagnostics and other things that read your data product before the call to event.put(...). | ||
<li> After the call to event.put(...), the variable <font color=blue>p</font> no longer | |||
</ol> | </ol> | ||
Line 98: | Line 85: | ||
== More about produces | == More about produces<T>(); == | ||
In the constructor, there is a call to a function template produces | In the constructor, there is a call to a function template produces<T>(). | ||
This tells the framework that when the produce method of this class is called, it is expected | This tells the framework that when the produce method of this class is called, it is expected | ||
to add a data product of type T to the event. If the produce method is expected | to add a data product of type T to the event. If the produce method is expected | ||
Line 106: | Line 93: | ||
If the produce method tries to add a product for which it did not make a | If the produce method tries to add a product for which it did not make a | ||
produce | produce<T>() call, then the framework will throw. The default response to this exception | ||
is to stop event processing and to shut down as gracefully as possible; normally this means that | is to stop event processing and to shut down as gracefully as possible; normally this means that | ||
your histogram files an log files will be flushed and closed properly. | your histogram files an log files will be flushed and closed properly. | ||
Line 123: | Line 110: | ||
It remains true that, for objects that are collection types, the recommended procedure is to | It remains true that, for objects that are collection types, the recommended procedure is to | ||
put an empty collection into the event rather than to put nothing into the event; this greatly | put an empty collection into the event rather than to put nothing into the event; this greatly | ||
simplifies code that reads your output. | |||
Line 136: | Line 123: | ||
<pre> | <pre> | ||
SampleProducer(fhicl::ParameterSet const& ps){ | SampleProducer(fhicl::ParameterSet const& ps){ | ||
produces | produces<T>("version1"); | ||
produces | produces<T>("version2"); | ||
} | } | ||
Line 147: | Line 134: | ||
e.put(std::move(result2),"version2"); | e.put(std::move(result2),"version2"); | ||
} | } | ||
</pre> | |||
where the text strings must be unique but have no other requirements. | where the text strings must be unique but have no other requirements. | ||
== Declaring new Data Products== | == Declaring new Data Products== | ||
In the above description it was presumed that the class to be added to the event | In the above description it was presumed that the class to be added to the event | ||
was already known to the framework. A class is made known to | was already known to the framework. A class is made known to art by creating a ROOT | ||
dictionary by using ROOT's genreflex system, as described below. | |||
Making a ROOT dictionary requires two files, named | |||
classes_def.xml and classes.h. | classes_def.xml and classes.h. The convention used by the Mu2e build system is that these files are located | ||
in the src subdirectory of each data product package, for example | in the src subdirectory of each data product package, for example | ||
RecoDataProducts/src/classes_def.xml | RecoDataProducts/src/classes_def.xml | ||
and RecoDataProducts/src/classes.h | and RecoDataProducts/src/classes.h . | ||
If the only data product we had were StrawHitCollection, then classes_def.xml would look like: | If the only data product we had were StrawHitCollection, then classes_def.xml would look like: | ||
<pre> | <pre> | ||
<lcgdict> | |||
<class name="mu2e::StrawHit"/> | |||
<class name="mu2e::StrawHitCollection"/> | |||
<class name="art::Wrapper<mu2e::StrawHitCollection>"/> | |||
</lcgdict> | |||
</pre> | </pre> | ||
and classes.h would look like: | and classes.h would look like: | ||
<pre> | <pre> | ||
#include <vector> | |||
#include | #include "RecoDataProducts/inc/StrawHitCollection.hh" | ||
#include " | |||
</pre> | </pre> | ||
The rule for | The rule for classes_def.xml is that the art::Wrapper line must be present for every class | ||
that can be given to the event using a call to event.put(...); in this case that is just | that can be given to the event using a call to event.put(...); in this case that is just | ||
<code>art::Wrapper<mu2e::StrawHitCollection></code>. You also need a line for the class that is the template argument of art::Wrapper, in this case <code>StrawHitCollection</code>. And you need a line for every type that is used within StrawHitCollection. | |||
This applies recursively until only primitive | This applies recursively until only primitive | ||
types are found ( that is, we do not need lines for int, double, float, char and so on). | |||
There is one exception to the rule that you must recursively declare all classes that are | There is one exception to the rule that you must recursively declare all classes that are | ||
Line 201: | Line 173: | ||
that is known art. For example none of the Mu2e dictionaries includes a reference to CLHEP::Hep3Vector | that is known art. For example none of the Mu2e dictionaries includes a reference to CLHEP::Hep3Vector | ||
or CLHEP::HepLorentzVector; these are found in | or CLHEP::HepLorentzVector; these are found in | ||
<code>$CANVAS_ROOT_IO_DIR/source/canvas_root_io/Dictionaries/clhep/classes_def.xml</code>. | |||
< | |||
Mu2e has adopted the convention that if we need a dictionary entry for a type that is used more than one dictionary, that type should be defined in <code>DataProducts/src/classses_def.xml</code> and <code>DataProducts/src/classses_def.h</code>. | |||
You can learn what other dictionaries are defined by art by giving the following unix command: | |||
art | |||
<code>find $CANVAS_ROOT_IO_DIR -name classes_def.xml</code> | |||
If we had decided that it made sense to add a single StrawHit as a data product, then we would | If we had decided that it made sense to add a single StrawHit as a data product, then we would | ||
Line 221: | Line 187: | ||
storing that collection. | storing that collection. | ||
Every | Every header file that is need to recursively resolve classes declared in classes_def.xml must be #included'ed in classes.h. In earlier versions of root, it was necessary to include some explicit template instantiations in classes.h but these is no longer needed and should not be present. | ||
There is a syntax to make only a subset of the data members of a class persistent. There is | There is a syntax to make only a subset of the data members of a class persistent. There is | ||
Line 238: | Line 193: | ||
be added to the event so that other modules may use it, but it will never be written out. | be added to the event so that other modules may use it, but it will never be written out. | ||
For details see the next two sections and see also | For details see the next two sections and see also | ||
[https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideCreatingNewProducts | [https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideCreatingNewProducts the CMS documentation for making new products]. | ||
== Transient Data Products== | == Transient Data Products== | ||
Line 248: | Line 203: | ||
<pre> | <pre> | ||
<class name="MyClass" persistent="false"/> | |||
</pre> | </pre> | ||
Line 262: | Line 217: | ||
See also | See also | ||
[https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideCreatingNewProducts | [https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideCreatingNewProducts the CMS documentation for making new products]. | ||
Line 272: | Line 227: | ||
<pre> | <pre> | ||
<class name="MyClass"> | |||
<field name="_field" transient="true"/> | |||
</class> | |||
</pre> | </pre> | ||
Line 287: | Line 242: | ||
See also | See also | ||
[https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideCreatingNewProducts | [https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideCreatingNewProducts the CMS documentation for making new products]. | ||
== Identifiers of a Data Product== | == Identifiers of a Data Product== | ||
Please see the [[ReadProducts#Identifiers of a Data Product|documentation]] on the art naming convention for products. | |||
== Writing only Selected Events and Selected Data Products== | == Writing only Selected Events and Selected Data Products== | ||
Line 349: | Line 254: | ||
These operations are described in the web page on | These operations are described in the web page on | ||
[[IOModules|configuring output files]]. | [[IOModules|configuring output files]]. | ||
[[Category:Computing]] | |||
[[Category:Code]] |
Latest revision as of 02:37, 31 March 2023
Introduction
A Data Product is anything that you can add to an event or see in an event. Examples include the generated particles, the simulated particles produced by Geant4, the hits produced by Geant4, tracks found by the reconstruction algorithms, clusters found in the calorimeters and so on.
This page contains a short description of how to add a data product to an event and how to define a new type of data product. For more complete information, consult the the CMS documentation for making new products.
In 2022, art provided and updated method for putting products in an event, explained here
A Minimal Module
The code fragment below shows a minimal example of an EDProducer module that adds a
StrawHitCollection to the event. If you look back through the
nested header files, you will see the StrawHitCollection is just a typedef for
std::vector<mu2e::StrawHit>
and that StrawHit is a very simple class, really nothing
more than a simple struct.
#include "art/Framework/Core/EDProducer.h" #include "art/Framework/Core/ModuleMacros.h" #include "MCDataProducts/inc/StrawHitCollection.hh" namespace mu2e{ class MyClass : public art::EDProducer { public: explicit MyClass(fhicl::ParameterSet const& pSet) { produces<StrawHitCollection>(); } virtual ~MyClass() { } virtual void produce(art::Event& e ); }; void MyClass::produce(art::Event& event ) { unique_ptr<StrawHitCollection> p(new StrawHitCollection); // Some sort of loop to fill the collection: for ( int i=0; i<10; ++i){ p>push_back(StrawHit(...)); } event.put(std::move(p)); } } // end of namespace mu2e
using mu2e::G4; DEFINE_ART_MODULE(G4);
In the above fragment there is a member function of MyClass named produce (singular) and a member function of the base class named produces (plural); the second function is called in the constructor of MyClass. The following text refers to both - so pay attention to which of the two is being discussed. The following pattern describes any producer module:
- The class must inherit from art::EDProducer.
- The constructor must tell the framework what it produces; it does so via the call to produces<StrawHitCollection>(). This is described in more depth below.
- Data products are added to an event inside the produce method. A three step pattern is used:
- Create an unique_ptr to an empty object.
- Fill the object.
- Give the unique_ptr to the event.
It might be possible to create a fully formed object in step 1; in that case there is no step 2.
- The code must invoke the macro DEFINE_ART_MODULE as shown in the last two lines. This line may appear anywhere in the file after the definition of the class. Mu2e has adopted the convention of putting it at the end of the file.
- After the call to event.put(...), the variable p no longer points at anything. If you try to use it, you will get a run-time error. Therefore you should run diagnostics and other things that read your data product before the call to event.put(...).
You might try the following: call event.put(....) and then get the data product out of the event using one of the get methods. This will not work. The reason is that a data product is not actually registered with the event until the produce method of the module returns. The logic behind this restriction is that if a module fails, then none of its data products should be available via the get interface; therefor event.put(...) only schedules the data product for addition to the event and that addition occurs when the module returns from the produce call.
More about produces<T>();
In the constructor, there is a call to a function template produces<T>(). This tells the framework that when the produce method of this class is called, it is expected to add a data product of type T to the event. If the produce method is expected to add more than one data product to the event, then there must be a corresponding call to produces for each data product.
If the produce method tries to add a product for which it did not make a produce<T>() call, then the framework will throw. The default response to this exception is to stop event processing and to shut down as gracefully as possible; normally this means that your histogram files an log files will be flushed and closed properly.
One natural question is "what should I do if this particular event has no StrawHits"? One needs to distinguish two cases here. If it is perfectly normal that some events will produce no StrawHits, then you should put an empty StrawHitCollection into the event. The event data model is perfectly happy to hold empty collections. If it is an error for any event to produce no StrawHits, then you should issue an appropriate error message using the message logger. If it is sufficiently severe error, then you should throw an appropriate exception.
In an earlier version of this document it was stated that the framework would throw an exception if a model failed to produce one of its data products advertised via produces calls. This is not true and never was true - the older document was wrong. It remains true that, for objects that are collection types, the recommended procedure is to put an empty collection into the event rather than to put nothing into the event; this greatly simplifies code that reads your output.
If you are wondering where the produces function lives, it comes from deep down in
an inheritance chain. First look in the header file
for the base class, EDProducer. That class inherits from some other class; check
its header file. After several levels you will find the base class that defines
produces.
If one module wishes to produce two or more data products of the same data type, these can be distinguished using the instance name argument to produces and put:
SampleProducer(fhicl::ParameterSet const& ps){ produces<T>("version1"); produces<T>("version2"); } void SampleProducer::produce(art::Event& e ){ std::unique_ptr<SampleCollection> result1(new SampleCollection); std::unique_ptr<SampleCollection> result2(new SampleCollection); // ... fill the collections ... e.put(std::move(result1),"version1"); e.put(std::move(result2),"version2"); }
where the text strings must be unique but have no other requirements.
Declaring new Data Products
In the above description it was presumed that the class to be added to the event was already known to the framework. A class is made known to art by creating a ROOT dictionary by using ROOT's genreflex system, as described below.
Making a ROOT dictionary requires two files, named classes_def.xml and classes.h. The convention used by the Mu2e build system is that these files are located in the src subdirectory of each data product package, for example RecoDataProducts/src/classes_def.xml and RecoDataProducts/src/classes.h . If the only data product we had were StrawHitCollection, then classes_def.xml would look like:
<lcgdict> <class name="mu2e::StrawHit"/> <class name="mu2e::StrawHitCollection"/> <class name="art::Wrapper<mu2e::StrawHitCollection>"/> </lcgdict>
and classes.h would look like:
#include <vector> #include "RecoDataProducts/inc/StrawHitCollection.hh"
The rule for classes_def.xml is that the art::Wrapper line must be present for every class
that can be given to the event using a call to event.put(...); in this case that is just
art::Wrapper<mu2e::StrawHitCollection>
. You also need a line for the class that is the template argument of art::Wrapper, in this case StrawHitCollection
. And you need a line for every type that is used within StrawHitCollection.
This applies recursively until only primitive
types are found ( that is, we do not need lines for int, double, float, char and so on).
There is one exception to the rule that you must recursively declare all classes that are data members of your class. You must not declare them if they are already found in another dictionary that is known art. For example none of the Mu2e dictionaries includes a reference to CLHEP::Hep3Vector or CLHEP::HepLorentzVector; these are found in
$CANVAS_ROOT_IO_DIR/source/canvas_root_io/Dictionaries/clhep/classes_def.xml
.
Mu2e has adopted the convention that if we need a dictionary entry for a type that is used more than one dictionary, that type should be defined in DataProducts/src/classses_def.xml
and DataProducts/src/classses_def.h
.
You can learn what other dictionaries are defined by art by giving the following unix command:
find $CANVAS_ROOT_IO_DIR -name classes_def.xml
If we had decided that it made sense to add a single StrawHit as a data product, then we would also need to write the wrapper line for StrawHit. Instead we decided that if you would like to store a single StrawHit, you need to store it by creating a collection with only one member and storing that collection.
Every header file that is need to recursively resolve classes declared in classes_def.xml must be #included'ed in classes.h. In earlier versions of root, it was necessary to include some explicit template instantiations in classes.h but these is no longer needed and should not be present.
There is a syntax to make only a subset of the data members of a class persistent. There is also a syntax to tell the framework to make a data product purely transient: that is, it can be added to the event so that other modules may use it, but it will never be written out. For details see the next two sections and see also the CMS documentation for making new products.
Transient Data Products
It is possible to tell the framework that it should allow data products of a certain type to be added to the event but that it should never write out data products of that type. This is useful, for example, for data products that are full of bare pointers. To declare the class MyClass as a transient data product you need to add one line to classes_def.xml
<class name="MyClass" persistent="false"/>
and one line to classes.h,
#include "MyClass.hh"
One should not provide the lines for the art::Wrapper<MyClass> to either of these files. Moreover it is not necessary to provide lines in classes_def.xml that describe the classes used as data members inside MyClass. When an output module encounters this data product it will not try to persist the data product.
See also the CMS documentation for making new products.
Transient Data Members within a Persistable Class
It is also possible to declare that a data member of a class is transient. This is done in classes_def.xml. Suppose that the class MyClass has a data member with the name _field of type T. The data member can be declared transient using the syntax:
<class name="MyClass"> <field name="_field" transient="true"/> </class>
In this case the data member _field will not be written to the output file but the remaining data members of MyClass will be. When these objects are read back, the data member _field will be invalid and the user needs to know not to access this data member until it can be properly initialized by some other method. Ideally MyClass should protect against illegal access either by initializing on demand or by throwing.
If there are no persisted objects of type T in the any of the data products, then it is not necessary to declare the type T in classes_def.xml.
See also the CMS documentation for making new products.
Identifiers of a Data Product
Please see the documentation on the art naming convention for products.
Writing only Selected Events and Selected Data Products
It is possible to configure an art job so that it writes selected events to one or more different output files. It is also possible to configure each output file so that only selected data products are written to that file. These operations are described in the web page on configuring output files.