Provenance: Difference between revisions
No edit summary |
No edit summary |
||
Line 42: | Line 42: | ||
config_dumper <file> | config_dumper <file> | ||
which prints the fcl config for each module that was used in the art jobs which produced the file. | which prints the fcl config for each module that was used in the art jobs which produced the file. | ||
config_dumper has three switches: -M -P -S and you need to run all three separately to get the three parts of the fcl and append these fcl together. This fcl might contain several blocks if the file was the result of several art jobs. In this case, you will need to append one more line, like | |||
@table::Reconstruct | |||
which selects the block of fcl from the stage you are interested in. | |||
[[Category:Computing]] | [[Category:Computing]] | ||
[[Category:Code]] | [[Category:Code]] |
Latest revision as of 23:38, 14 February 2025
This page is a draft, please help complete it!
This page page needs expert review!
Each data product has an associated provenance that describes how the data product was made:
- Which module made this data product?
- What parameter set was used to configure that module?
- What data products, if any, were read by that module?
Suppose that module MA has no input data products (for example an event generator) and that it produces a data product DPA. Further suppose that module MB reads DPA and produces data product DPB. In this circumstance both data products A and B have an entry in the provenance registry. Now suppose that only the data product DPB is written to the output file. When that file is read in again, the data product DPA is not present but both provenances remain in the provenance registry; art does this because the provenance of DPA is part of the provenance of DPB and art always keeps complete proveances.
If we read this output file and write a new one in which we do NOT write data product DPB, then neither DPA nor DPB will be present in the output. In this case the provenances for both DPA and DPB will be removed from the registry that is written to the output file.
The general rule is that a provenance is retained in the registry so long as at least one of the following is true:
- The data product that it describes is present in the output file.
- Any of that data products descendant data products are present in the output file.
Tools
There is an art tool to dump the list of processes which have run on a fie, based on file provenance contents:
file_info_dumper --process-history <file>
produces:
Chronological list of process names for processes that produced this file. 1. cosmics1 2. cosmics2 3. cosmics3 4. drap
and a config dumper:
config_dumper <file>
which prints the fcl config for each module that was used in the art jobs which produced the file. config_dumper has three switches: -M -P -S and you need to run all three separately to get the three parts of the fcl and append these fcl together. This fcl might contain several blocks if the file was the result of several art jobs. In this case, you will need to append one more line, like
@table::Reconstruct
which selects the block of fcl from the stage you are interested in.