ProductionProceduresMC: Difference between revisions
No edit summary |
|||
Line 1: | Line 1: | ||
= POMS MC Production Guide = | |||
__TOC__ | |||
== Introduction == | == Introduction == | ||
This | This guide describes how to run Monte Carlo (MC) production using POMS. Jobs fall into two categories: | ||
== Simple jobs == | |||
* **Definition:** Process a single input file to produce one output via a standard FCL template. | |||
* **Driver:** <code>Production/Scripts/run_RecoEntuple.py</code> (consider renaming to <code>run_DigiReco.py</code>). | |||
* **Output storage:** Write results to persistent storage to avoid many small tape files; later concatenate before archiving. | |||
* **Examples:** digitization, reconstruction, event-ntupling. | |||
* **Example campaign:** [https://pomsgpvm02.fnal.gov/poms/campaign_stage_info/mu2e/production?campaign_stage_id=24194 POMS Campaign 24194] | |||
Stage | === Stage Parameter Overrides === | ||
Param_Overrides = [ | Param_Overrides = [ | ||
Line 23: | Line 23: | ||
] | ] | ||
* | * <code>%(dataset)s</code> – placeholder for POMS slice names (e.g. <code>dts.sophie.ensembleMDS2a.MDC2020at.art_slice_72935_stage_5</code>) | ||
* <code>digireco_digi_list</code> – stage definition from <code>…/poms_includes/mdc2020ar.cfg</code> | |||
* Remaining overrides feed into <code>run_RecoEntuple.py</code> | |||
* | |||
* | |||
The split types that we use are: | The split types that we use are: |
Revision as of 18:17, 1 May 2025
POMS MC Production Guide
Introduction
This guide describes how to run Monte Carlo (MC) production using POMS. Jobs fall into two categories:
Simple jobs
- **Definition:** Process a single input file to produce one output via a standard FCL template.
- **Driver:**
Production/Scripts/run_RecoEntuple.py
(consider renaming torun_DigiReco.py
). - **Output storage:** Write results to persistent storage to avoid many small tape files; later concatenate before archiving.
- **Examples:** digitization, reconstruction, event-ntupling.
- **Example campaign:** POMS Campaign 24194
Stage Parameter Overrides
Param_Overrides = [ ['-Oglobal.dataset=', '%(dataset)s'], ['--stage=', 'digireco_digi_list'], ['-Oglobal.release_v_o=','au'], ['-Oglobal.dbversion=', 'v1_3'], ['-Oglobal.fcl=', 'Production/JobConfig/digitize/OnSpill.fcl'], ['-Oglobal.nevent=', '-1'], ]
%(dataset)s
– placeholder for POMS slice names (e.g.dts.sophie.ensembleMDS2a.MDC2020at.art_slice_72935_stage_5
)digireco_digi_list
– stage definition from…/poms_includes/mdc2020ar.cfg
- Remaining overrides feed into
run_RecoEntuple.py
The split types that we use are:
Split Type: drainingn(500)
, which is described through `Edit Campaign Stage` and in POMS docs:
This type, when filled out as drainign(n) for some integer n, will pull at most n files at a time from the dataset and deliver them on each iteration, keeping track of the delivered files with a snapshot.
To modify campaign, the preferred option is to use GUI editor on the main page, which will bring you the below:
Then double click on digi cell to modify campaign parameters
Complex jobs
- Require unique, job-specific parameters and configurations.
- Examples: stage-1 processing, stage-2 resampling, mixing.
POMS campaign example[1]
Primaries
We resample primary from particle stops. We use gen_Resampler.sh to produce a parameter file
Example:
gen_Resampler.sh --json /exp/mu2e/app/users/oksuzian/muse_080224/Production/data/primary_dio.json --json_index 0
json file index 0 looks like:
{ "dsconf": "MDC2020at", "desc": "DIOtail95", "fcl": "Production/JobConfig/primary/DIOtail.fcl", "resampler_name": "TargetStopResampler", "resampler_data": "sim.mu2e.MuminusStopsCat.MDC2020p.art", "events": 5000, "njobs": 2000, "start_mom": 95, "end_mom": 1000, "run": 1202, "simjob_setup": "/cvmfs/mu2e.opensciencegrid.org/Musings/SimJob/MDC2020at/setup.sh" }
, and essentially sets the parameter for gen_Resampler.sh
will produce:
cnf.mu2e.DIOtail95.MDC2020at.0.tar cnf.mu2e.DIOtail95.MDC2020at.fcl
fcl file can be used for testing If happy upload par file to disk:
gen_Resampler.sh --json Production/data/primary_dio.json --json_index 0
Merging
Example:
gen_Merge.sh --json Production/data/merge_filter.json --json_index 4
json file index 4 looks like:
{ "desc": "ensembleMDS1eOnSpillTriggered-noMC", "dsconf": "MDC2020au_best_v1_3", "append": ["physics.trigger_paths: []", "outputs.strip.fileName: \"dig.owner.dsdesc.dsconf.seq.art\""], "extra_opts": "--override-output-description", "fcl": "Production/JobConfig/digitize/StripMC.fcl", "dataset": "dig.mu2e.ensembleMDS1eOnSpillTriggered.MDC2020aq_best_v1_3.art", "merge-factor": 1, "simjob_setup": "/cvmfs/mu2e.opensciencegrid.org/Musings/SimJob/MDC2020au/setup.sh" }
, and essentially sets the parameter for gen_Merge.sh
will produce:
cnf.mu2e.ensembleMDS1eOnSpillTriggered-noMC.MDC2020au_best_v1_3.0.tar cnf.mu2e.ensembleMDS1eOnSpillTriggered-noMC.MDC2020au_best_v1_3.fcl
fcl file can be used for testing If happy upload par file to disk:
gen_Merge.sh --json Production/data/merge_filter.json --json_index 4 --pushout
Index datasets
Complex job type run of the index datasets as such: $ samdes idx_map042425.txt Definition Name: idx_map042425.txt
Definition Id: 208459 Creation Date: 2025-04-25T15:58:50+00:00 Username: oksuzian Group: mu2e Dimensions: dh.dataset etc.mu2e.index.000.txt and dh.sequencer < 0003892
The definitions are created from a list of par files like:
cnf.mu2e.MuonIPAStopSelector.MDC2020at.tar -1 cnf.mu2e.RMCInternal.MDC2020at.tar 2000 cnf.mu2e.RMCExternal.MDC2020at.tar 8000 cnf.mu2e.IPAMuminusMichel.MDC2020at.tar 2000 cnf.mu2e.CeMLeadingLog.MDC2020at.tar 2000 cnf.mu2e.CePLeadingLog.MDC2020at.tar 2000 cnf.mu2e.DIOtail95.MDC2020at.tar 2000
Where the first column are the parameter files definitions, and the second column are the number of jobs (-1 means the number of jobs can be extracted from the par file itself)
Then using the list above, we can create a definition:
gen_MergeMap.py /exp/mu2e/data/users/oksuzian/poms_map/map041025.txt
Scripts/run_JITfcl.py
This script drives complex job types of the index definitions. On the grid it:
- Extracts the parameter filename and local index from the map, i.e. /exp/mu2e/data/users/oksuzian/poms_map/merged_map042425.txt
- Download par file, and extracts fcl file
- Runs and pushOut all the relevant output: art, root, log
Current datasets
You can check recent datasets using listNewDatasets.sh The current datasets are also available: https://mu2ewiki.fnal.gov/wiki/MDC2020#Current_Datasets These webpage are geneted by nightly cron jobs: /exp/mu2e/app/home/mu2epro/cron/datasetMon/