ProductionProceduresMC: Difference between revisions
Line 108: | Line 108: | ||
If happy upload par file to disk: | If happy upload par file to disk: | ||
gen_Merge.sh --json Production/data/merge_filter.json --json_index 4 --pushout | gen_Merge.sh --json Production/data/merge_filter.json --json_index 4 --pushout | ||
===Index datasets=== | |||
Complex job type run of the index datasets as such: | |||
$ samdes idx_map042425.txt | |||
Definition Name: idx_map042425.txt | |||
Definition Id: 208459 | |||
Creation Date: 2025-04-25T15:58:50+00:00 | |||
Username: oksuzian | |||
Group: mu2e | |||
Dimensions: dh.dataset etc.mu2e.index.000.txt and dh.sequencer < 0003892 | |||
The definitions are created from a list of par files like: | |||
cnf.mu2e.MuonIPAStopSelector.MDC2020at.tar -1 | |||
cnf.mu2e.RMCInternal.MDC2020at.tar 2000 | |||
cnf.mu2e.RMCExternal.MDC2020at.tar 8000 | |||
cnf.mu2e.IPAMuminusMichel.MDC2020at.tar 2000 | |||
cnf.mu2e.CeMLeadingLog.MDC2020at.tar 2000 | |||
cnf.mu2e.CePLeadingLog.MDC2020at.tar 2000 | |||
cnf.mu2e.DIOtail95.MDC2020at.tar 2000 | |||
Where the first column are the parameter files definitions, and the second column are the number of jobs (-1 means the number of jobs can be extracted from the par file itself) | |||
Then using the list above, we can create a definition: | |||
gen_MergeMap.py /exp/mu2e/data/users/oksuzian/poms_map/map041025.txt | |||
=== Scripts/run_JITfcl.py === | |||
This script drives complex job types of the index definitions. | |||
On the grid it: | |||
* Extracts the parameter filename and local index from the map, i.e. /exp/mu2e/data/users/oksuzian/poms_map/merged_map042425.txt | |||
* Download par file, and extracts fcl file | |||
* Runs and pushOut all the relevant output: art, root, log | |||
=== Current datasets === | === Current datasets === |
Revision as of 18:08, 1 May 2025
Introduction
This document outlines the procedures for running Monte Carlo (MC) production with POMS. MC jobs fall into two categories:
Simple jobs
- Process a single input file to produce one output using a standard FCL template.
- Examples: digitization, reconstruction, event ntupling.
This jobs are driven are driven by Production/Scripts/run_RecoEntuple.py script (needs a name change). The output from particular datasets needs to be saved on persistent area to avoid small files on tape. At the later stage, small files will have to be concatanated and saved on tape.
An example of POMS campaign that digitizes all the primaries: https://pomsgpvm02.fnal.gov/poms/campaign_stage_info/mu2e/production?campaign_stage_id=24194
Stage parameters are defined as such:
Param_Overrides = [ ['-Oglobal.dataset=', '%(dataset)s'], ['--stage=', 'digireco_digi_list'], ['-Oglobal.release_v_o=','au'], ['-Oglobal.dbversion=', 'v1_3'], ['-Oglobal.fcl=', 'Production/JobConfig/digitize/OnSpill.fcl'], ['-Oglobal.nevent=', '-1'], ]
- %(dataset)s – Internal POMS dataset placeholder. For each submission, POMS generates slices named like
dts.sophie.ensembleMDS2a.MDC2020at.art_slice_72935_stage_5 and substitutes the slice name for %(dataset)s.
- digireco_digi_list – The stage definition loaded from
/exp/mu2e/app/users/mu2epro/production_manager/poms_includes/mdc2020ar.cfg
- All other parameters are passed as arguments to the `run_RecoEntuple.py` script within this stage.
The split types that we use are:
Split Type: drainingn(500)
, which is described through `Edit Campaign Stage` and in POMS docs:
This type, when filled out as drainign(n) for some integer n, will pull at most n files at a time from the dataset and deliver them on each iteration, keeping track of the delivered files with a snapshot.
To modify campaign, the preferred option is to use GUI editor on the main page, which will bring you the below:
Then double click on digi cell to modify campaign parameters
Complex jobs
- Require unique, job-specific parameters and configurations.
- Examples: stage-1 processing, stage-2 resampling, mixing.
Primaries
We resample primary from particle stops. We use gen_Resampler.sh to produce a parameter file
Example:
gen_Resampler.sh --json /exp/mu2e/app/users/oksuzian/muse_080224/Production/data/primary_dio.json --json_index 0
json file index 0 looks like:
{ "dsconf": "MDC2020at", "desc": "DIOtail95", "fcl": "Production/JobConfig/primary/DIOtail.fcl", "resampler_name": "TargetStopResampler", "resampler_data": "sim.mu2e.MuminusStopsCat.MDC2020p.art", "events": 5000, "njobs": 2000, "start_mom": 95, "end_mom": 1000, "run": 1202, "simjob_setup": "/cvmfs/mu2e.opensciencegrid.org/Musings/SimJob/MDC2020at/setup.sh" }
, and essentially sets the parameter for gen_Resampler.sh
will produce:
cnf.mu2e.DIOtail95.MDC2020at.0.tar cnf.mu2e.DIOtail95.MDC2020at.fcl
fcl file can be used for testing If happy upload par file to disk:
gen_Resampler.sh --json Production/data/primary_dio.json --json_index 0
Merging
Example:
gen_Merge.sh --json Production/data/merge_filter.json --json_index 4
json file index 4 looks like:
{ "desc": "ensembleMDS1eOnSpillTriggered-noMC", "dsconf": "MDC2020au_best_v1_3", "append": ["physics.trigger_paths: []", "outputs.strip.fileName: \"dig.owner.dsdesc.dsconf.seq.art\""], "extra_opts": "--override-output-description", "fcl": "Production/JobConfig/digitize/StripMC.fcl", "dataset": "dig.mu2e.ensembleMDS1eOnSpillTriggered.MDC2020aq_best_v1_3.art", "merge-factor": 1, "simjob_setup": "/cvmfs/mu2e.opensciencegrid.org/Musings/SimJob/MDC2020au/setup.sh" }
, and essentially sets the parameter for gen_Merge.sh
will produce:
cnf.mu2e.ensembleMDS1eOnSpillTriggered-noMC.MDC2020au_best_v1_3.0.tar cnf.mu2e.ensembleMDS1eOnSpillTriggered-noMC.MDC2020au_best_v1_3.fcl
fcl file can be used for testing If happy upload par file to disk:
gen_Merge.sh --json Production/data/merge_filter.json --json_index 4 --pushout
Index datasets
Complex job type run of the index datasets as such: $ samdes idx_map042425.txt Definition Name: idx_map042425.txt
Definition Id: 208459 Creation Date: 2025-04-25T15:58:50+00:00 Username: oksuzian Group: mu2e Dimensions: dh.dataset etc.mu2e.index.000.txt and dh.sequencer < 0003892
The definitions are created from a list of par files like:
cnf.mu2e.MuonIPAStopSelector.MDC2020at.tar -1 cnf.mu2e.RMCInternal.MDC2020at.tar 2000 cnf.mu2e.RMCExternal.MDC2020at.tar 8000 cnf.mu2e.IPAMuminusMichel.MDC2020at.tar 2000 cnf.mu2e.CeMLeadingLog.MDC2020at.tar 2000 cnf.mu2e.CePLeadingLog.MDC2020at.tar 2000 cnf.mu2e.DIOtail95.MDC2020at.tar 2000
Where the first column are the parameter files definitions, and the second column are the number of jobs (-1 means the number of jobs can be extracted from the par file itself)
Then using the list above, we can create a definition:
gen_MergeMap.py /exp/mu2e/data/users/oksuzian/poms_map/map041025.txt
Scripts/run_JITfcl.py
This script drives complex job types of the index definitions. On the grid it:
- Extracts the parameter filename and local index from the map, i.e. /exp/mu2e/data/users/oksuzian/poms_map/merged_map042425.txt
- Download par file, and extracts fcl file
- Runs and pushOut all the relevant output: art, root, log
Current datasets
You can check recent datasets using listNewDatasets.sh The current datasets are also available: https://mu2ewiki.fnal.gov/wiki/MDC2020#Current_Datasets These webpage are geneted by nightly cron jobs: /exp/mu2e/app/home/mu2epro/cron/datasetMon/