Difference between revisions of "POMS"

From Mu2eWiki
Jump to navigation Jump to search
(First part of tutorial)
Line 26: Line 26:
  
 
===INI file===
 
===INI file===
Let's start with the INI file: it can be created locally and it is then uploaded to POMS through the web interface. In the following INI file we specify two stages: one that creates the FCL files, one per job, and one that takes as input the SAM dataset containing the FCL files and submit N jobs, where N is the number of FCL files in the dataset.
+
Let's start with the INI file: it can be created locally and it is then uploaded to POMS through the web interface. In the following INI file we will specify two stages: the first one creates the FCL files, one per job, and the second one takes as input the SAM dataset containing the FCL files and submit N jobs, where N is the number of FCL files in the dataset.
  
 
First of all, the INI file needs the definition of the campaign and of the job type we are going to run.
 
First of all, the INI file needs the definition of the campaign and of the job type we are going to run.
  
 
<pre>
 
<pre>
'''[campaign]'''
+
[campaign]
 
experiment = mu2e
 
experiment = mu2e
 
poms_role = analysis
 
poms_role = analysis
Line 37: Line 37:
 
campaign_stage_list = reco_fcl, reco
 
campaign_stage_list = reco_fcl, reco
  
'''[campaign_defaults]'''
+
[campaign_defaults]
 
vo_role=Analysis
 
vo_role=Analysis
 
software_version=MDC2020t
 
software_version=MDC2020t
Line 52: Line 52:
 
output_ancestor_depth=1
 
output_ancestor_depth=1
  
'''[login_setup srsoleti_poms_login]'''
+
[login_setup srsoleti_poms_login]
 
host=pomsgpvm01.fnal.gov
 
host=pomsgpvm01.fnal.gov
 
account=poms_launcher
 
account=poms_launcher
 
setup=setup fife_utils v3_5_0, poms_client, poms_jobsub_wrapper;
 
setup=setup fife_utils v3_5_0, poms_client, poms_jobsub_wrapper;
 +
</pre>
 +
 +
Then, we define two job types, one that corresponds to the stage that will run the <code>mu2e</code> process, and one that corresponds to the stage that will run the <code>generate_fcl</code> script, which creates the FCL files.
  
'''[job_type mu2e_reco_srsoleti_jobtype]'''
+
[job_type mu2e_reco_srsoleti_jobtype]
 
launch_script = fife_launch
 
launch_script = fife_launch
 
parameters = [["-c ", "/mu2e/app/users/srsoleti/tutorial/reco.cfg"]]
 
parameters = [["-c ", "/mu2e/app/users/srsoleti/tutorial/reco.cfg"]]
Line 63: Line 66:
 
recoveries = [["proj_status",[["-Osubmit.dataset=","%(dataset)s"]]]]
 
recoveries = [["proj_status",[["-Osubmit.dataset=","%(dataset)s"]]]]
  
'''[job_type generate_fcl_reco_srsoleti_jobtype]'''
+
[job_type generate_fcl_reco_srsoleti_jobtype]
 
launch_script = fife_launch
 
launch_script = fife_launch
 
parameters = [["-c ", "/mu2e/app/users/srsoleti/tutorial/reco.cfg"]]
 
parameters = [["-c ", "/mu2e/app/users/srsoleti/tutorial/reco.cfg"]]
Line 70: Line 73:
 
</pre>
 
</pre>
  
Then, we define the two stages that form our campaign, <code>reco</code> and <code>reco_fcl</code>.
+
Finally, we define the two stages that form our campaign, <code>reco</code> and <code>reco_fcl</code>.
  
 
<pre>
 
<pre>

Revision as of 22:54, 25 July 2022

Introduction

The Production Operations Management Service (POMS) is a computing division tool that helps users to run large and complex grid campaigns. It provides

  • control scripts and GUI interface
  • chaining of multiple stages of a grid campaign
  • re-submission of failed jobs
  • analysis of logs and database entries for results

POMS has been successfully employed for the production of the MDC2020 datasets. project-py is s python scripts which provides a command line interface.


Tutorial

Work in progress

The POMS system is designed around the concept of campaign. A campaign is a set of stages which can have interdependencies. Each stage corresponds to the submission of a certain number of jobs to the computing grid, with a specific configuration. Each stage typically takes as input an entire SAM dataset and produces one or more SAM datasets as output, which contain the output files produced by each job.

In this tutorial we will create a campaign with two stages, whose goal is to run the reconstruction stage on a pre-existing digi sample. This campaign can be then easily extended to run an arbitrary number of stages (e.g. generation, digitization, and reconstruction).

The first step is to create a proxy certification that will be then used by POMS:

setup fife_utils
kx509 -n --minhours 168 -o /tmp/x509up_voms_mu2e_Analysis_${USER}
upload_file /tmp/x509up_voms_mu2e_Analysis_${USER}

A POMS campaign requires two files: a INI file, which defines the stage names and the interdependencies, and a CFG file, which describes the configuration for each stage.

INI file

Let's start with the INI file: it can be created locally and it is then uploaded to POMS through the web interface. In the following INI file we will specify two stages: the first one creates the FCL files, one per job, and the second one takes as input the SAM dataset containing the FCL files and submit N jobs, where N is the number of FCL files in the dataset.

First of all, the INI file needs the definition of the campaign and of the job type we are going to run.

[campaign]
experiment = mu2e
poms_role = analysis
name = srsoleti_tutorial
campaign_stage_list = reco_fcl, reco

[campaign_defaults]
vo_role=Analysis
software_version=MDC2020t
dataset_or_split_data=None
cs_split_type=None
completion_type=complete
completion_pct=100
param_overrides="[]"
test_param_overrides="[]"
merge_overrides=False
login_setup=srsoleti_poms_login
job_type=mu2e_reco_srsoleti_jobtype
stage_type=regular
output_ancestor_depth=1

[login_setup srsoleti_poms_login]
host=pomsgpvm01.fnal.gov
account=poms_launcher
setup=setup fife_utils v3_5_0, poms_client, poms_jobsub_wrapper;

Then, we define two job types, one that corresponds to the stage that will run the mu2e process, and one that corresponds to the stage that will run the generate_fcl script, which creates the FCL files.

[job_type mu2e_reco_srsoleti_jobtype] launch_script = fife_launch parameters = "-c ", "/mu2e/app/users/srsoleti/tutorial/reco.cfg" output_file_patterns = %.art recoveries = [["proj_status","-Osubmit.dataset=","%(dataset)s"]]

[job_type generate_fcl_reco_srsoleti_jobtype] launch_script = fife_launch parameters = "-c ", "/mu2e/app/users/srsoleti/tutorial/reco.cfg" output_file_patterns = %.fcl

Finally, we define the two stages that form our campaign, reco and reco_fcl.

[campaign_stage reco_fcl]
param_overrides = [["--stage ", "reco_fcl"]]
test_param_overrides = [["--stage ", "reco_fcl"]]
job_type = generate_fcl_reco_srsoleti_jobtype

[campaign_stage reco]
param_overrides = [["--stage ", "reco"]]
test_param_overrides = [["--stage ", "reco"]]

[dependencies reco]
campaign_stage_1 = reco_fcl
file_pattern_1 = %.fcl


UconDB

mu2e_ucon_prod - created 11/2021 to help with MDC2020

  • database mu2e_ucon_prod owned by nologin role mu2e_ucon_prod;
  • Kerberos-authenticated roles: brownd kutschke srsoleti
  • md5 authenticated role 'mu2e_ucon_web' (for POMS)
  • port is 5458 (on ifdbprod/ifdb08)
https://dbdata0vm.fnal.gov:9443/mu2e_ucondb_prod/app/... - not cached, external and internal access
http://dbdata0vm.fnal.gov:9090/mu2e_ucondb_prod/app/... - not cached, internal access only 
https://dbdata0vm.fnal.gov:8444/mu2e_ucondb_prod/app/... - cached, external and internal access
http://dbdata0vm.fnal.gov:9091/mu2e_ucondb_prod/app/... - cached, internal access only


FCL files are saved to the UconDB in dedicated folders. It's not possible to use dots in the name of the database folders, so we replace with underscores, as in: https://dbdata0vm.fnal.gov:9443/mu2e_ucondb_prod/app/UI/folder?folder=cnf_mu2e_pot_db_test_v12_fcl

The POMS FCL stages take care of creating the folders and saving the FCL files. The SAM location of the files stored in the database looks like this:

$ samweb locate-file cnf.mu2e.POT.db_test_v12.001201_00000001.fcl
dbdata0vm.fnal.gov:/mu2e_ucondb_prod/app/data/cnf_mu2e_POT_db_test_v12_fcl

References