MCProdWorkflow: Difference between revisions

From Mu2eWiki
Jump to navigation Jump to search
 
(25 intermediate revisions by the same user not shown)
Line 1: Line 1:


==Introduction==
==Introduction==
This workflow is for production-style simulation jobs. It can be used for stage-1 jobs which start with a generator, or later simulation stages which start with the output files of previous stages.  It will result in the output files being concatenated and uploaded to tape and properly documented through the SAM database.  It is intended for cases where the output needs to be saved for more than a month or so, or that might be used by many collaborators, or needs to be carefully documented. If your work doesn't need to be uploaded or is more personal or temporary, you can follow the [[MCScrWorkflow|scratch workflow]] which does not concatenate or upload.  Most commonly this procedure would be part of a collaboration simulation effort and would be run out of the [[Mu2epro|mu2epro account]], but is can be run in a personal account.  
This workflow is for production-style simulation jobs. It can be used for stage-1 jobs which start with a generator, or later simulation stages which start with the output files of previous stages.  It can also be used for concatenation of a dataset.  The output may stay on disk, or the output files can be concatenated and uploaded to tape and properly documented through the SAM database.  Uploading is intended for cases where the output needs to be saved for more than a month or so, or that might be used by many collaborators, or needs to be carefully documented.   This process may be run out of the [[Mu2epro|mu2epro account]] for collaboration samples, or a personal account for personal samples.  


The user should prepare the physics, and relevant fcl in detail and perform basic test before starting this procedure.
You will need to plan the project in some detail before starting this production workflow.
* the physics
* the basic fcl to perform the job, see [[SimulationFCL]] for the fcl patterns used in MDC2020.
* the output [[FileNames|dataset names]]
* the [[JobPlan|job plan]] in terms of number and length of jobs, etc.


This page assumes that the user is familiar with the basic infrastructure and its references:
This page assumes that the user is familiar with the basic infrastructure and its references:
* [[Simulation]], [[FclIntro|fcl]]
* [[Simulation]], [[FclIntro|fcl]]
*[[Grids|grid]], [[JobPlan|job planning]], [[OfflineOps|monitoring]]
* [[FileNames|file names]], [[FileTools|file tools]], [[SAM]]
* [[FileNames|file names]], [[FileTools|file tools]], [[SAM]]
* [[Grids|grids]], [[Dcache|dCache]], [[Enstore|enstore]]
* [[Grids|grids]], [[Dcache|dCache]], [[DataTransfer|data transfer]], [[Enstore|enstore]]
* [[Grids|grid]], [[JobPlan|job planning]], [[OfflineOps|monitoring]]
* [[Prestage|prestaging]], [[Concatenate|concatenation]], [https://cdcvs.fnal.gov/redmine/projects/mu2egrid/wiki mu2egrid]  
* [[Prestage|prestaging]], [[Concatenate|concatenation]], [https://cdcvs.fnal.gov/redmine/projects/mu2egrid/wiki mu2egrid]  


The basic steps, expanded below are  
The basic steps, expanded below are  
* if needed, prestage input files
* prestage input files, if needed
* generate a set of fcl files
* generate a set of fcl files, one for each job
* register the fcl dataset with SAM, and copy fcl files to dCache
* copy fcl files to dCache and optionally register the fcl dataset with SAM
* submit jobs
* submit jobs (after creating custom code tarball, if needed)
* check output and recover failed jobs
* check output and recover failed jobs
* if needed, concatenate output files
* optionally concatenate output files for each dataset
* upload output files
** generate a set of fcl files
* tar and upload log files
** copy fcl files to dCache and optionally register the fcl dataset with SAM
** submit jobs
** check output and recover failed jobs
* optionally upload output files
* optionally tar and upload log files


The <code>mu2egrid</code> and [[FileTools|related]] packages provides Mu2e-specific code
The <code>mu2egrid</code> and [[FileTools|related]] packages provide Mu2e-specific code
required for submitting jobs and manipulating files.  Most scripts support the
required for submitting jobs and manipulating files.  Most scripts support the
<code>--help</code> option.  Look for the  
<code>--help</code> option.  Look for the  
Line 28: Line 36:
show what will be done without performing the action.
show what will be done without performing the action.


==Collecting Inputs==
==Directories==
<pre>
It is useful to have a working area:
# input dataset, if needed
/mu2e/data/users/$USER/projects/my_project
export INPUTDS=sim.mu2e.cd3-beam-g4s1-dsregion.0506a.art
and areas for the jobs main fcl
#
/mu2e/data/users/$USER/projects/project_name/job/fcl
# project, similar to cd3-beam or cd3-cosmic
and for each concatenation
export PROJECT=abc-phys
/mu2e/data/users/$USER/projects/project_name/output1/fcl
# stage, like g4s1
export STAGE=g4s1
export WORKDIR=/mu2e/data/users/$USER/projects/$PROJECT
export FCLDIR=$WORKDIR/fcl
# user or mu2e for mu2epro for collaboration work
DSOWNER=mu2e
# a unique string for versions in case project has restarts or variations
DSCONF=v0
#
# tags, one each for for output steams, like dsregion, mubeam, truncated, crv
# output datas
export OUTDS1=out1
export CATSTAGE=cs1
export CATDIR1=/mu2e/data/users/$USER/projects/$PROJECT/fcl/out1
</pre>


<pre>
For official collaboration work, the output will go to
mkdir -p $FCLDIR
/pnfs/mu2e/persistent/users/mu2epro/workflow/project_name/STATUS
mkdir -p $CATDIR1
for individual's work, the output will go to
# repeated ..
/pnfs/mu2e/scratch/users/$USER/workflow/project_name/STATUS
where STATUS is
* <code>outstage</code> for output from grid jobs
* <code>good</code> for jobs that have been checked and passed
* <code>failed</code> for jobs that have been checked and passed


==Prestage input dataset==


</pre>
Prestaging makes sure the input dataset has been copied off tape to disk,
so it is ready to use.  If there is no input dataset, or it is known to be on disk (in scratch dCache for example)
skip this step.


==1) Prestage inputs==
We recommend starting the prestaging as soon as possible since it can take several days.  Please follow the [[Prestage|prestage]] instructions on the input dataset.


We recommend starting the prestaging as soon as possible since it can take several days. Please follow the [[Prestage|prestage]] instructions for $INPUTDS.
==Generate fcl ==
In this step you take a fcl file which works interactively to generate a simulation sample (see [[SimulationFCL]] for the fcl patterns used in MDC2020), and scale it up for the grid.  
Please follow the [[GenerateFcl|instructions]] for generating fcl from a template or example.  There are examples of a job which starts from a generator (stage 1) or a later stage that take input art files from a previous stage.


==2) Generate fcl ==
==Submit Jobs ==
If you are running code in a pre-built release on [[Cvmfs|cvmfs]], you can use that build directly on the grid nodes. If you are using locally-built code, please make a tarball of the code area with [[Muse]].  Next, please follow the [[SubmitJobs|instructions]] for submitting jobs.


Please follow the
==Concatenate output datasets==
Some discussion and overview is available in [[JobPlan|job planning]] and [[Concatenate|concatenation]].  If you decide to concatenate the output datasets, for each dataset:
* [[GenerateFcl|generate]] a fcl file set, using the concatenation example
* [[SubmitJobs|submit]] those fcl to the grid
 
This procedure is essentially the same as the main job: generate fcl, submit and recover jobs, and you will end up with the new datasets as output, and those will be uploaded in the next step.
 
In Production, and as a recommended convention for everyone, to create the concatenated dataset name, we add "-cat" to the description field in the name.
 
==Upload output files==
Please see the background information on the [[Upload|upload page]] then follow the recipes in
[[Upload#MC workflow, art files|MC art files section]].
 
==Archive log files==
Please see the instructions on the [[Upload#MC workflow, log files|log files section]] of the [[Upload|upload page]].
 
[[Category:Computing]]
[[Category:Workflows]]

Latest revision as of 21:25, 15 October 2021

Introduction

This workflow is for production-style simulation jobs. It can be used for stage-1 jobs which start with a generator, or later simulation stages which start with the output files of previous stages. It can also be used for concatenation of a dataset. The output may stay on disk, or the output files can be concatenated and uploaded to tape and properly documented through the SAM database. Uploading is intended for cases where the output needs to be saved for more than a month or so, or that might be used by many collaborators, or needs to be carefully documented. This process may be run out of the mu2epro account for collaboration samples, or a personal account for personal samples.

You will need to plan the project in some detail before starting this production workflow.

  • the physics
  • the basic fcl to perform the job, see SimulationFCL for the fcl patterns used in MDC2020.
  • the output dataset names
  • the job plan in terms of number and length of jobs, etc.

This page assumes that the user is familiar with the basic infrastructure and its references:

The basic steps, expanded below are

  • prestage input files, if needed
  • generate a set of fcl files, one for each job
  • copy fcl files to dCache and optionally register the fcl dataset with SAM
  • submit jobs (after creating custom code tarball, if needed)
  • check output and recover failed jobs
  • optionally concatenate output files for each dataset
    • generate a set of fcl files
    • copy fcl files to dCache and optionally register the fcl dataset with SAM
    • submit jobs
    • check output and recover failed jobs
  • optionally upload output files
  • optionally tar and upload log files

The mu2egrid and related packages provide Mu2e-specific code required for submitting jobs and manipulating files. Most scripts support the --help option. Look for the --dry-run and --verbose options to show what will be done without performing the action.

Directories

It is useful to have a working area:

/mu2e/data/users/$USER/projects/my_project

and areas for the jobs main fcl

/mu2e/data/users/$USER/projects/project_name/job/fcl

and for each concatenation

/mu2e/data/users/$USER/projects/project_name/output1/fcl

For official collaboration work, the output will go to

/pnfs/mu2e/persistent/users/mu2epro/workflow/project_name/STATUS

for individual's work, the output will go to

/pnfs/mu2e/scratch/users/$USER/workflow/project_name/STATUS

where STATUS is

  • outstage for output from grid jobs
  • good for jobs that have been checked and passed
  • failed for jobs that have been checked and passed

Prestage input dataset

Prestaging makes sure the input dataset has been copied off tape to disk, so it is ready to use. If there is no input dataset, or it is known to be on disk (in scratch dCache for example) skip this step.

We recommend starting the prestaging as soon as possible since it can take several days. Please follow the prestage instructions on the input dataset.

Generate fcl

In this step you take a fcl file which works interactively to generate a simulation sample (see SimulationFCL for the fcl patterns used in MDC2020), and scale it up for the grid. Please follow the instructions for generating fcl from a template or example. There are examples of a job which starts from a generator (stage 1) or a later stage that take input art files from a previous stage.

Submit Jobs

If you are running code in a pre-built release on cvmfs, you can use that build directly on the grid nodes. If you are using locally-built code, please make a tarball of the code area with Muse. Next, please follow the instructions for submitting jobs.

Concatenate output datasets

Some discussion and overview is available in job planning and concatenation. If you decide to concatenate the output datasets, for each dataset:

  • generate a fcl file set, using the concatenation example
  • submit those fcl to the grid

This procedure is essentially the same as the main job: generate fcl, submit and recover jobs, and you will end up with the new datasets as output, and those will be uploaded in the next step.

In Production, and as a recommended convention for everyone, to create the concatenated dataset name, we add "-cat" to the description field in the name.

Upload output files

Please see the background information on the upload page then follow the recipes in MC art files section.

Archive log files

Please see the instructions on the log files section of the upload page.