Workflows: Difference between revisions

From Mu2eWiki
Jump to navigation Jump to search
 
(10 intermediate revisions by one other user not shown)
Line 3: Line 3:
== Workflow for jobs ==
== Workflow for jobs ==
* [[Grids|Grids]] - overview of the computing farms and job submission
* [[Grids|Grids]] - overview of the computing farms and job submission
* [[POMS]] - grid campaign organization, submission and recovery tool
* [[HPC]] - high performance computing resources
* [[HPC]] - high performance computing resources
* [[Docker]] software containers for submitting to special farms
* [[Docker]] software containers for submitting to special farms
Line 10: Line 11:
* [[JobPlan|Job planning]] - considerations of project CPU, disk,  memory and file size  
* [[JobPlan|Job planning]] - considerations of project CPU, disk,  memory and file size  
* [[Prestage]] - make sure files have been retrieved off tape and are ready to use
* [[Prestage]] - make sure files have been retrieved off tape and are ready to use
* [[SimulationFCL]] - an overview how MDC2020 fcl works
* [[GenerateFcl]] - generate fcl for simulation jobs
* [[GenerateFcl]] - generate fcl for simulation jobs
* [[gridexport]] - export a build of Offline or a satellite release for use on the grid
* [[JustInTimeFcl]] - a technique for simplifying fcl generation
* [[SubmitJobs]] - submit, monitor and recover jobs in the workflow
* [[SubmitJobs]] - submit, monitor and recover jobs in the workflow
* [[Concatenate]] - merging root and art files
* [[Concatenate]] - merging root and art files
* [[Upload]] - upload files to tape
* [[Upload]] - upload files to tape
* [[ErrorRecovery]] - how to recover from common errors
* [[ErrorRecovery]] - how to recover from common errors
* [[Validation]] - check how code is performing
* [[Validation]] - check how code is performing, using MC
* [[DQM]] - Data Quality Monitoring
* [[Datasets]] - production datasets
* [[Datasets]] - production datasets
* [[RunNumbers]] - run number ranges and uses
* [[RunNumbers]] - run number ranges and uses
* [[ProductionLogic]] - production job logic
* [[ProductionProcedures]] - how to run production


== Data Handling ==
== Data Handling ==
* [[Dcache|dCache]] - the large aggregated data disk system
* [[Dcache|dCache]] - the large aggregated data disk system
* [[Enstore|enstore]] - the mass storage tape system
* [[Enstore|enstore]] - the mass storage tape system
* [[FileTransferService|File Transfer Service (FTS)]] - service to create SAM records and copy files to tape
* [[Declad|Declad]] - service to create Metacat records and copy files to tape
* [[RawDataMover|Raw Data Mover (RDM)]] - Mu2e service to copy teststand data files to tape
* [[DataTransfer|Data transfer]] - how to move larger datasets, necessary for grid jobs
* [[DataTransfer|Data transfer]] - how to move larger datasets, necessary for grid jobs
* [[FileNames|File names]] - how to name files for upload or for production
* [[FileNames|File names]] - how to name files for upload or for production

Latest revision as of 19:28, 20 November 2024


Workflow for jobs

Data Handling

  • dCache - the large aggregated data disk system
  • enstore - the mass storage tape system
  • File Transfer Service (FTS) - service to create SAM records and copy files to tape
  • Declad - service to create Metacat records and copy files to tape
  • Raw Data Mover (RDM) - Mu2e service to copy teststand data files to tape
  • Data transfer - how to move larger datasets, necessary for grid jobs
  • File names - how to name files for upload or for production
  • File families - the logical grouping of data on tapes
  • SAM - file metadata and data handling management system
    • SAM metadata - the metadata stored for each file
    • SAM expert - some notes for the future, not of general interest
  • File tools - tools for manipulating large file datasets and metadata
  • FTS Upload - deprecated method to upload by the FTS area
  • Rucio - next-gen data handling

Operations