Workflows: Difference between revisions
Jump to navigation
Jump to search
(43 intermediate revisions by 2 users not shown) | |||
Line 2: | Line 2: | ||
== Workflow for jobs == | == Workflow for jobs == | ||
* [[Grids|Grids]] - overview of the computing farms | * [[Grids|Grids]] - overview of the computing farms and job submission | ||
*[[ | * [[POMS]] - grid campaign organization, submission and recovery tool | ||
* [ | * [[HPC]] - high performance computing resources | ||
* [[Docker]] software containers for submitting to special farms | |||
* [[AnalysisWorkflow| analysis workflow]] - how to read existing datasets | |||
* [[MCProdWorkflow| production simulation workflow]] - how to run simulation, concatenate and upload the files | |||
* [[MARSWorkflow| MARS and G4Beamline workflow]] - these are not in the art framework | |||
* [[JobPlan|Job planning]] - considerations of project CPU, disk, memory and file size | |||
* [[Prestage]] - make sure files have been retrieved off tape and are ready to use | |||
* [[SimulationFCL]] - an overview how MDC2020 fcl works | |||
* [[GenerateFcl]] - generate fcl for simulation jobs | |||
* [[JustInTimeFcl]] - a technique for simplifying fcl generation | |||
* [[SubmitJobs]] - submit, monitor and recover jobs in the workflow | |||
* [[Concatenate]] - merging root and art files | |||
* [[Upload]] - upload files to tape | |||
* [[ErrorRecovery]] - how to recover from common errors | |||
* [[Validation]] - check how code is performing, using MC | |||
* [[DQM]] - Data Quality Monitoring | |||
* [[Datasets]] - production datasets | |||
* [[RunNumbers]] - run number ranges and uses | |||
* [[ProductionLogic]] - production job logic | |||
* [[ProductionProcedures]] - how to run production | |||
== Data Handling == | == Data Handling == | ||
* [[ | * [[Dcache|dCache]] - the large aggregated data disk system | ||
* [[ | * [[Enstore|enstore]] - the mass storage tape system | ||
* [[ | * [[FileTransferService|File Transfer Service (FTS)]] - service to create SAM records and copy files to tape | ||
* [[ | * [[Declad|Declad]] - service to create Metacat records and copy files to tape | ||
* [[RawDataMover|Raw Data Mover (RDM)]] - Mu2e service to copy teststand data files to tape | |||
* [[DataTransfer|Data transfer]] - how to move larger datasets, necessary for grid jobs | |||
* [[FileNames|File names]] - how to name files for upload or for production | |||
* [[FileFamilies|File families]] - the logical grouping of data on tapes | |||
* [[SAM|SAM]] - file metadata and data handling management system | * [[SAM|SAM]] - file metadata and data handling management system | ||
* [[UploadFTS|FTS Upload]] | ** [[SamMetadata|SAM metadata]] - the metadata stored for each file | ||
** [[SamExpert|SAM expert]] - some notes for the future, not of general interest | |||
* [[FileTools|File tools]] - tools for manipulating large file datasets and metadata | |||
* [[UploadFTS|FTS Upload]] - deprecated method to upload by the FTS area | |||
* [[Rucio]] - next-gen data handling | |||
==Operations== | ==Operations== | ||
* [ | * [[OfflineOps|operations links]] | ||
* [[Mu2epro|Mu2epro account]] | * [[Mu2epro|Mu2epro account]] | ||
[[Category:Computing]] | [[Category:Computing]] | ||
[[Category: | [[Category:Workflows]] |
Latest revision as of 19:28, 20 November 2024
Workflow for jobs
- Grids - overview of the computing farms and job submission
- POMS - grid campaign organization, submission and recovery tool
- HPC - high performance computing resources
- Docker software containers for submitting to special farms
- analysis workflow - how to read existing datasets
- production simulation workflow - how to run simulation, concatenate and upload the files
- MARS and G4Beamline workflow - these are not in the art framework
- Job planning - considerations of project CPU, disk, memory and file size
- Prestage - make sure files have been retrieved off tape and are ready to use
- SimulationFCL - an overview how MDC2020 fcl works
- GenerateFcl - generate fcl for simulation jobs
- JustInTimeFcl - a technique for simplifying fcl generation
- SubmitJobs - submit, monitor and recover jobs in the workflow
- Concatenate - merging root and art files
- Upload - upload files to tape
- ErrorRecovery - how to recover from common errors
- Validation - check how code is performing, using MC
- DQM - Data Quality Monitoring
- Datasets - production datasets
- RunNumbers - run number ranges and uses
- ProductionLogic - production job logic
- ProductionProcedures - how to run production
Data Handling
- dCache - the large aggregated data disk system
- enstore - the mass storage tape system
- File Transfer Service (FTS) - service to create SAM records and copy files to tape
- Declad - service to create Metacat records and copy files to tape
- Raw Data Mover (RDM) - Mu2e service to copy teststand data files to tape
- Data transfer - how to move larger datasets, necessary for grid jobs
- File names - how to name files for upload or for production
- File families - the logical grouping of data on tapes
- SAM - file metadata and data handling management system
- SAM metadata - the metadata stored for each file
- SAM expert - some notes for the future, not of general interest
- File tools - tools for manipulating large file datasets and metadata
- FTS Upload - deprecated method to upload by the FTS area
- Rucio - next-gen data handling