RawDataMover: Difference between revisions
(Created page with "==Introduction== ==References==") |
No edit summary |
||
Line 1: | Line 1: | ||
==Introduction== | ==Introduction== | ||
The Raw Data Mover (RDM) is a set of scripts which move raw data (and other) files from a teststand area to tape. It is automated to make it easier to keep up with getting the raw data to tape quickly, with very little effort by the teststand group. | |||
This is not intended to move a high volume of data efficiently, so is not a candidate to be used for this step when taking beam data. | |||
==General Mechanics== | |||
The processes are run in the mu2epro account. The procedure is driven by a cron job on mu2egpvm01. The main script is ~mu2pro/RDM/rdm.sh. | |||
The script takes arguments which select which teststands to operator on. The configuration (such as the node name of the teststand) is embedded in rdm.sh for now. This script has to perform certain operations on the remote teststand node, so mu2epro need to be able to login there, and there must be a script there called rdmRemote.sh. This remote script just does a few simple tasks in examining the local files. As new files are discovered, they are copied to persistent dCache under | |||
/pnfs/mu2e/persistent/users/mu2epro/RDM | |||
if the checksum in this location agrees with the original file (copy was successful), the file is also copied to an [[File TransferService|FTS]] directory under | |||
/pnfs/mu2e/persistent/fts | |||
and a json metadata file is created. The FTS is the File Transfer Service, a computing division service which makes [[SAM]] records for a file, moves them to tape-backed dCache, then when there is a tape location, it updates the SAM record with that location. | |||
The RDM reports the status of files on a [https://mu2e.fnal.gov/atwork/computing/ops/rdm/rdm.html web page]. | |||
On the teststand node, there should be the following set of directories, all in the same parent directory. mu2epro needs write access to the directories. | |||
* '''output''' teststand group writes files directly here | |||
* '''upload''' when files are ready to upload teststand group mv's the file form output to here | |||
* '''stage''' when the files are on persistent dcache, RDM mv's the file from upload to here | |||
* '''delete''' when the files are on tape, RDM moves the files here. The teststand group can delete as they wish | |||
* '''temp''' this area is intended to be used to keep some data files around to do analysis, for example | |||
==References== | ==References== | ||
* [[FileTransferService]] | |||
* [https://cdcvs.fnal.gov/redmine/projects/filetransferservice redmine] | |||
* [https://mu2e.fnal.gov/atwork/computing/ops/rdm/rdm.html web page] | |||
[[Category:Computing]] | |||
[[Category:Workflows]] | |||
[[Category:DataHandling]] |
Latest revision as of 18:34, 27 August 2021
Introduction
The Raw Data Mover (RDM) is a set of scripts which move raw data (and other) files from a teststand area to tape. It is automated to make it easier to keep up with getting the raw data to tape quickly, with very little effort by the teststand group.
This is not intended to move a high volume of data efficiently, so is not a candidate to be used for this step when taking beam data.
General Mechanics
The processes are run in the mu2epro account. The procedure is driven by a cron job on mu2egpvm01. The main script is ~mu2pro/RDM/rdm.sh. The script takes arguments which select which teststands to operator on. The configuration (such as the node name of the teststand) is embedded in rdm.sh for now. This script has to perform certain operations on the remote teststand node, so mu2epro need to be able to login there, and there must be a script there called rdmRemote.sh. This remote script just does a few simple tasks in examining the local files. As new files are discovered, they are copied to persistent dCache under
/pnfs/mu2e/persistent/users/mu2epro/RDM
if the checksum in this location agrees with the original file (copy was successful), the file is also copied to an FTS directory under
/pnfs/mu2e/persistent/fts
and a json metadata file is created. The FTS is the File Transfer Service, a computing division service which makes SAM records for a file, moves them to tape-backed dCache, then when there is a tape location, it updates the SAM record with that location.
The RDM reports the status of files on a web page.
On the teststand node, there should be the following set of directories, all in the same parent directory. mu2epro needs write access to the directories.
- output teststand group writes files directly here
- upload when files are ready to upload teststand group mv's the file form output to here
- stage when the files are on persistent dcache, RDM mv's the file from upload to here
- delete when the files are on tape, RDM moves the files here. The teststand group can delete as they wish
- temp this area is intended to be used to keep some data files around to do analysis, for example