RawDataMover

From Mu2eWiki
Jump to navigation Jump to search

Introduction

The Raw Data Mover (RDM) is a set of scripts which move raw data (and other) files from a teststand area to tape. It is automated to make it easier to keep up with getting the raw data to tape quickly, with very little effort by the teststand group.

This is not intended to move a high volume of data efficiently, so is not a candidate to be used for this step when taking beam data.

General Mechanics

The processes are run in the mu2epro account. The procedure is driven by a cron job on mu2egpvm01. The main script is ~mu2pro/RDM/rdm.sh. The script takes arguments which select which teststands to operator on. The configuration (such as the node name of the teststand) is embedded in rdm.sh for now. This script has to perform certain operations on the remote teststand node, so mu2epro need to be able to login there, and there must be a script there called rdmRemote.sh. This remote script just does a few simple tasks in examining the local files. As new files are discovered, they are copied to persistent dCache under

/pnfs/mu2e/persistent/users/mu2epro/RDM

if the checksum in this location agrees with the original file (copy was successful), the file is also copied to an FTS directory under

/pnfs/mu2e/persistent/fts

and a json metadata file is created. The FTS is the File Transfer Service, a computing division service which makes SAM records for a file, moves them to tape-backed dCache, then when there is a tape location, it updates the SAM record with that location.

The RDM reports the status of files on a web page.

On the teststand node, there should be the following set of directories, all in the same parent directory. mu2epro needs write access to the directories.

  • output teststand group writes files directly here
  • upload when files are ready to upload teststand group mv's the file form output to here
  • stage when the files are on persistent dcache, RDM mv's the file from upload to here
  • delete when the files are on tape, RDM moves the files here. The teststand group can delete as they wish
  • temp this area is intended to be used to keep some data files around to do analysis, for example


References