An example of how to archive files to tape
The procedure is for saving "random" assortments of files as tape archives. One would have to retrieve and expand a tar file later to access its content; the individual files will not be accessible from their tape storage in a transparent way. This corresponds to the "Random files" option here. The current page walks through steps in an example use case.
Decide how to divide data
Decide how to divide data to be archived into chunks. A good size target is 2GB per chunk. For example
$ du -ks /disk2/data/tmnguyen/socketudp/data*2020 | awk '{s+=$1; print s"\t"$0}' 433728 433728 /disk2/data/tmnguyen/socketudp/data_Feb02_2020 1058092 624364 /disk2/data/tmnguyen/socketudp/data_Feb03_2020 1271012 212920 /disk2/data/tmnguyen/socketudp/data_Feb04_2020 1476712 205700 /disk2/data/tmnguyen/socketudp/data_Feb06_2020 2498948 1022236 /disk2/data/tmnguyen/socketudp/data_Feb07_2020 2498956 8 /disk2/data/tmnguyen/socketudp/data_Feb09_2020 3022512 523556 /disk2/data/tmnguyen/socketudp/data_Feb10_2020 3022712 200 /disk2/data/tmnguyen/socketudp/data_Feb11_2020 3042356 19644 /disk2/data/tmnguyen/socketudp/data_Feb12_2020 3770908 728552 /disk2/data/tmnguyen/socketudp/data_Feb17_2020 4705628 934720 /disk2/data/tmnguyen/socketudp/data_Feb19_2020 4795148 89520 /disk2/data/tmnguyen/socketudp/data_Feb20_2020 5092448 297300 /disk2/data/tmnguyen/socketudp/data_Feb21_2020 5751636 659188 /disk2/data/tmnguyen/socketudp/data_Feb22_2020 5755548 3912 /disk2/data/tmnguyen/socketudp/data_Feb25_2020 5762860 7312 /disk2/data/tmnguyen/socketudp/data_Feb26_2020 6084616 321756 /disk2/data/tmnguyen/socketudp/data_Jan03_2020 6094680 10064 /disk2/data/tmnguyen/socketudp/data_Jan22_2020 6129020 34340 /disk2/data/tmnguyen/socketudp/data_Jan23_2020 6236776 107756 /disk2/data/tmnguyen/socketudp/data_Jan24_2020 6241600 4824 /disk2/data/tmnguyen/socketudp/data_Jan29_2020 6628736 387136 /disk2/data/tmnguyen/socketudp/data_Jan30_2020 7372852 744116 /disk2/data/tmnguyen/socketudp/data_Jan31_2020
so the total size of all of the "data*2020" directories is about 7.3 GB.
Therefore they can be archived as a single tar file, or split into a few files.
Prepare tar files
The naming of files for tape upload is important. File names must follow the Mu2e convention explained here.
File names will look like
data_tier.owner.description.configuration.sequencer.file_format
A critical point is that every file name must be unique, the "sequencer" field is to make this possible when uploading a "dataset" of similar files.
Continuing with the example,
$ cd /disk2/data/tmnguyen/socketudp $ tar jcvf /disk2/bck.tmnguyen.socketudp.v0.2020part1.tbz data*Jan*2020 $ tar jcvf /disk2/bck.tmnguyen.socketudp.v0.2020part2.tbz data*Feb0*2020 $ tar jcvf /disk2/bck.tmnguyen.socketudp.v0.2020part3.tbz data*Feb1*2020
Copy the files to a mu2egpvm machine
ssh mu2egpvm01.fnal.gov $ mkdir /mu2e/data/users/tmnguyen $ cd /mu2e/data/users/tmnguyen $ scp -p mu2etest.fnal.gov:/disk2/bck.\*.tbz .
(and remove them from mu2etest)
Tape upload
This step is best done using VNC or a terminal server like "screen" or "tmux", because the command will probably take more than a day to complete. A terminal server will prevent the command from being killed if ssh session disconnects for any reason.
ssh mu2egpvm01.fnal.gov $ screen $ mu2einit $ setup mu2efiletools $ kx509 $ cd /mu2e/data/users/tmnguyen $ mu2eFileMoveToTape bck.*.tbz
Another way to ensure that the command will complete is to run it under nohup. A complication here is the buffering of the nohup.out file.
Check
Check the next day whether the process is complete. You can start a new ssh shell and re-connect to the existing "screen" session:
ssh mu2egpvm01.fnal.gov $ screen -d -r
If everything worked fine you'll see the source files given to the mu2eFileMoveToTape command deleted. If they are still there and you are SURE that the original mu2eFileMoveToTape process is not running any more (on any mu2egpvm node!), you can re-run the same mu2eFileMoveToTape command again, until the upload succeeds.