An example of how to archive files to tape
The procedure is for saving "random" assortments of files as tape archives. One would have to retrieve and expand a tar file later to access its content; the individual files will not be accessible from their tape storage in a transparent way. This corresponds to the "Random files" option here. The current page walks through steps in an example use case.
Decide how to divide data
Decide how to divide data to be archived into chunks. A good size target is 2GB per chunk. For example
$ du -ks /disk2/data/tmnguyen/socketudp/data*2020 | awk '{s+=$1; print s"\t"$0}'
433728 433728 /disk2/data/tmnguyen/socketudp/data_Feb02_2020
1058092 624364 /disk2/data/tmnguyen/socketudp/data_Feb03_2020
1271012 212920 /disk2/data/tmnguyen/socketudp/data_Feb04_2020
1476712 205700 /disk2/data/tmnguyen/socketudp/data_Feb06_2020
2498948 1022236 /disk2/data/tmnguyen/socketudp/data_Feb07_2020
2498956 8 /disk2/data/tmnguyen/socketudp/data_Feb09_2020
3022512 523556 /disk2/data/tmnguyen/socketudp/data_Feb10_2020
3022712 200 /disk2/data/tmnguyen/socketudp/data_Feb11_2020
3042356 19644 /disk2/data/tmnguyen/socketudp/data_Feb12_2020
3770908 728552 /disk2/data/tmnguyen/socketudp/data_Feb17_2020
4705628 934720 /disk2/data/tmnguyen/socketudp/data_Feb19_2020
4795148 89520 /disk2/data/tmnguyen/socketudp/data_Feb20_2020
5092448 297300 /disk2/data/tmnguyen/socketudp/data_Feb21_2020
5751636 659188 /disk2/data/tmnguyen/socketudp/data_Feb22_2020
5755548 3912 /disk2/data/tmnguyen/socketudp/data_Feb25_2020
5762860 7312 /disk2/data/tmnguyen/socketudp/data_Feb26_2020
6084616 321756 /disk2/data/tmnguyen/socketudp/data_Jan03_2020
6094680 10064 /disk2/data/tmnguyen/socketudp/data_Jan22_2020
6129020 34340 /disk2/data/tmnguyen/socketudp/data_Jan23_2020
6236776 107756 /disk2/data/tmnguyen/socketudp/data_Jan24_2020
6241600 4824 /disk2/data/tmnguyen/socketudp/data_Jan29_2020
6628736 387136 /disk2/data/tmnguyen/socketudp/data_Jan30_2020
7372852 744116 /disk2/data/tmnguyen/socketudp/data_Jan31_2020
so the total size of all of the "data*2020" directories is about 7.3 GB.
Therefore they can be archived as a single tar file, or split into a few files.
Prepare tar files
The naming of files for tape upload is important. File names must follow the Mu2e convention explained here.
File names will look like
data_tier.owner.description.configuration.sequencer.file_format
A critical point is that every file name must be unique, the "sequencer" field is to make this possible when uploading a "dataset" of similar files.
Continuing with the example,
$ cd /disk2/data/tmnguyen/socketudp $ tar jcvf /disk2/bck.tmnguyen.socketudp.v0.2020part1.tbz data*Jan*2020 $ tar jcvf /disk2/bck.tmnguyen.socketudp.v0.2020part2.tbz data*Feb0*2020 $ tar jcvf /disk2/bck.tmnguyen.socketudp.v0.2020part3.tbz data*Feb1*2020
Copy the files to a mu2egpvm machine
ssh mu2egpvm01.fnal.gov $ mkdir /mu2e/data/users/tmnguyen $ cd /mu2e/data/users/tmnguyen $ scp -p mu2etest.fnal.gov:/disk2/bck.\*.tbz .
(and remove them from mu2etest)
Tape upload
This step is best done using VNC or a terminal server like "screen" or "tmux", because the command will probably take more than a day to complete. A terminal server will prevent the command from being killed if ssh session disconnects for any reason.
ssh mu2egpvm01.fnal.gov $ screen $ mu2einit $ setup mu2efiletools $ kx509 $ cd /mu2e/data/users/tmnguyen $ mu2eFileMoveToTape bck.*.tbz
Another way to ensure that the command will complete is to run it under nohup. A complication here is the buffering of the nohup.out file.
Check
Check the next day whether the process is complete. You can start a new ssh shell and re-connect to the existing "screen" session:
ssh mu2egpvm01.fnal.gov $ screen -d -r
If everything worked fine you'll see the source files given to the mu2eFileMoveToTape command deleted. If they are still there and you are SURE that the original mu2eFileMoveToTape process is not running any more (on any mu2egpvm node!), you can re-run the same mu2eFileMoveToTape command again, until the upload succeeds.