Prestage: Difference between revisions
No edit summary |
No edit summary |
||
(14 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
==Introduction== | ==Introduction== | ||
Line 9: | Line 8: | ||
will be deleted from the dCache when disk space is needed. | will be deleted from the dCache when disk space is needed. | ||
Prestaging is the process of making sure all the files in a [[FileNames#Datasets|dataset]] have been | |||
copied off tape and written back to disk in dCache so they are ready to be used in a grid job. | |||
All files on tape also have a [[SAM]] record and part of a SAM [[FileNames#Datasets|dataset]]. We usually | |||
prestage all the files in a dataset, but any subset of files can be prestaged by operating on a [[SAM#Dataset Definitions and Snapshots|SAM dataset definition]] | |||
Note that accessing the file by by the pnfs directory will not trigger a prestage. See below for prestage techniques. | |||
It takes up to one minute or more to mount a [[Enstore|tape]] and find a random file. | It takes up to one minute or more to mount a [[Enstore|tape]] and find a random file. | ||
Backlogs at times of high demand on tape drives can cause hours of wait time. | Backlogs at times of high demand on tape drives can cause hours to days of wait time. | ||
===Note=== | |||
1/2023 we were informed there is a bug in sam prestage such that a prestage process may stop progressing when it gets near the end of the project. It will look like there are a few files left for a long time. The files may or may not have actually been prestaged. The only solution we know is to put in a ticket to restart the sam station. | |||
==When to prestage== | ==When to prestage== | ||
Line 25: | Line 28: | ||
<pre> | <pre> | ||
setup dhtools | setup dhtools | ||
samOnDisk DATASET | samOnDisk <DATASET> | ||
</pre> | </pre> | ||
on a [[FileNames#Datasets|dataset]] name. This script selects some files at random and check if they are | on a [[FileNames#Datasets|dataset]] name. This script selects some files at random and check if they are | ||
on disk. After a few minutes it should become clear what fraction are on disk. If it is nearly 100%, you | on disk. After a few minutes it should become clear what fraction are on disk. If it is nearly 100%, you | ||
don't need to prestage. | don't need to prestage. | ||
As this script is looping over random files, it is updating the running totals. Here is a file which is on tape (NEARLINE) and on disk (ONLINE). | |||
<pre> | |||
ONLINE_AND_NEARLINE | |||
0 | |||
120/ 159 are on disk, 75.5 % | |||
</pre> | |||
You can see that 120 of 159 files so far were found on disk. Here is one that is not on disk: | |||
<pre> | |||
dc_stage fail : File not cached | |||
System error: Resource temporarily unavailable | |||
NEARLINE | |||
255 | |||
120/ 160 are on disk, 75.0 % | |||
</pre> | |||
so it is only on tape (NEARLINE). You can see the total on disk is still 120, but the total checked is now 160, so the fraction is went down. Since the file selection is random, it should settle down to the usefully correct answer in 100 files or so. | |||
You do not need to prestage a dataset if it is less than a few hundred files. in this case the system | You do not need to prestage a dataset if it is less than a few hundred files. in this case the system | ||
Line 47: | Line 66: | ||
samweb prestage-dataset --parallel=5 --defname=DATASET | samweb prestage-dataset --parallel=5 --defname=DATASET | ||
</pre> | </pre> | ||
You will need [[Authentication#Certificate|certificate authentication]] to run this command, since it writes to the SAM database. | |||
The prestage will create a SAM [[SAM#SAM projects|project]] on the SAM station and create consumers to start requesting files from the project. The project on the SAM station has a knowledge of all the files it will need, so it does two things: | The prestage will create a SAM [[SAM#SAM projects|project]] on the SAM station and create consumers to start requesting files from the project. The project on the SAM station has a knowledge of all the files it will need, so it does two things: | ||
* gives out the files it | * gives out the files it concludes are on disk | ||
* look ahead in the file list and start prestaging upcoming files in a logical and efficient manner | * look ahead in the file list and start prestaging upcoming files in a logical and efficient manner | ||
The first point can be seen as the prestaging proceeding quickly as long it keeps finding files on disk, then slowing down when it starts requesting files off tape. | The first point can be seen as the prestaging proceeding quickly as long it keeps finding files on disk, then slowing down when it starts requesting files off tape. The parallel switch says how many [[SAM]] consumers to run in parallel, making requests to the SAM database - 5 should always be reasonable. | ||
The sam station knows which files are on disk, but we don't know how it knows, or how often it is updated. When sam prestages a file from tape to disk, it updates the last accessed time of the file in the dCache record. If it finds the file is already on disk, it does not update the dCache last accessed time. This procedure means it is possible that dCache is purging files right after sam tells you they were found on disk. | |||
==Prestage more than 100K files== | ==Prestage more than 100K files== | ||
Please take a look at the above section for some background information. | |||
As a practical matter, it is better to handle larger dataset by splitting them up first. | |||
You can do this by creating a set of [[SAM|dataset definitions]] that contain subsets | |||
of the big dataset. There is a script to do this in dhtools: | |||
We recommend breaking up the dataset into chunks of 100K files, so let N=Nfiles/100K, then: | |||
<pre> | |||
setup dhtools | |||
samSplit DATASET TAG N | |||
</pre> | |||
The subsets will come out as N dataset definitions, each a subset of DATASET with SUBSET_NAME <code>${USER}_${TAG}_X</code> where X is the subset's number. | |||
It takes about a day to prestage 100K files, so run one of these every day. | |||
<pre> | |||
setup dhtools | |||
samweb prestage-dataset --parallel=5 --defname=SUBSET_NAME | |||
</pre> | |||
== Prestaging Multiple Small Datasets == | |||
From time to time it is necessary to prestage several datasets as input to one job. For example, MDC2020Dev mixing jobs read inputs from 14 different datasets. There are 1776 files in these datasets and the files are scattered across small number of tape volumes. In this example, prestage time is dominated by waiting for an available tape drive, robot arm motion and tape seek time. If you prestage each dataset on its own you will pay most of these costs 14 times. To optimize this you can form a new [[SAM#Dataset_Definitions_and_Snapshots|dataset definition]] that merges these 14 datasets into one; then you can prestage the the new dataset. In the example of MDC2020DEV it took 1 hour to prestage 70 files from one dataset and it also took 1 hour to prestage 1706 files from the other 13 datasets that were combined into a single new dataset definition. When prestaging a dataset, SAM knows how to find all of the tapes that contain one or more of the files, mount each tape once, copy all files from from the specified dataset that are on each tape and to do so in the order in which the files are found on each tape; there is no wasted robot or tape seek motion. | |||
If there are more than 100k files in the snapshot, split the snapshot as described in the earlier section. | |||
For example, the SAM command to create a snapshot that joins 3 data sets is: | |||
<pre>samweb create-definition <your_user_name>_<unique_id> "dh.dataset dsname1 or dh.dataset dsname2 or dh.dataset dsname3" | |||
</pre> | |||
where the field <pre><your_user_name>_<unique_name></pre> is the name of the new dataset definition. The name should start with your username and it must be unique; for more details see the discussion of [[FileNames#Datasets|dataset definitions]]. | |||
In the above command, the full list of "dh.dataset name" must be on a single line; in a shell script the line may not be broken with a trailing \ . | |||
==Prestage speed== | ==Prestage speed== | ||
Overall the prestage speed is about 100K files per day if things are going well. If the project has to get files off tape, and [[Enstore|enstore]] is very busy, it may slow down | Overall the prestage speed is about 100K files per day if things are going well. If the project has to get files off tape, and [[Enstore|enstore]] is very busy, it may slow down substantially. dCache operators have told us that, during busy periods, it can take up to two weeks to prestage files, and a week to upload files. You can monitor the progress of the SAM station by the SAM station links on the [[OfflineOps]] page. | ||
[[Category:Computing]] | |||
[[Category:Workflows]] | |||
[[Category:DataHandling]] |
Latest revision as of 18:38, 28 January 2023
Introduction
In the process of uploading files to tape, they are copied to a tape-backed dCache disk area. From there, they migrate automatically to tape and after that, the least-used will be deleted from the dCache when disk space is needed.
Prestaging is the process of making sure all the files in a dataset have been copied off tape and written back to disk in dCache so they are ready to be used in a grid job. All files on tape also have a SAM record and part of a SAM dataset. We usually prestage all the files in a dataset, but any subset of files can be prestaged by operating on a SAM dataset definition
Note that accessing the file by by the pnfs directory will not trigger a prestage. See below for prestage techniques. It takes up to one minute or more to mount a tape and find a random file. Backlogs at times of high demand on tape drives can cause hours to days of wait time.
Note
1/2023 we were informed there is a bug in sam prestage such that a prestage process may stop progressing when it gets near the end of the project. It will look like there are a few files left for a long time. The files may or may not have actually been prestaged. The only solution we know is to put in a ticket to restart the sam station.
When to prestage
First, when in doubt, prestage. It is harmless, except for the delay, and you can be confident that tape response will not be a problem.
Next, you can check if the files are on disk. You can run the SAM utility script samOnDisk
:
setup dhtools samOnDisk <DATASET>
on a dataset name. This script selects some files at random and check if they are on disk. After a few minutes it should become clear what fraction are on disk. If it is nearly 100%, you don't need to prestage.
As this script is looping over random files, it is updating the running totals. Here is a file which is on tape (NEARLINE) and on disk (ONLINE).
ONLINE_AND_NEARLINE 0 120/ 159 are on disk, 75.5 %
You can see that 120 of 159 files so far were found on disk. Here is one that is not on disk:
dc_stage fail : File not cached System error: Resource temporarily unavailable NEARLINE 255 120/ 160 are on disk, 75.0 %
so it is only on tape (NEARLINE). You can see the total on disk is still 120, but the total checked is now 160, so the fraction is went down. Since the file selection is random, it should settle down to the usefully correct answer in 100 files or so.
You do not need to prestage a dataset if it is less than a few hundred files. in this case the system should respond in time so that your grid job will succeed. Note that the prestage should also be quick, so still a good idea to run.
Prestage less than 100K files
As a practical matter, it is better to handle larger dataset by splitting them up first. Smaller datasets, less than 100K files, can be prestaged in one command. You can see how many files in your dataset with:
samweb count-files "dh.dataset=DATASET"
where DATASET is a mu2e SAM dataset name. Prestage with
setup dhtools samweb prestage-dataset --parallel=5 --defname=DATASET
You will need certificate authentication to run this command, since it writes to the SAM database.
The prestage will create a SAM project on the SAM station and create consumers to start requesting files from the project. The project on the SAM station has a knowledge of all the files it will need, so it does two things:
- gives out the files it concludes are on disk
- look ahead in the file list and start prestaging upcoming files in a logical and efficient manner
The first point can be seen as the prestaging proceeding quickly as long it keeps finding files on disk, then slowing down when it starts requesting files off tape. The parallel switch says how many SAM consumers to run in parallel, making requests to the SAM database - 5 should always be reasonable.
The sam station knows which files are on disk, but we don't know how it knows, or how often it is updated. When sam prestages a file from tape to disk, it updates the last accessed time of the file in the dCache record. If it finds the file is already on disk, it does not update the dCache last accessed time. This procedure means it is possible that dCache is purging files right after sam tells you they were found on disk.
Prestage more than 100K files
Please take a look at the above section for some background information.
As a practical matter, it is better to handle larger dataset by splitting them up first. You can do this by creating a set of dataset definitions that contain subsets of the big dataset. There is a script to do this in dhtools:
We recommend breaking up the dataset into chunks of 100K files, so let N=Nfiles/100K, then:
setup dhtools samSplit DATASET TAG N
The subsets will come out as N dataset definitions, each a subset of DATASET with SUBSET_NAME ${USER}_${TAG}_X
where X is the subset's number.
It takes about a day to prestage 100K files, so run one of these every day.
setup dhtools samweb prestage-dataset --parallel=5 --defname=SUBSET_NAME
Prestaging Multiple Small Datasets
From time to time it is necessary to prestage several datasets as input to one job. For example, MDC2020Dev mixing jobs read inputs from 14 different datasets. There are 1776 files in these datasets and the files are scattered across small number of tape volumes. In this example, prestage time is dominated by waiting for an available tape drive, robot arm motion and tape seek time. If you prestage each dataset on its own you will pay most of these costs 14 times. To optimize this you can form a new dataset definition that merges these 14 datasets into one; then you can prestage the the new dataset. In the example of MDC2020DEV it took 1 hour to prestage 70 files from one dataset and it also took 1 hour to prestage 1706 files from the other 13 datasets that were combined into a single new dataset definition. When prestaging a dataset, SAM knows how to find all of the tapes that contain one or more of the files, mount each tape once, copy all files from from the specified dataset that are on each tape and to do so in the order in which the files are found on each tape; there is no wasted robot or tape seek motion.
If there are more than 100k files in the snapshot, split the snapshot as described in the earlier section.
For example, the SAM command to create a snapshot that joins 3 data sets is:
samweb create-definition <your_user_name>_<unique_id> "dh.dataset dsname1 or dh.dataset dsname2 or dh.dataset dsname3"
where the field
<your_user_name>_<unique_name>
is the name of the new dataset definition. The name should start with your username and it must be unique; for more details see the discussion of dataset definitions.
In the above command, the full list of "dh.dataset name" must be on a single line; in a shell script the line may not be broken with a trailing \ .
Prestage speed
Overall the prestage speed is about 100K files per day if things are going well. If the project has to get files off tape, and enstore is very busy, it may slow down substantially. dCache operators have told us that, during busy periods, it can take up to two weeks to prestage files, and a week to upload files. You can monitor the progress of the SAM station by the SAM station links on the OfflineOps page.