DataTransfer: Difference between revisions
(Created page with " ==Introduction== Grid jobs that run on thousands of nodes can overwhelm a disk system if they are all reading or writing without throttles. The procedures here are all desig...") |
No edit summary |
||
Line 1: | Line 1: | ||
==Introduction== | ==Introduction== | ||
Grid jobs that run on thousands of nodes can overwhelm a disk system if they are all reading or writing without throttles. The procedures here are all designed to maximize throughput by scheduling transfers in an orderly way. | Grid jobs that run on thousands of nodes can overwhelm a disk system if they are all reading or writing without throttles. The procedures here are all designed to maximize throughput by scheduling transfers in an orderly way. We recommend using ifdh for all non-trivial transfers. | ||
==Monte Carlo== | ==Monte Carlo== | ||
Line 11: | Line 11: | ||
Whenever you are transferring a non-trivial amount of data, and always when running on a grid, you should use ifdh. This product looks at the input and output destination and chooses the best method to transfer data, then schedules your transfer so that no servers or disks are overloaded and the system remains efficient. It usually picks gridftp which is the most efficient mechanism. | Whenever you are transferring a non-trivial amount of data, and always when running on a grid, you should use ifdh. This product looks at the input and output destination and chooses the best method to transfer data, then schedules your transfer so that no servers or disks are overloaded and the system remains efficient. It usually picks gridftp which is the most efficient mechanism. | ||
If used interactively, you should make sure you have a [[Authentication|kerberos ticket]]. If used in a grid job, | If used interactively, you should make sure you have a [[Authentication|kerberos ticket]]. If used in a grid job, your authentication is established automatically for you. Then setup the product | ||
<pre> | <pre> | ||
setup ifdhc (the c is not a typo, it means "client") | setup ifdhc (the c is not a typo, it means "client") | ||
Line 25: | Line 25: | ||
We recommend to always read and write from [[Dcache|dCache]] when transferring non-trivial amounts of data, or any size files to/from non-trivial numbers of grid nodes. [[Dcache|dCache]] has a much better bandwidth than /mu2e/data and is designed fundamentally to serve data to grid systems. | We recommend to always read and write from [[Dcache|dCache]] when transferring non-trivial amounts of data, or any size files to/from non-trivial numbers of grid nodes. [[Dcache|dCache]] has a much better bandwidth than /mu2e/data and is designed fundamentally to serve data to grid systems. | ||
==cpn notes== | |||
Please use ifdh, it will take care of the proper throttling, including | |||
using cpn where appropriate. These notes or for general information. | |||
<pre> | |||
setup cpn | |||
cpn local-file /mu2e/data/users/$USER/remotefile | |||
</pre> | |||
With one exception, the cpn behaves just like the Unix cp command. The exception | |||
is that it first checks the size of the file. If the file is small, it just copies the file. | |||
If the file is large, it checks the number of ongoing copies of large files. If too many | |||
copies are happening at the same time, cpn waits for its turn before it does its copy. | |||
A side effect of this strategy is that there can be some dead time | |||
when your job is occupying a worker node but not doing anything except waiting for | |||
a chance to copy; the experience of the MINOS | |||
experiment is that this loss is small compared to what occurs when /grid/data starts thrashing. | |||
If you are certain that a file will always be small just use cp. If the file size is variable | |||
and may sometimes be large, then use cpn. | |||
We are using the cpn program directly from MINOS, | |||
/grid/fermiapp/minos/scripts/cpn . | |||
The locking mechanism inside cpn uses LOCK files that are maintained in /grid/data/mu2e/LOCK | |||
and in corresponding locations for other collaborations. The use of the file-system to perform locking | |||
means that locking has some overhead. If a file is small enough, it is less work just to copy | |||
the file than it is to use the locking mechanism to serialize the copy. After some | |||
experience it was found the 5 MB is a good choice for the boundary between large and small files. |
Revision as of 22:17, 27 March 2017
Introduction
Grid jobs that run on thousands of nodes can overwhelm a disk system if they are all reading or writing without throttles. The procedures here are all designed to maximize throughput by scheduling transfers in an orderly way. We recommend using ifdh for all non-trivial transfers.
Monte Carlo
For running Monte Carlo grid jobs, please follow the instructions at mu2egrid scripts. This will use the proper tools in an optimized way.
non-Monte Carlo
Whenever you are transferring a non-trivial amount of data, and always when running on a grid, you should use ifdh. This product looks at the input and output destination and chooses the best method to transfer data, then schedules your transfer so that no servers or disks are overloaded and the system remains efficient. It usually picks gridftp which is the most efficient mechanism.
If used interactively, you should make sure you have a kerberos ticket. If used in a grid job, your authentication is established automatically for you. Then setup the product
setup ifdhc (the c is not a typo, it means "client")
and issue transfer commands:
ifdh cp my-local-file /pnfs/mu2e/scratch/users/$USER/remote-file
ifdh knows about dCache (the /pnfs directory) and other Fermilab disks wherever you are running it, even if the /pnfs directory is not mounted on that grid node.
You can also transfer data to/from /mu2e/data/users/$USER. in this case ifdh will use "cpn" locks to make sure no more than 5 grid nodes are writing at any time. Never try to defeat this mechanism by writing from a grid node directly to /mu2e/data. This will almost certainly block access to the disk for anyone else and may crash the system. The /mu2e/data disk is not designed for high bandwidth to multiple processes so this transfer is inherently inefficient.
We recommend to always read and write from dCache when transferring non-trivial amounts of data, or any size files to/from non-trivial numbers of grid nodes. dCache has a much better bandwidth than /mu2e/data and is designed fundamentally to serve data to grid systems.
cpn notes
Please use ifdh, it will take care of the proper throttling, including using cpn where appropriate. These notes or for general information.
setup cpn cpn local-file /mu2e/data/users/$USER/remotefile
With one exception, the cpn behaves just like the Unix cp command. The exception is that it first checks the size of the file. If the file is small, it just copies the file. If the file is large, it checks the number of ongoing copies of large files. If too many copies are happening at the same time, cpn waits for its turn before it does its copy. A side effect of this strategy is that there can be some dead time when your job is occupying a worker node but not doing anything except waiting for a chance to copy; the experience of the MINOS experiment is that this loss is small compared to what occurs when /grid/data starts thrashing.
If you are certain that a file will always be small just use cp. If the file size is variable and may sometimes be large, then use cpn.
We are using the cpn program directly from MINOS,
/grid/fermiapp/minos/scripts/cpn .
The locking mechanism inside cpn uses LOCK files that are maintained in /grid/data/mu2e/LOCK and in corresponding locations for other collaborations. The use of the file-system to perform locking means that locking has some overhead. If a file is small enough, it is less work just to copy the file than it is to use the locking mechanism to serialize the copy. After some experience it was found the 5 MB is a good choice for the boundary between large and small files.