StashCache: Difference between revisions

From Mu2eWiki
Jump to navigation Jump to search
(Created page with " ==Introduction== StashCache is implemented with the same use rinterface and database as CVMFS. The use sees the files as if they are on a local file system....")
 
No edit summary
Line 1: Line 1:


==Introduction==
==Introduction==
Different disk systems are optimized for different demands. [[Cvmfs|CVMFS]] is designed to
distribute code releases.  [[Dcache|dCache]] is designed to deliver large datastes to the grid. 
The lab found there was an intermediate case, a few large files that had to be delivered to
all nodes during a grid job.  (Note this is a substantially different that the dCache case where
each node gets a different data file.)  Examples of this use case are Genie files or
analysis template libraries for Nova, or in the case of mu2e, stopped muon ntuples,
magnetic field maps, or sample data files would qualify.  (Currently the first two are
on CVMFS.)


StashCache is implemented with the same use rinterface and database as [[Cvmfs|CVMFS]].   
The [[Grids|OSG]] solution for this use case is called StashCache, a merge of
The use sees the files as if they are on a local file system.
the CVMFS interface with dCache storage.  The CVMFS interface makes the files
look like they are on a nfs-mounted disk and can be copied to the working area or
opened in place.  The files are kept in dCache so the space available is much larger than
for pure CVMFS, which is limited to a few GB cache on the local node.


Like CVMFS, StashCache is filled by copying data to central location.  The data
migrates out to other cache locations at grid sites on the time scale of an hour.
After this latency, the files are available to all grid nodes on OSG.


==Usage==
==Usage==
Line 14: Line 28:


Files to be distributed are copied interactively to the appropriate area under the stash cache directory.
Files to be distributed are copied interactively to the appropriate area under the stash cache directory.
  cp foo $MU2E_STASH_WRITE/path/foo
  cp foo $MU2E_STASH_WRITE/users/$USER/foo


Interactively, or on any grid node, the use can copy it locally if a fast disk response is needed
Interactively, or on any grid node, the use can copy it locally if a fast disk response is needed
  cp $MU2E_STASH_READ/path/foo .
  cp $MU2E_STASH_READ/users/$USER/foo .
or simply open it as a local file.
or simply open it as a disk file.


The cache area is not executable.
The cache area is not executable.

Revision as of 15:21, 2 May 2017

Introduction

Different disk systems are optimized for different demands. CVMFS is designed to distribute code releases. dCache is designed to deliver large datastes to the grid. The lab found there was an intermediate case, a few large files that had to be delivered to all nodes during a grid job. (Note this is a substantially different that the dCache case where each node gets a different data file.) Examples of this use case are Genie files or analysis template libraries for Nova, or in the case of mu2e, stopped muon ntuples, magnetic field maps, or sample data files would qualify. (Currently the first two are on CVMFS.)

The OSG solution for this use case is called StashCache, a merge of the CVMFS interface with dCache storage. The CVMFS interface makes the files look like they are on a nfs-mounted disk and can be copied to the working area or opened in place. The files are kept in dCache so the space available is much larger than for pure CVMFS, which is limited to a few GB cache on the local node.

Like CVMFS, StashCache is filled by copying data to central location. The data migrates out to other cache locations at grid sites on the time scale of an hour. After this latency, the files are available to all grid nodes on OSG.

Usage

Define a read a write area

export MU2E_STASH_WRITE=/pnfs/mu2e/persistent/stash
export MU2E_STASH_READ=/cvmfs/mu2e.osgstorage.org/pnfs/fnal.gov/usr/mu2e/persistent

Files to be distributed are copied interactively to the appropriate area under the stash cache directory.

cp foo $MU2E_STASH_WRITE/users/$USER/foo

Interactively, or on any grid node, the use can copy it locally if a fast disk response is needed

cp $MU2E_STASH_READ/users/$USER/foo .

or simply open it as a disk file.

The cache area is not executable.