StashCache

Introduction

Different disk systems are optimized for different demands. CVMFS is designed to distribute code releases. dCache is designed to deliver large datastes to the grid. The lab found there was an intermediate case, a few large files that had to be delivered to all nodes during a grid job. (Note this is a substantially different that the dCache case where each node gets a different data file.) Examples of this use case are Genie files or analysis template libraries for Nova, or in the case of mu2e, stopped muon ntuples, magnetic field maps, or sample data files would qualify. (Currently the first two are on CVMFS.)

The OSG solution for this use case is called StashCache, a merge of the CVMFS interface with dCache storage. The CVMFS interface makes the files look like they are on a nfs-mounted disk and can be copied to the working area or opened in place. The files are kept in dCache so the space available is much larger than for pure CVMFS, which is limited to a few GB cache on the local node.

Like CVMFS, StashCache is filled by copying data to central location. The data migrates out to other cache locations at grid sites on the time scale of an hour. After this latency, the files are available to all grid nodes on OSG.

Usage

Define a read a write area

export MU2E_STASH_WRITE=/pnfs/mu2e/persistent/stash
export MU2E_STASH_READ=/cvmfs/mu2e.osgstorage.org/pnfs/fnal.gov/usr/mu2e/persistent

Files to be distributed are copied interactively to the appropriate area under the stash cache directory.

mkdir $MU2E_STASH_WRITE/users/$USER
cp foo $MU2E_STASH_WRITE/users/$USER/foo

Interactively, or on any grid node, the user can copy it locally if a fast disk response is needed

cp $MU2E_STASH_READ/users/$USER/foo .

or simply open it as a disk file.

The cache area is not executable.

StashCache

Introduction

Usage

Navigation menu

Search