Rucio: Difference between revisions
Jump to navigation
Jump to search
Line 20: | Line 20: | ||
* metacat has ''roles''. Users can be members of a role, and then the role can create and own objects | * metacat has ''roles''. Users can be members of a role, and then the role can create and own objects | ||
* files records are not deleted - either retired (no new file of the same name can be created) or modified | * files records are not deleted - either retired (no new file of the same name can be created) or modified | ||
* file must belong to at least one dataset (which might not be in the same namespace) | |||
==Quick start== | ==Quick start== |
Revision as of 16:30, 20 June 2023
Introduction
Rucio is a CERN software system for storing file metadata and organizing the delivery of that data to users. Its primary features are scalability, flexibility, adaptive file replication, and built-in monitoring. It can use various backends for databases, various platforms for its servers and daemons, various transfer and storage method plug-ins, and a command line and python interface for users.
The new system would consist of these parts
- Metacat - a database of file metadata (docs GUI)
- Rucio - a database of file locations, and servers which can move and track data, responding to user rules (docs)
- Data Dispatcher - a modern replacement for SAM project file delivery (docs GUI)
- mdh - Mu2e data-handling commands added to supplement the above systems (see
mdh -h
)
A few overarching concepts to keep in mind
- these system only recognition authentication with tokens
- metacat requires you to be authenticated to write to the database (create files, datasets)
- all files belong to a namespace, also known as the scope. Namespaces can't be deleted.
- if you create a namespace, it must start with your username (by policy)
- the combination of namespace:filename uniquely identifies a file and is called a did
- files must be named by the naming convention (by policy)
- all files are readable by all users
- metacat has roles. Users can be members of a role, and then the role can create and own objects
- files records are not deleted - either retired (no new file of the same name can be created) or modified
- file must belong to at least one dataset (which might not be in the same namespace)
Quick start
setup
setup mu2e setup mdh
will setup all related data-handling tools
Authentication
Authenticate yourself
metacat auth login -m token $USER
- if you get a token file not found, please run getToken, or see token docs
- if you get Authentication failed, you might not have an account
Your authentication lasts as long as your token valid period. To check your authentication
metacat auth list
There is no logout
Listing
list namespaces
metacat namespace list
list your namespaces
metacat namespace list -u $USER
Implementation
export METACAT_SERVER_URL=http://dbweb5.fnal.gov:9094/mu2e_meta_prod/app export METACAT_AUTH_SERVER_URL=https://metacat.fnal.gov:8143/auth/mu2e export DATA_DISPATCHER_URL=https://metacat.fnal.gov:9443/mu2e_dd_prod/data export DATA_DISPATCHER_AUTH_URL=https://metacat.fnal.gov:8143/auth/mu2e /pnfs/mu2e/tape /pnfs/mu2e/persistent/datasets /pnfs/mu2e/scratch/datasets (expect to have a greedy cleanup of two weeks)
Admin
- create new accounts via the GUI
- anonymizer is for token access and is the text from the "sub" field from the user's token. Enabling token access is required.
- DN is for proxy access and easiest to get from the user account and
metacat auth mydn -c /tmp/x509up_u$UID
. Could also be left blank - password can be left blank since we expect only token access