Authentication

From Mu2eWiki
Jump to navigation Jump to search


Introduction

The lab services and utilities used by mu2e require gaining and using several kinds of authentication. You will have one username for all computing purposes lab.

You login to the virtual machines with kerberos authentication. You will need a permanent ID called a kerberos "principal" which is looks like "xyz@FNAL.GOV", where xyz is your username. You will have a password associated with your principal. You will use this principal and password to log into the various personal linux desktops located at Fermilab or to ssh into the collaboration interactive machines from your home institution.

The second identity you will need is the services principal, which looks like xyz@services.fnal.gov, or often just xyz, and also has a password (different from your kerberos password). You will need this identity to log into Fermilab email, the servicedesk web site and some other services based at the lab. You would typically only use this authentication at the point you log into the service.

The third identity you will need is a CILogin certificate. This "cert" is the basis of authentication to the mu2e documents database, the computing farms, and a few other services. You will use this cert in two ways. The first way is to load it into your browser, which then gives you access to web pages and web services. The second is by using your kerberos authentication to access a copy of your certificate maintained in a remote database. You get this certificate once and then renew it only once a year.

hypernews is an archived blog and email list - for access here, you will need a hypernews password and your services password!

Finally, the mu2e internal web pages require a collaboration username and password, please ask your mu2e mentor.

There is a standard lab seminar on authentication for users.

If you need to do any computing in mu2e, please go ahead and start the procedure on the ComputingAccounts to create your accounts and authentication.

Kerberos

You login to the virtual machines with kerberos authentication. You will need a permanent ID called a kerberos "principal" which is looks like "xyz@FNAL.GOV", where xyz is your username. You will have a password associated with your principal. You will use this principal and password to log into the various personal linux desktops located at Fermilab or to ssh into the collaboration interactive machines from your home institution.

Your kerberos authentication is stored in a file

/tmp/krb5cc_`id -u`_*

where "*" will be a random string. Each time you ssh into a machine, it will produce a new file in /tmp for that process. The environmental KRB5CCNAME will point to the ticket file. The ticket may be viewed with

klist  

If you log into a lab desktop, you would typically user you username (just "xyz" without the "@FNAL.GOV") a new ticket is created. You can renew the ticket, or create it if you logged on my some other means, using

kinit

kinit takes an argument which is the user name (or full principle) and asked for you password. Tickets are "forwardable" by default. This means if you are logged into machine A with a ticket, and ssh to machine B, the ticket will also be moved to machine B, so you can ssh (or scp, etc) once you are on B.

Tickets are only valid for 26h and you typically refresh your kerberos authentication every day. This would normally happen as you log in to a desktop. If you have an ssh session from machine A to machine B, and leave the session up, you may renew the ticket on A, but the ticket on B will not automatically be updated. in this case, you can use

k5push <node>

to push your fresh ticket out though all ssh sessions and update the remote tickets.

There are two utilities in kerberos which we will only note here. One is kcron and kcroninit. These allows you to gain a kerberos ticket in a cron job. The other variant is the keytab file. This file can hold a ticket that is good for a year and can be accessed by anyone with access to the keytab file. This feature is not as secure, so it is usually only issued by the lab when needed, for example, in a group account.

Read more at the FNAL kerberos link (Miscellaneous Kerberos Topics for the User -> Automated Processes). Reset password here.

Services

The second identity you will need is the services principal, which looks like xyz@services.fnal.gov, or often just xyz, and also has a password (different from your kerberos password). You will need this identity to log into Fermilab email, the servicedesk web site, sharepoint and some other services based at the lab. You would typically only use this authentication at the point you log into the service and their is no local credential cache.

Certificate

For some interactive purposes on linux, via the command line or browsers, you will need a certificate to prove your identity. Our certificates are based on the CILogon Certification Authority (CA) which is geared to big science. Some of the systems which require cert authentication are jobsub job submission, ifdh data transfer and writing to the SAM database as part of file upload to tape.

You should have received a certificate as part of registering for computing accounts. (See also Fermilab docs on certs for creating or renewing a cert.) The cert can be downloaded as a password-protected pem or p12 file and imported in your browser. Your cert is good for a year and only needs to be updated in your browser once a year.

When your cert is first created, it is also communicated to the lab which can then manage and provide the cert for you. When you access your cert at the linux command line, you usually access it from this cache.

kinit 
kx509

kinit makes sure your kerberos identity is valid, and kx509 uses that authentication to make a local, temporary copy of your cert, called a proxy, and writes it to a file named with your UID:

kx509
ls -l /tmp/x509up_u`id -u`
-rw------- 1 rlc mu2e 8171 Aug 15 10:07 /tmp/x509up_u1311


It is this proxy, in this standard location, that commands can use to authenticate you. Note that jobsub and ifdh can automatically run kx509 for you if it is needed (so you only need to remember to kinit), however, samweb does not run kx509 automatically, and you will need to run it yourself if you get authentication errors.

kx509 is equivalent to

setup cigetcert
cigetcert --institution="Fermi National Accelerator Laboratory"

which you might see some places. If you do not have a kerberos ticket when you run cigetcert, it will prompt you for your services password, and you can use this authentication to access your cert and make a proxy.

You can print your proxy certificate with

voms-proxy-info -all

Your cert works by providing encrypted identity information. This packet will be "signed" by a Certificate Authority (CA). The party, such as a lab service, that wants to check your identity can ask the CA if your packet is valid. CA's may be signed by other CA's in turn up to a nationally-recognized organization, in a "trust chain".

Certificate Error

If you believe you have a valid certificate and you still see errors, for example,

Error creating dataset definition for ...
500 SSL negotiation failed: .

Then try removing the cert and recreating it.

kinit 
rm /tmp/x509up_u`id -u`
kx509

The problem is that the proxy created by jobsub or ifdh may require extra steps in authentication because it is "issued" by you. The cert created by kx509 is issued by the CILogon CA, and requires fewer steps to authenticate. While this delete-and-kx509 procedure works fine, the best solution is to provide the intermediate CA in the trust chain to the command (more below).


Types of certs and proxies

A proxy is a copy of your certificate that expires quickly, usually in a few days or hours (see the printouts for "timeleft"). If a bad actor were to gain access to the proxy, they could only use it for the valid time, and they could not replicate it. The proxy is considered safe enough to pass around the grid and over networks. At the command line, we only use proxies.


kx509 creates "end entity certificates", and are temporary copies of your cert. (I've had experts tell me these are technically proxies, but this seems to not be helpful.) Here is what one print looks like:

subject   : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc
issuer    : /DC=org/DC=cilogon/C=US/O=CILogon/CN=CILogon Basic CA 1
identity  : /DC=org/DC=cilogon/C=US/O=CILogon/CN=CILogon Basic CA 1
type      : unknown
strength  : 2048 bits
path      : /tmp/x509up_u1311
timeleft  : 167:59:40
key usage : Digital Signature, Key Encipherment, Data Encipherment

The term "end-entity credential" just means it is below a CA in the certificate trust chain.

If you print your cert and see "subject' with an appendage like "/CN=2707985426" then this is a proxy. It may also say "proxy" in the "type" field. The kx509 certs are not fully RFC-compliant (the current industry standard) for backwards compatibility, so have type "unknown".

You can generate a proxy from your certificate, the easiest way I know is:

kx509 --proxyhours=1

which gives

subject   : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc/CN=528207979
issuer    : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc
identity  : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc
type      : RFC compliant proxy
strength  : 2048 bits
path      : /tmp/x509up_u1311
timeleft  : 0:59:55

This is a proxy derived from your end entity cert, so you are the issuer. It has type proxy and has the "/CN=..." and limited time features of a proxy.


Under some circumstances, you might end up with a voms proxy. voms is a system to track identities for uses on the grid. When you run a job on the grid, this will be the form of the cert you have on the grid node as you run. These proxies have been "extended" with additional fields that can be considered part of your identity, such as your VO (Virtual Organization, i.e. your experiment) and your role ("Analysis" or "Production").

In our current procedures, you should never need to do this directly, but if you want, you can create a voms proxy:

 voms-proxy-init -noregen -rfc -voms fermilab:/fermilab/mu2e/Role=Analysis

This will create a proxy with an extension:

subject   : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc/CN=333302448
issuer    : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc
identity  : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc
type      : RFC compliant proxy
strength  : 1024 bits
path      : /tmp/x509up_u1311
timeleft  : 11:59:57
key usage : Digital Signature, Key Encipherment, Data Encipherment
=== VO fermilab extension information ===
VO        : fermilab
subject   : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc
issuer    : /DC=org/DC=opensciencegrid/O=Open Science Grid/OU=Services/CN=voms2.fnal.gov
attribute : /fermilab/mu2e/Role=Analysis/Capability=NULL
attribute : /fermilab/mu2e/Role=NULL/Capability=NULL
attribute : /fermilab/Role=NULL/Capability=NULL
timeleft  : 11:59:57
uri       : voms2.fnal.gov:15001

The "fermilab" VO is used for all smaller experiments at the lab. In the fermilab VO, there is a "mu2e group". Again, this has all the features of a proxy.

A different location of the proxy can be indicated to most services with some combination of

X509_USER_PROXY=/tmp/x509up_u`id -u`
X509_USER_CERT=/tmp/x509up_u`id -u`
X509_USER_KEY=/tmp/x509up_u`id -u`
X509_CERT_DIR=/etc/grid-security/certificates

With a real cert, you might need to point CERT and KEY to different files, but with a proxy you can point them both to the proxy. The CERT_DIR variable points the command to the intermediate CA certs.

There is an equivalent set for the http protocol:

HTTPS_CA_FILE
HTTPS_CERT_FILE
HTTPS_KEY_FILE
HTTPS_CA_DIR

The RFC for proxy certificates is here: link and for X.509 certificates here: link.

Browsers

When you first get your certificate, you want to load it in your browsers. This is usually straightforward following the browser instructions. When you visit secure web sites, such as DocDB] or Jenkins, you may have to select the certificate to present to the web site.

When you connect to a secure web site that expects a certificate from you, that site will also present your browser with a certificate of its own. Your browser will then attempt to authenticate the certificate. If it cannot, it will open a dialog box telling you that it does not recognize the site's certificate and asking you if you would like to "add an exception". If you add the exception, then your browser will accept this site even though the browser cannot itself authenticate the certificate. This is usually OK, but not ideal.

The way that your browser authenticates a certificate is that it contacts a recognized, trusted, Certificate Authority (CA). It then forwards the certificate in question to the CA and asks "Can I trust this?". If all is well, the CA replies that you can trust it. If your browser does not know the relevant CA to use, or if it does not trust the CA that the certificate says to use, then your browser will start the "add exception" dialog.

Out of the box, your browser usually has a set of CA's from commercial services, but not the CILogon CA, which we are using currently. You can see what CA's are needed by entering the url in at digicert or running

openssl s_client -connect URL:PORT

If it is uses the CILogon certs, you can usually find them here:

/etc/grid-security/certificates/cilogon-basic.pem
/etc/grid-security/certificates/cilogon-osg.pem

Or you can download the same files from CILogon CA certs.

After following the browser instructions for importing the CA cert to your browser, you usually need to check a box to "trust" these certs.

KCA

KCA refers to an old lab-based Certificate Authority system which is now disabled (2016).

Grid Workflows

The following operations (from Workflows) require certs:

  • samweb write operations (such as mu2eFileDeclare) require a kx509 cert
  • ifdh copies to dCache (including mu2eFileUpload --ifdh) require a voms proxy
  • jobsub grid submission requires a voms proxy
  • reading files via xrootd (filespec like xroot://..) requires a voms proxy

When you land on a grid node, you will have a voms proxy, so all procedures should "just work". But when you perform procedure interactively, you need to manage your certificates.

Two procedures, jobsub and ifdh, will try to make a cert for you. We believe the pattern is:

  • check for a voms proxy pointed to by X509 environmentals
  • check for a voms proxy in /tmp/x509up_u$UID
  • check for a voms proxy in /tmp/x509up_voms_mu2e_Analysis_$UID
  • if a voms proxy is found, stop and use it (issue a warning if proxy has less than an hour left)
  • if no voms proxy found look for x509 cert: in X509 environmentals, or in /tmp/x509up_u$UID
  • if a x509 is found, create a 12 h voms proxy in /tmp/x509up_voms_mu2e_Analysis_$UID, use that
  • if no x509 proxy is found, look for one-session (daily, default) kerberos ticket in /tmp/krb5cc_${UID}_*, as pointed to by KRB5CCNAME environmental.
  • if a kerberos ticket is found, create x509 cert in /tmp/x509up_u$UID, and voms proxy in /tmp/x509up_voms_mu2e_Analysis_$UID, use voms proxy
  • if the voms proxy expires repeat the above

While this code is trying to be helpful, it can fall short. For example, if it finds a x509 cert with 2h left on it, the voms proxy will also be limited to 2h. Kerberos session tickets will be deleted when the interactive process that created it exits, which will prevent ifdh and jobsub from using it to renew the voms proxy. (Certs are not deleted when you log off.) Cert-based authentication is unique to a node and user, but not a session which means you might start a script while you have good authentication, then overwrite those certs by an action in a different session on the same machine. These effects can lead to the procedures failing at unexpected times. The only complete solution is to understand, plan, and monitor your authentication.

To setup voms authentication for a long time, so you can run a long mu2eFileUpload, for example, you probably want to refresh your certs first:

TMPP=$(mktemp)
TMPV=$(mktemp)
FINV=/tmp/x509up_u$UID
/usr/krb5/bin/klist
# this is the same as kx509, but with more options
/usr/bin/cigetcert -i "Fermi National Accelerator Laboratory" -o $TMPP
voms-proxy-info -file $TMPP
voms-proxy-init -hours 120 -noregen -rfc \
 -voms fermilab:/fermilab/mu2e/Role=Analysis \
 -cert $TMPP -out $TMPV
mv $TMPV $FINV
rm $TMPP
voms-proxy-info -all

A command has been provided on cvmfs:

/cvmfs/mu2e.opensciencegrid.org/bin/vomsCert


At this point, you should see an x509 cert and voms proxy with 120 h lifetime in /tmp/x509up_u$UID and all procedures should work. You must simply be aware and careful to not overwrite these certs on this machine. The "mv" in the above script makes swapping the new cert in an atomic action, so the cert is never invalid and other scripts can use the cert uninterrupted.

Theoretically, this procedure could be put in your login scripts, but we wouldn't recommend that since you won't be so aware of what's happening.

We have also established that you can create a kcron kerberos ticket (see above) and run the authentication procedure off of the kcron ticket in a cron job, thereby establishing perpetual authentication. To do this, you need to create the kcron keytab file (see also above). To do that, just run:

kcroninit

you'll be prompted for you kerberos password and the keytab will be put on /var/adm. At this point you can see your kcron principle:

kdestroy 
kcron
klist

Run the authentication with a crontab entry:

0 */8 * * * kcron /cvmfs/mu2e.opensciencegrid.org/bin/vomsCert >& ~/vomsCert.log

You will need to do this procedure once on each machine you want to work on. (kcroninit is permanent until you kcrondestroy..)