Authentication: Difference between revisions
(26 intermediate revisions by 3 users not shown) | |||
Line 2: | Line 2: | ||
==Introduction== | ==Introduction== | ||
You have one Fermilab username for all computing purposes at Fermilab. It was assigned when you applied for your Fermilab ID and computer accounts. This one username is used by several different authentication systems. | |||
You login to | The lab provides interactive linux machines for use by Mu2e; see [[ComputingTutorials#Interactive_logins]]. You login in to these machines using '''kerberos''' authentication. You identify yourself to kerberos using your kerberos "principal" which is looks like "xyz@FNAL.GOV", where xyz is your Fermilab username. Usually you only need to type the xyz part. When your Fermilab accounts were created you set a password for kerberos. You will use your kerberos principal and password to log into all lab supplied linux computing resources. | ||
The second identity you will need is | The second identity you will need is your Fermilab Single Sign On ('''SSO''') identity. For historical reasons this is sometimes called your '''services''' principal, which looks like xyz@services.fnal.gov, where xyz is your Fermilab username. Usually you only need to type the xyz part. When your Fermilab computing accounts were created, you set your SSO password; it should be different from your kerberos password. You use your SSO password to log into the [https://mu2e.fnal.gov/atwork/ Mu2e web site], [https://mu2e-docdb.fnal.gov/cgi-bin/sso/DocumentDatabase/ the Mu2e DocDB], [https://dbweb9.fnal.gov:8443/ECL/mu2e/U/login?message=Login%20required&ret_url=/ECL/mu2e/E/index the Electronic Log Book], the [https://mu2einternalwiki.fnal.gov/wiki/Main_Page Mu2e internal wiki] and to edit [https://mu2ewiki.fnal.gov/wiki/Main_Page the Mu2e public wiki]. You also use it to access Fermilab email, the Service Desk web site and some other services hosted by Fermilab. | ||
The third identity you will need is a '''CILogon certificate'''. | The third identity you will need is a '''CILogon certificate'''. You use this certifcate by loading it into your browser, which then gives you access to some Fermilab supported web pages and web services. See [[#Certifcate]], below. | ||
[https://mu2e-hnews.fnal.gov/HyperNews/Mu2e/top.pl | The fourth identity that you need is to access [https://mu2e-hnews.fnal.gov/HyperNews/Mu2e/top.pl Mu2e h=Hypernews], which is an archived blog and email list. To learn about Hypernews see [[Communications and Collaborative Tools#Hypernews]]. | ||
A small number of Mu2e people may also need a fifth identity. Fermilab maintains a separate kerberos system to grant access to a cluster of interactive Windows machines. The kerberos principal for these machines looks like xyz@FERMI.GOV (FERMI, not FNAL), where xyz is your Fermilab username. If you need an account on these machines you can requrest one by opening a [https://servicedesk.fnal.gov/ Service Desk] ticket. | |||
There is a standard [http://cd-docdb.fnal.gov/cgi-bin/RetrieveFile?docid=5884 lab seminar] on authentication for users. | There is a standard [http://cd-docdb.fnal.gov/cgi-bin/RetrieveFile?docid=5884 lab seminar] on authentication for users. | ||
For instructions on how to get a Fermilab ID, Fermilab computing accounts and Mu2e computing accounts, see [[ComputingAccounts]]. | |||
==Kerberos== | ==Kerberos== | ||
Line 50: | Line 50: | ||
You should have received a certificate as part of [[ComputingAccounts|registering]] for computing accounts. (See also [https://fermi.servicenowservices.com/kb_view.do?sysparm_article=KB0010773 Fermilab docs on certs] for creating or renewing a cert.) The cert can be downloaded as a password-protected '''pem''' or '''p12''' file and imported in your browser. Your cert is good for a year and only needs to be updated in your browser once a year. | You should have received a certificate as part of [[ComputingAccounts|registering]] for computing accounts. (See also [https://fermi.servicenowservices.com/kb_view.do?sysparm_article=KB0010773 Fermilab docs on certs] for creating or renewing a cert.) The cert can be downloaded as a password-protected '''pem''' or '''p12''' file and imported in your browser. Your cert is good for a year and only needs to be updated in your browser once a year. | ||
When your cert is first created, it is also communicated to the lab which can then manage and provide the cert for you. When you access your cert at the linux command line, you usually access it from this cache. | When your cert is first created, it is also communicated to the lab which can then manage and provide the cert for you. When you access your cert at the linux command line, you usually access it from this cache. To access the cache you need a valid kerberos ticket. When this access occurs, you get a local file called a '''proxy''' which is a copy of your cert, but only valid for a finite time. | ||
It is this proxy that commands can use to authenticate you. Note that '''jobsub''' and '''ifdh''' can automatically create a proxy for you if it is needed (so you only need to remember to kinit), however, [[SAM|samweb]] does not create proxies automatically, and you will need to do that yourself if you get authentication errors. | |||
Certs may also have "extensions" added by the lab which enable certain access. The one we use is the "VO" or "virtual organization" extension. All experiments in the Intensity Frontier are part of the fermilab VO. This allows access to OSG grid sites, for example. | |||
The easiest way to get a proxy is to run '''vomsCert''' which is in your path after you <code>mu2einit</code>: | |||
mu2einit | |||
kinit | |||
vomsCert | |||
The following are some details and alternatives for getting proxies | |||
<pre> | <pre> | ||
kinit | kinit | ||
Line 57: | Line 68: | ||
</pre> | </pre> | ||
'''kx509''' uses kerberos authentication to make a proxy, and writes it to a file named with your UID: | |||
<pre> | <pre> | ||
kx509 | kx509 | ||
Line 64: | Line 75: | ||
</pre> | </pre> | ||
kx509 is equivalent to | kx509 is equivalent to | ||
Line 71: | Line 80: | ||
cigetcert --institution="Fermi National Accelerator Laboratory" | cigetcert --institution="Fermi National Accelerator Laboratory" | ||
which you might see some places. If you do not have a kerberos ticket when you run cigetcert, it will prompt you for your services password, and you can use this authentication to access your cert and make a proxy. | which you might see some places. If you do not have a kerberos ticket when you run cigetcert, it will prompt you for your services password, and you can use this authentication to access your cert and make a proxy. | ||
You can print your proxy certificate with | You can print your proxy certificate with | ||
Line 179: | Line 189: | ||
===Browsers=== | ===Browsers=== | ||
When you first get your certificate, you want to load it in your browsers. This is usually straightforward following the browser instructions. When you visit secure web sites, such as | When you first get your certificate, you want to load it in your browsers. This is usually straightforward following the browser instructions. When you visit secure web sites, such as [[Jenkins]], you may have to select the certificate to present to the web site. | ||
When you connect to a secure web site that expects a certificate from you, that | When you connect to a secure web site that expects a certificate from you, that | ||
Line 271: | Line 281: | ||
You will need to do this procedure once on each machine you want to work on. (kcroninit is permanent until you kcrondestroy..) | You will need to do this procedure once on each machine you want to work on. (kcroninit is permanent until you kcrondestroy..) | ||
==Tokens== | |||
Tokens are a form of authentication intended to replace x509 certs in the process of authenticating | |||
* grid job submission | |||
* dCache reads and writes (using protocols other than nfs access in /pnfs file system) | |||
* SAM database write access | |||
Tokens are (2/2023) becoming the leading authentication method for Mu2e. There are docs: | |||
* [https://indico.fnal.gov/event/57514/ computing division tutorial] | |||
* [https://fifewiki.fnal.gov/wiki/Getting_started_with_jobsub_lite FIFE Wiki] | |||
* [https://mu2e-docdb.fnal.gov/cgi-bin/sso/ShowDocument?docid=44379 Mu2e notes] | |||
* [https://landscape.fnal.gov/monitor/d/56WoQ2cVk/token-scopes-capability-sets?orgId=1&var-experiment=mu2e&var-role=All roles] | |||
There are several files involved in tokens | |||
# your permanent token at <code>cilogon.org</code> | |||
# the '''refresh token''' which is valid for 30d and is kept in the lab's central vaultserver database | |||
# <code>/tmp/vt_u$(id -u)</code> the '''vault token''', valid for 7d | |||
# <code>$HOME/.config/htgettoken/credkey-mu2e-default</code> a pointer into a record in the vaultserver, saved here for faster lookup | |||
# <code>${XDG_RUNTIME_DIR:-/tmp}/bt_u$(id -u)</code> (looks like <code>/run/user/1311/bt_u1311</code>), the '''bearer token''', valid for usually on the scale of hours. This is the one actually used in authentication. | |||
To get a new token, issue the command | |||
htgettoken -i mu2e --vaultserver htvaultprod.fnal.gov | |||
or this is in a script available after <code>mu2einit</code>: | |||
getToken | |||
(In the future the arguments may have useful defaults.) Now there are two cases. If you have not been active in over 30d, or this is the first time getting a token, then none of your token caches are valid except the permanent one at cilogon.org, so you have to go there to start the process. You will see: | |||
Complete the authentication at: | |||
https://cilogon.org/device/?user_code=ZCC-9M3-W6K | |||
If the system can access a local web browser, it may open the page for you, but if not, you will have to cut and past this url into a browser. Once you access this url, and enter your lab SSO user/password, then the system will recognize your authentication and populate all your token caches. | |||
== | The second case occurs when it has been less than 30d since the last time you got a token. In this case, your vault token or your refresh token will be valid, and then your kerberos ticket will be used to access those caches and refresh all your tokens. | ||
You can see the token with | |||
httokendecode | |||
or | |||
seeToken | |||
and delete the token with | |||
delToken | |||
The common procedures (job submission using <code>jobsub</code>, data copies using <code>ifdh</code> and SAM write commands using <code>samweb</code>) that need this authentication, will attempt to use your kerberos ticket to refresh your tokens (the second case) whenever tokens are needed. The envisioned process is that the refresh is done by the tool when you use the tool, and without you taking any explicit token action (except when you have not been active in 30d and fall back into the first case). | |||
If you need to run a command for many hours, so need to have a token repeatedly renewed, you can use <code>httokensh</code> | |||
The following allows a test of the tokens | |||
setup ifdhc v2_6_6 | |||
export IFDH_PROXY_ENABLE=0 | |||
export IFDH_TOKEN_ENABLE=1 | |||
export IFDH_DEBUG=10 | |||
ifdhc cp ... | |||
===Mu2epro=== | |||
The SCD FIFE group provides a service to push a valid vault token to the mu2epro account every 10 min (probably will be set to several hours in the future). The write access is granted by adding | |||
mu2epro/managedtokens/fifeutilgpvm01.fnal.gov@FNAL.GOV | |||
to the <code>k5login</code>. Setting | |||
export HTGETTOKENOPTS="--credkey=mu2epro/managedtokens/fifeutilgpvm01.fnal.gov" | |||
in bashrc allows token refresh commands to find it. | |||
There are actually two vault tokens pushed: | |||
-rw------- 1 mu2epro mu2e 96 Aug 4 08:40 /tmp/vt_u44592 | |||
-rw------- 1 mu2epro mu2e 96 Aug 4 08:40 /tmp/vt_u44592-mu2e_production | |||
Either of these can help generate the identical bearer token. Essentially, the two names for one thing are baked into the tool scripts for now. This might be fixed in the future. | |||
To refresh tokens (usually done automatically by the tools) | |||
htgettoken -a fermicloud543.fnal.gov -i mu2e -r production | |||
this is also available after <code>mu2einit</code> as | |||
getToken | |||
Setting | |||
export BEARER_TOKEN_FILE=/tmp/bt_token_mu2e_production_44592 | |||
causes the token procedures to always find the "production" one. | |||
===references=== | |||
[https://docs.google.com/document/d/1JoC1lyk0WJi9xMyaoJHvEBHsQfROslowe606nEAqFc4/edit# Tanya page] | [https://docs.google.com/document/d/1JoC1lyk0WJi9xMyaoJHvEBHsQfROslowe606nEAqFc4/edit# Tanya page] | ||
Line 279: | Line 357: | ||
[https://indico.fnal.gov/event/51072/ 9/22/21] | [https://indico.fnal.gov/event/51072/ 9/22/21] | ||
[https://indico.fnal.gov/event/51189/ 10/6/21] | |||
==SSH Keys== | ==SSH Keys== | ||
Line 399: | Line 479: | ||
Also test that your ssh keys have been forwarded to the mu2e interactive machines as follows: | Also test that your ssh keys have been forwarded to the mu2e interactive machines as follows: | ||
<nowiki>ssh mu2egpvm01 | <nowiki>ssh mu2egpvm01 | ||
mu2einit | |||
git clone git@github.com:username/Offline</nowiki> | git clone git@github.com:username/Offline</nowiki> | ||
Line 408: | Line 488: | ||
===Install Miniconda=== | ===Install Miniconda=== | ||
Starting on Oct 16, 2024, Fermilab will block the Anaconda and Miniconda repositories when accessed from onsite. The recommended option is to use Miniforge. For more details see [[Pyana#Anaconda]]. | |||
<font color=red>Fixme:</font> The instructions below need to be edited to point people at Miniforge. | |||
The first thing we do is to download and run [https://docs.conda.io/en/latest/miniconda.html Miniconda], a minimal installer of conda. | The first thing we do is to download and run [https://docs.conda.io/en/latest/miniconda.html Miniconda], a minimal installer of conda. | ||
So we log into a Mu2e gpvm (e.g. mu2egpvm01) and run: | So we log into a Mu2e gpvm (e.g. mu2egpvm01) and run: | ||
<nowiki>cd | <nowiki>cd | ||
mu2einit | |||
setup python v3_8_3b | setup python v3_8_3b | ||
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh | wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh |
Latest revision as of 20:42, 4 October 2024
Introduction
You have one Fermilab username for all computing purposes at Fermilab. It was assigned when you applied for your Fermilab ID and computer accounts. This one username is used by several different authentication systems.
The lab provides interactive linux machines for use by Mu2e; see ComputingTutorials#Interactive_logins. You login in to these machines using kerberos authentication. You identify yourself to kerberos using your kerberos "principal" which is looks like "xyz@FNAL.GOV", where xyz is your Fermilab username. Usually you only need to type the xyz part. When your Fermilab accounts were created you set a password for kerberos. You will use your kerberos principal and password to log into all lab supplied linux computing resources.
The second identity you will need is your Fermilab Single Sign On (SSO) identity. For historical reasons this is sometimes called your services principal, which looks like xyz@services.fnal.gov, where xyz is your Fermilab username. Usually you only need to type the xyz part. When your Fermilab computing accounts were created, you set your SSO password; it should be different from your kerberos password. You use your SSO password to log into the Mu2e web site, the Mu2e DocDB, the Electronic Log Book, the Mu2e internal wiki and to edit the Mu2e public wiki. You also use it to access Fermilab email, the Service Desk web site and some other services hosted by Fermilab.
The third identity you will need is a CILogon certificate. You use this certifcate by loading it into your browser, which then gives you access to some Fermilab supported web pages and web services. See #Certifcate, below.
The fourth identity that you need is to access Mu2e h=Hypernews, which is an archived blog and email list. To learn about Hypernews see Communications and Collaborative Tools#Hypernews.
A small number of Mu2e people may also need a fifth identity. Fermilab maintains a separate kerberos system to grant access to a cluster of interactive Windows machines. The kerberos principal for these machines looks like xyz@FERMI.GOV (FERMI, not FNAL), where xyz is your Fermilab username. If you need an account on these machines you can requrest one by opening a Service Desk ticket.
There is a standard lab seminar on authentication for users.
For instructions on how to get a Fermilab ID, Fermilab computing accounts and Mu2e computing accounts, see ComputingAccounts.
Kerberos
You login to the virtual machines with kerberos authentication. You will need a permanent ID called a kerberos "principal" which is looks like "xyz@FNAL.GOV", where xyz is your username. You will have a password associated with your principal. You will use this principal and password to log into the various personal linux desktops located at Fermilab or to ssh into the collaboration interactive machines from your home institution.
Your kerberos authentication is stored in a file
/tmp/krb5cc_`id -u`_*
where "*" will be a random string. Each time you ssh into a machine, it will produce a new file in /tmp for that process. The environmental KRB5CCNAME
will point to the ticket file.
The ticket may be viewed with
klist
If you log into a lab desktop, you would typically user you username (just "xyz" without the "@FNAL.GOV") a new ticket is created. You can renew the ticket, or create it if you logged on my some other means, using
kinit
kinit takes an argument which is the user name (or full principle) and asked for you password. Tickets are "forwardable" by default. This means if you are logged into machine A with a ticket, and ssh to machine B, the ticket will also be moved to machine B, so you can ssh (or scp, etc) once you are on B.
Tickets are only valid for 26h and you typically refresh your kerberos authentication every day. This would normally happen as you log in to a desktop. If you have an ssh session from machine A to machine B, and leave the session up, you may renew the ticket on A, but the ticket on B will not automatically be updated. in this case, you can use
k5push <node>
to push your fresh ticket out though all ssh sessions and update the remote tickets.
There are two utilities in kerberos which we will only note here. One is kcron
and kcroninit
. These allows you to gain a kerberos ticket in a cron job. The other variant is the keytab
file. This file can hold a ticket that is good for a year and can be accessed by anyone with access to the keytab file. This feature is not as secure, so it is usually only issued by the lab when needed, for example, in a group account.
Read more at the FNAL kerberos link (Miscellaneous Kerberos Topics for the User -> Automated Processes). Reset password here.
Services
The second identity you will need is the services principal, which looks like xyz@services.fnal.gov, or often just xyz, and also has a password (different from your kerberos password). You will need this identity to log into Fermilab email, the servicedesk web site, sharepoint and some other services based at the lab. You would typically only use this authentication at the point you log into the service and their is no local credential cache.
Certificate
For some interactive purposes on linux, via the command line or browsers, you will need a certificate to prove your identity. Our certificates are based on the CILogon Certification Authority (CA) which is geared to big science. Some of the systems which require cert authentication are jobsub job submission, ifdh data transfer and writing to the SAM database as part of file upload to tape.
You should have received a certificate as part of registering for computing accounts. (See also Fermilab docs on certs for creating or renewing a cert.) The cert can be downloaded as a password-protected pem or p12 file and imported in your browser. Your cert is good for a year and only needs to be updated in your browser once a year.
When your cert is first created, it is also communicated to the lab which can then manage and provide the cert for you. When you access your cert at the linux command line, you usually access it from this cache. To access the cache you need a valid kerberos ticket. When this access occurs, you get a local file called a proxy which is a copy of your cert, but only valid for a finite time.
It is this proxy that commands can use to authenticate you. Note that jobsub and ifdh can automatically create a proxy for you if it is needed (so you only need to remember to kinit), however, samweb does not create proxies automatically, and you will need to do that yourself if you get authentication errors.
Certs may also have "extensions" added by the lab which enable certain access. The one we use is the "VO" or "virtual organization" extension. All experiments in the Intensity Frontier are part of the fermilab VO. This allows access to OSG grid sites, for example.
The easiest way to get a proxy is to run vomsCert which is in your path after you mu2einit
:
mu2einit kinit vomsCert
The following are some details and alternatives for getting proxies
kinit kx509
kx509 uses kerberos authentication to make a proxy, and writes it to a file named with your UID:
kx509 ls -l /tmp/x509up_u`id -u` -rw------- 1 rlc mu2e 8171 Aug 15 10:07 /tmp/x509up_u1311
kx509 is equivalent to
setup cigetcert cigetcert --institution="Fermi National Accelerator Laboratory"
which you might see some places. If you do not have a kerberos ticket when you run cigetcert, it will prompt you for your services password, and you can use this authentication to access your cert and make a proxy.
You can print your proxy certificate with
voms-proxy-info -all
Your cert works by providing encrypted identity information. This packet will be "signed" by a Certificate Authority (CA). The party, such as a lab service, that wants to check your identity can ask the CA if your packet is valid. CA's may be signed by other CA's in turn up to a nationally-recognized organization, in a "trust chain".
Certificate Error
If you believe you have a valid certificate and you still see errors, for example,
Error creating dataset definition for ... 500 SSL negotiation failed: .
Then try removing the cert and recreating it.
kinit rm /tmp/x509up_u`id -u` kx509
The problem is that the proxy created by jobsub or ifdh may require extra steps in authentication because it is "issued" by you. The cert created by kx509 is issued by the CILogon CA, and requires fewer steps to authenticate. While this delete-and-kx509 procedure works fine, the best solution is to provide the intermediate CA in the trust chain to the command (more below).
Types of certs and proxies
A proxy is a copy of your certificate that expires quickly, usually in a few days or hours (see the printouts for "timeleft"). If a bad actor were to gain access to the proxy, they could only use it for the valid time, and they could not replicate it. The proxy is considered safe enough to pass around the grid and over networks. At the command line, we only use proxies.
kx509 creates "end entity certificates", and are temporary copies of your cert. (I've had experts tell me these are technically proxies, but this seems to not be helpful.) Here is what one print looks like:
subject : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc issuer : /DC=org/DC=cilogon/C=US/O=CILogon/CN=CILogon Basic CA 1 identity : /DC=org/DC=cilogon/C=US/O=CILogon/CN=CILogon Basic CA 1 type : unknown strength : 2048 bits path : /tmp/x509up_u1311 timeleft : 167:59:40 key usage : Digital Signature, Key Encipherment, Data Encipherment
The term "end-entity credential" just means it is below a CA in the certificate trust chain.
If you print your cert and see "subject' with an appendage like "/CN=2707985426" then this is a proxy. It may also say "proxy" in the "type" field. The kx509 certs are not fully RFC-compliant (the current industry standard) for backwards compatibility, so have type "unknown".
You can generate a proxy from your certificate, the easiest way I know is:
kx509 --proxyhours=1
which gives
subject : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc/CN=528207979 issuer : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc identity : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc type : RFC compliant proxy strength : 2048 bits path : /tmp/x509up_u1311 timeleft : 0:59:55
This is a proxy derived from your end entity cert, so you are the issuer. It has type proxy and has the "/CN=..." and limited time features of a proxy.
Under some circumstances, you might end up with a voms proxy. voms is a system to track identities for uses on the grid. When you run a job on the grid, this will be the form of the cert you have on the grid node as you run. These proxies have been "extended" with additional fields that can be considered part of your identity, such as your VO (Virtual Organization, i.e. your experiment) and your role ("Analysis" or "Production").
In our current procedures, you should never need to do this directly, but if you want, you can create a voms proxy:
voms-proxy-init -noregen -rfc -voms fermilab:/fermilab/mu2e/Role=Analysis
This will create a proxy with an extension:
subject : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc/CN=333302448 issuer : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc identity : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc type : RFC compliant proxy strength : 1024 bits path : /tmp/x509up_u1311 timeleft : 11:59:57 key usage : Digital Signature, Key Encipherment, Data Encipherment === VO fermilab extension information === VO : fermilab subject : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc issuer : /DC=org/DC=opensciencegrid/O=Open Science Grid/OU=Services/CN=voms2.fnal.gov attribute : /fermilab/mu2e/Role=Analysis/Capability=NULL attribute : /fermilab/mu2e/Role=NULL/Capability=NULL attribute : /fermilab/Role=NULL/Capability=NULL timeleft : 11:59:57 uri : voms2.fnal.gov:15001
The "fermilab" VO is used for all smaller experiments at the lab. In the fermilab VO, there is a "mu2e group". Again, this has all the features of a proxy.
A different location of the proxy can be indicated to most services with some combination of
X509_USER_PROXY=/tmp/x509up_u`id -u` X509_USER_CERT=/tmp/x509up_u`id -u` X509_USER_KEY=/tmp/x509up_u`id -u` X509_CERT_DIR=/etc/grid-security/certificates
With a real cert, you might need to point CERT and KEY to different files, but with a proxy you can point them both to the proxy. The CERT_DIR variable points the command to the intermediate CA certs.
There is an equivalent set for the http protocol:
HTTPS_CA_FILE HTTPS_CERT_FILE HTTPS_KEY_FILE HTTPS_CA_DIR
The RFC for proxy certificates is here: link and for X.509 certificates here: link.
Browsers
When you first get your certificate, you want to load it in your browsers. This is usually straightforward following the browser instructions. When you visit secure web sites, such as Jenkins, you may have to select the certificate to present to the web site.
When you connect to a secure web site that expects a certificate from you, that site will also present your browser with a certificate of its own. Your browser will then attempt to authenticate the certificate. If it cannot, it will open a dialog box telling you that it does not recognize the site's certificate and asking you if you would like to "add an exception". If you add the exception, then your browser will accept this site even though the browser cannot itself authenticate the certificate. This is usually OK, but not ideal.
The way that your browser authenticates a certificate is that it contacts a recognized, trusted, Certificate Authority (CA). It then forwards the certificate in question to the CA and asks "Can I trust this?". If all is well, the CA replies that you can trust it. If your browser does not know the relevant CA to use, or if it does not trust the CA that the certificate says to use, then your browser will start the "add exception" dialog.
Out of the box, your browser usually has a set of CA's from commercial services, but not the CILogon CA, which we are using currently. You can see what CA's are needed by entering the url in at digicert or running
openssl s_client -connect URL:PORT
If it is uses the CILogon certs, you can usually find them here:
/etc/grid-security/certificates/cilogon-basic.pem /etc/grid-security/certificates/cilogon-osg.pem
Or you can download the same files from CILogon CA certs.
After following the browser instructions for importing the CA cert to your browser, you usually need to check a box to "trust" these certs.
KCA
KCA refers to an old lab-based Certificate Authority system which is now disabled (2016).
Grid Workflows
The following operations (from Workflows) require certs:
- samweb write operations (such as mu2eFileDeclare) require a kx509 cert
- ifdh copies to dCache (including mu2eFileUpload --ifdh) require a voms proxy
- jobsub grid submission requires a voms proxy
- reading files via xrootd (filespec like xroot://..) requires a voms proxy
When you land on a grid node, you will have a voms proxy, so all procedures should "just work". But when you perform procedure interactively, you need to manage your certificates.
Two procedures, jobsub and ifdh, will try to make a cert for you. We believe the pattern is:
- check for a voms proxy pointed to by X509 environmentals
- check for a voms proxy in /tmp/x509up_u$UID
- check for a voms proxy in /tmp/x509up_voms_mu2e_Analysis_$UID
- if a voms proxy is found, stop and use it (issue a warning if proxy has less than an hour left)
- if no voms proxy found look for x509 cert: in X509 environmentals, or in /tmp/x509up_u$UID
- if a x509 is found, create a 12 h voms proxy in /tmp/x509up_voms_mu2e_Analysis_$UID, use that
- if no x509 proxy is found, look for one-session (daily, default) kerberos ticket in /tmp/krb5cc_${UID}_*, as pointed to by KRB5CCNAME environmental.
- if a kerberos ticket is found, create x509 cert in /tmp/x509up_u$UID, and voms proxy in /tmp/x509up_voms_mu2e_Analysis_$UID, use voms proxy
- if the voms proxy expires repeat the above
While this code is trying to be helpful, it can fall short. For example, if it finds a x509 cert with 2h left on it, the voms proxy will also be limited to 2h. Kerberos session tickets will be deleted when the interactive process that created it exits, which will prevent ifdh and jobsub from using it to renew the voms proxy. (Certs are not deleted when you log off.) Cert-based authentication is unique to a node and user, but not a session which means you might start a script while you have good authentication, then overwrite those certs by an action in a different session on the same machine. These effects can lead to the procedures failing at unexpected times. The only complete solution is to understand, plan, and monitor your authentication.
To setup voms authentication for a long time, so you can run a long mu2eFileUpload, for example, you probably want to refresh your certs first:
TMPP=$(mktemp) TMPV=$(mktemp) FINV=/tmp/x509up_u$UID /usr/krb5/bin/klist # this is the same as kx509, but with more options /usr/bin/cigetcert -i "Fermi National Accelerator Laboratory" -o $TMPP voms-proxy-info -file $TMPP voms-proxy-init -hours 120 -noregen -rfc \ -voms fermilab:/fermilab/mu2e/Role=Analysis \ -cert $TMPP -out $TMPV mv $TMPV $FINV rm $TMPP voms-proxy-info -all
A command has been provided on cvmfs:
/cvmfs/mu2e.opensciencegrid.org/bin/vomsCert
At this point, you should see an x509 cert and voms proxy with 120 h lifetime in /tmp/x509up_u$UID
and all procedures should work. You must simply be aware and careful to not overwrite these certs on this machine. The "mv" in the above script makes swapping the new cert in an atomic action, so the cert is never invalid and other scripts can use the cert uninterrupted.
Theoretically, this procedure could be put in your login scripts, but we wouldn't recommend that since you won't be so aware of what's happening.
We have also established that you can create a kcron kerberos ticket (see above) and run the authentication procedure off of the kcron ticket in a cron job, thereby establishing perpetual authentication. To do this, you need to create the kcron keytab file (see also above). To do that, just run:
kcroninit
you'll be prompted for you kerberos password and the keytab will be put on /var/adm
. At this point you can see your kcron principle:
kdestroy kcron klist
You can run the authentication with a crontab entry:
0 */8 * * * kcron /cvmfs/mu2e.opensciencegrid.org/bin/vomsCert >& ~/vomsCert.log
but first you must get your kcron kerberos principle associated to your certificate id as explained on the fife pages, which points to this servicedesk item.
You will need to do this procedure once on each machine you want to work on. (kcroninit is permanent until you kcrondestroy..)
Tokens
Tokens are a form of authentication intended to replace x509 certs in the process of authenticating
- grid job submission
- dCache reads and writes (using protocols other than nfs access in /pnfs file system)
- SAM database write access
Tokens are (2/2023) becoming the leading authentication method for Mu2e. There are docs:
There are several files involved in tokens
- your permanent token at
cilogon.org
- the refresh token which is valid for 30d and is kept in the lab's central vaultserver database
/tmp/vt_u$(id -u)
the vault token, valid for 7d$HOME/.config/htgettoken/credkey-mu2e-default
a pointer into a record in the vaultserver, saved here for faster lookup${XDG_RUNTIME_DIR:-/tmp}/bt_u$(id -u)
(looks like/run/user/1311/bt_u1311
), the bearer token, valid for usually on the scale of hours. This is the one actually used in authentication.
To get a new token, issue the command
htgettoken -i mu2e --vaultserver htvaultprod.fnal.gov
or this is in a script available after mu2einit
:
getToken
(In the future the arguments may have useful defaults.) Now there are two cases. If you have not been active in over 30d, or this is the first time getting a token, then none of your token caches are valid except the permanent one at cilogon.org, so you have to go there to start the process. You will see:
Complete the authentication at: https://cilogon.org/device/?user_code=ZCC-9M3-W6K
If the system can access a local web browser, it may open the page for you, but if not, you will have to cut and past this url into a browser. Once you access this url, and enter your lab SSO user/password, then the system will recognize your authentication and populate all your token caches.
The second case occurs when it has been less than 30d since the last time you got a token. In this case, your vault token or your refresh token will be valid, and then your kerberos ticket will be used to access those caches and refresh all your tokens.
You can see the token with
httokendecode or seeToken
and delete the token with
delToken
The common procedures (job submission using jobsub
, data copies using ifdh
and SAM write commands using samweb
) that need this authentication, will attempt to use your kerberos ticket to refresh your tokens (the second case) whenever tokens are needed. The envisioned process is that the refresh is done by the tool when you use the tool, and without you taking any explicit token action (except when you have not been active in 30d and fall back into the first case).
If you need to run a command for many hours, so need to have a token repeatedly renewed, you can use httokensh
The following allows a test of the tokens
setup ifdhc v2_6_6 export IFDH_PROXY_ENABLE=0 export IFDH_TOKEN_ENABLE=1 export IFDH_DEBUG=10 ifdhc cp ...
Mu2epro
The SCD FIFE group provides a service to push a valid vault token to the mu2epro account every 10 min (probably will be set to several hours in the future). The write access is granted by adding
mu2epro/managedtokens/fifeutilgpvm01.fnal.gov@FNAL.GOV
to the k5login
. Setting
export HTGETTOKENOPTS="--credkey=mu2epro/managedtokens/fifeutilgpvm01.fnal.gov"
in bashrc allows token refresh commands to find it.
There are actually two vault tokens pushed:
-rw------- 1 mu2epro mu2e 96 Aug 4 08:40 /tmp/vt_u44592 -rw------- 1 mu2epro mu2e 96 Aug 4 08:40 /tmp/vt_u44592-mu2e_production
Either of these can help generate the identical bearer token. Essentially, the two names for one thing are baked into the tool scripts for now. This might be fixed in the future.
To refresh tokens (usually done automatically by the tools)
htgettoken -a fermicloud543.fnal.gov -i mu2e -r production
this is also available after mu2einit
as
getToken
Setting
export BEARER_TOKEN_FILE=/tmp/bt_token_mu2e_production_44592
causes the token procedures to always find the "production" one.
references
SSH Keys
What are SSH keys
SSH has many options for performing authentication, one of which is RSA public key cryptography. If you want you, can read more about that at, for example, https://en.wikipedia.org/wiki/RSA_(cryptosystem) . SSH keys are a pair of files one of which contains the private key and the other of which contains the public key. You share the public key with people/applications with whom you wish to communicate securely. It is important to keep your private key secure.
One of the ways to authenticate to github is with SSH keys and github has some good information about SSH keys:
- Generating and testing keys: https://help.github.com/en/articles/connecting-to-github-with-ssh
- Trouble shooting problems: https://help.github.com/en/categories/authenticating-to-github
The following contains a shortened version of that information plus some additional details.
Generating a public and private key pair
Do this step on your desktop or laptop, not on one of the Fermilab interactive machines.
The instructions below are good for most unix based systems, including Mac OS. If someone knows the instructions for other operating systems, please add them here.
Instructions for Unix and Mac
The first step is to choose a password that you will provide when prompted. Please use something unique and do not recycle your kerberos or services password. You will rarely need this password so make sure you have a means to remember it.
- cd ~/.ssh
- ssh-keygen
- it will also prompt you for a file name; I chose robk_rsa .
- it will prompt you for password; use the new one you chose above.
This will create two files in the current directory named robk_rsa and robk_rsa.pub; the former holds your private key and the latter your public key.
Adding a key to your ssh-agent
Do this step on your desktop or laptop, not on one of the Fermilab interactive machines.
The next step is to add your private key to the ssh-agent running on your machine. You can check to see if there are already any registered private keys with the command:
- ssh-add -l
The option is a lower case letter L, not the numeral one.
To add your new key: (n.b. rhb was having trouble with this step after a recent Mac upgrade, and tried with the unqualified pathname, and then it (ahem) just worked. The old key which had been working did not, the new key did not, and only the unqualified pathname worked)
- cd ~/.ssh
- ssh-add ~/.ssh/robk_rsa
- You will be prompted a password; use the password you gave to ssh-keygen.
RHB has also determined that at least on OSX 10.14.6 every reboot requires this ssh-add step.
If you get the error message "Could not open a connection to your authentication agent", you don't have the agent running, you can start it with
- eval `ssh-agent`
You can verify that the ssh-agent has your key:
- ssh-add -l
If you do this in one terminal window during an interactive session it will immediately be present in all other terminal windows and all other applications.
When you reboot your computer the key will no longer be in your ssh-agent. To add it you need again to run
- ssh-add ~/.ssh/robk_rsa
so you had better remember that password! You could put ssh-add -l in your .profile and then before you get too far you'll see if you need to add the agent again. Once this step is complete you will be able to communicate securely from your laptop/desktop to people/applications with whom you have shared your public key.
To delete a key from ssh-agent:
- ssh-add -d robk_rsa
Forwarding your keys from your laptop/desktop to the Mu2e interactive machines
Do this step on your desktop or laptop, not on one of the Fermilab interactive machines.
The final step is to forward the private key information from the ssh-agent on your laptop/desktop to a Mu2e interactive machine so that you can also have secure communications when logged into one of those machines. Forwarding your private keys via ssh is more secure than placing a copy of your private key on the Mu2e interactive machines.
There are two ways to do this. The first way is to add the -A option to your ssh command line:
ssh -A mu2egpvm01
You will need to add the -A option everytime that you ssh into a machine. You can verify that your credentials have been forwarded by logging into, for example, mu2egpvm01 and issuing the following command:
ssh-add -l
In my case I see the forwarded key information plus a warning message that indicates that the older protocol 1 is still active but not in use.
error fetching identities for protocol 1: agent refused operation
This is only a warning message.
The second way is that you can modify your ~/.ssh/config file so that the -A option is present by default on every ssh command. Make a backup copy of ~/.ssh/config. The structure of this file is a series of blocks beginning with "Host pattern", followed by lines with option-value pairs. When you ssh into a machine, ssh will look in the config file and pattern match the name of your target machine to the patterns given in the lines beginning with "Host". If the name will match multiple patterns, the first match wins. The final Host block should begin with "Host *" to match all machine names that have not already matched. The options given in the matched Host block will be applied to your ssh session. The option that controls forwarding of ssh-agent credentials is: "ForwardAgent". The spirit of the following instructions is that you should only forward these credentials to machines that need them and that you trust; forwarding them to all machines risks theft of the credentials.
In the block beginning:
Host *.fnal.gov
add the line:
ForwardAgent yes
The ssh config protocol requires an exact match for machine names. So if you normally type ssh mu2egpvm01, not mu2egpvm01.fnal.gov, it will not match the *.fnal.gov pattern. To rectify this you can copy the "Host *.fnal.gov" block to a new block named "Host mu2e*" or simply use the "Host mu2e*" block only, if that matches your needs. The final block of the file should begin:
Host *
which specifies options for all machines not matched by other patterns. It should contain:
ForwardAgent no
Authenticating to github
GitHub has recently announced that password based authentication is deprecated and will soon be disabled. Several types of token based authentication, including ssh keys, will continue to be supported. This section describes how to use ssh keys for authentication.
GitHub has good instructions for registering ssh keys with GitHub. The instructions below add a few details:
- Follow the instructions Authentication#SSH Keys to enable ssh keys on your laptop/desktop and forward them to the Mu2e interactive machines.
- Log in to your github account via the web interface: https://github.com
- In the upper right corner, click on your picture or avatar icon to get a pop-up menu
- From the menu select "Settings"
- In the left hand task bar select "SSH and GPG Keys"
- In the upper right of the main pane there is a green button "New SSH Key". Click it.
- Mouse the content of your public key file into the place provided. Fill in the title field with a unique name.
- Click on the green "Add SSH Key" below the field for the content of your key file.
- The system will prompt your for the password to your *github* account, not the password you used to encrypt the key.
On your laptop or desktop you can test that everything is working by cloning a repository:
git clone git@github.com:username/Offline
where "username" is the GitHub name of the fork that you are cloning; the official Mu2e repository has the username "mu2e".
Also test that your ssh keys have been forwarded to the mu2e interactive machines as follows:
ssh mu2egpvm01 mu2einit git clone git@github.com:username/Offline
Jupyter
Jupyter is a web-based interactive development environment which allows to have notebooks containing code, text, and plots. The most convenient way to use Jupyter and access Mu2e files is to run a session on a Mu2e gpvm and create a SSH tunnel to your local machine. The recommended way to install Jupyter is through conda, a package management system which manages self-contained and independent Python environments.
Install Miniconda
Starting on Oct 16, 2024, Fermilab will block the Anaconda and Miniconda repositories when accessed from onsite. The recommended option is to use Miniforge. For more details see Pyana#Anaconda.
Fixme: The instructions below need to be edited to point people at Miniforge.
The first thing we do is to download and run Miniconda, a minimal installer of conda. So we log into a Mu2e gpvm (e.g. mu2egpvm01) and run:
cd mu2einit setup python v3_8_3b wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh chmod +x Miniconda3-latest-Linux-x86_64.sh ./Miniconda3-latest-Linux-x86_64.sh echo ". ~/miniconda3/etc/profile.d/conda.sh" >> ~/.bash_profile
Install Jupyter
Now, we logout and login again to create a conda environment containing Jupyter:
conda create -n jupyter_env conda activate jupyter_env conda install -c conda-forge jupyterlab
Create a SSH tunnel
Now it is possible to run a Jupyter session (in this case JupyterLab):
jupyter-lab --no-browser
A token, needed to access the session from another browser, will appear. Take note of it and open a SSH tunnel on your local machine:
ssh -L localhost:8888:localhost:8888 user@mu2egpvm01.fnal.gov
Now, open your browser at http://localhost:8888 on your local machine and enter the token. The Jupyter session will appear and you should be able to access and the files on the gpvm.
If you terminate the session on the gpvm you will now only need to load the conda environment and create a new Jupyter session:
conda activate jupyter_env jupyter-lab --no-browser