Authentication: Difference between revisions

From Mu2eWiki
Jump to navigation Jump to search
 
(85 intermediate revisions by 5 users not shown)
Line 1: Line 1:


{{Draft}}


==Introduction==
==Introduction==
The lab services and utilities used by mu2e require gaining and using several kinds of authentication.  You will have one username for all computing purposes lab.
You have one Fermilab username for all computing purposes at Fermilab.  It was assigned when you applied for your Fermilab ID and computer accounts.  This one username is used by several different authentication systems.


You login to the virtual machines with '''kerberos''' authentication.  You will need a permanent ID called a kerberos "principal" which is looks like "xyz@FNAL.GOV", where xyz is your username.   You will have a password associated with your principal.  You will use this principal and password to log into the various personal linux desktops located at Fermilab or to ssh into the collaboration interactive machines from your home institution.  
The lab provides interactive linux machines for use by Mu2e;  see [[ComputingTutorials#Interactive_logins]].  You login in to these machines using '''kerberos''' authentication.  You identify yourself to kerberos using your kerberos "principal" which is looks like "xyz@FNAL.GOV", where xyz is your Fermilab username. Usually you only need to type the xyz part.  When your Fermilab accounts were created you set a password for kerberos.  You will use your kerberos principal and password to log into all lab supplied linux computing resources.


The second identity you will need is the '''services''' principal, which looks like xyz@services.fnal.gov, or often just xyz, and also has a password (different from your kerberos password).  You will need this identity to log into Fermilab email, the servicedesk web site and some other services based at the lab.  You would typically only use this authentication at the point you log into the service.
The second identity you will need is your Fermilab Single Sign On ('''SSO''') identity.  For historical reasons this is sometimes called your '''services''' principal, which looks like xyz@services.fnal.gov, where xyz is your Fermilab username. Usually you only need to type the xyz part. When your Fermilab computing accounts were created, you set your SSO password; it should be different from your kerberos password.  You use your SSO password to log into the [https://mu2e.fnal.gov/atwork/ Mu2e web site], [https://mu2e-docdb.fnal.gov/cgi-bin/sso/DocumentDatabase/ the  Mu2e DocDB],  [https://dbweb9.fnal.gov:8443/ECL/mu2e/U/login?message=Login%20required&ret_url=/ECL/mu2e/E/index the Electronic Log Book], the [https://mu2einternalwiki.fnal.gov/wiki/Main_Page Mu2e internal wiki] and to edit [https://mu2ewiki.fnal.gov/wiki/Main_Page the Mu2e public wiki].  You also use it to access Fermilab email, the Service Desk web site and some other services hosted by Fermilab.  


The third identity you will need is a '''CILogin certificate'''.  This "cert" is the basis of authentication to the mu2e documents database, the computing farms, and a few other services.  You will use this cert in two ways.  The first way is to load it into your browser, which then gives you access to web pages and web services. The second is by using your kerberos authentication to access a copy of your certificate maintained in a remote database.  You get this certificate once and then renew it only once a year.  
The third identity you will need is a '''CILogon certificate'''.  You use this certifcate by loading it into your browser, which then gives you access to some Fermilab supported web pages and web services. See [[#Certifcate]], below.


[https://mu2e-hnews.fnal.gov/HyperNews/Mu2e/top.pl hypernews] is an archived blog and email list - for access here, you will need a hypernews password and your services password!
The fourth identity that you need is to access [https://mu2e-hnews.fnal.gov/HyperNews/Mu2e/top.pl Mu2e h=Hypernews], which is an archived blog and email list. To learn about Hypernews see [[Communications and Collaborative Tools#Hypernews]].


Finally, the mu2e [http://mu2e.fnal.gov/atwork/ internal web pages] require a collaboration username and password, please ask your mu2e mentor.
A small number of Mu2e people may also need a fifth identity. Fermilab maintains a separate kerberos system to grant access to a cluster of interactive Windows machines.  The kerberos principal for these machines looks like xyz@FERMI.GOV (FERMI, not FNAL), where xyz is your Fermilab username.  If you need an account on these machines you can requrest one by opening a [https://servicedesk.fnal.gov/ Service Desk] ticket.


If you need to do any computing in mu2e, please go ahead and start the procedure on the [[ComputingAccounts]] to create your accounts and authentication.
There is a standard [http://cd-docdb.fnal.gov/cgi-bin/RetrieveFile?docid=5884 lab seminar] on authentication for users.


For instructions on how to get a Fermilab ID, Fermilab computing accounts and Mu2e computing accounts, see [[ComputingAccounts]].


==Kerberos==
==Kerberos==
You login to the virtual machines with '''kerberos''' authentication.  You will need a permanent ID called a kerberos "principal" which is looks like "xyz@FNAL.GOV", where xyz is your username.  You will have a password associated with your principal.  You will use this principal and password to log into the various personal linux desktops located at Fermilab or to ssh into the collaboration interactive machines from your home institution.   
You login to the virtual machines with [https://fermi.servicenowservices.com/kb_view.do?sys_kb_id=628123de1be868107319ea41f54bcba0&sysparm_rank=1&sysparm_tsqueryId=52fedcbe1bf86050ced962cfe54bcbef'''kerberos'''] authentication.  You will need a permanent ID called a kerberos "principal" which is looks like "xyz@FNAL.GOV", where xyz is your username.  You will have a password associated with your principal.  You will use this principal and password to log into the various personal linux desktops located at Fermilab or to ssh into the collaboration interactive machines from your home institution.   


Your kerberos authentication is stored in a file  
Your kerberos authentication is stored in a file  
Line 38: Line 38:
There are two utilities in kerberos which we will only note here.  One is  <code>kcron</code> and <code>kcroninit</code>.  These allows you to gain a kerberos ticket in a cron job.  The other variant is the <code>keytab</code> file.  This file can hold a ticket that is good for a year and can be accessed by anyone with access to the keytab file.  This feature is not as secure, so it is usually only issued by the lab when needed, for example, in a group account.
There are two utilities in kerberos which we will only note here.  One is  <code>kcron</code> and <code>kcroninit</code>.  These allows you to gain a kerberos ticket in a cron job.  The other variant is the <code>keytab</code> file.  This file can hold a ticket that is good for a year and can be accessed by anyone with access to the keytab file.  This feature is not as secure, so it is usually only issued by the lab when needed, for example, in a group account.


Read more at the FNAL [https://fermi.service-now.com/nav_to.do?uri=%2Fkb_view_customer.do%3Fsysparm_article%3DKB0011308 kerberos link].
Read more at the FNAL [https://fermi.servicenowservices.com/kb_view.do?sysparm_article=KB0011308 kerberos link] (Miscellaneous Kerberos Topics for the User -> Automated Processes).
Reset password [https://fermi.service-now.com/nav_to.do?uri=%2Fkb_view_customer.do%3Fsysparm_article%3DKB0010628 |here].
Reset password [https://fermi.servicenowservices.com/kb_view.do?sysparm_article=KB0010628 here].


==Services==
==Services==
The second identity you will need is the '''services''' principal, which looks like xyz@services.fnal.gov, or often just xyz, and also has a password (different from your kerberos password).  You will need this identity to log into [http://email.fnal.gov Fermilab email], the [https://fermi.service-now.com servicedesk] web site, sharepoint and some other services based at the lab.  You would typically only use this authentication at the point you log into the service and their is no local credential cache.
The second identity you will need is the '''services''' principal, which looks like xyz@services.fnal.gov, or often just xyz, and also has a password (different from your kerberos password).  You will need this identity to log into [http://email.fnal.gov Fermilab email], the [https://fermi.servicenowservices.com servicedesk] web site, sharepoint and some other services based at the lab.  You would typically only use this authentication at the point you log into the service and their is no local credential cache.


==Certificate==
==Certificate==
Line 48: Line 48:
Some of the systems which require cert authentication are [[Grids|jobsub]] job submission, [[DataTransfer|ifdh]] data transfer and writing to the [[SAM|SAM]] database as part of [[Upload|file upload]] to tape.
Some of the systems which require cert authentication are [[Grids|jobsub]] job submission, [[DataTransfer|ifdh]] data transfer and writing to the [[SAM|SAM]] database as part of [[Upload|file upload]] to tape.


You should have received a certificate as part of [[ComputingAccounts|registering]] for computing accounts.  (See also [https://fermi.service-now.com/kb_view.do?sysparm_article=KB0010773 Fermilab docs on certs].) The cert can be downloaded as a password-protected '''pem''' or '''p12''' file and installed in your browser.  Your cert is good for a year and only needs to be updated in your browser once a year.   
You should have received a certificate as part of [[ComputingAccounts|registering]] for computing accounts.  (See also [https://fermi.servicenowservices.com/kb_view.do?sysparm_article=KB0010773 Fermilab docs on certs] for creating or renewing a cert.) The cert can be downloaded as a password-protected '''pem''' or '''p12''' file and imported in your browser.  Your cert is good for a year and only needs to be updated in your browser once a year.   


When your cert is first created, it is also communicated to the lab which can then manage and provide the cert for you. When you access your cert at the linux command line, you usually access it from this cache.  
When your cert is first created, it is also communicated to the lab which can then manage and provide the cert for you. When you access your cert at the linux command line, you usually access it from this cache.  To access the cache you need a valid kerberos ticket. When this access occurs, you get a local file called a '''proxy''' which is a copy of your cert, but only valid for a finite time.


It is this proxy that commands can use to authenticate you.  Note that '''jobsub''' and '''ifdh''' can automatically create a proxy for you if it is needed (so you only need to remember to kinit), however, [[SAM|samweb]] does not create proxies automatically, and you will need to do that yourself if you get authentication errors.
Certs may also have "extensions" added by the lab which enable certain access.  The one we use is the "VO" or "virtual organization" extension.  All experiments in the Intensity Frontier are part of the fermilab VO.  This allows access to OSG grid sites, for example.
The easiest way to get a proxy is to run '''vomsCert''' which is in your path after you <code>mu2einit</code>:
mu2einit
kinit
vomsCert
The following are some details and alternatives for getting proxies
<pre>
<pre>
kinit  
kinit  
Line 57: Line 68:
</pre>
</pre>


'''kinit''' makes sure your kerberos identity is valid, and '''kx509''' uses that authentication to make a local, temporary copy of your cert, called a '''proxy''', and writes it to a file named with your UID:
'''kx509''' uses kerberos authentication to make a proxy, and writes it to a file named with your UID:
<pre>
<pre>
kx509
kx509
Line 64: Line 75:
</pre>
</pre>


It is this proxy, in this standard location, that commands can use to authenticate you.  Note that '''jobsub''' and '''ifdh''' can automatically run kx509 for you if it is needed (so you only need to remember to kinit), however, [[SAM|samweb]] does not run kx509 automatically, and you will need to run it yourself if you get authentication errors.


kx509 is equivalent to  
kx509 is equivalent to  
  setup cigetcert
  setup cigetcert
  cigetcert --institution="Fermi National Accelerator Laboratory"
  cigetcert --institution="Fermi National Accelerator Laboratory"
which you might see some places.  If you do not have a kerberos ticket when you run cigetcert, it will prompt you for your service password, and use that authentication to access your cert cache and make a copy.  
which you might see some places.  If you do not have a kerberos ticket when you run cigetcert, it will prompt you for your services  password, and you can use this authentication to access your cert and make a proxy.  


You can print your certificate with  
 
You can print your proxy certificate with  
<pre>
<pre>
voms-proxy-info -all
voms-proxy-info -all
</pre>
</pre>
Your cert works by providing encrypted identity information.  This packet will be "signed" by a Certificate Authority (CA). The party, such as a lab service, that wants to check your identity can ask the CA if your packet is valid.  CA's may be signed by other CA's in turn up to a nationally-recognized organization, in a "trust chain".


===Certificate Error===
===Certificate Error===
Line 90: Line 102:
</pre>
</pre>


The problem is that the proxy created by jobsub may require extra steps in authentication because it is "self-signed" by you.  The cert created by kx509 is signed by the CILogon CA, and requires fewer steps to authenticate.
The problem is that the proxy created by jobsub or ifdh may require extra steps in authentication because it is "issued" by you.  The cert created by kx509 is issued by the CILogon CA, and requires fewer steps to authenticate.  While this delete-and-kx509 procedure works fine, the best solution is to provide the intermediate CA in the trust chain to the command (more below).




===Proxies===
===Types of certs and proxies===
A proxy is a copy of your certificate except that it expires quickly, usually in 12 or 24 hours.  If a bad actor were to gain access to the proxy, they could only use it for the valid time, and they could not replicate it.  The proxy is considered safer to pass around the grid and over networks.  At the command line, we only use proxies.


A proxy is a copy of your certificate that expires quickly, usually in a few days or hours (see the printouts for "timeleft").  If a bad actor were to gain access to the proxy, they could only use it for the valid time, and they could not replicate it.  The proxy is considered safe enough to pass around the grid and over networks.  At the command line, we only use proxies.


Proxies created by kx509 are "plain" proxies, simple temporary copies of your cert.  Here is what one print looks like:
 
kx509 creates "end entity certificates", and are temporary copies of your cert.  (I've had experts tell me these are technically proxies, but this seems to not be helpful.) Here is what one print looks like:
<pre style="font-size:90%">
<pre style="font-size:90%">
subject  : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc
subject  : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc
Line 109: Line 122:
</pre>
</pre>


If you print your cert and see "subject' with an appendage like "/CN=2707985426" then this is a proxy, probably created with <code>voms-proxy-init</code>.  It may also say "proxy" in the "type" field when printed. Certs created by kx509 are actually proxies, even though the word proxy and the "CN" tag doesn't appear. These are not fully RFC-compliant, which is the general, current standard, due to backwards compatibility, so have type "unknown".  All our proxies may be called an "end-entity credential" which just means it is below a CA in the certificate chain.
The term "end-entity credential" just means it is below a CA in the certificate trust chain.
 
If you print your cert and see "subject' with an appendage like "/CN=2707985426" then this is a proxy.  It may also say "proxy" in the "type" field. The kx509 certs are not fully RFC-compliant (the current industry standard) for backwards compatibility, so have type "unknown".   
 
You can generate a proxy from your certificate, the easiest way I know is:
kx509 --proxyhours=1
which gives
<pre>
subject  : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc/CN=528207979
issuer    : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc
identity  : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc
type      : RFC compliant proxy
strength  : 2048 bits
path      : /tmp/x509up_u1311
timeleft  : 0:59:55
</pre>
This is a proxy derived from your end entity cert, so you are the issuer.  It has type proxy and has the "/CN=..." and limited time features of a proxy.
 


The proxies created by jobsub and ifdh are '''voms''' proxies.  voms is a system to track identities for uses on the grid.  They have been "extended" with additional fields that can be considered part of your identity, such as your '''VO''' (Virtual Organization, ''i.e.'' your experiment) and your role ("Analysis" or "Production").
Under some circumstances, you might end up with a '''voms''' proxy.  voms is a system to track identities for uses on the grid.  When you run a job on the grid, this will be the form of the cert you have on the grid node as you run. These proxies have been "extended" with additional fields that can be considered part of your identity, such as your '''VO''' (Virtual Organization, ''i.e.'' your experiment) and your role ("Analysis" or "Production").


In our current procedures, uou should never need to do this directly, but if you want, you can create a voms proxy:
In our current procedures, you should never need to do this directly, but if you want, you can create a voms proxy:
   voms-proxy-init -noregen -rfc -voms fermilab:/fermilab/mu2e/Role=Analysis
   voms-proxy-init -noregen -rfc -voms fermilab:/fermilab/mu2e/Role=Analysis
This will create a proxy with an extension:
This will create a proxy with an extension:
Line 136: Line 166:
</pre>
</pre>
The "fermilab" VO is used for all smaller experiments at the lab.  In the fermilab VO, there is a "mu2e group".
The "fermilab" VO is used for all smaller experiments at the lab.  In the fermilab VO, there is a "mu2e group".
Again, this has all the features of a proxy.


A different location of the proxy can be indicated to most services with some combination of
A different location of the proxy can be indicated to most services with some combination of
Line 144: Line 175:
X509_CERT_DIR=/etc/grid-security/certificates
X509_CERT_DIR=/etc/grid-security/certificates
</pre>
</pre>
With a real cert, you might need to point CERT and KEY to different files, but with a proxy, you can point them all to the proxy.
With a real cert, you might need to point CERT and KEY to different files, but with a proxy you can point them both to the proxy.
The CERT_DIR variable points the command to the intermediate CA certs.


There is an equivalent set for http protocol:
There is an equivalent set for the http protocol:
<pre>
<pre>
HTTPS_CA_FILE
HTTPS_CA_FILE
Line 154: Line 186:
</pre>
</pre>


 
The RFC for proxy certificates is here: [https://www.ietf.org/rfc/rfc3820.txt link] and for X.509 certificates here: [https://tools.ietf.org/html/rfc5280#section-4.1.2.5 link].


===Browsers===
===Browsers===
When you first get your certificate, you want to load it in your browsers.  This is usually straightforward following the browser instructions.  When you visit secure web sites, such as [https://mu2e-docdb.fnal.gov:440/cgi-bin/DocumentDatabase DocDB]] or [[Jenkins]], you may have to select the certificate to present to the web site.
When you first get your certificate, you want to load it in your browsers.  This is usually straightforward following the browser instructions.  When you visit secure web sites, such as [[Jenkins]], you may have to select the certificate to present to the web site.


When you connect to a secure web site that expects a certificate from you, that
When you connect to a secure web site that expects a certificate from you, that
Line 191: Line 223:
KCA refers to an old lab-based Certificate Authority system which is now disabled (2016).
KCA refers to an old lab-based Certificate Authority system which is now disabled (2016).


===Grid Workflows===
The following operations (from [[Workflows]]) require certs:
* samweb write operations (such as mu2eFileDeclare) require a kx509 cert
* ifdh copies to dCache (including mu2eFileUpload --ifdh) require a voms proxy
* jobsub grid submission requires a voms proxy
* reading files via xrootd (filespec like xroot://..) requires a voms proxy
When you land on a grid node, you will have a voms proxy, so all procedures should "just work".  But when you perform procedure interactively, you need to manage your certificates.
Two procedures, jobsub and ifdh, will try to make a cert for you.  We believe the pattern is:
* check for a voms proxy pointed to by X509 environmentals
* check for a voms proxy in /tmp/x509up_u$UID
* check for a voms proxy in /tmp/x509up_voms_mu2e_Analysis_$UID
* if a voms proxy is found, stop and use it (issue a warning if proxy has less than an hour left)
* if no voms proxy found look for x509 cert: in X509 environmentals, or in /tmp/x509up_u$UID
* if a x509 is found, create a 12 h voms proxy in /tmp/x509up_voms_mu2e_Analysis_$UID, use that
* if no x509 proxy is found, look for one-session (daily, default) kerberos ticket in /tmp/krb5cc_${UID}_*, as pointed to by KRB5CCNAME environmental. 
* if a kerberos ticket is found, create x509 cert in /tmp/x509up_u$UID, and voms proxy in /tmp/x509up_voms_mu2e_Analysis_$UID, use voms proxy
* if the voms proxy expires repeat the above
While this code is trying to be helpful, it can fall short.  For example, if it finds a x509 cert with 2h left on it, the voms proxy will also be limited to 2h.  Kerberos session tickets will be deleted when the interactive process that created it exits, which will prevent ifdh and jobsub from using it to renew the voms proxy. (Certs are not deleted when you log off.) Cert-based authentication is unique to a node and user, but not a session which means you might start a script while you have good authentication, then overwrite those certs by an action in a different session on the same machine.  These effects can lead to the procedures failing at unexpected times.  The only complete solution is to understand, plan, and monitor your authentication.
To setup voms authentication for a long time, so you can run a long mu2eFileUpload, for example, you probably want to refresh your certs first:
TMPP=$(mktemp)
TMPV=$(mktemp)
FINV=/tmp/x509up_u$UID
/usr/krb5/bin/klist
# this is the same as kx509, but with more options
/usr/bin/cigetcert -i "Fermi National Accelerator Laboratory" -o $TMPP
voms-proxy-info -file $TMPP
voms-proxy-init -hours 120 -noregen -rfc \
  -voms fermilab:/fermilab/mu2e/Role=Analysis \
  -cert $TMPP -out $TMPV
mv $TMPV $FINV
rm $TMPP
voms-proxy-info -all
A command has been provided on cvmfs:
/cvmfs/mu2e.opensciencegrid.org/bin/vomsCert
At this point, you should see an x509 cert and voms proxy with 120 h lifetime in <code>/tmp/x509up_u$UID</code> and all procedures should work. You must simply be aware and careful to not overwrite these certs on this machine.  The "mv" in the above script makes swapping the new cert in an atomic action, so the cert is never invalid and other scripts can use the cert uninterrupted. 
Theoretically, this procedure could be put in your login scripts, but we wouldn't recommend that since you won't be so aware of what's happening. 
We have also established that you can create a kcron kerberos ticket (see above) and run the authentication procedure off of the kcron ticket in a cron job, thereby establishing perpetual authentication.  To do this, you need to create the kcron keytab file (see also above).  To do that, just run:
kcroninit
you'll be prompted for you kerberos password and the keytab will be put on <code>/var/adm</code>. At this point you can see your kcron principle:
kdestroy
kcron klist
You can run the authentication with a crontab entry:
0 */8 * * * kcron /cvmfs/mu2e.opensciencegrid.org/bin/vomsCert >& ~/vomsCert.log
but first you must get your kcron kerberos principle associated to your certificate id as explained on the  [https://cdcvs.fnal.gov/redmine/projects/fife/wiki/Authentication#Authentication-with-kcron-for-SL7 fife pages], which points to this [https://fermi.servicenowservices.com/wp?id=evg_sc_cat_item&sys_id=11cd3721db8adf4096f5ff621f9619fc&spa=1 servicedesk item].
You will need to do this procedure once on each machine you want to work on. (kcroninit is permanent until you kcrondestroy..)
==Tokens==
Tokens are a form of authentication intended to replace x509 certs in the process of authenticating
* grid job submission
* dCache reads and writes (using protocols other than nfs access in /pnfs file system)
* SAM database write access
Tokens are (2/2023) becoming the leading authentication method for Mu2e. There are docs:
* [https://indico.fnal.gov/event/57514/ computing division tutorial]
* [https://fifewiki.fnal.gov/wiki/Getting_started_with_jobsub_lite FIFE Wiki]
* [https://mu2e-docdb.fnal.gov/cgi-bin/sso/ShowDocument?docid=44379 Mu2e notes]
* [https://landscape.fnal.gov/monitor/d/56WoQ2cVk/token-scopes-capability-sets?orgId=1&var-experiment=mu2e&var-role=All roles]
There are several files involved in tokens
# your permanent token at <code>cilogon.org</code>
# the '''refresh token''' which is valid for 30d and is kept in the lab's central vaultserver database
# <code>/tmp/vt_u$(id -u)</code> the '''vault token''', valid for 7d
# <code>$HOME/.config/htgettoken/credkey-mu2e-default</code> a pointer into a record in the vaultserver, saved here for faster lookup
# <code>${XDG_RUNTIME_DIR:-/tmp}/bt_u$(id -u)</code> (looks like <code>/run/user/1311/bt_u1311</code>), the '''bearer token''', valid for usually on the scale of hours. This is the one actually used in authentication.
To get a new token, issue the command
htgettoken -i mu2e --vaultserver htvaultprod.fnal.gov
or this is in a script available after <code>mu2einit</code>:
getToken
(In the future the arguments may have useful defaults.) Now there are two cases. If you have not been active in over 30d, or this is the first time getting a token, then none of your token caches are valid except the permanent one at cilogon.org, so you have to go there to start the process.  You will see:
Complete the authentication at:
    https://cilogon.org/device/?user_code=ZCC-9M3-W6K
If the system can access a local web browser, it may open the page for you, but if not, you will have to cut and past this url into a browser. Once you access this url, and enter your lab SSO user/password, then the system will recognize your authentication and populate all your token caches. 
The second case occurs when it has been less than 30d since the last time you got a token. In this case, your vault token or your refresh token will be valid, and then your kerberos ticket will be used to access those caches and refresh all your tokens.
You can see the token with
httokendecode
  or
seeToken
and delete the token with
  delToken
The common procedures (job submission using <code>jobsub</code>, data copies using <code>ifdh</code> and SAM write commands using <code>samweb</code>) that need this authentication, will attempt to use your kerberos ticket to refresh your tokens (the second case) whenever tokens are needed.  The envisioned process is that the refresh is done by the tool when you use the tool, and without you taking any explicit token action (except when you have not been active in 30d and fall back into the first case).
If you need to run a command for many hours, so need to have a token repeatedly renewed, you can use <code>httokensh</code>
The following allows a test of the tokens
setup ifdhc v2_6_6
export IFDH_PROXY_ENABLE=0
export IFDH_TOKEN_ENABLE=1
export IFDH_DEBUG=10
ifdhc cp ...
===Mu2epro===
The SCD FIFE group provides a service to push a valid vault token to the mu2epro account every 10 min (probably will be set to several hours in the future). The write access is granted by adding
mu2epro/managedtokens/fifeutilgpvm01.fnal.gov@FNAL.GOV
to the <code>k5login</code>.  Setting
export HTGETTOKENOPTS="--credkey=mu2epro/managedtokens/fifeutilgpvm01.fnal.gov"
in bashrc allows token refresh commands to find it.
There are actually two vault tokens pushed:
-rw------- 1 mu2epro mu2e 96 Aug  4 08:40 /tmp/vt_u44592
-rw------- 1 mu2epro mu2e 96 Aug  4 08:40 /tmp/vt_u44592-mu2e_production
Either of these can help generate the identical bearer token.  Essentially, the two names for one thing are baked into the tool scripts for now.  This might be fixed in the future.
To refresh tokens (usually done automatically by the tools)
htgettoken -a fermicloud543.fnal.gov -i mu2e -r production
this is also available after <code>mu2einit</code> as
getToken
Setting
export BEARER_TOKEN_FILE=/tmp/bt_token_mu2e_production_44592
causes the token procedures to always find the "production" one.
===references===
[https://docs.google.com/document/d/1JoC1lyk0WJi9xMyaoJHvEBHsQfROslowe606nEAqFc4/edit# Tanya page]
[https://indico.cern.ch/event/948465/contributions/4323985/ Dave CHEP]
[https://indico.fnal.gov/event/51072/ 9/22/21]
[https://indico.fnal.gov/event/51189/ 10/6/21]
==SSH Keys==
===What are SSH keys===
SSH has many options for performing authentication, one of which is RSA public key cryptography.  If you want you, can read more about that at, for example, https://en.wikipedia.org/wiki/RSA_(cryptosystem) .    SSH keys are a pair of files one of which contains the private key and the other of which contains the public key.  You share the public key with people/applications with whom you wish to communicate securely.  It is important to keep your private key secure.
One of the ways to authenticate to github is with SSH keys and github has some good information about SSH keys:
* Generating and testing keys:  https://help.github.com/en/articles/connecting-to-github-with-ssh
* Trouble shooting problems: https://help.github.com/en/categories/authenticating-to-github
The following contains a shortened version of that information plus some additional details.
===Generating a public and private key pair===
Do this step on your desktop or laptop, not on one of the Fermilab interactive machines.
The instructions below are good for most unix based systems, including Mac OS.  If someone knows the instructions for other operating systems, please add them here.
==== Instructions for Unix and Mac ====
The first step is to choose a password that you will provide when prompted.  Please use something unique and do not recycle your kerberos or services password.  You will rarely need this password so make sure you have a means to remember it.
# cd ~/.ssh
# ssh-keygen
## it will also prompt you for a file name; I chose robk_rsa .
## it will prompt you for password; use the new one you chose above.
This will create two files in the current directory named robk_rsa and robk_rsa.pub; the former holds your private key and the latter your public key.
===Adding a key to your ssh-agent===
Do this step on your desktop or laptop, not on one of the Fermilab interactive machines.
The next step is to add your ''private'' key to the ssh-agent running on your machine.  You can check to see if there are already any registered private keys with the command:
* ssh-add -l
The option is a lower case letter L, not the numeral one.
To add your new key: (n.b.  rhb was having trouble with this step after a recent Mac upgrade, and tried with the unqualified pathname, and then it (ahem) just worked.  The old key which had been working did not, the new key did not, and only the unqualified pathname worked)
# cd ~/.ssh
# ssh-add ~/.ssh/robk_rsa
# You will be prompted a password; use the password you gave to ssh-keygen.
RHB has also determined that at least on OSX 10.14.6 every reboot requires this ssh-add step.
If you get the error message "Could not open a connection to your authentication agent", you don't have the agent running, you can start it with
* eval `ssh-agent`
You can verify that the ssh-agent has your key:
* ssh-add -l
If you do this in one terminal window during an interactive session it will immediately be present in all other terminal windows and all other applications.
When you reboot your computer the key will no longer be in your ssh-agent.  To add it you need again to run
* ssh-add ~/.ssh/robk_rsa
so you had better remember that password! You could put ssh-add -l in your .profile and then before you get too far you'll see if you need to add the agent again. Once this step is complete you will be able to communicate securely from your laptop/desktop to people/applications with whom you have shared your public key.
To delete a key from ssh-agent:
* ssh-add -d robk_rsa
===Forwarding your keys from your laptop/desktop to the Mu2e interactive machines===
Do this step on your desktop or laptop, not on one of the Fermilab interactive machines.
The final step is to forward the private key information from the ssh-agent on your laptop/desktop to a Mu2e interactive machine so that you can also have secure communications when logged into one of those machines.  Forwarding your private keys via ssh is more secure than placing a copy of your private key on the Mu2e interactive machines.
There are two ways to do this.  The first way is to add the -A option to your ssh command line:
<nowiki>
ssh -A mu2egpvm01</nowiki>
You will need to add the -A option everytime that you ssh into a machine.  You can verify that your credentials have been forwarded by  logging into, for example, mu2egpvm01 and issuing the following command:
<nowiki>
ssh-add -l</nowiki>
In my case I see the forwarded key information plus a warning message that indicates that the older protocol 1 is still active but not in use.
<nowiki>
error fetching identities for protocol 1: agent refused operation</nowiki>
This is only a warning message. 
The second way is that you can modify your ~/.ssh/config file so that the -A option is present by default on every ssh command.  Make a backup copy of ~/.ssh/config.  The structure of this file is a series of blocks beginning with "Host pattern", followed by lines with option-value pairs.  When you ssh into a machine, ssh will look in the config file and pattern match the name of your target machine to the patterns given in the lines beginning with "Host".  If the name will match multiple patterns, the first match wins.  The final Host block should begin with "Host *" to match all machine names that have not already matched.  The options given in the matched Host block will be applied to your ssh session.  The option that controls forwarding of ssh-agent credentials is:  "ForwardAgent". The spirit of the following instructions is that you should only forward these credentials to machines that need them and that you trust; forwarding them to all machines risks theft of the credentials.
In the block beginning: 
<nowiki>Host *.fnal.gov</nowiki>
add the line:
<nowiki> ForwardAgent yes</nowiki>
The ssh config protocol requires an exact match for machine names.  So if you normally type ssh mu2egpvm01, not mu2egpvm01.fnal.gov, it will not match the *.fnal.gov pattern.  To rectify this you can copy the "Host *.fnal.gov" block to a new block named "Host mu2e*" or simply use the "Host mu2e*" block only, if that matches your needs.  The final block of the file should begin:
<nowiki>Host *</nowiki>
which specifies options for all machines not matched by other patterns.  It should contain:
<nowiki> ForwardAgent no</nowiki>
==Authenticating to github==
GitHub has recently announced that [https://github.blog/2020-12-15-token-authentication-requirements-for-git-operations/ password based authentication is deprecated and will soon be disabled].  Several types of token based authentication, including ssh keys, will continue to be supported.  This section describes how to use ssh keys for authentication.
GitHub has good [https://docs.github.com/en/free-pro-team@latest/github/authenticating-to-github/adding-a-new-ssh-key-to-your-github-account instructions for registering ssh keys with GitHub].  The instructions below add a few details:
# Follow the instructions [[Authentication#SSH Keys]] to enable ssh keys on your laptop/desktop and forward them to the Mu2e interactive machines.
# Log in to your github account via the web interface: https://github.com
# In the upper right corner, click on your picture or avatar icon to get a pop-up menu
# From the menu select "Settings"
# In the left hand task bar select "SSH and GPG Keys"
# In the upper right of the main pane there is a green button "New SSH Key".  Click it.
# Mouse the content of your ''public'' key file into the place provided.  Fill in the title field with a unique name.
# Click on the green "Add SSH Key" below the field for the content of your key file.
# The system will prompt your for the password to your *github* account, not the password you used to encrypt the key.
On your laptop or desktop you can test that everything is working by cloning a repository:
<nowiki>
git clone git@github.com:username/Offline</nowiki>
where "username" is the GitHub name of the fork that you are cloning;  the official Mu2e repository has the username "mu2e".
Also test that your ssh keys have been forwarded to the mu2e interactive machines as follows:
<nowiki>ssh mu2egpvm01
mu2einit
git clone git@github.com:username/Offline</nowiki>
==Jupyter==
[https://jupyter.org Jupyter] is a web-based interactive development environment which allows to have notebooks containing code, text, and plots. The most convenient way to use Jupyter and access Mu2e files is to run a session on a Mu2e gpvm and create a SSH tunnel to your local machine.
The recommended way to install Jupyter is through [https://docs.conda.io/en/latest/ conda], a package management system which manages self-contained and independent Python ''environments''.
===Install Miniconda===
Starting on Oct 16, 2024, Fermilab will block the Anaconda and Miniconda repositories when accessed from onsite.  The recommended option is to use Miniforge. For more details see [[Pyana#Anaconda]]. 
<font color=red>Fixme:</font> The instructions below need to be edited to point people at Miniforge.
The first thing we do is to download and run [https://docs.conda.io/en/latest/miniconda.html Miniconda], a minimal installer of conda.
So we log into a Mu2e gpvm (e.g. mu2egpvm01) and run:
<nowiki>cd
mu2einit
setup python v3_8_3b
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
chmod +x Miniconda3-latest-Linux-x86_64.sh
./Miniconda3-latest-Linux-x86_64.sh
echo ". ~/miniconda3/etc/profile.d/conda.sh" >> ~/.bash_profile</nowiki>
===Install Jupyter===
Now, we logout and login again to create a conda environment containing Jupyter:
<nowiki>
conda create -n jupyter_env
conda activate jupyter_env
conda install -c conda-forge jupyterlab</nowiki>
===Create a SSH tunnel===
Now it is possible to run a Jupyter session (in this case JupyterLab):
<nowiki>
jupyter-lab --no-browser</nowiki>
A token, needed to access the session from another browser, will appear. Take note of it and open a SSH tunnel on your local machine:
<nowiki>
ssh -L localhost:8888:localhost:8888 user@mu2egpvm01.fnal.gov </nowiki>
Now, open your browser at http://localhost:8888 on your local machine and enter the token. The Jupyter session will appear and you should be able to access and the files on the gpvm.


If you terminate the session on the gpvm you will now only need to load the conda environment and create a new Jupyter session:
<nowiki>
conda activate jupyter_env
jupyter-lab --no-browser</nowiki>


[[Category:Computing]]
[[Category:Computing]]
[[Category:Infrastructure]]
[[Category:Infrastructure]]

Latest revision as of 20:42, 4 October 2024


Introduction

You have one Fermilab username for all computing purposes at Fermilab. It was assigned when you applied for your Fermilab ID and computer accounts. This one username is used by several different authentication systems.

The lab provides interactive linux machines for use by Mu2e; see ComputingTutorials#Interactive_logins. You login in to these machines using kerberos authentication. You identify yourself to kerberos using your kerberos "principal" which is looks like "xyz@FNAL.GOV", where xyz is your Fermilab username. Usually you only need to type the xyz part. When your Fermilab accounts were created you set a password for kerberos. You will use your kerberos principal and password to log into all lab supplied linux computing resources.

The second identity you will need is your Fermilab Single Sign On (SSO) identity. For historical reasons this is sometimes called your services principal, which looks like xyz@services.fnal.gov, where xyz is your Fermilab username. Usually you only need to type the xyz part. When your Fermilab computing accounts were created, you set your SSO password; it should be different from your kerberos password. You use your SSO password to log into the Mu2e web site, the Mu2e DocDB, the Electronic Log Book, the Mu2e internal wiki and to edit the Mu2e public wiki. You also use it to access Fermilab email, the Service Desk web site and some other services hosted by Fermilab.

The third identity you will need is a CILogon certificate. You use this certifcate by loading it into your browser, which then gives you access to some Fermilab supported web pages and web services. See #Certifcate, below.

The fourth identity that you need is to access Mu2e h=Hypernews, which is an archived blog and email list. To learn about Hypernews see Communications and Collaborative Tools#Hypernews.

A small number of Mu2e people may also need a fifth identity. Fermilab maintains a separate kerberos system to grant access to a cluster of interactive Windows machines. The kerberos principal for these machines looks like xyz@FERMI.GOV (FERMI, not FNAL), where xyz is your Fermilab username. If you need an account on these machines you can requrest one by opening a Service Desk ticket.

There is a standard lab seminar on authentication for users.

For instructions on how to get a Fermilab ID, Fermilab computing accounts and Mu2e computing accounts, see ComputingAccounts.

Kerberos

You login to the virtual machines with kerberos authentication. You will need a permanent ID called a kerberos "principal" which is looks like "xyz@FNAL.GOV", where xyz is your username. You will have a password associated with your principal. You will use this principal and password to log into the various personal linux desktops located at Fermilab or to ssh into the collaboration interactive machines from your home institution.

Your kerberos authentication is stored in a file

/tmp/krb5cc_`id -u`_*

where "*" will be a random string. Each time you ssh into a machine, it will produce a new file in /tmp for that process. The environmental KRB5CCNAME will point to the ticket file. The ticket may be viewed with

klist  

If you log into a lab desktop, you would typically user you username (just "xyz" without the "@FNAL.GOV") a new ticket is created. You can renew the ticket, or create it if you logged on my some other means, using

kinit

kinit takes an argument which is the user name (or full principle) and asked for you password. Tickets are "forwardable" by default. This means if you are logged into machine A with a ticket, and ssh to machine B, the ticket will also be moved to machine B, so you can ssh (or scp, etc) once you are on B.

Tickets are only valid for 26h and you typically refresh your kerberos authentication every day. This would normally happen as you log in to a desktop. If you have an ssh session from machine A to machine B, and leave the session up, you may renew the ticket on A, but the ticket on B will not automatically be updated. in this case, you can use

k5push <node>

to push your fresh ticket out though all ssh sessions and update the remote tickets.

There are two utilities in kerberos which we will only note here. One is kcron and kcroninit. These allows you to gain a kerberos ticket in a cron job. The other variant is the keytab file. This file can hold a ticket that is good for a year and can be accessed by anyone with access to the keytab file. This feature is not as secure, so it is usually only issued by the lab when needed, for example, in a group account.

Read more at the FNAL kerberos link (Miscellaneous Kerberos Topics for the User -> Automated Processes). Reset password here.

Services

The second identity you will need is the services principal, which looks like xyz@services.fnal.gov, or often just xyz, and also has a password (different from your kerberos password). You will need this identity to log into Fermilab email, the servicedesk web site, sharepoint and some other services based at the lab. You would typically only use this authentication at the point you log into the service and their is no local credential cache.

Certificate

For some interactive purposes on linux, via the command line or browsers, you will need a certificate to prove your identity. Our certificates are based on the CILogon Certification Authority (CA) which is geared to big science. Some of the systems which require cert authentication are jobsub job submission, ifdh data transfer and writing to the SAM database as part of file upload to tape.

You should have received a certificate as part of registering for computing accounts. (See also Fermilab docs on certs for creating or renewing a cert.) The cert can be downloaded as a password-protected pem or p12 file and imported in your browser. Your cert is good for a year and only needs to be updated in your browser once a year.

When your cert is first created, it is also communicated to the lab which can then manage and provide the cert for you. When you access your cert at the linux command line, you usually access it from this cache. To access the cache you need a valid kerberos ticket. When this access occurs, you get a local file called a proxy which is a copy of your cert, but only valid for a finite time.

It is this proxy that commands can use to authenticate you. Note that jobsub and ifdh can automatically create a proxy for you if it is needed (so you only need to remember to kinit), however, samweb does not create proxies automatically, and you will need to do that yourself if you get authentication errors.

Certs may also have "extensions" added by the lab which enable certain access. The one we use is the "VO" or "virtual organization" extension. All experiments in the Intensity Frontier are part of the fermilab VO. This allows access to OSG grid sites, for example.

The easiest way to get a proxy is to run vomsCert which is in your path after you mu2einit:

mu2einit
kinit
vomsCert


The following are some details and alternatives for getting proxies

kinit 
kx509
kx509 uses kerberos authentication to make a proxy, and writes it to a file named with your UID:
kx509
ls -l /tmp/x509up_u`id -u`
-rw------- 1 rlc mu2e 8171 Aug 15 10:07 /tmp/x509up_u1311


kx509 is equivalent to

setup cigetcert
cigetcert --institution="Fermi National Accelerator Laboratory"

which you might see some places. If you do not have a kerberos ticket when you run cigetcert, it will prompt you for your services password, and you can use this authentication to access your cert and make a proxy.


You can print your proxy certificate with

voms-proxy-info -all

Your cert works by providing encrypted identity information. This packet will be "signed" by a Certificate Authority (CA). The party, such as a lab service, that wants to check your identity can ask the CA if your packet is valid. CA's may be signed by other CA's in turn up to a nationally-recognized organization, in a "trust chain".

Certificate Error

If you believe you have a valid certificate and you still see errors, for example,

Error creating dataset definition for ...
500 SSL negotiation failed: .

Then try removing the cert and recreating it.

kinit 
rm /tmp/x509up_u`id -u`
kx509

The problem is that the proxy created by jobsub or ifdh may require extra steps in authentication because it is "issued" by you. The cert created by kx509 is issued by the CILogon CA, and requires fewer steps to authenticate. While this delete-and-kx509 procedure works fine, the best solution is to provide the intermediate CA in the trust chain to the command (more below).


Types of certs and proxies

A proxy is a copy of your certificate that expires quickly, usually in a few days or hours (see the printouts for "timeleft"). If a bad actor were to gain access to the proxy, they could only use it for the valid time, and they could not replicate it. The proxy is considered safe enough to pass around the grid and over networks. At the command line, we only use proxies.


kx509 creates "end entity certificates", and are temporary copies of your cert. (I've had experts tell me these are technically proxies, but this seems to not be helpful.) Here is what one print looks like:

subject   : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc
issuer    : /DC=org/DC=cilogon/C=US/O=CILogon/CN=CILogon Basic CA 1
identity  : /DC=org/DC=cilogon/C=US/O=CILogon/CN=CILogon Basic CA 1
type      : unknown
strength  : 2048 bits
path      : /tmp/x509up_u1311
timeleft  : 167:59:40
key usage : Digital Signature, Key Encipherment, Data Encipherment

The term "end-entity credential" just means it is below a CA in the certificate trust chain.

If you print your cert and see "subject' with an appendage like "/CN=2707985426" then this is a proxy. It may also say "proxy" in the "type" field. The kx509 certs are not fully RFC-compliant (the current industry standard) for backwards compatibility, so have type "unknown".

You can generate a proxy from your certificate, the easiest way I know is:

kx509 --proxyhours=1

which gives

subject   : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc/CN=528207979
issuer    : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc
identity  : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc
type      : RFC compliant proxy
strength  : 2048 bits
path      : /tmp/x509up_u1311
timeleft  : 0:59:55

This is a proxy derived from your end entity cert, so you are the issuer. It has type proxy and has the "/CN=..." and limited time features of a proxy.


Under some circumstances, you might end up with a voms proxy. voms is a system to track identities for uses on the grid. When you run a job on the grid, this will be the form of the cert you have on the grid node as you run. These proxies have been "extended" with additional fields that can be considered part of your identity, such as your VO (Virtual Organization, i.e. your experiment) and your role ("Analysis" or "Production").

In our current procedures, you should never need to do this directly, but if you want, you can create a voms proxy:

 voms-proxy-init -noregen -rfc -voms fermilab:/fermilab/mu2e/Role=Analysis

This will create a proxy with an extension:

subject   : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc/CN=333302448
issuer    : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc
identity  : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc
type      : RFC compliant proxy
strength  : 1024 bits
path      : /tmp/x509up_u1311
timeleft  : 11:59:57
key usage : Digital Signature, Key Encipherment, Data Encipherment
=== VO fermilab extension information ===
VO        : fermilab
subject   : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Raymond Culbertson/CN=UID:rlc
issuer    : /DC=org/DC=opensciencegrid/O=Open Science Grid/OU=Services/CN=voms2.fnal.gov
attribute : /fermilab/mu2e/Role=Analysis/Capability=NULL
attribute : /fermilab/mu2e/Role=NULL/Capability=NULL
attribute : /fermilab/Role=NULL/Capability=NULL
timeleft  : 11:59:57
uri       : voms2.fnal.gov:15001

The "fermilab" VO is used for all smaller experiments at the lab. In the fermilab VO, there is a "mu2e group". Again, this has all the features of a proxy.

A different location of the proxy can be indicated to most services with some combination of

X509_USER_PROXY=/tmp/x509up_u`id -u`
X509_USER_CERT=/tmp/x509up_u`id -u`
X509_USER_KEY=/tmp/x509up_u`id -u`
X509_CERT_DIR=/etc/grid-security/certificates

With a real cert, you might need to point CERT and KEY to different files, but with a proxy you can point them both to the proxy. The CERT_DIR variable points the command to the intermediate CA certs.

There is an equivalent set for the http protocol:

HTTPS_CA_FILE
HTTPS_CERT_FILE
HTTPS_KEY_FILE
HTTPS_CA_DIR

The RFC for proxy certificates is here: link and for X.509 certificates here: link.

Browsers

When you first get your certificate, you want to load it in your browsers. This is usually straightforward following the browser instructions. When you visit secure web sites, such as Jenkins, you may have to select the certificate to present to the web site.

When you connect to a secure web site that expects a certificate from you, that site will also present your browser with a certificate of its own. Your browser will then attempt to authenticate the certificate. If it cannot, it will open a dialog box telling you that it does not recognize the site's certificate and asking you if you would like to "add an exception". If you add the exception, then your browser will accept this site even though the browser cannot itself authenticate the certificate. This is usually OK, but not ideal.

The way that your browser authenticates a certificate is that it contacts a recognized, trusted, Certificate Authority (CA). It then forwards the certificate in question to the CA and asks "Can I trust this?". If all is well, the CA replies that you can trust it. If your browser does not know the relevant CA to use, or if it does not trust the CA that the certificate says to use, then your browser will start the "add exception" dialog.

Out of the box, your browser usually has a set of CA's from commercial services, but not the CILogon CA, which we are using currently. You can see what CA's are needed by entering the url in at digicert or running

openssl s_client -connect URL:PORT

If it is uses the CILogon certs, you can usually find them here:

/etc/grid-security/certificates/cilogon-basic.pem
/etc/grid-security/certificates/cilogon-osg.pem

Or you can download the same files from CILogon CA certs.

After following the browser instructions for importing the CA cert to your browser, you usually need to check a box to "trust" these certs.

KCA

KCA refers to an old lab-based Certificate Authority system which is now disabled (2016).

Grid Workflows

The following operations (from Workflows) require certs:

  • samweb write operations (such as mu2eFileDeclare) require a kx509 cert
  • ifdh copies to dCache (including mu2eFileUpload --ifdh) require a voms proxy
  • jobsub grid submission requires a voms proxy
  • reading files via xrootd (filespec like xroot://..) requires a voms proxy

When you land on a grid node, you will have a voms proxy, so all procedures should "just work". But when you perform procedure interactively, you need to manage your certificates.

Two procedures, jobsub and ifdh, will try to make a cert for you. We believe the pattern is:

  • check for a voms proxy pointed to by X509 environmentals
  • check for a voms proxy in /tmp/x509up_u$UID
  • check for a voms proxy in /tmp/x509up_voms_mu2e_Analysis_$UID
  • if a voms proxy is found, stop and use it (issue a warning if proxy has less than an hour left)
  • if no voms proxy found look for x509 cert: in X509 environmentals, or in /tmp/x509up_u$UID
  • if a x509 is found, create a 12 h voms proxy in /tmp/x509up_voms_mu2e_Analysis_$UID, use that
  • if no x509 proxy is found, look for one-session (daily, default) kerberos ticket in /tmp/krb5cc_${UID}_*, as pointed to by KRB5CCNAME environmental.
  • if a kerberos ticket is found, create x509 cert in /tmp/x509up_u$UID, and voms proxy in /tmp/x509up_voms_mu2e_Analysis_$UID, use voms proxy
  • if the voms proxy expires repeat the above

While this code is trying to be helpful, it can fall short. For example, if it finds a x509 cert with 2h left on it, the voms proxy will also be limited to 2h. Kerberos session tickets will be deleted when the interactive process that created it exits, which will prevent ifdh and jobsub from using it to renew the voms proxy. (Certs are not deleted when you log off.) Cert-based authentication is unique to a node and user, but not a session which means you might start a script while you have good authentication, then overwrite those certs by an action in a different session on the same machine. These effects can lead to the procedures failing at unexpected times. The only complete solution is to understand, plan, and monitor your authentication.

To setup voms authentication for a long time, so you can run a long mu2eFileUpload, for example, you probably want to refresh your certs first:

TMPP=$(mktemp)
TMPV=$(mktemp)
FINV=/tmp/x509up_u$UID
/usr/krb5/bin/klist
# this is the same as kx509, but with more options
/usr/bin/cigetcert -i "Fermi National Accelerator Laboratory" -o $TMPP
voms-proxy-info -file $TMPP
voms-proxy-init -hours 120 -noregen -rfc \
 -voms fermilab:/fermilab/mu2e/Role=Analysis \
 -cert $TMPP -out $TMPV
mv $TMPV $FINV
rm $TMPP
voms-proxy-info -all

A command has been provided on cvmfs:

/cvmfs/mu2e.opensciencegrid.org/bin/vomsCert


At this point, you should see an x509 cert and voms proxy with 120 h lifetime in /tmp/x509up_u$UID and all procedures should work. You must simply be aware and careful to not overwrite these certs on this machine. The "mv" in the above script makes swapping the new cert in an atomic action, so the cert is never invalid and other scripts can use the cert uninterrupted.

Theoretically, this procedure could be put in your login scripts, but we wouldn't recommend that since you won't be so aware of what's happening.

We have also established that you can create a kcron kerberos ticket (see above) and run the authentication procedure off of the kcron ticket in a cron job, thereby establishing perpetual authentication. To do this, you need to create the kcron keytab file (see also above). To do that, just run:

kcroninit

you'll be prompted for you kerberos password and the keytab will be put on /var/adm. At this point you can see your kcron principle:

kdestroy 
kcron klist

You can run the authentication with a crontab entry:

0 */8 * * * kcron /cvmfs/mu2e.opensciencegrid.org/bin/vomsCert >& ~/vomsCert.log

but first you must get your kcron kerberos principle associated to your certificate id as explained on the fife pages, which points to this servicedesk item.

You will need to do this procedure once on each machine you want to work on. (kcroninit is permanent until you kcrondestroy..)

Tokens

Tokens are a form of authentication intended to replace x509 certs in the process of authenticating

  • grid job submission
  • dCache reads and writes (using protocols other than nfs access in /pnfs file system)
  • SAM database write access

Tokens are (2/2023) becoming the leading authentication method for Mu2e. There are docs:

There are several files involved in tokens

  1. your permanent token at cilogon.org
  2. the refresh token which is valid for 30d and is kept in the lab's central vaultserver database
  3. /tmp/vt_u$(id -u) the vault token, valid for 7d
  4. $HOME/.config/htgettoken/credkey-mu2e-default a pointer into a record in the vaultserver, saved here for faster lookup
  5. ${XDG_RUNTIME_DIR:-/tmp}/bt_u$(id -u) (looks like /run/user/1311/bt_u1311), the bearer token, valid for usually on the scale of hours. This is the one actually used in authentication.

To get a new token, issue the command

htgettoken -i mu2e --vaultserver htvaultprod.fnal.gov

or this is in a script available after mu2einit:

getToken

(In the future the arguments may have useful defaults.) Now there are two cases. If you have not been active in over 30d, or this is the first time getting a token, then none of your token caches are valid except the permanent one at cilogon.org, so you have to go there to start the process. You will see:

Complete the authentication at:
    https://cilogon.org/device/?user_code=ZCC-9M3-W6K

If the system can access a local web browser, it may open the page for you, but if not, you will have to cut and past this url into a browser. Once you access this url, and enter your lab SSO user/password, then the system will recognize your authentication and populate all your token caches.

The second case occurs when it has been less than 30d since the last time you got a token. In this case, your vault token or your refresh token will be valid, and then your kerberos ticket will be used to access those caches and refresh all your tokens.

You can see the token with

httokendecode
  or
seeToken

and delete the token with

 delToken

The common procedures (job submission using jobsub, data copies using ifdh and SAM write commands using samweb) that need this authentication, will attempt to use your kerberos ticket to refresh your tokens (the second case) whenever tokens are needed. The envisioned process is that the refresh is done by the tool when you use the tool, and without you taking any explicit token action (except when you have not been active in 30d and fall back into the first case).

If you need to run a command for many hours, so need to have a token repeatedly renewed, you can use httokensh

The following allows a test of the tokens

setup ifdhc v2_6_6
export IFDH_PROXY_ENABLE=0
export IFDH_TOKEN_ENABLE=1
export IFDH_DEBUG=10
ifdhc cp ...

Mu2epro

The SCD FIFE group provides a service to push a valid vault token to the mu2epro account every 10 min (probably will be set to several hours in the future). The write access is granted by adding

mu2epro/managedtokens/fifeutilgpvm01.fnal.gov@FNAL.GOV

to the k5login. Setting

export HTGETTOKENOPTS="--credkey=mu2epro/managedtokens/fifeutilgpvm01.fnal.gov"

in bashrc allows token refresh commands to find it.

There are actually two vault tokens pushed:

-rw------- 1 mu2epro mu2e 96 Aug  4 08:40 /tmp/vt_u44592
-rw------- 1 mu2epro mu2e 96 Aug  4 08:40 /tmp/vt_u44592-mu2e_production

Either of these can help generate the identical bearer token. Essentially, the two names for one thing are baked into the tool scripts for now. This might be fixed in the future.

To refresh tokens (usually done automatically by the tools)

htgettoken -a fermicloud543.fnal.gov -i mu2e -r production

this is also available after mu2einit as

getToken

Setting

export BEARER_TOKEN_FILE=/tmp/bt_token_mu2e_production_44592

causes the token procedures to always find the "production" one.

references

Tanya page

Dave CHEP

9/22/21

10/6/21

SSH Keys

What are SSH keys

SSH has many options for performing authentication, one of which is RSA public key cryptography. If you want you, can read more about that at, for example, https://en.wikipedia.org/wiki/RSA_(cryptosystem) . SSH keys are a pair of files one of which contains the private key and the other of which contains the public key. You share the public key with people/applications with whom you wish to communicate securely. It is important to keep your private key secure.

One of the ways to authenticate to github is with SSH keys and github has some good information about SSH keys:

The following contains a shortened version of that information plus some additional details.

Generating a public and private key pair

Do this step on your desktop or laptop, not on one of the Fermilab interactive machines.

The instructions below are good for most unix based systems, including Mac OS. If someone knows the instructions for other operating systems, please add them here.


Instructions for Unix and Mac

The first step is to choose a password that you will provide when prompted. Please use something unique and do not recycle your kerberos or services password. You will rarely need this password so make sure you have a means to remember it.

  1. cd ~/.ssh
  2. ssh-keygen
    1. it will also prompt you for a file name; I chose robk_rsa .
    2. it will prompt you for password; use the new one you chose above.

This will create two files in the current directory named robk_rsa and robk_rsa.pub; the former holds your private key and the latter your public key.

Adding a key to your ssh-agent

Do this step on your desktop or laptop, not on one of the Fermilab interactive machines.

The next step is to add your private key to the ssh-agent running on your machine. You can check to see if there are already any registered private keys with the command:

  • ssh-add -l

The option is a lower case letter L, not the numeral one.

To add your new key: (n.b. rhb was having trouble with this step after a recent Mac upgrade, and tried with the unqualified pathname, and then it (ahem) just worked. The old key which had been working did not, the new key did not, and only the unqualified pathname worked)


  1. cd ~/.ssh
  2. ssh-add ~/.ssh/robk_rsa
  3. You will be prompted a password; use the password you gave to ssh-keygen.

RHB has also determined that at least on OSX 10.14.6 every reboot requires this ssh-add step.

If you get the error message "Could not open a connection to your authentication agent", you don't have the agent running, you can start it with

  • eval `ssh-agent`

You can verify that the ssh-agent has your key:

  • ssh-add -l

If you do this in one terminal window during an interactive session it will immediately be present in all other terminal windows and all other applications.

When you reboot your computer the key will no longer be in your ssh-agent. To add it you need again to run

  • ssh-add ~/.ssh/robk_rsa

so you had better remember that password! You could put ssh-add -l in your .profile and then before you get too far you'll see if you need to add the agent again. Once this step is complete you will be able to communicate securely from your laptop/desktop to people/applications with whom you have shared your public key.

To delete a key from ssh-agent:

  • ssh-add -d robk_rsa

Forwarding your keys from your laptop/desktop to the Mu2e interactive machines

Do this step on your desktop or laptop, not on one of the Fermilab interactive machines.

The final step is to forward the private key information from the ssh-agent on your laptop/desktop to a Mu2e interactive machine so that you can also have secure communications when logged into one of those machines. Forwarding your private keys via ssh is more secure than placing a copy of your private key on the Mu2e interactive machines.

There are two ways to do this. The first way is to add the -A option to your ssh command line:

ssh -A mu2egpvm01

You will need to add the -A option everytime that you ssh into a machine. You can verify that your credentials have been forwarded by logging into, for example, mu2egpvm01 and issuing the following command:

ssh-add -l

In my case I see the forwarded key information plus a warning message that indicates that the older protocol 1 is still active but not in use.

error fetching identities for protocol 1: agent refused operation

This is only a warning message.

The second way is that you can modify your ~/.ssh/config file so that the -A option is present by default on every ssh command. Make a backup copy of ~/.ssh/config. The structure of this file is a series of blocks beginning with "Host pattern", followed by lines with option-value pairs. When you ssh into a machine, ssh will look in the config file and pattern match the name of your target machine to the patterns given in the lines beginning with "Host". If the name will match multiple patterns, the first match wins. The final Host block should begin with "Host *" to match all machine names that have not already matched. The options given in the matched Host block will be applied to your ssh session. The option that controls forwarding of ssh-agent credentials is: "ForwardAgent". The spirit of the following instructions is that you should only forward these credentials to machines that need them and that you trust; forwarding them to all machines risks theft of the credentials.

In the block beginning:

Host *.fnal.gov

add the line:

 ForwardAgent yes

The ssh config protocol requires an exact match for machine names. So if you normally type ssh mu2egpvm01, not mu2egpvm01.fnal.gov, it will not match the *.fnal.gov pattern. To rectify this you can copy the "Host *.fnal.gov" block to a new block named "Host mu2e*" or simply use the "Host mu2e*" block only, if that matches your needs. The final block of the file should begin:

Host *

which specifies options for all machines not matched by other patterns. It should contain:

 ForwardAgent no

Authenticating to github

GitHub has recently announced that password based authentication is deprecated and will soon be disabled. Several types of token based authentication, including ssh keys, will continue to be supported. This section describes how to use ssh keys for authentication.

GitHub has good instructions for registering ssh keys with GitHub. The instructions below add a few details:

  1. Follow the instructions Authentication#SSH Keys to enable ssh keys on your laptop/desktop and forward them to the Mu2e interactive machines.
  2. Log in to your github account via the web interface: https://github.com
  3. In the upper right corner, click on your picture or avatar icon to get a pop-up menu
  4. From the menu select "Settings"
  5. In the left hand task bar select "SSH and GPG Keys"
  6. In the upper right of the main pane there is a green button "New SSH Key". Click it.
  7. Mouse the content of your public key file into the place provided. Fill in the title field with a unique name.
  8. Click on the green "Add SSH Key" below the field for the content of your key file.
  9. The system will prompt your for the password to your *github* account, not the password you used to encrypt the key.

On your laptop or desktop you can test that everything is working by cloning a repository:

git clone git@github.com:username/Offline

where "username" is the GitHub name of the fork that you are cloning; the official Mu2e repository has the username "mu2e".

Also test that your ssh keys have been forwarded to the mu2e interactive machines as follows:

ssh mu2egpvm01
mu2einit
git clone git@github.com:username/Offline

Jupyter

Jupyter is a web-based interactive development environment which allows to have notebooks containing code, text, and plots. The most convenient way to use Jupyter and access Mu2e files is to run a session on a Mu2e gpvm and create a SSH tunnel to your local machine. The recommended way to install Jupyter is through conda, a package management system which manages self-contained and independent Python environments.

Install Miniconda

Starting on Oct 16, 2024, Fermilab will block the Anaconda and Miniconda repositories when accessed from onsite. The recommended option is to use Miniforge. For more details see Pyana#Anaconda.

Fixme: The instructions below need to be edited to point people at Miniforge.

The first thing we do is to download and run Miniconda, a minimal installer of conda. So we log into a Mu2e gpvm (e.g. mu2egpvm01) and run:

cd
mu2einit
setup python v3_8_3b
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
chmod +x Miniconda3-latest-Linux-x86_64.sh
./Miniconda3-latest-Linux-x86_64.sh
echo ". ~/miniconda3/etc/profile.d/conda.sh" >> ~/.bash_profile

Install Jupyter

Now, we logout and login again to create a conda environment containing Jupyter:

conda create -n jupyter_env
conda activate jupyter_env
conda install -c conda-forge jupyterlab

Create a SSH tunnel

Now it is possible to run a Jupyter session (in this case JupyterLab):

jupyter-lab --no-browser

A token, needed to access the session from another browser, will appear. Take note of it and open a SSH tunnel on your local machine:

ssh -L localhost:8888:localhost:8888 user@mu2egpvm01.fnal.gov 

Now, open your browser at http://localhost:8888 on your local machine and enter the token. The Jupyter session will appear and you should be able to access and the files on the gpvm.

If you terminate the session on the gpvm you will now only need to load the conda environment and create a new Jupyter session:

conda activate jupyter_env
jupyter-lab --no-browser