Difference between revisions of "ComputingLogin"

From Mu2eWiki
Jump to navigation Jump to search
 
(28 intermediate revisions by 5 users not shown)
Line 14: Line 14:
 
The interactive login nodes come in these groups:
 
The interactive login nodes come in these groups:
  
*'''mu2evm.fnal.gov''' This is a pool of virtual machines within the [https://ssiwiki.fnal.gov/wiki/Interactive_Server_Facility General Purpose Computing Farm (GPCF)], the main interactive computing facility for use by the Intensity Frontier experiments.  They are identical virtual quad-core machines running SL7 linuxA virtual system looks like a single machine but is actually a copy of the operating system that may share the hardware with other copies of the OS.  If you login to mu2evm you will be diverted to one of the machines in the pool, currently <b>mu2egpvm01</b> through <b>mu2egpvm06</b>. You can also ssh directly to one of these nodes.  These machines are intended for developing, compiling and linking jobs and other general purpose uses.  You can also run short jobs here - longer jobs should be submitted to the grid.  These nodes mount all the [[disks]].  
+
* '''mu2egpvm01''' through '''mu2egpvm07'''
** mu2egpvm02 has updates '''live''' - they follow the repos from the night before
+
** This is a pool of virtual machines within the [https://ssiwiki.fnal.gov/wiki/Interactive_Server_Facility General Purpose Computing Farm (GPCF)], the main interactive computing facility for use by the Intensity Frontier experiments.  They are identical quad-core virtual machines running the Alma Linux 9 (AL9) operating systemEach VM shares a host server with other VMs, some of which may be assigned to other experiments.  
** all others are '''delayed''' - they follow the clone from 30 days ago
+
** These machines are intended for developing, compiling and linking jobs and other general purpose uses.  You can also run short to moderate jobs here - large jobs should be submitted to the [[SubmitJobs | grid]].  These nodes mount all the [[disks]].  
 +
** You can also access these machines by login into '''mu2evm.fnal.gov'''.  This is a load balancer which will forward to one of 01 through 07.
 +
** Monitoring of infrastructure, nightly validation and submission of production jobs (using the mu2epro account) all run on mu2egpvm01.
  
* '''mu2ebuild01.fnal.gov''' This is a 16-core bare metal machine setup very similarly to the virtual machines.  This machine is only for building code (running scons) and running a few simple tests on the build results.  It explicitly is '''not''' for running production jobs, analysis, or anything at all that takes more than a few minutes.  As soon as anyone uses the machine for running jobs, its purpose as a build machine would be defeated.
+
* '''mu2ebuild02.fnal.gov''' This a 32-core (64 thread) bare metal machine running AL9.  It is setup very similarly to the virtual machines.  This machine is only for building code (running scons) and running a few simple tests on the build results.  It explicitly is '''not''' for running production jobs, analysis, or anything at all that takes more than a few minutes.  As soon as anyone uses the machine for running jobs, its purpose as a fast turn-around build machine would be defeated.
  
* '''mu2edaq01.fnal.gov''' This is a 24-core virtual machine, running SL7, dedicated to developing the DAQ and triggers.  Special login permission required.
+
* '''mu2edaq01.fnal.gov''' This is a 24-core virtual machine, running AL9, dedicated to developing the DAQ and triggers.  Special login permission required.
  
 
* '''mu2eimagegpmv01.fnal.gov'''  A [https://ssiwiki.fnal.gov/wiki/Container_Build_Service_Home dedicated machine], running [[Docker]] server, for building Docker images.  Ask offline management for login permissions.
 
* '''mu2eimagegpmv01.fnal.gov'''  A [https://ssiwiki.fnal.gov/wiki/Container_Build_Service_Home dedicated machine], running [[Docker]] server, for building Docker images.  Ask offline management for login permissions.
Line 28: Line 30:
 
* [https://cdcvs.fnal.gov/redmine/projects/fermicloud-openstack/wiki FermiCloud OpenStack Service] Request a custom configured VM; best suited for temporary use.
 
* [https://cdcvs.fnal.gov/redmine/projects/fermicloud-openstack/wiki FermiCloud OpenStack Service] Request a custom configured VM; best suited for temporary use.
  
 
+
You can only log in to the above nodes if you have a Fermilab computing account and accounts on the Mu2e interactive machines.  See [[ComputingAccounts]]. The only permitted access method is '''ssh''' authenticated with a '''kerberos''' ticket, as described more below.
You can only log in to the above nodes if you have a Fermilab [[ComputingAccounts|computer account]]. The only permitted access method is '''ssh''' authenticated with a '''kerberos''' ticket, as described more below.
 
 
 
Monitoring of infrastructure and production runs on mu2egpvm01 (in mu2epro account).  mu2egpvm02 gets updated to the next minor release of scientific linux as soon as it is available, but the rest are updated at the next monthly opportunity.  The idea is that if something bad happens during an upgrade, we catch it on 02 first.
 
  
 
==Logging in from Linux or Mac's==
 
==Logging in from Linux or Mac's==
  
 
If you are running a Fermilab SL distribution or are using a Fermilab maintained Mac, your /etc/krb5.conf file will be correct.  In all other cases update your krb5.conf with the  [https://metrics.fnal.gov/authentication/krb5conf/ Fermlab recommended krb5.conf]; this page requires
 
If you are running a Fermilab SL distribution or are using a Fermilab maintained Mac, your /etc/krb5.conf file will be correct.  In all other cases update your krb5.conf with the  [https://metrics.fnal.gov/authentication/krb5conf/ Fermlab recommended krb5.conf]; this page requires
[[ComputingLogin#VPN|VPN]] for access.
+
[[ComputingLogin#VPN|VPN]] for access.  Anyone can apply for [[#VPN]] access but it takes a few days so we recommend that you ask a colleague to send your their known-good /etc/krb5.conf file and install it on your machine.
  
 
From a Unix or Mac system you need to issue the following commands to log into one of the Fermilab interactive machines:
 
From a Unix or Mac system you need to issue the following commands to log into one of the Fermilab interactive machines:
  
  > kinit -l 6d -f
+
  > kinit -l 6d -f [username@FNAL.GOV]
 
  > ssh -AKX -l your_kerberos_principal mu2egpvm01.fnal.gov
 
  > ssh -AKX -l your_kerberos_principal mu2egpvm01.fnal.gov
  
Line 52: Line 51:
 
If your username on your desktop/laptop is he same as your kerberos principal, you may omit the "-l your_kerberos_principal" or the "your_kerberos_principa@".
 
If your username on your desktop/laptop is he same as your kerberos principal, you may omit the "-l your_kerberos_principal" or the "your_kerberos_principa@".
  
You can change the default behaviour of your ssh using the file ~/.ssh/config .  We recommend adding the following lines:
+
You can change the default behaviour of your ssh using the file <code>~/.ssh/config</code> .  We recommend adding the following lines:
  
 
<pre>
 
<pre>
 
Host mu2e*
 
Host mu2e*
 +
GSSAPIAuthentication yes
 
  GSSAPIDelegateCredentials yes
 
  GSSAPIDelegateCredentials yes
 
  ForwardAgent yes
 
  ForwardAgent yes
Line 62: Line 62:
 
</pre>
 
</pre>
  
On most systems this will let you drop the explicit -AKX arguments.  If your username also matches, the command is reduced to:
+
On most systems this will let you drop the explicit -AKX arguments.  If your username on your laptop is the same as your kerberos principal, then the command is reduced to:
  
 
  > ssh mu2egpvm01.fnal.gov
 
  > ssh mu2egpvm01.fnal.gov
  
 +
 +
To learn more about the file <code>~/.ssh/config</code> google it.  I found this one useful: https://linuxize.com/post/using-the-ssh-config-file .
  
  
Line 124: Line 126:
  
 
* MacOS 11, Big Sur, was released Nov 12, 2020.  See [[#Known_Issues_with_MacOS_11.2C_Big_Sur]] for a description of known issues.
 
* MacOS 11, Big Sur, was released Nov 12, 2020.  See [[#Known_Issues_with_MacOS_11.2C_Big_Sur]] for a description of known issues.
* To log into to the Mu2e interactive machines at Fermilab, you may need to download and install the [https://metrics.fnal.gov/authentication/krb5conf/ Fermilab recommended /etc/krb5.conf] (requires [[ComputingLogin#VPN|VPN]] for access) and consult the web page with suggestions on [https://cdcvs.fnal.gov/redmine/projects/admin/wiki/SSH how to resolve ssh problems].
+
* To log into to the Mu2e interactive machines at Fermilab, you may need to download and install the [https://metrics.fnal.gov/authentication/krb5conf/ Fermilab recommended /etc/krb5.conf] (requires [[ComputingLogin#VPN|VPN]] for access) and consult the web page with suggestions on [https://cdcvs.fnal.gov/redmine/projects/admin/wiki/SSH how to resolve ssh problems].  Be aware there is a ticket cache and a "Ticket Viewer".  
 
** Any time that you upgrade your OS you must check whether or not the upgrade overwrote /etc/krb5.conf; if necessary restore it from the link above.
 
** Any time that you upgrade your OS you must check whether or not the upgrade overwrote /etc/krb5.conf; if necessary restore it from the link above.
 
* Starting Wed, Nov 4, 2020, any Mac computer running macOS High Sierra (v10.13) will be blocked from accessing the Fermilab network until the machine is upgraded; all earlier versions of Mac OS are already blocked.
 
* Starting Wed, Nov 4, 2020, any Mac computer running macOS High Sierra (v10.13) will be blocked from accessing the Fermilab network until the machine is upgraded; all earlier versions of Mac OS are already blocked.
Line 137: Line 139:
 
If you install this release, read the instructions carefully.  The install requires a reboot but many people miss that.  The result is login sessions that hang.
 
If you install this release, read the instructions carefully.  The install requires a reboot but many people miss that.  The result is login sessions that hang.
  
===Known Issues with MacOS 11, Big Sur===
+
=== Monterey ===
 +
Update 12/2022: On MacOS Big Sur (and apparently also Monterey), if you want to use kinit on the command line, you need to explicitly configure where the ticket cache is; in .bashrc,  
 +
export KRB5CCNAME=KCM:uid.  See also [http://kb.mit.edu/confluence/pages/viewpage.action?pageId=4981397 MIT].
 +
 
 +
=== Big Sur===
 +
 
 +
Update 9/2023, from the lab: Starting Tuesday, Oct. 31, 2023, any Mac computer running macOS Big Sur (v11) will be blocked from accessing the Fermilab network. Fermilab-owned computers, by laboratory policy, must run vendor-supported operating systems to avoid security risks.
 +
 
 +
Update 12/2022: On MacOS Big Sur, if you want to use kinit on the command line, you need to explicitly configure where the ticket cache is; in .bashrc,
 +
export KRB5CCNAME=KCM:uid.  See also [http://kb.mit.edu/confluence/pages/viewpage.action?pageId=4981397 MIT].
  
 
Update Feb 23, 2021: I have not heard anything about this in a long time.  I expect that it has been solved but I have not heard a conclusive answer.
 
Update Feb 23, 2021: I have not heard anything about this in a long time.  I expect that it has been solved but I have not heard a conclusive answer.
Line 161: Line 172:
 
You may need to research a little more to see what you need to do to include OpenGL with your builds
 
You may need to research a little more to see what you need to do to include OpenGL with your builds
 
</nowiki>
 
</nowiki>
 
  
 
===Mojave===
 
===Mojave===
 
[https://discussions.apple.com/thread/252808834 Mojave Security Update 2021-004 broke Kerberos for me]
 
[https://discussions.apple.com/thread/252808834 Mojave Security Update 2021-004 broke Kerberos for me]
  
 +
Mojave will be blocked from the Fermilab network starting Oct 22, 2021.
  
 
===Anaconda===
 
===Anaconda===
  
Installing anaconda python analysis package can interfere with shell operation and kerberos. [https://mu2e-hnews.fnal.gov/HyperNews/Mu2e/get/HELPBug/429.html hypernews]
+
Installing the anaconda python analysis package can interfere with shell operation and kerberos. [https://mu2e-hnews.fnal.gov/HyperNews/Mu2e/get/HELPBug/429.html hypernews].  Here's an overview from experience with Big Sur:
 +
* it changed the default kinit to a conda version, which fails (gssapi-etc)</li>
 +
* it switches you to the zsh (your prompt will probably change)</li>
 +
Ask "which kinit" or echo $SHELL to see if that's the problem.  /usr/bin/yourname not anything with anaconda or conda in it!
 +
 
 +
Adding "conda deactivate" as the last line of your .bashrc doesn't do the trick; you still end up in the zshell.  I added conda deactivate as the last line of my .zshrc (which it created) and left it in my .bashrc and all was well. There may well be more intelligent ways to do this. You can conda activate later. 
 +
* https://uscms.org/uscms_at_work/physics/computing/getstarted/uaf.shtml#MacBigSur has useful information </li>
 +
 
 +
In one case none of the above worked to restore ssh functionality following installation of anaconda.  The solution was to use the -v (verbose) option on ssh. Why this worked is not understood but we have a guess: verbose mode enables debugging and it tries addition options if the first attempts fail.  Presumably this modified some sticky state that had been polluted by the Anaconda kerberos.
 +
 
 +
 
 +
Added Sept 19, 2024: I was told that adding this line to krb5.conf
 +
 
 +
  default_ccache_name = API:
 +
 
 +
will allow us to access Fermilab machines using both regular and anaconda kinit.  If you do try this please comment on the Mu2e slack about success or failure.
 +
 
 +
===Networking connection 6/2022===
 +
I just came to Fermilab for the first time in a month or two, and my computer would not connect to the internet.  It would connect to fgz,  but then DNS wouldn’t work.  It’s been registered for almost five years and it always works.
 +
 
 +
Like I said, this stumped the Help Desk.  I started poking around, and if you open network preferences, there’s a new option (or at least I don’t recall ever seeing it) called “Limit IP Address Tracking”.  I unchecked this and it immediately started working.
 +
 
 +
===Catalina===
 +
 
 +
Because it was declared end-of-life, Catalina (v10.15) was blocked from the Fermilab network on 10/28/22.
 +
 
 +
===Ventura===
 +
 
 +
4/2023 from Rob: I recently upgraded my lab Mac laptop to Ventura, upgraded XCode to version 14 and root to v6.28/00.  Root worked properly for viewing histograms and browsing TTrees.  However, there are two issues:
 +
 
 +
#Using the GeometryBrowser I can't get the OGL view to work, only the default wireframe view.
 +
#When I use #include within a .C file to include code to be compiled, root cannot find the #included file; true even with ROOT_INCLUDE_PATH set what I believe is correctly.
 +
 
 +
I upgraded to root 6.28/02. Still not luck.  I also tried the homebrew install and got a random old version that did not work.
 +
 
 +
I spoke with Philippe Canal earlier this afternoon who told me that the root team is aware of issues with root on Ventura and are working on them.  They will be fixed in an upcoming release.  Many things are fixed in the head of root if anyone wants to build it on their own.  I will wait for a new release.
 +
 
 +
Update 5/8/23 from Rob:  The workaround is to install an older root, v6.26.10.  I useed the root distribution tar.gz file for Mac OS 13.0 and Code 14, even though I am running Mac OS 13.3.1.  It's working.  Today Axel Naumann announced the release of root 6.28/04 with support for gcc 13, Xcode 14 and macOS 13.3 (Ventura).  I have asked Kyle to provide an art suite with this root.
  
 
==Logging in From PC's==
 
==Logging in From PC's==
Line 205: Line 253:
  
 
Most Mu2e people have an RSA App on their phone to generate the "soft token" needed to access Fermilab VPN.  If you change to a new phone, you may need to contact the service desk to request a new "verification code" for the app.
 
Most Mu2e people have an RSA App on their phone to generate the "soft token" needed to access Fermilab VPN.  If you change to a new phone, you may need to contact the service desk to request a new "verification code" for the app.
 +
 +
If you try click on a link to a service that requires you be on site or on VPN, but your are not, then you will see one of two outcomes. Sometimes you get an error message saying that you must be on site or on VPN. An increasing fraction of the time nothing happens - no error message and your browser does not move from the page it was on. If the latter happens to you, ask a colleague if the page requires on site or VPN.
  
 
===<ins>VNC </ins>===
 
===<ins>VNC </ins>===
Line 232: Line 282:
 
<li> once logged in, do: </li>
 
<li> once logged in, do: </li>
 
<pre>
 
<pre>
kinit                 # get Kerberos ticket
+
kinit [username@FNAL.GOV]  # get Kerberos ticket
vncpasswd             # create your VNC server password
+
vncpasswd                 # create your VNC server password
 
</pre>
 
</pre>
 
<li> check already running VNC servers:  
 
<li> check already running VNC servers:  
 +
 
<pre>
 
<pre>
 
murat@mu2egpvm02:~>uname -a; ps -efl | grep -i Xvnc | cut -c-120
 
murat@mu2egpvm02:~>uname -a; ps -efl | grep -i Xvnc | cut -c-120
Line 250: Line 301:
 
0 S huangs  32386    1  0  80  0 - 92075 ep_pol Aug04 ?        03:27:20 /usr/bin/Xvnc :29 -auth /nashome/h/huangs/.Xa
 
0 S huangs  32386    1  0  80  0 - 92075 ep_pol Aug04 ?        03:27:20 /usr/bin/Xvnc :29 -auth /nashome/h/huangs/.Xa
 
</pre>
 
</pre>
Trap for the unwary: if you do this next step after you have done a setup.sh in that shell, you will pollute the client environment with incorrect values.  Symptom is setup.sh will give you the "already set up" message.  Infinite bad things can then happen.
+
In this printout the Xserver ID is the field that immediately follows Xvnc; for example in the last line it is 29.
 
+
<li> start your VNC server with unused Xserver ID, for example, use ID=7:  
<li> start your VNC server with unused Xserver ID, for example, use ID=7: </li>
+
: <b>(make sure you start the VNC server before running any Mu2e-specific setups, having those in ~/.bashrc, for example, might result in a problem!)</b></li>
 
<pre>
 
<pre>
vncserver :7 -name MY_VNC -depth 24 -geometry 1920x1080 -localhost
+
vncserver :7 -name MY_VNC -depth 24 -geometry 1920x1080 -localhost -bs
 
</pre>
 
</pre>
 +
Replace MY_VNC with your kerberos principal.
 
Using <b>'-localhost'</b> option is a must, the server should be listening to a local port
 
Using <b>'-localhost'</b> option is a must, the server should be listening to a local port
 
</ol>
 
</ol>
Line 270: Line 322:
  
 
     <b> ssh -v -f -KX -N -L PORT:localhost:PORT <kerberos_username>@mu2egpvm04.fnal.gov </b>
 
     <b> ssh -v -f -KX -N -L PORT:localhost:PORT <kerberos_username>@mu2egpvm04.fnal.gov </b>
 
+
<Br>
 
'''notes''':  
 
'''notes''':  
  
Line 283: Line 335:
 
</ol>
 
</ol>
 
This is it! you now have acccess to your server that will look like a linux machine.
 
This is it! you now have acccess to your server that will look like a linux machine.
 +
 +
* to use KDE as a window manager (KDE is better than GNOME, but often GNOME comes as the default) configure <b>~/.Xclients</b> and <b>~/.vnc/xstartup</b> files on the server side
 +
<pre>
 +
# ---------------------------- ~/.Xclients  the mode: -rwx--x--x
 +
#!/bin/bash
 +
 +
startkde &
 +
 +
# --------------------------------- ~~/.vnc/xstartup :  -rwx--x--x
 +
#!/bin/sh
 +
 +
[ -r /etc/sysconfig/i18n ] && . /etc/sysconfig/i18n
 +
export LANG
 +
export SYSFONT
 +
vncconfig -iconic &
 +
unset SESSION_MANAGER
 +
unset DBUS_SESSION_BUS_ADDRESS
 +
OS=`uname -s`
 +
 +
WINDOWMANAGER=kde
 +
 +
if [ $OS = 'Linux' ]; then
 +
  case "$WINDOWMANAGER" in
 +
    *gnome*)
 +
      if [ -e /etc/SuSE-release ]; then
 +
        PATH=$PATH:/opt/gnome/bin
 +
        export PATH
 +
      fi
 +
      ;;
 +
  esac
 +
fi
 +
 +
export DESKTOP=KDE
 +
 +
if [ -x /etc/X11/xinit/xinitrc ]; then
 +
  exec /etc/X11/xinit/xinitrc
 +
fi
 +
if [ -f /etc/X11/xinit/xinitrc ]; then
 +
  exec sh /etc/X11/xinit/xinitrc
 +
fi
 +
[ -r $HOME/.Xresources ] && xrdb $HOME/.Xresources
 +
xsetroot -solid grey
 +
# xterm -geometry 80x24+10+10 -ls -title "$VNCDESKTOP Desktop" &
 +
kdeinit4 &
 +
</pre>
  
 
==== <ins>Setting up a VNC client on Windows</ins> ====
 
==== <ins>Setting up a VNC client on Windows</ins> ====
Line 344: Line 441:
  
 
<pre>
 
<pre>
   pf -efl | grep ssh
+
   ps -efl | grep ssh
 
</pre>
 
</pre>
  

Latest revision as of 19:58, 19 September 2024

Introduction

Fermilab maintains several machines for interactive login by Mu2e members. Typically all interactive work would be done on the mu2evm (GPCF) nodes. This web page describes how to log in to these machines and and a few others. We support the bash shell exclusively.


If you cannot find the information you need on this page, please

  1. Remember to check the section on documentation supported by other organizations
  2. Look at the Mu2e Computing Help page which contains links to additional material.


Machines

The interactive login nodes come in these groups:

  • mu2egpvm01 through mu2egpvm07
    • This is a pool of virtual machines within the General Purpose Computing Farm (GPCF), the main interactive computing facility for use by the Intensity Frontier experiments. They are identical quad-core virtual machines running the Alma Linux 9 (AL9) operating system. Each VM shares a host server with other VMs, some of which may be assigned to other experiments.
    • These machines are intended for developing, compiling and linking jobs and other general purpose uses. You can also run short to moderate jobs here - large jobs should be submitted to the grid. These nodes mount all the disks.
    • You can also access these machines by login into mu2evm.fnal.gov. This is a load balancer which will forward to one of 01 through 07.
    • Monitoring of infrastructure, nightly validation and submission of production jobs (using the mu2epro account) all run on mu2egpvm01.
  • mu2ebuild02.fnal.gov This a 32-core (64 thread) bare metal machine running AL9. It is setup very similarly to the virtual machines. This machine is only for building code (running scons) and running a few simple tests on the build results. It explicitly is not for running production jobs, analysis, or anything at all that takes more than a few minutes. As soon as anyone uses the machine for running jobs, its purpose as a fast turn-around build machine would be defeated.
  • mu2edaq01.fnal.gov This is a 24-core virtual machine, running AL9, dedicated to developing the DAQ and triggers. Special login permission required.
  • mu2eimagegpmv01.fnal.gov A dedicated machine, running Docker server, for building Docker images. Ask offline management for login permissions.
  • fnalu.fnal.gov This is a legacy machine which we have access to, but it is is only used for a few very specific purposes.

You can only log in to the above nodes if you have a Fermilab computing account and accounts on the Mu2e interactive machines. See ComputingAccounts. The only permitted access method is ssh authenticated with a kerberos ticket, as described more below.

Logging in from Linux or Mac's

If you are running a Fermilab SL distribution or are using a Fermilab maintained Mac, your /etc/krb5.conf file will be correct. In all other cases update your krb5.conf with the Fermlab recommended krb5.conf; this page requires VPN for access. Anyone can apply for #VPN access but it takes a few days so we recommend that you ask a colleague to send your their known-good /etc/krb5.conf file and install it on your machine.

From a Unix or Mac system you need to issue the following commands to log into one of the Fermilab interactive machines:

> kinit -l 6d -f [username@FNAL.GOV]
> ssh -AKX -l your_kerberos_principal mu2egpvm01.fnal.gov

An alternate form of the last line is:

> ssh -AKX your_kerberos_principal@mu2egpvm01.fnal.gov


The -f argument to kinit requests a forwardable kerberos ticket; this means that when you are logged in to your target machine you can request services that ask for kerberos authentication. Otherwise you just have permission to get into the machine, not to get further authorization once there. The -AK argument to ssh requests ssh to forward your forwardable ticket. The X argument might or might not be necessary: see the next section.

If your username on your desktop/laptop is he same as your kerberos principal, you may omit the "-l your_kerberos_principal" or the "your_kerberos_principa@".

You can change the default behaviour of your ssh using the file ~/.ssh/config . We recommend adding the following lines:

Host mu2e*
 GSSAPIAuthentication yes
 GSSAPIDelegateCredentials yes
 ForwardAgent yes
 ForwardX11Trusted yes
 ForwardX11 yes

On most systems this will let you drop the explicit -AKX arguments. If your username on your laptop is the same as your kerberos principal, then the command is reduced to:

> ssh mu2egpvm01.fnal.gov


To learn more about the file ~/.ssh/config google it. I found this one useful: https://linuxize.com/post/using-the-ssh-config-file .


Mac OpenGL 3D Error

When using OPEN GL based 3D event displays on MAC OS some people will get an error message containing the string "cannot load swrast driver". When this occurs no graphics is produced. The specific circumstances in which this error has been observed are:

  • You are working on a Mac laptop or desktop.
  • You are logged into one of the Mu2e linux machines ( for example one of the mu2egpvm machines ).
  • You are running a graphics program on the linux machine that opens a graphics window on your display.
  • Your Mac is using XQuartz installed as its X11 server and the version is at least 2.7.10.
  • Two cases in which this error has been seen are:

The most general solution is:

  1. Quit XQuartz.
  2. Open a fresh Terminal window and issue the command
  3. defaults write org.macosforge.xquartz.X11 enable_iglx -bool true
  4. Restart XQuartz and open a new Terminal window (either from the Applications menu or using the cmd-N keyboard shortcut). ( It is different from the default Mac Terminal - it is really an xTerm. )
  5. Use this xTerm window to login to a mu2egpvmxx machine:
  6. ssh -KAX userID@mu2egpvmxx.fnal.gov where userID is of course your own user ID.
  7. Your 3D graphics should now work, although you may still see an initial complaint.

In the error message, "swrast" refers to software rasterization. Depending on the details of your computer, you may have a high end graphics card capable of hardware rasterization of 3D graphics. If you do not, then X11 installation needs to do software rasterization of 3D graphics (which will work but will be slower).

If you have both a low and and a high end graphics card, your Mac normally automatically switches between the two (it uses the low end card when it can in order to save energy ). However there is a bug in the automatic switching that leads to the same error as seen above. In this case a possible solution is to disable automatic switching and to always use the high end card. The instructions for this are:

  1. To see if you have video cards: open up the Apple Menu and chose “About This Mac”. Click on “System Report”. In the left hand sidebar click on Graphics/Displays. On my machine this shows:
  2. AMD Radeon R9 M370X Intel Iris Pro The first is the more powerful video card; the second is the default on-chip video “card”.
  3. If you see only one video card then these instructions are not useful; use the instructions above.
  4. Go to the System Preferences (Gear icon) and choose energy saver. At the top is a check box for automatic graphics switching. Uncheck the box.
  5. Log out of the remote linux machine; restart XQuartz; log in again into the remote linux machine and retry.

Note on Mac's

  • MacOS 11, Big Sur, was released Nov 12, 2020. See #Known_Issues_with_MacOS_11.2C_Big_Sur for a description of known issues.
  • To log into to the Mu2e interactive machines at Fermilab, you may need to download and install the Fermilab recommended /etc/krb5.conf (requires VPN for access) and consult the web page with suggestions on how to resolve ssh problems. Be aware there is a ticket cache and a "Ticket Viewer".
    • Any time that you upgrade your OS you must check whether or not the upgrade overwrote /etc/krb5.conf; if necessary restore it from the link above.
  • Starting Wed, Nov 4, 2020, any Mac computer running macOS High Sierra (v10.13) will be blocked from accessing the Fermilab network until the machine is upgraded; all earlier versions of Mac OS are already blocked.
  • Mac OS X users can acquire the X.Org X Window System disk image at xquartz.org.
  • For most general Mac questions the best source of information is the Fermilab mailing list, MACUSERS@listserv.fnal.gov. See [1] to subscribe. The list is archived at: [2] . The lab also has a MAC_ANNOUNCE list that is archived at [3] .
  • The lab goes through a process to determine if a new OS meets the lab’s security baseline; the process typically takes a few to 6 months. During this time Fermilab owned Macs will not be upgraded to the new OS but personal or university owned Macs with the new OS are permitted at the lab. The most common issue is forgetting to configure /etc/krb5.conf ( requires VPN for access). For other issues consult the MACUSERS mailing list, described above.
  • If you use Scientific Linux Fermi (SLF) it is normally configured so that when you use ssh on your laptop to log in to a remote node ( like one of the GPCF nodes), software running on that remote node is permitted to open a new window on your laptop display. If this does not work, add the -X or -Y options to your ssh command:
> ssh -X -AK -l your_kerberos_principal mu2egpvm01.fnal.gov

XQuartz Beta Release 2.8.0

If you install this release, read the instructions carefully. The install requires a reboot but many people miss that. The result is login sessions that hang.

Monterey

Update 12/2022: On MacOS Big Sur (and apparently also Monterey), if you want to use kinit on the command line, you need to explicitly configure where the ticket cache is; in .bashrc, export KRB5CCNAME=KCM:uid. See also MIT.

Big Sur

Update 9/2023, from the lab: Starting Tuesday, Oct. 31, 2023, any Mac computer running macOS Big Sur (v11) will be blocked from accessing the Fermilab network. Fermilab-owned computers, by laboratory policy, must run vendor-supported operating systems to avoid security risks.

Update 12/2022: On MacOS Big Sur, if you want to use kinit on the command line, you need to explicitly configure where the ticket cache is; in .bashrc, export KRB5CCNAME=KCM:uid. See also MIT.

Update Feb 23, 2021: I have not heard anything about this in a long time. I expect that it has been solved but I have not heard a conclusive answer.

MacOS 11, Big Sur, was released Nov 12, 2020. There is one known issue.

One user has reported that out of the box OpenGL does not work correctly. This affects some of our event displays and geometry browsers and probably some other ROOT based graphics. The ROOT team is aware of the issue and is working with a member of SCD to resolve the issue.

The lab’s Mac support team has said that this is out of the scope of their expertise but they did provide some information:

A few years ago Apple announced that macOS would move away from OpenGL in favor of the Apple created Metal. At that time it was only mentioned that it would be removed in a “future OS”. From memory OpenGL and OpenCL were deprecated with macOS Mojave v10.14. I have seen references online that macOS Catalina does sill have it, but it remains stuck at v4.1 (2010) and will throw error messages when building.
 
Looking online, it appears that OpenGL still exists in macOS Big Sur according to this Apple Developer article https://developer.apple.com/documentation/xcode/porting_your_macos_apps_to_apple_silicon
 
From the article:

If your app uses Metal, OpenGL, or OpenCL, be aware of the following differences:
• The GPU and CPU on Apple silicon share memory.
• OpenGL is deprecated, but is available on Apple silicon.
• OpenCL is deprecated, but is available on Apple silicon when targeting the GPU. The OpenCL CPU device is not available to arm64 apps.
 
You may need to research a little more to see what you need to do to include OpenGL with your builds

Mojave

Mojave Security Update 2021-004 broke Kerberos for me

Mojave will be blocked from the Fermilab network starting Oct 22, 2021.

Anaconda

Installing the anaconda python analysis package can interfere with shell operation and kerberos. hypernews. Here's an overview from experience with Big Sur:

  • it changed the default kinit to a conda version, which fails (gssapi-etc)
  • it switches you to the zsh (your prompt will probably change)

Ask "which kinit" or echo $SHELL to see if that's the problem. /usr/bin/yourname not anything with anaconda or conda in it!

Adding "conda deactivate" as the last line of your .bashrc doesn't do the trick; you still end up in the zshell. I added conda deactivate as the last line of my .zshrc (which it created) and left it in my .bashrc and all was well. There may well be more intelligent ways to do this. You can conda activate later.

In one case none of the above worked to restore ssh functionality following installation of anaconda. The solution was to use the -v (verbose) option on ssh. Why this worked is not understood but we have a guess: verbose mode enables debugging and it tries addition options if the first attempts fail. Presumably this modified some sticky state that had been polluted by the Anaconda kerberos.


Added Sept 19, 2024: I was told that adding this line to krb5.conf

  default_ccache_name = API:

will allow us to access Fermilab machines using both regular and anaconda kinit. If you do try this please comment on the Mu2e slack about success or failure.

Networking connection 6/2022

I just came to Fermilab for the first time in a month or two, and my computer would not connect to the internet. It would connect to fgz, but then DNS wouldn’t work. It’s been registered for almost five years and it always works.

Like I said, this stumped the Help Desk. I started poking around, and if you open network preferences, there’s a new option (or at least I don’t recall ever seeing it) called “Limit IP Address Tracking”. I unchecked this and it immediately started working.

Catalina

Because it was declared end-of-life, Catalina (v10.15) was blocked from the Fermilab network on 10/28/22.

Ventura

4/2023 from Rob: I recently upgraded my lab Mac laptop to Ventura, upgraded XCode to version 14 and root to v6.28/00. Root worked properly for viewing histograms and browsing TTrees. However, there are two issues:

  1. Using the GeometryBrowser I can't get the OGL view to work, only the default wireframe view.
  2. When I use #include within a .C file to include code to be compiled, root cannot find the #included file; true even with ROOT_INCLUDE_PATH set what I believe is correctly.

I upgraded to root 6.28/02. Still not luck. I also tried the homebrew install and got a random old version that did not work.

I spoke with Philippe Canal earlier this afternoon who told me that the root team is aware of issues with root on Ventura and are working on them. They will be fixed in an upcoming release. Many things are fixed in the head of root if anyone wants to build it on their own. I will wait for a new release.

Update 5/8/23 from Rob: The workaround is to install an older root, v6.26.10. I useed the root distribution tar.gz file for Mac OS 13.0 and Code 14, even though I am running Mac OS 13.3.1. It's working. Today Axel Naumann announced the release of root 6.28/04 with support for gcc 13, Xcode 14 and macOS 13.3 (Ventura). I have asked Kyle to provide an art suite with this root.

Logging in From PC's

Fermilab does not officially support logging in to one of the Mu2e interactive nodes from a PC. We are trying to change this. Your options are:

  • The current recommended option is to use puTTy ssh client and xming xwindows server. Unoffical but helpful links: CMS doc link link link.
  • Another option is to install cyqwin , which provides a Linux-like environment for Windows. Useful instructions can be found here cygwin-instructions. You can install kerberized ssh in this environment. Consult with the Service Desk to learn how to configure ssh to work correctly.
  • The service desk may recommend Reflection software. You can run terminals on a remote linux host and display the terminal and other xwindows on your PC, and use ftp to move files.
  • You can install SLF as a guest virtual OS hosted by your Windows machine.
  • Purchase and install WRQ, which is X-Windows software that runs on PCs; configure it to do kerberized ssh. The Service Desk may be able to help with this configuration because they support WRQ but only for Fermilab employees. You need to pay for WRQ and non-employees are not covered by the Fermilab license.
  • You can configure your machine for dual boot and install both Linux and Windows. Boot to Linux when you wish to log in to Fermilab. This is very inconvenient if you need to switch from one OS to the other frequently.

Documentation Supported by other Organizations

The U.S. CMS LHC Physics Center (LPC, based at Fermilab, maintains a page on their wiki with a lot of information logging in to Fermilab machines from many types of laptops and workstations. They have the resources to keep it much more detailed and much more up to date than we do:

   https://uscms.org/uscms_at_work/physics/computing/getstarted/uaf.shtml

Remote Work

We recommend you install and use, or at least be prepared to use, the lab VPN. For working on the Muse central interactive linux machines (code development and job submission), we recommend using a remote desktop, a VNC.

Known restrictions

When working from offsite, the restrictions are a bit higher.

  • Some lab web sites, such as Jenkins and employee services, require you to access the site only through the lab Virtual Private Network (VPN).
  • for direct ssh (no VPN) from a remote linux machine to a Muse2 central interactive server, we have observed a limit of no more than 20 ssh connections in a minute.


VPN

Some web pages at the lab are restricted to viewing only on the lab network. If you are offsite and want to access one of these pages, you can use a VPN, which runs on a gateway node, authenticates you, then redirects your web traffic onto the lab network. Here is the lab VPN (use your services [email] password) and some VPN help. As of 8/2019, the VPN requires multi-factor authentication. There are two forms, a software or RSA form which is good enough for most users, and a yubikey (usb dongle) stronger form for people who need to access personnel and business information.

Most Mu2e people have an RSA App on their phone to generate the "soft token" needed to access Fermilab VPN. If you change to a new phone, you may need to contact the service desk to request a new "verification code" for the app.

If you try click on a link to a service that requires you be on site or on VPN, but your are not, then you will see one of two outcomes. Sometimes you get an error message saying that you must be on site or on VPN. An increasing fraction of the time nothing happens - no error message and your browser does not move from the page it was on. If the latter happens to you, ask a colleague if the page requires on site or VPN.

VNC

General information

VNC stands for Virtual Network Connection. It allows you to run your session on one of Mu2e interactive machines and display the running session it on your local desktop of laptop. VNC runs in a client-server mode, with client performing only display functions. Therefore, if you disconnect and reconnect later, you will continue to work in the same session. VNC handles efficiently the X11-compression and works over the network much faster than a direct SSH connection. It works well on network connections with ping times up to 200 ms, getting in trouble only if TCP packets start getting lost.

VNC server software is installed on MU2EGPVM* machines. VNC server runs in a user mode - each user starts his own server(s). The local computer only needs a VNC client and a kerberized SSH installations.

Below are a few links to general VNC tutorials:

Setting up a VNC server

To set up a VNC server follow these instructions:

  1. ssh into one of the mu2egpvm machines (like mu2egpvm04)
  2. once logged in, do:
  3. kinit [username@FNAL.GOV]  # get Kerberos ticket
    vncpasswd                  # create your VNC server password
    
  4. check already running VNC servers:
    murat@mu2egpvm02:~>uname -a; ps -efl | grep -i Xvnc | cut -c-120
    Linux mu2egpvm02.fnal.gov 3.10.0-1127.13.1.el7.x86_64 #1 SMP Tue Jun 23 10:32:27 CDT 2020 x86_64 x86_64 x86_64 GNU/Linux
    0 S bvitali   3704     1  0  80   0 - 286048 ep_pol Jul18 ?       02:37:19 /usr/bin/Xvnc :6 -auth /nashome/b/bvitali/.Xa
    0 S murat     3797  3326  0  80   0 - 28204 pipe_w 17:11 pts/57   00:00:00 grep --color=auto -i Xvnc
    0 S mmackenz  3888     1  0  80   0 - 125007 ep_pol Jul31 ?       03:32:25 /usr/bin/Xvnc :23 -auth /nashome/m/mmackenz/.
    0 S defelice  4232     1  0  80   0 - 205764 ep_pol Jul17 ?       03:49:00 /usr/bin/Xvnc :08 -auth /nashome/d/defelice/.
    0 S zanetti  11491     1  0  80   0 - 65755 ep_pol Sep08 ?        00:00:13 /usr/bin/Xvnc :2 -auth /nashome/z/zanetti/.Xa
    0 S mmackenz 14149     1  0  80   0 - 63366 ep_pol Jul17 ?        00:03:11 /usr/bin/Xvnc :1 -auth /nashome/m/mmackenz/.X
    0 S bbarton  18286     1  0  80   0 - 77543 ep_pol Jul17 ?        02:33:33 /usr/bin/Xvnc :5 -auth /nashome/b/bbarton/.Xa
    0 S chenj    21733     1  0  80   0 - 129571 ep_pol Jul28 ?       06:07:57 /usr/bin/Xvnc :9 -auth /nashome/c/chenj/.Xaut
    0 S gianipez 30193     1  0  80   0 - 83661 ep_pol Aug19 ?        00:09:30 /usr/bin/Xvnc :4 -auth /nashome/g/gianipez/.X
    0 S huangs   32386     1  0  80   0 - 92075 ep_pol Aug04 ?        03:27:20 /usr/bin/Xvnc :29 -auth /nashome/h/huangs/.Xa
    

    In this printout the Xserver ID is the field that immediately follows Xvnc; for example in the last line it is 29.

  5. start your VNC server with unused Xserver ID, for example, use ID=7:
    (make sure you start the VNC server before running any Mu2e-specific setups, having those in ~/.bashrc, for example, might result in a problem!)
  6. vncserver :7 -name MY_VNC -depth 24 -geometry 1920x1080 -localhost -bs
    

    Replace MY_VNC with your kerberos principal. Using '-localhost' option is a must, the server should be listening to a local port

Setting up a VNC client on Linux

  1. return to your local machine
  2. download a vncviewer. There is a multitude of those, for example, TigerVNC, RealVNC,etc. Mac OS has a native vncviewer, but it is slower than the others. TigerVNC is the most performant one, we recommend trying it first.
  3. once you have a VNC viewer installed, on the local machine, start a SSH tunnel connecting the local machine to the remote server:
  4. ssh -v -f -KX -N -L PORT:localhost:PORT <kerberos_username>@mu2egpvm04.fnal.gov
    notes:
    • replace PORT with 5900+ID , replace mu2egpvm04 with the name of the machine where you started the vncserver.
  5. open the vncviewer and type: localhost:ID, where ID is the same as above, then click "connect"
  6. Tigervnc client start.png
  7. you will be asked to enter the password you set on step (3)
  8. .
  9. On Macs, see below: this should be 5900+ID!!

This is it! you now have acccess to your server that will look like a linux machine.

  • to use KDE as a window manager (KDE is better than GNOME, but often GNOME comes as the default) configure ~/.Xclients and ~/.vnc/xstartup files on the server side
# ---------------------------- ~/.Xclients  the mode: -rwx--x--x
#!/bin/bash

startkde &

# --------------------------------- ~~/.vnc/xstartup :  -rwx--x--x 
#!/bin/sh

[ -r /etc/sysconfig/i18n ] && . /etc/sysconfig/i18n
export LANG
export SYSFONT
vncconfig -iconic &
unset SESSION_MANAGER
unset DBUS_SESSION_BUS_ADDRESS
OS=`uname -s`

WINDOWMANAGER=kde

if [ $OS = 'Linux' ]; then
  case "$WINDOWMANAGER" in
    *gnome*)
      if [ -e /etc/SuSE-release ]; then
        PATH=$PATH:/opt/gnome/bin
        export PATH
      fi
      ;;
  esac
fi

export DESKTOP=KDE

if [ -x /etc/X11/xinit/xinitrc ]; then
  exec /etc/X11/xinit/xinitrc
fi
if [ -f /etc/X11/xinit/xinitrc ]; then
  exec sh /etc/X11/xinit/xinitrc
fi
[ -r $HOME/.Xresources ] && xrdb $HOME/.Xresources
xsetroot -solid grey
# xterm -geometry 80x24+10+10 -ls -title "$VNCDESKTOP Desktop" &
kdeinit4 & 

Setting up a VNC client on Windows

  • install Cygwin in a minimal configuration, you only need the core, krb5 client and openssh
  • copy /etc/krb5.conf from one of the Mu2e interactive nodes to /etc directory of Cygwin installation (requires admin privilege)
 wget http://metrics.fnal.gov/authentication/krb5conf/Linux/krb5.conf
  • create the following .ssh/config file in the home directory
Host *
    GSSAPIAuthentication      yes
    GSSAPIDelegateCredentials no
    ForwardX11Trusted         yes
  • start Cygwin
  • the rest is the same, as for Linux

Setting up a VNC client on MacOS

  • mostly, the same as on Linux
  • to setup an ssh tunnel, run
 ssh -v -f -KXY -N -L PORT:localhost:PORT <kerberos_username>@mu2egpvm04.fnal.gov 
  • now start your client (Chicken etc); use your password; localhost:5900+ID
  • when connecting a client, use 5900+ID instead of ID, i.e. localhost:5901 instead of localhost:1

Few useful scripts on mu2egpvm* nodes

  • check VNC servers running on a given machine, in this example - mu2egpvm01. Look at several (mu2egpvm01-mu2egpvm06), choose the one which is less loaded
kinit
ssh YOUR_FNAL_USERNAME@mu2egpvm01.fnal.gov /mu2e/app/users/murat/bin/check_running_vnc_servers
/usr/bin/Xvnc  :2 huangs               
/usr/bin/Xvnc :22 mchattor             
/usr/bin/Xvnc  :3 kharrig              
/usr/bin/Xvnc :99 murat                
/usr/bin/Xvnc  :1 merrill              
  • choose the display number which is not used, for this example display=23 would do. Start VNC server and create password:
ssh YOUR_FNAL_USERNAME@mu2egpvm01.fnal.gov 
/mu2e/app/users/murat/bin/start_vnc_server :23 1920x1080  #  1920x1080 - is the screen size displayed by the server 
vncpasswd
  • kill server running on display :23 :
ssh YOUR_FNAL_USERNAME@mu2egpvm01.fnal.gov /mu2e/app/users/murat/bin/kill_vnc_server :23

Comments

  • you must limit the terminal history scrolling to not more than 10000 lines. Otherwise, the /tmp area on the central machine
will get filled up and the machine will need to be rebooted.

  • Once in a while, you need to reconnect. When restarting the SSH tunnel on the local machine, don't forget to kill the stale SSH tunnel processes.
Issue the following command to find those:
  ps -efl | grep ssh
  • you can add the following function to your .bashrc and use it to restart the SSH tunnel - it automatically finds and kills stale tunnel processes.
it is assumed that the used port numbers on local and remote machines are the same and that the server is running within the fnal.gov domain
#---------------------------------------------------------------------------------------
# to restart SSH port tunneling for VNC server running on mu2egpvm04 and using port 5902: 
# 
# vnc 2 mu2egpvm04
#----------------------------------------------------------------------------------------
function vnc  () {
    display=$1
    host=$2                              # assume "$host.fnal.gov" exists
    port=`printf "59%02i" $display`      # don't forget to check if the port is available
    forwarding=$port:localhost:$port
    kill -9 `ps -efl | grep ssh | grep $forwarding | awk '{print $4}'` ;
    ssh -v -f -X -N -L $forwarding $host.fnal.gov ;
}

Known problems

  • when you restart a SSH tunnel on a client side, check currently running SSH tunnels, kill the stale ones:
[murat@murat01 ~]$ ps -efl | grep ssh | grep "\-L"
1 S murat     344960       1  0  80   0 - 57739 -      Feb12 ?        00:00:28 ssh -v -f -XK -N -L 5999:localhost:5999 mu2egpvm06.fnal.gov
1 S murat     345784       1  0  80   0 - 57745 -      Feb12 ?        00:00:59 ssh -v -f -XK -N -L 5997:localhost:5997 mu2etrk@mu2edaq09.fnal.gov
[murat@murat01 ~]$ kill -9 344960 345784
  • VNC server v1.8.0 crashes on SL7 : disable back store - use -bs option when starting the VNC server
see details at https://mu2e-hnews.fnal.gov/HyperNews/Mu2e/get/Sim/744/6/2/2/1/1.html
  • 8/8/19 - Using "-bs" seems to break the root geometry viewer ogl (OpenGL) option - driver upgrade in the end of Sep'2019 fixed that
  • make sure you have the following in ~/.vnc/xstartup : unset DBUS_SESSION_BUS_ADDRESS

Accessing Fermi windows cluster from Linux

rdesktop on SL6, xfreerdp on SL7 (xfreerdp comes from the RPM called 'freerdp'). Command:

xfreerdp fermi-ts -u <username> -d fermi

More info