ComputingLogin: Difference between revisions

From Mu2eWiki
Jump to navigation Jump to search
Line 248: Line 248:
</pre>
</pre>
<li> check already running VNC servers:  
<li> check already running VNC servers:  
<pre>
<pre>
murat@mu2egpvm02:~>uname -a; ps -efl | grep -i Xvnc | cut -c-120
murat@mu2egpvm02:~>uname -a; ps -efl | grep -i Xvnc | cut -c-120
Line 262: Line 263:
0 S huangs  32386    1  0  80  0 - 92075 ep_pol Aug04 ?        03:27:20 /usr/bin/Xvnc :29 -auth /nashome/h/huangs/.Xa
0 S huangs  32386    1  0  80  0 - 92075 ep_pol Aug04 ?        03:27:20 /usr/bin/Xvnc :29 -auth /nashome/h/huangs/.Xa
</pre>
</pre>
Trap for the unwary: if you do this next step after you have done a setup.sh in that shell, you will pollute the client environment with incorrect values.  Symptom is setup.sh will give you the "already set up" message.  Infinite bad things can then happen.


<li> start your VNC server with unused Xserver ID, for example, use ID=7: </li>
<li> start your VNC server with unused Xserver ID, for example, use ID=7:  
::: <b>(make sure you start the VNC server before running any Mu2e-specific setups, having those in ~/.bashrc, for example, might result in a problem!)</b></li>
<pre>
<pre>
vncserver :7 -name MY_VNC -depth 24 -geometry 1920x1080 -localhost
vncserver :7 -name MY_VNC -depth 24 -geometry 1920x1080 -localhost -bs
</pre>
</pre>
Using <b>'-localhost'</b> option is a must, the server should be listening to a local port
Using <b>'-localhost'</b> option is a must, the server should be listening to a local port

Revision as of 22:18, 7 February 2022

Introduction

Fermilab maintains several machines for interactive login by Mu2e members. Typically all interactive work would be done on the mu2evm (GPCF) nodes. This web page describes how to log in to these machines and and a few others. We support the bash shell exclusively.


If you cannot find the information you need on this page, please

  1. Remember to check the section on documentation supported by other organizations
  2. Look at the Mu2e Computing Help page which contains links to additional material.


Machines

The interactive login nodes come in these groups:

  • mu2evm.fnal.gov This is a pool of virtual machines within the General Purpose Computing Farm (GPCF), the main interactive computing facility for use by the Intensity Frontier experiments. They are identical virtual quad-core machines running SL7 linux. A virtual system looks like a single machine but is actually a copy of the operating system that may share the hardware with other copies of the OS. If you login to mu2evm you will be diverted to one of the machines in the pool, currently mu2egpvm01 through mu2egpvm06. You can also ssh directly to one of these nodes. These machines are intended for developing, compiling and linking jobs and other general purpose uses. You can also run short to moderate jobs here - large jobs should be submitted to the grid. These nodes mount all the disks.
    • mu2egpvm02 has updates live - they follow the repos from the night before
    • all others are delayed - they follow the clone from 30 days ago
  • mu2ebuild01.fnal.gov This is a 16-core bare metal machine setup very similarly to the virtual machines. This machine is only for building code (running scons) and running a few simple tests on the build results. It explicitly is not for running production jobs, analysis, or anything at all that takes more than a few minutes. As soon as anyone uses the machine for running jobs, its purpose as a build machine would be defeated.
  • mu2edaq01.fnal.gov This is a 24-core virtual machine, running SL7, dedicated to developing the DAQ and triggers. Special login permission required.
  • mu2eimagegpmv01.fnal.gov A dedicated machine, running Docker server, for building Docker images. Ask offline management for login permissions.
  • fnalu.fnal.gov This is a legacy machine which we have access to, but it is is only used for a few very specific purposes.


You can only log in to the above nodes if you have a Fermilab computer account. The only permitted access method is ssh authenticated with a kerberos ticket, as described more below.

Monitoring of infrastructure and production runs on mu2egpvm01 (in mu2epro account). mu2egpvm02 gets updated to the next minor release of scientific linux as soon as it is available, but the rest are updated at the next monthly opportunity. The idea is that if something bad happens during an upgrade, we catch it on 02 first.

Logging in from Linux or Mac's

If you are running a Fermilab SL distribution or are using a Fermilab maintained Mac, your /etc/krb5.conf file will be correct. In all other cases update your krb5.conf with the Fermlab recommended krb5.conf; this page requires VPN for access.

From a Unix or Mac system you need to issue the following commands to log into one of the Fermilab interactive machines:

> kinit -l 6d -f
> ssh -AKX -l your_kerberos_principal mu2egpvm01.fnal.gov

An alternate form of the last line is:

> ssh -AKX your_kerberos_principal@mu2egpvm01.fnal.gov


The -f argument to kinit requests a forwardable kerberos ticket; this means that when you are logged in to your target machine you can request services that ask for kerberos authentication. Otherwise you just have permission to get into the machine, not to get further authorization once there. The -AK argument to ssh requests ssh to forward your forwardable ticket. The X argument might or might not be necessary: see the next section.

If your username on your desktop/laptop is he same as your kerberos principal, you may omit the "-l your_kerberos_principal" or the "your_kerberos_principa@".

You can change the default behaviour of your ssh using the file ~/.ssh/config . We recommend adding the following lines:

Host mu2e*
 GSSAPIDelegateCredentials yes
 ForwardAgent yes
 ForwardX11Trusted yes
 ForwardX11 yes

On most systems this will let you drop the explicit -AKX arguments. If your username also matches, the command is reduced to:

> ssh mu2egpvm01.fnal.gov


Mac OpenGL 3D Error

When using OPEN GL based 3D event displays on MAC OS some people will get an error message containing the string "cannot load swrast driver". When this occurs no graphics is produced. The specific circumstances in which this error has been observed are:

  • You are working on a Mac laptop or desktop.
  • You are logged into one of the Mu2e linux machines ( for example one of the mu2egpvm machines ).
  • You are running a graphics program on the linux machine that opens a graphics window on your display.
  • Your Mac is using XQuartz installed as its X11 server and the version is at least 2.7.10.
  • Two cases in which this error has been seen are:

The most general solution is:

  1. Quit XQuartz.
  2. Open a fresh Terminal window and issue the command
  3. defaults write org.macosforge.xquartz.X11 enable_iglx -bool true
  4. Restart XQuartz and open a new Terminal window (either from the Applications menu or using the cmd-N keyboard shortcut). ( It is different from the default Mac Terminal - it is really an xTerm. )
  5. Use this xTerm window to login to a mu2egpvmxx machine:
  6. ssh -KAX userID@mu2egpvmxx.fnal.gov where userID is of course your own user ID.
  7. Your 3D graphics should now work, although you may still see an initial complaint.

In the error message, "swrast" refers to software rasterization. Depending on the details of your computer, you may have a high end graphics card capable of hardware rasterization of 3D graphics. If you do not, then X11 installation needs to do software rasterization of 3D graphics (which will work but will be slower).

If you have both a low and and a high end graphics card, your Mac normally automatically switches between the two (it uses the low end card when it can in order to save energy ). However there is a bug in the automatic switching that leads to the same error as seen above. In this case a possible solution is to disable automatic switching and to always use the high end card. The instructions for this are:

  1. To see if you have video cards: open up the Apple Menu and chose “About This Mac”. Click on “System Report”. In the left hand sidebar click on Graphics/Displays. On my machine this shows:
  2. AMD Radeon R9 M370X Intel Iris Pro The first is the more powerful video card; the second is the default on-chip video “card”.
  3. If you see only one video card then these instructions are not useful; use the instructions above.
  4. Go to the System Preferences (Gear icon) and choose energy saver. At the top is a check box for automatic graphics switching. Uncheck the box.
  5. Log out of the remote linux machine; restart XQuartz; log in again into the remote linux machine and retry.

Note on Mac's

  • MacOS 11, Big Sur, was released Nov 12, 2020. See #Known_Issues_with_MacOS_11.2C_Big_Sur for a description of known issues.
  • To log into to the Mu2e interactive machines at Fermilab, you may need to download and install the Fermilab recommended /etc/krb5.conf (requires VPN for access) and consult the web page with suggestions on how to resolve ssh problems.
    • Any time that you upgrade your OS you must check whether or not the upgrade overwrote /etc/krb5.conf; if necessary restore it from the link above.
  • Starting Wed, Nov 4, 2020, any Mac computer running macOS High Sierra (v10.13) will be blocked from accessing the Fermilab network until the machine is upgraded; all earlier versions of Mac OS are already blocked.
  • Mac OS X users can acquire the X.Org X Window System disk image at xquartz.org.
  • For most general Mac questions the best source of information is the Fermilab mailing list, MACUSERS@listserv.fnal.gov. See [1] to subscribe. The list is archived at: [2] . The lab also has a MAC_ANNOUNCE list that is archived at [3] .
  • The lab goes through a process to determine if a new OS meets the lab’s security baseline; the process typically takes a few to 6 months. During this time Fermilab owned Macs will not be upgraded to the new OS but personal or university owned Macs with the new OS are permitted at the lab. The most common issue is forgetting to configure /etc/krb5.conf ( requires VPN for access). For other issues consult the MACUSERS mailing list, described above.
  • If you use Scientific Linux Fermi (SLF) it is normally configured so that when you use ssh on your laptop to log in to a remote node ( like one of the GPCF nodes), software running on that remote node is permitted to open a new window on your laptop display. If this does not work, add the -X or -Y options to your ssh command:
> ssh -X -AK -l your_kerberos_principal mu2egpvm01.fnal.gov

XQuartz Beta Release 2.8.0

If you install this release, read the instructions carefully. The install requires a reboot but many people miss that. The result is login sessions that hang.

Known Issues with MacOS 11, Big Sur

Update Feb 23, 2021: I have not heard anything about this in a long time. I expect that it has been solved but I have not heard a conclusive answer.

MacOS 11, Big Sur, was released Nov 12, 2020. There is one known issue.

One user has reported that out of the box OpenGL does not work correctly. This affects some of our event displays and geometry browsers and probably some other ROOT based graphics. The ROOT team is aware of the issue and is working with a member of SCD to resolve the issue.

The lab’s Mac support team has said that this is out of the scope of their expertise but they did provide some information:

A few years ago Apple announced that macOS would move away from OpenGL in favor of the Apple created Metal. At that time it was only mentioned that it would be removed in a “future OS”. From memory OpenGL and OpenCL were deprecated with macOS Mojave v10.14. I have seen references online that macOS Catalina does sill have it, but it remains stuck at v4.1 (2010) and will throw error messages when building.
 
Looking online, it appears that OpenGL still exists in macOS Big Sur according to this Apple Developer article https://developer.apple.com/documentation/xcode/porting_your_macos_apps_to_apple_silicon
 
From the article:

If your app uses Metal, OpenGL, or OpenCL, be aware of the following differences:
• The GPU and CPU on Apple silicon share memory.
• OpenGL is deprecated, but is available on Apple silicon.
• OpenCL is deprecated, but is available on Apple silicon when targeting the GPU. The OpenCL CPU device is not available to arm64 apps.
 
You may need to research a little more to see what you need to do to include OpenGL with your builds


Mojave

Mojave Security Update 2021-004 broke Kerberos for me

Mojave will be blocked from the Fermilab network starting Oct 22, 2021.

Anaconda

Installing the anaconda python analysis package can interfere with shell operation and kerberos. hypernews. Here's an overview from experience with Big Sur:

  • it changed the default kinit to a conda version, which fails (gssapi-etc)
  • it switches you to the zsh (your prompt will probably change)
  • Ask "which kinit" or echo $SHELL to see if that's the problem. /usr/bin/yourname not anything with anaconda or conda in it! Adding "conda deactivate" as the last line of your .bashrc doesn't do the trick; you still end up in the zshell. I added conda deactivate as the last line of my .zshrc (which it created) and left it in my .bashrc and all was well. There may well be more intelligent ways to do this. You can conda activate later.

  • https://uscms.org/uscms_at_work/physics/computing/getstarted/uaf.shtml#MacBigSur has useful information
  • Logging in From PC's

    Fermilab does not officially support logging in to one of the Mu2e interactive nodes from a PC. We are trying to change this. Your options are:

    • The current recommended option is to use puTTy ssh client and xming xwindows server. Unoffical but helpful links: CMS doc link link link.
    • Another option is to install cyqwin , which provides a Linux-like environment for Windows. Useful instructions can be found here cygwin-instructions. You can install kerberized ssh in this environment. Consult with the Service Desk to learn how to configure ssh to work correctly.
    • The service desk may recommend Reflection software. You can run terminals on a remote linux host and display the terminal and other xwindows on your PC, and use ftp to move files.
    • You can install SLF as a guest virtual OS hosted by your Windows machine.
    • Purchase and install WRQ, which is X-Windows software that runs on PCs; configure it to do kerberized ssh. The Service Desk may be able to help with this configuration because they support WRQ but only for Fermilab employees. You need to pay for WRQ and non-employees are not covered by the Fermilab license.
    • You can configure your machine for dual boot and install both Linux and Windows. Boot to Linux when you wish to log in to Fermilab. This is very inconvenient if you need to switch from one OS to the other frequently.

    Documentation Supported by other Organizations

    The U.S. CMS LHC Physics Center (LPC, based at Fermilab, maintains a page on their wiki with a lot of information logging in to Fermilab machines from many types of laptops and workstations. They have the resources to keep it much more detailed and much more up to date than we do:

       https://uscms.org/uscms_at_work/physics/computing/getstarted/uaf.shtml
    

    Remote Work

    We recommend you install and use, or at least be prepared to use, the lab VPN. For working on the Muse central interactive linux machines (code development and job submission), we recommend using a remote desktop, a VNC.

    Known restrictions

    When working from offsite, the restrictions are a bit higher.

    • Some lab web sites, such as Jenkins and employee services, require you to access the site only through the lab Virtual Private Network (VPN).
    • for direct ssh (no VPN) from a remote linux machine to a Muse2 central interactive server, we have observed a limit of no more than 20 ssh connections in a minute.


    VPN

    Some web pages at the lab are restricted to viewing only on the lab network. If you are offsite and want to access one of these pages, you can use a VPN, which runs on a gateway node, authenticates you, then redirects your web traffic onto the lab network. Here is the lab VPN (use your services [email] password) and some VPN help. As of 8/2019, the VPN requires multi-factor authentication. There are two forms, a software or RSA form which is good enough for most users, and a yubikey (usb dongle) stronger form for people who need to access personnel and business information.

    Most Mu2e people have an RSA App on their phone to generate the "soft token" needed to access Fermilab VPN. If you change to a new phone, you may need to contact the service desk to request a new "verification code" for the app.

    VNC

    General information

    VNC stands for Virtual Network Connection. It allows you to run your session on one of Mu2e interactive machines and display the running session it on your local desktop of laptop. VNC runs in a client-server mode, with client performing only display functions. Therefore, if you disconnect and reconnect later, you will continue to work in the same session. VNC handles efficiently the X11-compression and works over the network much faster than a direct SSH connection. It works well on network connections with ping times up to 200 ms, getting in trouble only if TCP packets start getting lost.

    VNC server software is installed on MU2EGPVM* machines. VNC server runs in a user mode - each user starts his own server(s). The local computer only needs a VNC client and a kerberized SSH installations.

    Below are a few links to general VNC tutorials:

    Setting up a VNC server

    To set up a VNC server follow these instructions:

    1. ssh into one of the mu2egpvm machines (like mu2egpvm04)
    2. once logged in, do:
    3. kinit                  # get Kerberos ticket
      vncpasswd              # create your VNC server password
      
    4. check already running VNC servers:
      murat@mu2egpvm02:~>uname -a; ps -efl | grep -i Xvnc | cut -c-120
      Linux mu2egpvm02.fnal.gov 3.10.0-1127.13.1.el7.x86_64 #1 SMP Tue Jun 23 10:32:27 CDT 2020 x86_64 x86_64 x86_64 GNU/Linux
      0 S bvitali   3704     1  0  80   0 - 286048 ep_pol Jul18 ?       02:37:19 /usr/bin/Xvnc :6 -auth /nashome/b/bvitali/.Xa
      0 S murat     3797  3326  0  80   0 - 28204 pipe_w 17:11 pts/57   00:00:00 grep --color=auto -i Xvnc
      0 S mmackenz  3888     1  0  80   0 - 125007 ep_pol Jul31 ?       03:32:25 /usr/bin/Xvnc :23 -auth /nashome/m/mmackenz/.
      0 S defelice  4232     1  0  80   0 - 205764 ep_pol Jul17 ?       03:49:00 /usr/bin/Xvnc :08 -auth /nashome/d/defelice/.
      0 S zanetti  11491     1  0  80   0 - 65755 ep_pol Sep08 ?        00:00:13 /usr/bin/Xvnc :2 -auth /nashome/z/zanetti/.Xa
      0 S mmackenz 14149     1  0  80   0 - 63366 ep_pol Jul17 ?        00:03:11 /usr/bin/Xvnc :1 -auth /nashome/m/mmackenz/.X
      0 S bbarton  18286     1  0  80   0 - 77543 ep_pol Jul17 ?        02:33:33 /usr/bin/Xvnc :5 -auth /nashome/b/bbarton/.Xa
      0 S chenj    21733     1  0  80   0 - 129571 ep_pol Jul28 ?       06:07:57 /usr/bin/Xvnc :9 -auth /nashome/c/chenj/.Xaut
      0 S gianipez 30193     1  0  80   0 - 83661 ep_pol Aug19 ?        00:09:30 /usr/bin/Xvnc :4 -auth /nashome/g/gianipez/.X
      0 S huangs   32386     1  0  80   0 - 92075 ep_pol Aug04 ?        03:27:20 /usr/bin/Xvnc :29 -auth /nashome/h/huangs/.Xa
      
    5. start your VNC server with unused Xserver ID, for example, use ID=7:
      (make sure you start the VNC server before running any Mu2e-specific setups, having those in ~/.bashrc, for example, might result in a problem!)
    6. vncserver :7 -name MY_VNC -depth 24 -geometry 1920x1080 -localhost -bs
      

      Using '-localhost' option is a must, the server should be listening to a local port

    Setting up a VNC client on Linux

    1. return to your local machine
    2. download a vncviewer. There is a multitude of those, for example, TigerVNC, RealVNC,etc. Mac OS has a native vncviewer, but it is slower than the others. TigerVNC is the most performant one, we recommend trying it first.
    3. once you have a VNC viewer installed, on the local machine, start a SSH tunnel connecting the local machine to the remote server:
    4. ssh -v -f -KX -N -L PORT:localhost:PORT <kerberos_username>@mu2egpvm04.fnal.gov notes:
      • replace PORT with 5900+ID , replace mu2egpvm04 with the name of the machine where you started the vncserver.
    5. open the vncviewer and type: localhost:ID, where ID is the same as above, then click "connect"
    6. Tigervnc client start.png
    7. you will be asked to enter the password you set on step (3)
    8. .
    9. On Macs, see below: this should be 5900+ID!!

    This is it! you now have acccess to your server that will look like a linux machine.

    Setting up a VNC client on Windows

    • install Cygwin in a minimal configuration, you only need the core, krb5 client and openssh
    • copy /etc/krb5.conf from one of the Mu2e interactive nodes to /etc directory of Cygwin installation (requires admin privilege)
     wget http://metrics.fnal.gov/authentication/krb5conf/Linux/krb5.conf
    
    • create the following .ssh/config file in the home directory
    Host *
        GSSAPIAuthentication      yes
        GSSAPIDelegateCredentials no
        ForwardX11Trusted         yes
    
    • start Cygwin
    • the rest is the same, as for Linux

    Setting up a VNC client on MacOS

    • mostly, the same as on Linux
    • to setup an ssh tunnel, run
     ssh -v -f -KXY -N -L PORT:localhost:PORT <kerberos_username>@mu2egpvm04.fnal.gov 
    
    • now start your client (Chicken etc); use your password; localhost:5900+ID
    • when connecting a client, use 5900+ID instead of ID, i.e. localhost:5901 instead of localhost:1

    Few useful scripts on mu2egpvm* nodes

    • check VNC servers running on a given machine, in this example - mu2egpvm01. Look at several (mu2egpvm01-mu2egpvm06), choose the one which is less loaded
    kinit
    ssh YOUR_FNAL_USERNAME@mu2egpvm01.fnal.gov /mu2e/app/users/murat/bin/check_running_vnc_servers
    /usr/bin/Xvnc  :2 huangs               
    /usr/bin/Xvnc :22 mchattor             
    /usr/bin/Xvnc  :3 kharrig              
    /usr/bin/Xvnc :99 murat                
    /usr/bin/Xvnc  :1 merrill              
    
    • choose the display number which is not used, for this example display=23 would do. Start VNC server and create password:
    ssh YOUR_FNAL_USERNAME@mu2egpvm01.fnal.gov 
    /mu2e/app/users/murat/bin/start_vnc_server :23 1920x1080  #  1920x1080 - is the screen size displayed by the server 
    vncpasswd
    
    • kill server running on display :23 :
    ssh YOUR_FNAL_USERNAME@mu2egpvm01.fnal.gov /mu2e/app/users/murat/bin/kill_vnc_server :23
    

    Comments

    • you must limit the terminal history scrolling to not more than 10000 lines. Otherwise, the /tmp area on the central machine
    will get filled up and the machine will need to be rebooted.

    • Once in a while, you need to reconnect. When restarting the SSH tunnel on the local machine, don't forget to kill the stale SSH tunnel processes.
    Issue the following command to find those:
      pf -efl | grep ssh
    
    • you can add the following function to your .bashrc and use it to restart the SSH tunnel - it automatically finds and kills stale tunnel processes.
    it is assumed that the used port numbers on local and remote machines are the same and that the server is running within the fnal.gov domain
    #---------------------------------------------------------------------------------------
    # to restart SSH port tunneling for VNC server running on mu2egpvm04 and using port 5902: 
    # 
    # vnc 2 mu2egpvm04
    #----------------------------------------------------------------------------------------
    function vnc  () {
        display=$1
        host=$2                              # assume "$host.fnal.gov" exists
        port=`printf "59%02i" $display`      # don't forget to check if the port is available
        forwarding=$port:localhost:$port
        kill -9 `ps -efl | grep ssh | grep $forwarding | awk '{print $4}'` ;
        ssh -v -f -X -N -L $forwarding $host.fnal.gov ;
    }
    

    Known problems

    • when you restart a SSH tunnel on a client side, check currently running SSH tunnels, kill the stale ones:
    [murat@murat01 ~]$ ps -efl | grep ssh | grep "\-L"
    1 S murat     344960       1  0  80   0 - 57739 -      Feb12 ?        00:00:28 ssh -v -f -XK -N -L 5999:localhost:5999 mu2egpvm06.fnal.gov
    1 S murat     345784       1  0  80   0 - 57745 -      Feb12 ?        00:00:59 ssh -v -f -XK -N -L 5997:localhost:5997 mu2etrk@mu2edaq09.fnal.gov
    [murat@murat01 ~]$ kill -9 344960 345784
    
    • VNC server v1.8.0 crashes on SL7 : disable back store - use -bs option when starting the VNC server
    see details at https://mu2e-hnews.fnal.gov/HyperNews/Mu2e/get/Sim/744/6/2/2/1/1.html
    • 8/8/19 - Using "-bs" seems to break the root geometry viewer ogl (OpenGL) option - driver upgrade in the end of Sep'2019 fixed that
    • make sure you have the following in ~/.vnc/xstartup : unset DBUS_SESSION_BUS_ADDRESS

    Accessing Fermi windows cluster from Linux

    rdesktop on SL6, xfreerdp on SL7 (xfreerdp comes from the RPM called 'freerdp'). Command:

    xfreerdp fermi-ts -u <username> -d fermi
    

    More info