Difference between revisions of "ComputingManagement"

From Mu2eWiki
Jump to navigation Jump to search
Line 12: Line 12:
 
* FIFE roadmap  [https://indico.fnal.gov/event/15555/ 12/2017] [https://indico.fnal.gov/event/19031/ 1/2019] [https://indico.fnal.gov/event/20911/ 7/2019] [https://indico.fnal.gov/event/50150/overview 2021]
 
* FIFE roadmap  [https://indico.fnal.gov/event/15555/ 12/2017] [https://indico.fnal.gov/event/19031/ 1/2019] [https://indico.fnal.gov/event/20911/ 7/2019] [https://indico.fnal.gov/event/50150/overview 2021]
 
* [https://indico.fnal.gov/event/16923/ Future computing workshop]
 
* [https://indico.fnal.gov/event/16923/ Future computing workshop]
* [https://mu2e-docdb.fnal.gov/cgi-bin/sso/ShowDocument?docid=30167 doc-30167 tape/disk cost guidelines]The "tape support costs" capture the prorated capital costs of libraries (aka robots), drives and nodes to host the drives.  We pay this on incremental tape purchases each year.  The "slot charges" capture the cost of yearly maintenance, power cooling etc.  They are charged in the integral tape that we have in the system.
 
* [http://www-enstore.fnal.gov/ISA/PNFS_Auth_list.html who can authorize] new /pnfs mounts
 
* [http://cd-docdb.fnal.gov/cgi-bin/RetrieveFile?docid=5644 IF security guidelines]
 
* [https://fermipoint.fnal.gov/organization/cs/scd/Lists/Experiment%20and%20Scientific%20Collaboration%20Liaison%20Li/AllItems.aspx computing liaisons ] to all experiments
 
 
* [https://fermi.servicenowservices.com/nav_to.do?uri=%2Fhome.do%3Fsysparm_view%3Dscd_outage_calendar SCD downtime schedule]
 
* [https://fermi.servicenowservices.com/nav_to.do?uri=%2Fhome.do%3Fsysparm_view%3Dscd_outage_calendar SCD downtime schedule]
* Rob, Andrei and Ray can log into root@if-admin-mu2e (from *.fnal.gov)
 
** to chown directories on pnfs (needed for some personal file upload) - see $HOME/bin/mu2e_pnfs_useradd.sh
 
  [root@if-admin-mu2e tape]# cd /pnfs/mu2e/tape
 
  [root@if-admin-mu2e tape]# for i in usr-{etc/bck,nts/nts,sim/{sim,dig}}; do echo $i; mkdir $i/sophie; chown 54683.9914 $i/sophie ;done
 
 
** to delete any users' files
 
** to find largest users
 
* new disk admin system
 
** to start, ssh mu2e-admin@mu2egpvm01, then you can issue the following commands
 
**ifadmin-check-container- See if container is up and healthy
 
**ifadmin-stop-container
 
**ifadmin-command
 
[mu2e-admin@mu2egpvm01 bin]$ ifadmin-command "whoami"
 
**ifadmin-connect - Open a shell as root in the container.  This gives you command line access, as root, to the nfs mounts.
 
**ifadmin-rebuild-container - Rebuild the container if anything seems weird.  This should restart the container as well
 
**ifadmin-start-container - Start the container if, for some reason, it is stopped
 
**ifadmin-stop-container - Stop the container.
 
**IMPORTANT NOTE: DO NOT TRY TO RUN CRONS INSIDE THE CONTAINER  Instead, the way to do this is via cron on the host gpvm node, as the {experiment}-admin user.  Put your cron job into /{experiment}/app, and call it using '**ifadmin-command "/{experiment}/app/your_cron"** from the admin user cron on the host interactive node (gpvm node) via **crontab -e**.
 
**The container should have your experiment password file as well, so you can 'su' to any user as needed.
 
* Docker and Singularity build
 
** Rob, Andrei and Ray can [https://ssiwiki.fnal.gov/wiki/Container_Build_Service_Home#Adding_users log into] repoadmin@mu2eimagegpvm01 (docker build machine) to add users
 
** mailing list: container-build-users and containers@listserv.fnal.gov
 
 
* To view password and group files on an interactive node:
 
* To view password and group files on an interactive node:
 
** getent passwd [<username>]
 
** getent passwd [<username>]
Line 48: Line 22:
 
* [http://www-giduid.fnal.gov/cd/FUE/uidgid/gid.lis Fermilab GID List (name to number mapping)]
 
* [http://www-giduid.fnal.gov/cd/FUE/uidgid/gid.lis Fermilab GID List (name to number mapping)]
 
* [https://fermi.servicenowservices.com/wp/?id=evg_sc_cat_item&sys_id=97be09036f276d005232ce026e3ee435 Add UID/GID to existing User]
 
* [https://fermi.servicenowservices.com/wp/?id=evg_sc_cat_item&sys_id=97be09036f276d005232ce026e3ee435 Add UID/GID to existing User]
* web sites
 
** [https://fermi.servicenowservices.com/nav_to.do?uri=%2Fcom.glideapp.servicecatalog_cat_item_view.do%3Fsysparm_id%3Dc0be71ed319cd980f9e6279cd360d4eb%26amp;sysparm_service%3D148bd88b6f54354032544d1fde3ee420%26amp;sysparm_affiliation%3D Add users to mu2e.fnal.gov web site] (Rob and ??? - requires a human at the service desk to respond)
 
** [https://mu2e.fnal.gov/atwork/general/webinfo/intro.shtml static site intro]
 
** [https://metrics.fnal.gov/cws/apache.html#fullreport web site metrics]
 
** [https://cd-docdb.fnal.gov/cgi-bin/ShowDocument?docid=5372 CS-5372] web site owner's manual
 
** [https://directorate-docdb.fnal.gov/cgi-bin/RetrieveFile?docid=31 dir 31] web site governance (wiki moderate, internalwiki low)
 
** [https://toolsdev.fnal.gov/wiki-user-role-management/ edit wiki user list] (Rob and Ray - 1h propagation delay)
 
** mu2ewiki and mu2einternalwiki go-live requests are renewed every year - a request is sent to Ray
 
 
* Mu2e mailing list for sites like Docker: mu2e-production@fnal.gov
 
* Mu2e mailing list for sites like Docker: mu2e-production@fnal.gov
 
* [[CodeDevelopment]]
 
* [[CodeDevelopment]]
Line 61: Line 27:
 
* [[GitDirExample]] for what might go into each git subdirectory, as time-dependent documentation
 
* [[GitDirExample]] for what might go into each git subdirectory, as time-dependent documentation
 
* pages with dead links to retired static pages [[ComputingTutorials]] [[ComputingLogin]] [[Geometry]]
 
* pages with dead links to retired static pages [[ComputingTutorials]] [[ComputingLogin]] [[Geometry]]
* files on scisoft.fnal.gov can be corrected by logging into scisoftportal.fnal.gov and editing /nasroot/SciSoft/packages
 
 
* [https://ssiwiki.fnal.gov/wiki/Interactive_Server_Facility GPCF], host of gpvm machines
 
* [https://ssiwiki.fnal.gov/wiki/Interactive_Server_Facility GPCF], host of gpvm machines
* edit Fermigrid subgroup quotas: servicedesk (not classic view), search "Modify Quota on Fermigrid" or in classic view "Batch job Management" and "modify quota"
 
 
* [[DatabaseList]]
 
* [[DatabaseList]]
 
* Notes
 
* Notes

Revision as of 19:43, 24 May 2022

To do lists for mu2e offline (password protected)

Other links:

  • interactive node notes
  • Testing that edits work on the exported clone.
  • TestPage - a page to experiment with formatting