ComputingTutorials: Difference between revisions

From Mu2eWiki
Jump to navigation Jump to search
No edit summary
 
(63 intermediate revisions by 5 users not shown)
Line 1: Line 1:
==Introduction==
This page is intended for physicists who are just starting to work in the mu2e computing environment.  The following is a broad overview of the major components, their terminology, and how you might use them.  Following each into paragraph, here are links into more specific tutorials and the rest of the mu2e documentation which is more terse and is intended as a reference once you get through the introductory material. 
You probably don't have to work through this entire page and you can stop and any point, please talk to you adviser or mentor to see what's appropriate. The material you are most likely need to use comes first, followed by more in-depth tutorials for people who will be spending years on mu2e and learn to do more complex work.
From time to time we will hold in person tutorials.  See
[https://mu2einternalwiki.fnal.gov/wiki/In_Person_Tutorials In Person Tutorials] on the Mu2e internal wiki.
==Reporting Errors or Ambiguities==
If you find errors or ambiguities in these tutorials, including the accompanying written material, please report them using the issue tracker on the Mu2e/Tutorial GitHub page: https://github.com/Mu2e/Tutorial .  The issue button is second from the left on the top of the page.  To report an issue you need to [[GitHubWorkflow#Make_your_Own_GitHub_Account_and_Join_the_Mu2e_GitHub_Organization | join the Mu2e GitHub Organization]].
==Prerequisites==
Have you completed the Mu2e [[Day 1 CheckList]]?  If not, please do so.


If you have not already done so, please sign up for these slack channels:
* tutorial-questions - ask questions about the tutorials here
* computing_and_software - the annoucements and general discussion list
* is_it_me_or_a_bug - ask questions and about anything except the tutorials.


==Introduction==
In this tutorial we will assume you are familiar with the topics below. They are not hard prerequisitesYou don't need to master this material before you start but we recommend that you skim it and learn where to look up information when you need it:
This page is intended for physicists who are just starting to work in the mu2e computing environment.  The following is a broad overview of the major components, their terminology, and how you might use themThere are links into the rest of the mu2e documentation which is more terse and is intended as a reference once you get through the introductory material.
* The [[LearnAboutMu2e|Mu2e detector]]
** Have a basic idea of the Tracker, Calorimeter and Cosmic Ray Veto System (CRV)
* [https://mu2e-docdb.fnal.gov/cgi-bin/sso/ShowDocument?docid=1120 Units and Coordinate Systems Used by Mu2e Offline]
* [[Computing Concepts|Concepts needed to understand Mu2e computing]]
** Understand, On-spill, off-spill, Event, EventId, Digi, Hit, Track, Cluster, Reconstruction, Simulation
* [[UPS to Spack Transition]]
* bash shell commands
** [[Shells#Beginner_Cheat_Sheet | Beginner Cheat Sheet]]
** [[LinuxFAQ|Linux FAQ]] - includes links to suggested references
* The basics of C++.  See [[CppFAQ#What_C.2B.2B_References_are_Recommended.3F | suggested C++ references]].
 
 
When you do these exercises you will start on your local computer and log in to one of the Mu2e interactive machines, which are sometimes called the "central machines".
The software you will use is on these machines and you will do your work on these machines.  You will be working at a terminal window and typing commands at a promptThe editors available on these machines are: vi, vim, emacs and nedit.  If this way is working is not familiar to you, contact a colleague to help get you started.
 
==A Convention You will See==


We suggest reading through this page, then take a look at the "your workflow" notes at the end, to decide what to go back to and understand in more detail.
You will sometimes see an instruction that says something like.  "To check that this step worked correctly, do"
> echo $PRODUCTS
  /cvmfs/fermilab.opensciencegrid.org/products/common/db
The right arrow character represents the shell prompt and you should type everything to the right of that character at the prompt in your working shell.  The expected output of that command is known on the following lines.  There may be many lines of output.  If the output that you see is that same as shown in the example, then everything is working.  If the example output contains a version number or a timestamp, it's OK if your output has different values, unless, of course, the example is there explicitly to check the version number.


==Interactive logins==
==Interactive logins==
Collaborators can do interactive computing work in several places.  Probably the best place to start is the collaboration's interactive machines at Fermilab.  These are set of virtual machines running SL6 (a variant of linux) .    The disks that the user sees are located on specialized disk server hardware and the same disks are mounted by all the interactive machines. There are five quad-core machines named mu2egpvm01.fnal.gov through mu2egpvm05.fnal.gov.  You will have an account and a home area here (the same home area is on all machines) and some disk space for data.  We prefer using the bash shell for all purposes. 


Collaborators can also compile and run mu2e code on their linux desktops or laptops. The code should be manageable on Mac's unix layer, and maybe even Windows, but we have not pursued these options yet.   Finally, several of the collaboration institutions have set up working areas at their home institution's linux systems. For these options, someone would need to mount the distributed code disk or copy the code distribution and supporting software tools to the local machineIt's best to discuss this with your group leaders and mu2e mentors firstGenerally we recommend working on the collaboration interactive nodes unless the networking from your institution makes that prohibitive.
Mu2e has 6 machines that you can use for the tutorials: mu2egpvm01 ... mu2egpvm06. For purposes of this tutorial these machines are identical and they all mount the same disks, including your home disk.  You can start on one machine and restart on anotherPick one and log inLast I checked Fermilab security policy broke the automatic load balancer so we need to load balance by hand.  If your machine is slow for more than a few minutes, try another.


When ready, you can read more about the [[ComputingLogin|logging into interactive machines]] , the [[Disks|disks]], [[Shells|bash shell]], and how to get the code [[CodeDistribution|distribution]] from the code disk, or copied to a desktop, laptop or remote institution.
You login to the mu2e interactive machines using your kerberos principal, "your_username@FNAL.GOV".  If you need a refresher on kerberos vs SSO, revisit [[Day_1_CheckList#Computer_Accounts | the Day 1 Checklist]].


==Authentication==
* To login, follow the instructions at: [[LoginTutorial]].
* Work through the rest of the page to setup your login scripts, and explore the disks that are available.
* If your find ambiguities or errors in the above, let us know on tutorial-questions channel.


You login to the virtual machines with '''kerberos''' authentication.  You will need a permanent ID called a kerberos "principal" which is looks like "xyz@FNAL.GOV", where xyz is your username.  (You will have one username for all computing purposes lab.)  You will have a password associated with your principal.  You will use this principal and password to log into the various personal linux desktops located at Fermilab or to ssh into the collaboration interactive machines from your home institution.  You typically refresh your kerberos authentication every day.
Here are some references to know about.  You don't need to master them now.
* [[ComputingLogin]] - lots of hints on resolving issues with logging in.
* [[Shells|bash shell]] -
* [[CodeEnvironment]]
* [[Disks|disks]]


The second identity you will need is the '''services''' principal, which looks like xyz@services.fnal.gov, or often just xyz, and also has a password (different from your kerberos password).  You will need this identity to log into Fermilab email, the servicedesk web site and some other services based at the lab.  You would typically only use this authentication at the point you log into the service.


The third identity you will need is a '''CILogin certificate'''.   This "cert" is the basis of authentication to the mu2e documents database, the computing farms, and a few other servicesYou will use this cert in two ways.  The first way is to load it into your browser, which then gives you access to web pages and web services.  The second is by using your kerberos authentication to access a copy of your certificate maintained in a remote database.  You get this certificate once and then renew it only once a year.  
==[[NtupleTutorial|Ntuple]]==
The data from the detector, and the reconstructed data, is stored in files in '''art''' format.  Accessing this data generally requires compiling code and learning a special configuration language, so we will save that for a later tutorial. To simplify, and speed up access to the data, we often run a program to copy a small part of the data in the art file into a convenient format call a root ntuple (pronounced "en-tuple")This format is easy to browser interactively and make histograms.  The ntuple file may contain histograms that were already made, or a list of the tracks in each event along with interesting quantities, such as the number of hits on the track or its reconstructed momentum.


hypernews is an archived blog and email list - for access here, you will need a hypernews password and your services password!
Tutorial:
* [[NtupleTutorial]] is the tutorial for this section and will guide you through making plots with one of the Mu2e-specific ntuples that are available.


Finally, the mu2e internal web pages require a collaboration username and password, please ask your mu2e mentor.
Other useful pages:
* [https://root.cern.ch/ ROOT] is a very useful resource. In particular: [https://root.cern.ch/getting-started getting started] and [https://root.cern.ch/guides/reference-guide code reference]
* Overview of existing [[Ntuples|mu2e ntuples]]


If you need to do any computing in mu2e, please go ahead and start the procedure on the [[ComputingAccounts]] to create your accounts and authentication.
==TrkAnaTutorial==
[[TrkAna]] is one of the mu2e ntuples.


==Code==
Tutorial:
===Events===
* The [https://github.com/Mu2e/TrkAna/blob/main/tutorial/README.md TrkAnaTutorial] is hosted on GitHub
Very briefly, the experiment is driven by a short burst of millions of protons hitting the primary target every 1695 ns.  This cycle is called a '''microbunch'''.   After the burst interacts, outgoing muons migrate to, and come to rest in, the stopping target. After this surge of particles dies down during the first part of the microbunch, the detector observes what happens to the stopped muons during the second part of the microbunch.  The data recorded during this ~900 ns is written out (if it passes the trigger) in a data structure called an event.  Many events can be written in one file.  Events have unique identifying numbers.  Short periods of data-taking (minutes) are grouped into subruns with a unique ID number, and longer periods of stable running (a few hours) will be grouped into a run with unique ID number. 


===Sim and Reco===
==Geometry Browser==
Since we do not have data yet, we analyze '''simulation''' or '''sim''' events which are based on our expectation of what data will look like.  We draw randomly from expected and potential physics processes and then trace the particles through the detector, and write out events.  The interaction of particles in the detector materials is simulated with the geant software package. The simulated looks like the real data will, except it also contains the truth of what happened in the interactions.
It is often useful to look at the detector as it is implemented in the simulation either to debug (e.g. double-check the geometry is as you expect) or to get images for presentations.


The output of simulation would typically be data events in the '''raw''' formats that we will see produced by the detector readout.  These are typically ADC values indicating energy deposited, or TDC values indicating the time of a energy deposit.  In the '''reconstruction''' or '''reco''' process, we run this raw data through modules that analyze the raw data and look for patterns that can be identified as evidence of particular particles.  For example, hits in the tracker are reconstructed into individual particle paths, and energy in the calorimeter crystals is clustered into showers caused by individual electrons.
Tutorial:
Exactly how a reconstruction module does its work is called its '''algorithm'''.  A lot of the work of physicists is invested in these algorithms, because they are fundamental to the quality of the experimental results.
* [[GeometryBrowserTutorial2019 | Geometry browser tutorial, revised for June 2019]]
* [[GeometryBrowserTutorial| Geometry browser tutorial from summer 2016]]
Related references:
* [[Geometry]]
* [[EventDisplays]]


===Coding===
==Code, art and fcl==
The mu2e simulation and reconstruction code is written in c++.   We write modules which create simulated data, or read data out of the event, process it, and write the results back into the event.  The modules plug into a framework called '''art''', and this framework calls the modules to do the actual work, as the framework reads an input file and writes an output file. The primary data format is determined by the framework, so it is called the '''art format''' and the file will have an extension .art.
The main program that is used for simulation, reconstruction and analysis is Mu2e Offline. This is built on top of, art an event processing framework in which the data passes through a series of modules to perform a variety of tasks. Sometimes you will need to build the full Mu2e Offline framework (e.g. if you are a developer) but in other cases you might only need a partial build or use an already existing build.


We use the '''git''' code management system to store and version our code.   Currently, we have one main git repository which contains all our simulation and reconstruction code. You can check out this repository, or a piece of it, and build it locally. In this local area you can make changes to code and read, write and analyze small amounts of data.  We build the code with a make system called '''scons'''.  The code may be built optimized ('''prof''') or non-optimized and prepared for running a debugger ('''debug''').
Tutorial:
* The tutorial is now on [https://github.com/Mu2e/Tutorial/blob/main/AllInOne/doc/AllInOne.md Mu2e GitHub].
* A reminder about the Mu2e [[CodeEnvironment]].


At certain times, the code is tagged, built, and published as a stable releaseThese releases are available on the /cvmfs disk area.  cmfvs is a sophisticated distributed disk system with layers of servers and caches, but to us it just looks like a read-only local disk, which can be mounted almost anywhere.  We often run large projects using these tagged releases.  cmvfs is mounted on the interactive nodes, at remote institutions, on some desktops, and all the many farm nodes we use.
Old materialObsolete but kept to be scrubbed from good content:
* Old [[CodeArtFclTutorial]]
* [[ReleaseList]] gives the list of releases that are available on cvmfs.


You can read more about accessing and building [[CodeRecipe|code]], [[git]], [[scons]] and [[cvmfs]].
<!--
where is the code (do not run git yet tho), intro to releases, cvmfs, run genReco, set output file names and Nevents
intro to fcl, paths and filters. maybe a couple of tutorials?
-->


===Executables===
==Event Display==
Which modules are run and how they are configured is determined by a control file, written in '''fcl''' (pronounced fickle).  This control file can change the random seeds for the simulation and the input and output file names, for example.  A typical run might be to create a new simulation file.  For various reasons, we often do our simulation in several stages, writing out a file between each run of the executable, or stage, and reading it in to start the next stage.  A second type of job might be to run one of the simulation stages with a variation of the detector design, for example.  Another typical run might be to take a simulation file as input and test various reconstruction algorithms, and write out reconstruction results.
Not only is it useful to look at the geometry but it is also useful to look at specific events in the simulation to see what is happening.


===Data products===
Tutorial:
The data in an event in a file is organized into data products.  Examples of data products include straw tracker hits, tracks, or clusters in the calorimeter.  The fcl is often used to decide which data products to read, which one to make, and which ones to write out. There are data products which contain the information of what happened during the simulation, such as the main particle list, SimParticles.
* The [[EventDisplayTutorial]] provides two example tasks to make you familiar with the Mu2eEventDisplay display
** [[EventDisplays]] gives an overview of the current ways to display events.


===UPS Products===
==Art Data Products==
Disambiguation of "products" - please note that we have both '''data products''' and '''UPS products''' which unfortunately are both referred to as "products" at times. Please be aware of the difference, which you can usually determine from the context.  
All objects (e.g. straw hits, calorimeter clusters) are stored in the art event as art data products. These are accessed and created in Offline modules.


The art framework and fcl control language are provided as a '''product''' inside the '''UPS''' software release management system. There are several other important UPS products we use. This software is distributed as UPS products because many experiments at the lab use these utilities.  You can control which UPS products are available to you ( which you can recognize as a setup command like "setup root v6_06_08") but most of this is organized as defaults inside of a setup script.
Tutorial:
* The [[ArtDataProductTutorial]] is still to be written but...
** [[ReadProducts]] gives information on reading products; and,
** [[MakeProducts]] gives information on creating products.


You can read more about  how [[UPS|'''UPS''']] works.
<!---
print and dump files, list products, write input tags. Maybe look at RecoDataProducts
--->


===Histogramming===
==Checkout and build code==
Once you have an art file, how to actually make plots and histograms of the data?  There are many ways to do this, so it is important to consult with the people you work with, and make sure you are working in a style that is consistent with their expertise and preferences, so you can work together effectively.
If you need to write your own modules or edit code in Offline itself, then you will need your own build of Offline.


In any case, we always use the '''root''' UPS product for making and viewing histograms.
Tutorial:
There are two main ways to approach it.  The first is to insert the  histogram code into a module and write out a file which contains the histograms.  The second method is to use a module to write out an '''ntuple''', also called a '''tree'''. This is a summary of the data in each event, so instead of writing out the whole track data product, you might just write out the momentum and the number of hits in the nutple.  The ntuple is very compact, so you can easily open this and make histogram interactively very quickly.
* [[CheckoutAndBuildCodeTutorial]] is the build system up to spring 2021
* [[MuseBuildCodeTutorial]] is the build system after spring 2021
** [[ReleaseList]] gives the list of releases that are available on cvmfs.


Read more about ways to histogram or [[ntuples|ntuple]] data for analysis.
<!---
scons, satellite releases, warning about changing include files
--->


==Workflows==
==Modules==
===Designing larger jobs===
There are a few different types of art module that you will encounter. "Analyzers" can only analyze data products that are already in the event; "producers" can create new data products; and, "filters" make a decision as to whether an event passes or fails some criteria.
After understanding data on a small level by running interactive jobs, you may want to run on larger datasets.  If a job is working interactively, it is not too hard to take that workflow and adapt it for running on large datasets on the compute farms.  First, you will need to understand the '''mu2egrid''' UPS product which is a set of scripts to help you submit jobs and organize the output. mu2egrid will call the '''jobsub''' UPS product to start your job on the farm.  You data will be copied back using the '''ifdh''' UPS product, which is a wrapper to data transfer software.  The output will go to '''dCache''', which is a high-capacity and high-throughput distributed disk system.  We have 100's of terabytes of disk space here, divided into three types (a scratch area, a persistent disk area, and a tape-backed area).  Once the data is written, there are procedures to check it and optionally concatenate the files and write them tape.  We track our files in a database that is part of the '''SAM''' UPS product.  You can see the files in dCache by looking under the '''/pnfs''' filesystem.  Writing and reading files to dCache can have consequences, so please understand how to use dCache and also consult with an experienced user before running a job that uses this disk space.


===Grid resources===
Tutorial:
mu2e has access to a compute farm at Fermilab, called '''Fermigrid'''. This farm is several thousand nodes and mu2e is allocated a portion of the nodes (our '''dedicated''' nodes). Once you have used the interactive machines to build and test your code, you can submit a large job to the compute farms.  You can get typically get 1000 nodes for a day before your priority goes down and you get fewer.  If the farm is not crowded, which is not uncommon, you can get several times that by running on the idle or  '''opportunistic''' nodes.
* The [[ModulesTutorial]] is still to be written but...
** [[Modules]] discusses module names and labels; and
** [[FilterModules]] discusses filter modules.


mu2e also has access to compute farms at other institutions through a collaboration called Open Science Grid (OSG).  It is easy to modify your submit command to use these resources.  We do not have a quota here, we can only access opportunistic nodes, so we don't really know how many nodes we can get, but it is usually at least as much as we can get on Fermigrid.  This system is less reliable than Fermigrid so we often see unusual failure modes or jobs restarting.
<!---
example module code.  access a product, make histograms,
write a product(?)
access geometry and config
--->


==Your workflow==
==Geometry and Config==
Hopefully you now have a good idea of the concepts and terminology of the mu2e offline.  What part of the offline system you will need to be familiar with will depend on what tasks you will be doing.  Let's identify four nominal roles.  In all cases, you will need to understand the accounts and authentication.
For your study, you might need to edit part of the geometry or change the generated particle that is simulated. This is done with config files
# ntuple user. This is the simplest case.  You probably will be given a ntuple, or a simple recipe to make an ntuple, then you will want to analyze the contents.  You will need to have a good understanding of c++ and root, but not much else.
# art user.  In this level you would be running art executables, so you will also need to understand modules, fcl, and data products.  Probably also how to make histograms or ntuples from the art file.
# farm user.  In this level you would be running art executables on the compute farms, so you will also need to understand the  farms, workflows, dCache, and possibly uploading files to tape.
# developer.  In this case, you will be writing modules and algorithms, so you need to understand the art framework, data products, geometry, c++, and standards in some detail, as well as the detector itself. 


Tutorial:
* The [[GeometryAndConfig]] tutorial is still to be written but...
** [[SimpleConfig]] describes the format of these files.


<!---
alter a geometry file, examine a generator config file
--->


==Staging and Mixing Concepts==
For the simulation, we don't just run from protons-on-target (POT) all the way through to hits in the tracker and actually run it in stages. This allows us to re-run specific stages to test new geometries without having to run everything again; and it also allows us to generate large samples of specific processes (e.g. particles emitted after a muon is captured by a nucleus) with better efficiency before mixing them all together into a full microbunch event.
Tutorial:
* The [[StagingAndMixingTutorial]] is still to be written but...
** [[Staging]] describes many of the relevant concepts; and,
** [[Mixing]] explains background mixing and how to run this type of job.
<!---
run multi-stage, run mixing, make fcl changes and re-run
--->
==Datasets and dCache==
dCache is the tape system that we use to store our art files.
Tutorial:
* The [[dCacheTutorial]] is still to be written but...
** [[Dcache]] gives a good introduction
<!---
explain dCache, write to scratch dcache, use ifdh, intro to upload?
--->
==Grids==
To run large jobs, we use the grid rather than run on a local machine.
Tutorial:
* The [[GridTutorial]] is still to be written but...
** [[Grids]] has a lot of useful information; and,
** [[Workflows]] gives a lot of information on how best to organise your work in Mu2e.
<!---
build and submit a mu2eprodsys job, second tutorial to monitor the job
--->
==Git commits==
Git is a version control system that allows us to coordinate many people developing software at the same time. It is widely used in the software development world so you will be able to find a lot of information online.
Tutorial:
* The [[GitTutorial]] is still to be written but...
** [[Git]] has some Mu2e specific information
<!---
provide a scratch repo that they can checkout and commit to randomly.  Walk through both commit patterns
--->
==Code standards==
In order to ensure that our code is stable and doing what we expect, there are various tasks where we enforce standard ways of performing them.
Tutorial:
* The [[CodeStandardsTutorial]] is still to be written but...
** [[CodingStandards]] has a nice summary; and,
** [[RandomNumbers]] and [[RandomNumbersBasic]] describes the way you should generate random numbers in Offline.
<!---
code style, art standards, random numbers, CLHEP, boost, magic numbers
--->
==References and resources==
The best ways to get help can be found on the [[ComputingHelp]] page.
<!---
discuss when and how to get help, maybe tour the wiki?
--->


==Outline (scratch)==
* Intro to physics goals
** concept, new physics, current limits, timeline
** ideas of beamline, stopping, DIO, tracker, final momentum plot
* Detector overview
** overview of why the detector is laid out this way, purpose of calorimeter, gradient filed, cosmic shield
** tour individual pieces, specific points to make
* Backgrounds
** flash and time window
** DIO
** RPC
** antip
** cosmics
* Computing Stage I (getting around, using ntuples)
** Authentication, interactive machines, OS, shell
** c++, how code is organized, cvmfs, UPS products, setup procedure
** intro to art files
** geometry browser
** root ntuples and plots
** documents and getting help
* Computing Stage II (understanding sim, local builds)
** intro to modules and products
** running mu2e exe, fcl,
** intro to simulation, staging and mixing
** paths, generator and geometry files
** checkout and build commands
* Computing Stage III (submit grid jobs)
** dcache
** grids
** mu2eprodsys
** monitoring
* Computing Stage IV (developer)
** committing code
** releases and tags
** how to make products
** random numbers, handles, exceptions, etc


==Random Links (scratch)==
==Random Links (scratch)==


[https://mu2e-docdb.fnal.gov:440/cgi-bin/DisplayMeeting?sessionid=3729 latest meeting]
[https://mu2e-docdb.fnal.gov/cgi-bin/sso/DisplayMeeting?sessionid=3729 latest meeting]


Sarah's [https://docs.google.com/document/d/1YuwYfarDO7d2ficXsjpMr-yYfFo4l72FNPEZT4rfkjQ/edit google doc] on clickable status and intro paragraphs
Sarah's [https://docs.google.com/document/d/1YuwYfarDO7d2ficXsjpMr-yYfFo4l72FNPEZT4rfkjQ/edit google doc] on clickable status and intro paragraphs


Rob's 10/26/17 [https://mu2e-docdb.fnal.gov:440/cgi-bin/ShowDocument?docid=14074 talk] on intro to computing plan
Rob's 10/26/17 [https://mu2e-docdb.fnal.gov/cgi-bin/sso/ShowDocument?docid=14074 talk] on intro to computing plan


[http://mu2e.fnal.gov/atwork/tmp/ clickable detector]
[http://mu2e.fnal.gov/atwork/tmp/ clickable detector]
[[ComputingStart|Overview]]


[[CodeRecipe|Build recipe]]
[[CodeRecipe|Build recipe]]


[http://mu2e.fnal.gov/atwork/computing/g4fwk.shtml Rob's first geant run for new users]
[http://mu2e.fnal.gov/atwork/computing_retired/g4fwk.shtml Rob's first geant run for new users]


[http://mu2e.fnal.gov/atwork/computing/artworkbook.shtml art workbook]
[http://mu2e.fnal.gov/atwork/computing_retired/artworkbook.shtml art workbook]


[http://mu2e.fnal.gov/atwork/computing/rootTest.shtml test root]
[http://mu2e.fnal.gov/atwork/computing_retired/rootTest.shtml test root]


[http://mu2e.fnal.gov/atwork/computing/G4EventDisplayTest.shtml test display]
[http://mu2e.fnal.gov/atwork/computing_retired/G4EventDisplayTest.shtml test display]


[http://mu2e.fnal.gov/atwork/workgroups/SoftwareAndSimulations/summer_2016.shtml Summer 2016 SCD workshops] (includes geometry tutorial)
[http://mu2e.fnal.gov/atwork/workgroups_retired/SoftwareAndSimulations/summer_2016.shtml Summer 2016 SCD workshops] (includes geometry tutorial)


[http://mu2e.fnal.gov/atwork/computing/Tutorials/2016/Agenda/index.html Summer 2016 mu2e tutorials]
[http://mu2e.fnal.gov/atwork/computing_retired/Tutorials/2016/Agenda/index.html Summer 2016 mu2e tutorials]


[http://mu2e.fnal.gov/atwork/computing/UnixHints.shtml unix hints]


[http://mu2e.fnal.gov/atwork/computing/standaloneROOT.shtml setup root by itself]
[http://mu2e.fnal.gov/atwork/computing_retired/standaloneROOT.shtml setup root by itself]


[[CppFAQ|c++]]
[[CppFAQ|c++]]
Line 161: Line 259:
July 2016 intro talks
July 2016 intro talks


[https://mu2e-docdb.fnal.gov:440/cgi-bin/ShowDocument?docid=7746 Software tutorial]
[https://mu2e-docdb.fnal.gov/cgi-bin/sso/ShowDocument?docid=7746 Software tutorial]


[https://mu2e-docdb.fnal.gov:440/cgi-bin/ShowDocument?docid=7859 Practicalities of MC]
[https://mu2e-docdb.fnal.gov/cgi-bin/sso/ShowDocument?docid=7859 Practicalities of MC]


[https://mu2e-docdb.fnal.gov:440/cgi-bin/ShowDocument?docid=7861 Hits and Mixing]
[https://mu2e-docdb.fnal.gov/cgi-bin/sso/ShowDocument?docid=7861 Hits and Mixing]


[[Category:Computing]]
[[Category:Computing]]
[[Category:Tutorial]]
[[Category:Tutorial]]


== Tutorials (scratch) ==
== Tutorials (scratch) ==

Latest revision as of 15:06, 14 October 2024

Introduction

This page is intended for physicists who are just starting to work in the mu2e computing environment. The following is a broad overview of the major components, their terminology, and how you might use them. Following each into paragraph, here are links into more specific tutorials and the rest of the mu2e documentation which is more terse and is intended as a reference once you get through the introductory material.

You probably don't have to work through this entire page and you can stop and any point, please talk to you adviser or mentor to see what's appropriate. The material you are most likely need to use comes first, followed by more in-depth tutorials for people who will be spending years on mu2e and learn to do more complex work.

From time to time we will hold in person tutorials. See In Person Tutorials on the Mu2e internal wiki.

Reporting Errors or Ambiguities

If you find errors or ambiguities in these tutorials, including the accompanying written material, please report them using the issue tracker on the Mu2e/Tutorial GitHub page: https://github.com/Mu2e/Tutorial . The issue button is second from the left on the top of the page. To report an issue you need to join the Mu2e GitHub Organization.


Prerequisites

Have you completed the Mu2e Day 1 CheckList? If not, please do so.

If you have not already done so, please sign up for these slack channels:

  • tutorial-questions - ask questions about the tutorials here
  • computing_and_software - the annoucements and general discussion list
  • is_it_me_or_a_bug - ask questions and about anything except the tutorials.

In this tutorial we will assume you are familiar with the topics below. They are not hard prerequisites. You don't need to master this material before you start but we recommend that you skim it and learn where to look up information when you need it:


When you do these exercises you will start on your local computer and log in to one of the Mu2e interactive machines, which are sometimes called the "central machines". The software you will use is on these machines and you will do your work on these machines. You will be working at a terminal window and typing commands at a prompt. The editors available on these machines are: vi, vim, emacs and nedit. If this way is working is not familiar to you, contact a colleague to help get you started.

A Convention You will See

You will sometimes see an instruction that says something like. "To check that this step worked correctly, do"

> echo $PRODUCTS
/cvmfs/fermilab.opensciencegrid.org/products/common/db

The right arrow character represents the shell prompt and you should type everything to the right of that character at the prompt in your working shell. The expected output of that command is known on the following lines. There may be many lines of output. If the output that you see is that same as shown in the example, then everything is working. If the example output contains a version number or a timestamp, it's OK if your output has different values, unless, of course, the example is there explicitly to check the version number.

Interactive logins

Mu2e has 6 machines that you can use for the tutorials: mu2egpvm01 ... mu2egpvm06. For purposes of this tutorial these machines are identical and they all mount the same disks, including your home disk. You can start on one machine and restart on another. Pick one and log in. Last I checked Fermilab security policy broke the automatic load balancer so we need to load balance by hand. If your machine is slow for more than a few minutes, try another.

You login to the mu2e interactive machines using your kerberos principal, "your_username@FNAL.GOV". If you need a refresher on kerberos vs SSO, revisit the Day 1 Checklist.

  • To login, follow the instructions at: LoginTutorial.
  • Work through the rest of the page to setup your login scripts, and explore the disks that are available.
  • If your find ambiguities or errors in the above, let us know on tutorial-questions channel.

Here are some references to know about. You don't need to master them now.


Ntuple

The data from the detector, and the reconstructed data, is stored in files in art format. Accessing this data generally requires compiling code and learning a special configuration language, so we will save that for a later tutorial. To simplify, and speed up access to the data, we often run a program to copy a small part of the data in the art file into a convenient format call a root ntuple (pronounced "en-tuple"). This format is easy to browser interactively and make histograms. The ntuple file may contain histograms that were already made, or a list of the tracks in each event along with interesting quantities, such as the number of hits on the track or its reconstructed momentum.

Tutorial:

  • NtupleTutorial is the tutorial for this section and will guide you through making plots with one of the Mu2e-specific ntuples that are available.

Other useful pages:

TrkAnaTutorial

TrkAna is one of the mu2e ntuples.

Tutorial:

Geometry Browser

It is often useful to look at the detector as it is implemented in the simulation either to debug (e.g. double-check the geometry is as you expect) or to get images for presentations.

Tutorial:

Related references:

Code, art and fcl

The main program that is used for simulation, reconstruction and analysis is Mu2e Offline. This is built on top of, art an event processing framework in which the data passes through a series of modules to perform a variety of tasks. Sometimes you will need to build the full Mu2e Offline framework (e.g. if you are a developer) but in other cases you might only need a partial build or use an already existing build.

Tutorial:

Old material. Obsolete but kept to be scrubbed from good content:


Event Display

Not only is it useful to look at the geometry but it is also useful to look at specific events in the simulation to see what is happening.

Tutorial:

  • The EventDisplayTutorial provides two example tasks to make you familiar with the Mu2eEventDisplay display
    • EventDisplays gives an overview of the current ways to display events.

Art Data Products

All objects (e.g. straw hits, calorimeter clusters) are stored in the art event as art data products. These are accessed and created in Offline modules.

Tutorial:


Checkout and build code

If you need to write your own modules or edit code in Offline itself, then you will need your own build of Offline.

Tutorial:


Modules

There are a few different types of art module that you will encounter. "Analyzers" can only analyze data products that are already in the event; "producers" can create new data products; and, "filters" make a decision as to whether an event passes or fails some criteria.

Tutorial:


Geometry and Config

For your study, you might need to edit part of the geometry or change the generated particle that is simulated. This is done with config files

Tutorial:


Staging and Mixing Concepts

For the simulation, we don't just run from protons-on-target (POT) all the way through to hits in the tracker and actually run it in stages. This allows us to re-run specific stages to test new geometries without having to run everything again; and it also allows us to generate large samples of specific processes (e.g. particles emitted after a muon is captured by a nucleus) with better efficiency before mixing them all together into a full microbunch event.

Tutorial:


Datasets and dCache

dCache is the tape system that we use to store our art files.

Tutorial:


Grids

To run large jobs, we use the grid rather than run on a local machine.

Tutorial:

  • The GridTutorial is still to be written but...
    • Grids has a lot of useful information; and,
    • Workflows gives a lot of information on how best to organise your work in Mu2e.


Git commits

Git is a version control system that allows us to coordinate many people developing software at the same time. It is widely used in the software development world so you will be able to find a lot of information online.

Tutorial:

  • The GitTutorial is still to be written but...
    • Git has some Mu2e specific information


Code standards

In order to ensure that our code is stable and doing what we expect, there are various tasks where we enforce standard ways of performing them.

Tutorial:


References and resources

The best ways to get help can be found on the ComputingHelp page.


Random Links (scratch)

latest meeting

Sarah's google doc on clickable status and intro paragraphs

Rob's 10/26/17 talk on intro to computing plan

clickable detector

Build recipe

Rob's first geant run for new users

art workbook

test root

test display

Summer 2016 SCD workshops (includes geometry tutorial)

Summer 2016 mu2e tutorials


setup root by itself

c++

linux

root

July 2016 intro talks

Software tutorial

Practicalities of MC

Hits and Mixing

Tutorials (scratch)

  • Testing the ROOT display
  • Testing the Geant4 based event display
  • Notes on dynamic libraries
  • The First Step: the art workbook
  • Running G4 within art: The first examples.
  • Mu2e maintained FAQs: C++ FAQ, Unix/Linux FAQ, ROOT FAQ, Geant4 Notes