Computing Concepts

From Mu2eWiki
Jump to navigation Jump to search


Introduction

This page is intended for physicists who are just starting to work in the Mu2e computing environment. It explains ideas and jargon that you need to know to navigate the Mu2e Tutorials. In case this is your first exposure to computing in High-Energy Physics (HEP), it also points out which ideas and language are common throughout the field. It supports ComputingTutorials

This is a fairly long document (about 14 printed pages) and you don't need understand everything in it on day one. We suggest that you skim the document to learn what it covers. Then come back to it as needed while you are working on the tutorials.

Online and Offline

The Mu2e software is used in two different environments, online and offline. Online refers to activities in the Mu2e Hall, such as Data Acquisition (DAQ) and Triggering, collectively known as TDAQ. Offline refers to activities that take place after the data has reached the Computer Center, such as reconstruction, calibration and analysis. Some software is used in both places and some software is used in only one or the other.

This writeup will focus on offline computing but will have some references to online computing because some important ideas originate in the online world.

On-spill and Off-spill

When Mu2e is in a steady state of data taking, there will be a repeating cycle that is 1.4 seconds long. The cycle is shown schematically in the figure below:

MI Cycle.png

At the start of the cycle, the Fermilab accelerator complex will deliver a beam of protons to the Mu2e production target. The beam is structured as a series of pulses of nominally 39 million protons per pulse, separated by approximately 1695 ns; ideally there are no protons between the pulses. This continues for about 43.1 ms, or about 25,400 pulses. The period over which the 25,400 pulses arrive is called a spill. This will be followed by a brief period of about 5 ms when no beam is delivered to the production target. There are eight spills plus seven 5 ms inter-spill periods in the 1.4 s cycle. These are followed by a period of about 1.02 seconds during which no protons arrive at the target. A single pulse of nominally 39 million is correctly called a pulse; however it is sometimes called a bunch or a micro-bunch. For historical reasons those other terms are present through the Mu2e documentation and code.

This cycle of 1.4 s is an example of a Main Injector Cycle (MI cycle). The Main Injector is the name of one of the accelerators in the Fermilab complex. During the 1.020 s period with no beam to Mu2e, the Fermilab accelerator complex will deliver protons to the NOvA experiment (early in Mu2e) or the DUNE experiment (later in Mu2e). Towards the end of each 1.020 s period the accelerator complex will begin the preparatory steps to deliver the next spills of protons to Mu2e.

In normal operations this MI cycle will repeat without a break for about 1 minute. Then there will be a brief pause, perhaps 5 to 10 seconds during which other sorts of MI cycles are executed. One example is delivering protons to the Fermilab test beam areas, called MTEST. This whole process is called a super cycle.

When operations are stable Mu2e will run continuous repeats of this super-cycle. The explanation of why the MI-cycle and super-cycle are the way they are is outside of the scope of this writeup.

We define the term “on-spill” to mean during the 8 spills. We define the term “off-spill” to mean any time that the detector is taking data that is not on-spill. At this time Mu2e does not have widely agreed upon terms to differentiate several different notions of off-spill:

  1. During the 1.020 s off-spill period during an MI cycle
  2. During the seven 5 ms inter-spill periods within the MI cycle
  3. During the portion of the supercycle when the MI is not running a protons-to-Mu2e MI cycle.
  4. During an extended period of time, minutes to months, when the accelerator complex is not delivering protons to the Production Target.

Some groups within Mu2e do use language to make some of these distinctions but it is neither uniform nor widely adopted.

During off-spill data taking the Mu2e detector does some or all of the following. Some of these can be done at the same time but some require a special configuration of the data taking system.

  1. Measure cosmic ray induced activity in the detector.
    • This is used to look for cosmic rays that produce signal-like particles and to measure the ability of the cosmic ray veto system to identify when this happens.
    • This is also a source of data that we will use to calibrate the detector in-situ.
  2. Randomly readout portions of the detector to collect samples that can be used to measure what a quiet detector looks like. This is referred to as “pedestal” data.
  3. Perform dedicated calibration operations.

Doing all three well is important to the success of Mu2e.

1BB and 2BB

The previous section described how Mu2e will operate after it has been fully commissioned. When Mu2e starts commissioning with beam, we will run a different MI-cycle. In this cycle there will only be 4 spills, each of longer duration, and the total number of protons on target (POT) during this MI cycle will be about 55% of that of the MI cycle described above. The reason for this is to have additional radiation safety margin.

The MI cycle defined above is called 2BB while the modified cycle is called 1BB, where BB is an acronym for Booster Batch. The Booster is part of the Fermilab accelerator complex. A batch is the unit of protons transferred out of the Booster to the next accelerator in the chain. You can infer from the above discussion that one Booster Batch is able to produce four spills. You can read more the delivery of the proton beam to Mu2e.

Events

The basic element of Mu2e data processing is an “event’, which is defined to be all of the data associated with a given time interval. For example when Mu2e is taking on-spill data, a proton pulse arrives at the production target approximately every 1695 ns. All of the raw data collected after the arrival of one proton pulse and before the arrival of the next proton pulse forms one event. When Mu2e processes our data, the unit of processing is one event at a time; each event is processed independently of the others.

By convention Mu2e has chosen that each event will start at the middle of the proton pulse that defines the event start; that is, for each event the time t=0 is defined to be the middle of the proton pulse. This is illustrated by the figure below:

During off-spill cases, there is no external reference to define an event and Mu2e has made the following choice: off-spill events will have a duration of 100 μs and one will follow the other with no break between them. For off-spill events the time t=0 is simply the start of the event.

For on-spill events, some subsystems start recording data for an event at t=0 but other subsystems start recording data at a later time; in those subsystems the early-time data is dominated by backgrounds and are not of use for physics analysis. The figure below illustrates some of the timing within an event.

TimeLineLiveGate2.png

The horiztonal axis shows time, in ns, and the vertical axis shows the number of particles that arrive in a 10 ns window. The left side black hatched histogram shows the time profile of one proton pulse arriving at the production target at t=0. This is followed 1695 ns later with the time profile of the next proton pulse. The time profile of the proton pulse is a consequence of details of the Fermilab accelerator complex that are beyond the scope of this note. The solid red histogram shows the time at which negative muons stop in the stopping target. The time profile of the muon stopping times is the convolution of the proton pulse shape with the travel time from the Production Target to the Stopping Target. The caption says that the muon stopping time distribution is scaled up by a factor of 300. Note that the black histogram show time as measured at the production target while the red histogram shows time measured at the stopping target.

The hatched red area shows time at which a the muon in a muonic Al atom either decays or is captured on to the nucleus. It's time profile is the convolution of the muon stopping time distribution with the exponential decay of the bound state muon. The caption says that this is scaled up another factor of about 3.3 relative to the muon stopping time distribution. If Mu2e conversion occurs at a measurable rate, the shape of time profile of the signal will be the the same as red hatched area but the normalization will be very different. This is figure is drawn as one pulse in the middle of a spill of near identical pulses. On the left hand side you can see that muonic atoms produced during earlier pulses will still be decaying when this pulse begins.

The hashed box shows the fiducial time window for signal candidates. I really wanted a version of this figure with the live window, or live gate, the time during which the subsystems record data, but we don't have that handy. Most of the subdectors start recording data about 100 to 150 ns before the fiducial time window.

Finally, look at the blue hatched histogram. This shows the time profile of beam flash arriving at the stopping target. Beam flash is mostly electrons, mostly with momentum < 10 MeV/c but with a tail out to a few hundred MeV/c. According to the caption it is scaled up by only a factor for 4. So there are approximately 75 beam flash particles for every muon arriving at the stopping target. Many of these low energy electrons bremsstrahlung in the stopping target foils and the bremsstrahlung photons are an important source of #Pileup in the tracker and calorimeter.

The other histogram shown in the figure is important for the discussion of a particular background that is described at: https://mu2e-docdb.fnal.gov/cgi-bin/sso/ShowDocument?docid=39255. It's not important to understand that paper in order to proceed with the tutorials but it is something you should understand in your first few months on Mu2e. For a shorter discussion of backgrounds see BackgroundsPhysIntro.

art, EventIDs, Runs and Subruns

Mu2e uses an event processing framework named art to manage the processing of events. It is supported by Fermilab Computing and is used by many of the Intensity Frontier experiments. art is written in C++ and the Mu2e code that works within art is also written in C++. You will learn more about art later in this document and in the ComputingTutorials.

The art convention for uniquely labeling each event is to give it a unique art::EventId, which is a set of 3 positive integers named the Run number, the SubRun number and the Event number.

The setting of the Run, SubRun and Event fields is under the control of the Mu2e Data Acquisition (DAQ) system. Mu2e has the following plan for how to use these features during normal data taking:

  • A run will have a duration between 8 and 24 hours. A new run will start with the run number incremented by 1 and with the SubRun and Event numbers both reset to 1. This will introduce about a 5 minute pause in data taking to reload firmware and restart some software.
  • A SubRun will have a duration of between 14 s and a few minutes. When a new SubRun starts, the SubRun number will be incremented by 1 and the Event number will be reset to 1. A SubRun transition will create no deadtime.
  • Within a SubRun, the Event number is monotonically increasing in time.

The final choice for the duration of Runs and SubRuns will be made when we are closer to having data and we better understand the tradeoffs. When an anomaly occurs during data taking, we will sometimes decide to stop the run and start a new one in order to segregate the data that has the anomaly. So there will be some short runs.

The key feature of a SubRun is that it must be short enough to follow the fastest changing calibrations. In HEP the name used for calibration information is “Conditions Data”, which is stored in a Conditions Database and managed by a Conditions System. For example Mu2e will record the temperature at many places on the apparatus in order to apply temperature dependent calibrations during data processing. Other calibration information might be determined from data, such as the relative timing and alignment of tracker and calorimeter. Some conditions information will change quickly and some will be constant over long periods of time. Information within the Conditions system is labeled with an Interval of Validity (IOV). Mu2e has made the choice that IOVs will be a range of SubRuns. You will not encounter the Conditions system in the early tutorials.

When the Mu2e DAQ system starts a new Run it will record information about the configuration of the run and at the end of the run it will record statistics and status about the run. Similarly, when the Mu2e DAQ starts or ends a new SubRun it will record information about the configuration, statistics and status of the SubRun. Some of this information will be available in offline world via databases; other information will be added to the computer file that holds the event information.

Analogous to the EventID, the art SubRunID is a 2-tuple of non-negative integers with parts named run number and subrun number. For completeness, art provides a RunID class that is just a non-negative integer.

The DAQ system will ensure that EventIDs, SubRunIDs and RunIDs are monontonically increasing in time.

art Events

An art event is a C++ class, art::Event, that is a C++ representation of the information in a Mu2e event. Conceptually it is an art::EventID plus a collection of Mu2e defined information that we have created. You will learn more about this in the ComputingTutorials. When someone uses the word event they may be speaking about the abstract concept of a Mu2e event or they may be speaking of its specific representation within art. When it's important to distinguish the two ideas, the remainder of this page will use event for the former and art::Event for the latter.

Raw Data

For each event, each Mu2e detector subsystem produces raw data that is sent through the DAQ system and is stored in the art::Event object for that event. The raw data is stored with in the art::Event in a format that is designed to be efficient for use by the subsystem readout firmware and by the DAQ system. The synonyms that you may hear for raw data are byte-stream data and binary data. These designs are documented at: https://mu2e-docdb.fnal.gov/cgi-bin/sso/ShowDocument?docid=4914 . This contains a lot of detail that beginners can safely skip the first time through.

Reconstruction

Reconstruction (reco) is the process of transforming raw data into quantities that are the inputs to physics analysis. These are sometimes called "Physics Objects". While that name does not yet have wide use within Mu2e, we expect its use will grow.

The first step in using the raw data is to unpack it into a form that is convenient for use by the algorithms that will use it. The unpacked objects are called “digis” and are implemented as simple C++ classes or structs; the name digi is in common use throughout HEP. Typically the transformation from raw data to digis is loss-less; that is, the information is the same, just reorganized into a format that is matched to the needs of downstream processing.

The code that transforms raw data into digis is an example of a module, which you will learn more about in the ComputingTutorials. The module will read the raw data from the art::Event object and write the digis back to the art::Event object.

Typically a digi contains a channel identifier, one or more TDC values, and an array of ADC values that represent a waveform. For the tracker system the channel identifier tells you which straw the data belongs to; similarly, for the calorimeter or CRV it tells you which SiPM the data belongs to. Typically a channel ID is a 16 bit integer. A TDC value is the output of a Time to Digital Converter; the datum is typically a 16 bit integer that tells you the time of the measurement in clock tick units relative to a reference time; the reference time may be the start of the event or some other reference that has a known offset relative to the start of the event. Different subsystems may have different clock tick units and different offsets. An ADC value is the output of an Analog to Digital Converter that measures a pulse height; the data is typically an 8 or 16 bit integer and it measures in units specific to that subsystem.

The next step in data processing transforms digis into "hits". Hits are also represented by simple C++ classes or structs. A typical hit contains the same information as a digi but transformed into physical units; the TDC values have been transformed into times in ns relative to the start of the event; and the ADC values have been transformed into voltages or charges. Some hit objects contain additional derived information that is used by downstream algorithms.

The word "hit" is widely used throughout HEP but it is heavily overloaded. It is often used to in the precise sense defined here but it is sometimes used as a collective noun meaning digis or hits. It also used to mean any interaction that deposits energy that might eventually produce a digi and later a hit. For example we say that a charged particle passing through the tracker makes hits in the straws and a particle entering the calorimeter makes hits in the crystals. Once you get some experience the meaning is usually clear from the context.

The straw hits have one other feature of interest to this discussion. The straws have electronics at both ends of the straw; in an ideal straw, if a charged particle passes through the straw at exactly the halfway point along the straw, the electronic signal created by the passing particle will arrive at both ends of the straw at the same time. If the charged particle passes at any other point along the length of the straw the electronic signals will arrive at the two ends at different times. We can invert this and use the measured time difference to infer the position along the straw at which the charged particle passed through. This is called Time Division.

Some Mu2e subsystems have a hierarchy of hit classes. The details are outside of the scope of this note. The short version is that there are sometimes good reasons to join two hits together and for the downstream code to treat them as a unit. You will encounter some of these in the tutorials.

At this stage reconstruction becomes very subsystem dependent and the details are outside of the scope of this note. A broad outline, however, is within the scope of this note.

The calorimeter reconstruction code looks for groups of crystals that are near each other and have hits that are close in time. It collects these together into an object called a cluster and estimates the total energy of the cluster. A cluster is interpreted as the energy left in the calorimeter by the interaction of single particle that enters the calorimeter. Clusters can be produced by different types of particles; for example clusters produced by through-going muons look very different than clusters produced by electrons or photons. Usually high energy clusters hit a larger group of crystals than low energy clusters. Information about each cluster is stored in the art::Event.

The tracker reconstruction code measures the trajectories of charged particles that traversed the tracker and provides a high quality estimator of each trajectory. In particular, the tracker must measure the momentum of these particles with a precision and accuracy of about 0.1% for charged particles with a momentum near 105 MeV/c. The objects created by this software are called “tracks”; in this usage a track is always a reconstructed object and a charged particle is the entity that produced energy deposits in the straw gas. In practice the word track is often used for both. In this writeup we will keep the distinction.

When a charged particle traverses the tracker, it follows an approximately helical trajectory and creates hits in the straws that it passes through. There are several reasons why the trajectory deviates from a helix:

  • The magnetic field is not exactly uniform; it has a designed gradient of about 1% per meter along the z direction. It also has other, smaller irregularities.
  • The particle loses energy and scatters as it traverses the straw materials.
  • If the charged particle is an electron, it bremsstrahlungs in the straw materials.

The early steps in tracker reco include several hit level classification/filtering steps that are followed by several pattern recognition steps; the latter look to find sets of tracker hits that are consistent with being on a helical trajectory. These steps do not use the full information available in each hit so a helix is a good enough approximation to the real trajectory. The output of the pattern recognition is an object called a HelixSeed that represents the reconstructed track. There may be zero, one or many HelixSeeds found in a given art::Event.

The final step in the tracker reco takes each HelixSeed and uses track fitting code called KinKal (for Kinematic Kalman Filter, https://mu2e-docdb.fnal.gov/cgi-bin/sso/ShowDocument?docid=45547 ) to produce a high quality estimator for the trajectory. This is the code that achieves the required momentum resolution of 0.1% at 105 MeV/c. Unlike Kalman filters used by other experiments, this code produces both an estimator of the trajectory and an estimator of the time that the track passed the central plane (perpendicular to z) of the tracker; this time is known as track t0 and the output of KinKal includes the full covariance matrix of t0 with the trajectory parameters. Like other Kalman filters this code can be configured to produce estimators of the trajectory for many points along the track. The Kalman filter code records it's output as objects of type KalSeed in the art::Event.

It is foreseen that the use of KinKal will be updated to extrapolate the measured trajectory upstream from the tracker, through the Inner Proton Absorber (IPA) to the stopping target foils. This feature is not available at this time and it is available as an R&D project, a very challenging one, to interested people.

Whenever possible, the KinKal code associates a cluster in the calorimeter with a track in the tracker and includes that cluster in the fit. The calorimeter cluster provides precise time information that greatly reduces the uncertainty in the measured track t0; the spatial information available from the calorimeter cluster is not used in the fit.

When a user tells KinKal to perform a fit, the user must specify a mass hypothesis (e,mu,pi,k,p,d) and a direction hypothesis (upstream going or downstream going). Most tracks that Mu2e will fit are, in fact, downstream moving electrons. When it is needed, the Mu2e code will fit the track with multiple hypotheses. In these cases, all fit variants will be stored in the art::Event.

The CRV reco looks for groups of nearby CRV counters that have hits that are close in time. These groups are called clusters. This notion of a cluster is entirely independent of calorimeter clusters; the two subsystems just happen to use the same word. Each CRV module has 4 layers of counters and a cluster must contain counters from at least 3 of the 4 layers; these are called coincidence clusters. During analysis, one of the steps is to ask if any of the CRV coincidence clusters are close in time to a track reconstructed in the tracker. If so, the track cannot be considered a signal candidate.

Implicit in the above description of the CRV is that the information is not use to veto the event at TDAQ time; it is only used during physics analysis.

The above story describes data processing for data from the Mu2e detector. The early stages of the story also describe the data flow from Tracker Vertical Slice Test (VST). When the test stands and VSTs for the other subsystems are integrated into the Mu2e offline software, their story will also be similar.

Mu2e has algorithms, developed on simulated events, to perform the above steps. All of these algorithms will need retuning once they encounter experimental data. Experimental data will also expose corner cases and imperfections that were not present in the simulated events. So there is a lot of work still to be done and many projects for grad students and post docs.

Analysis Formats

Mu2e plans to process data using art format files through to the end of calibration and reconstruction. In order to do final analysis of the data we will produce datasets in a format that is more user friendly for fast turn around analysis. To date, this has meant producing root TTrees and the two commonly used TTree formats are:

  • TrkAna
  • Stntuple

The traditional way to analyze TTrees is using root. In recent years, more and more people have been using Python based tools to analyze TTrees. There is a TrkAna Tutorial included in the ComputingTutorials. Information on Stnuple is avaialble at: https://github.com/Mu2e/Stntuple/blob/old_master/doc/Stntuple.org .

We expect that there will be significant evolution of analysis formats and tools as Mu2e approaches data taking. We invite everyone with experience with modern analysis tools to participate. Keep your eyes open for announcements. They keyword is likely to be "Analysis Model".

Simulations

The previous section described data processing for data from the Mu2e detector once it comes online. To date, the available samples of Mu2e events have all been produced by simulations. Over the years, simulations have been used to develop triggering, reconstruction and analysis algorithms, to predict the physics reach of the experiment and to study design variations and design tradeoffs. Simulations, and reconstruction of simulated events, played a critical role in convincing ourselves, the HEP community and the review committees that the Mu2e design is capable of achieving the physics goals of the experiment. There have been many iterations of the simulation over the years, each adding more detail and improved fidelity.

The following will present a compact overview of how Mu2e simulates events that contain one conversion electron (CE).

In HEP, simulations rely heavily on Monte Carlo methods ( https://en.wikipedia.org/wiki/Monte_Carlo_method ) and simulations are often referred to as Monte Carlos or simply as MC. A useful review of some commonly used Monte Carlo methods is maintained by the Particle Data Group; see page 5 of https://pdg.lbl.gov/2022/reviews/mathematical_tools.html .

The workhorse of HEP simulations is a program named Geant4 (https://geant4.web.cern.ch) which originated at CERN and which is now maintained by a world wide collaboration of physicists. It is also used in such fields as nuclear, medical, accelerator and space physics.

Geant4 (G4) is a toolkit for simulation of the passage of particles through matter. To use it, Mu2e has prepared a 3D volume-based description of the geometry of the Mu2e apparatus; the model includes the material from which each volume is made. And we have specified the magnetic field, including field outside of the coils that extends into the Mu2e Hall. The geometry model includes enough detail of the Mu2e building and surrounding area that it an be used as the input for radiation safety studies. The description of the building and surrounding area is also used when we predict how often how cosmic rays can produce false signals.

To start Geant4, we give it a stack of particles to process. Each particle in the stack is specified by its particle species (electron, muon, proton, neutron … ), position, 4-momentum and the time at which it is created. The particles in the stack that we give to Geant4 are called “primary particles” or simply “primaries”. To study conversion electrons, the only particle on the stack is the conversion electron.

The information that is needed to put a particle on the stack is its particle species, electron in this case, its position in space, the time at which it was created and its initial 4-momentum. For the case of conversion electronics we used a previous run of G4 to compute the times and positions that muons stop in the stopping target foils. When we generate conversion electrons as primary particles we sample the output of this earlier G4 run.

Geant4 knows how to transport particles in an arbitrary set of electric and magnetic fields; It also has detailed models of how particles interact with many different kinds of matter; these models are called “physics processes”. Geant4 knows that particles lose energy and scatter as they traverse material; it knows that the details are different for different types of particles. It knows that charged particles may produce delta rays and that photons may Compton scatter or convert to an e+ e- pair. It knows that neutrons can capture on hydrogen, releasing a 2.2 MeV gamma ray. It knows that some particles may have nuclear interactions with the nuclei that they encounter and it knows the cross-sections for each sort of interaction. It knows that some particles decay. It knows that when a particle loses all of its kinetic energy it will come to a stop in the material and that some particle species are captured on atoms when they stop. And so on.

Geant4 breaks up the transport of a particle into many steps. One step ends and a new step starts whenever a particle crosses a geometry boundary or whenever a physics process tells it to.

Many of the physics processes modeled by Geant4 produce secondary particles; examples include decays, delta rays, conversion paris, Compton electrons and many particle species created by nuclear interactions. As Geant4 produces these particles, it adds them to its stack of particles to process. When it finishes with the first particle it takes a particle off of its stack and processes it. When Geant4 has drained the stack, it has finished processing one event.

When discussing Geant4, the jargon “secondary particles” or “secondaries” refers to all generations of particles produced by Geant4, not just to the generation produced directly by the primary particles. This is common use throughout HEP.

Mu2e extracts information from Geant4 to record a complete parent-child history of primary and secondary particles, including the physics process that caused a particle to be created or to stop. Optionally Mu2e can record a detailed model of the trajectory, suitable for event displays.

Mu2e also records what happens during G4 steps that occur in interesting volumes. Mu2e records steps taken within the gas volume of each straw, within each calorimeter crystal and with each CRV counter; in particular, Mu2e records the energy deposited by each step and that time at which it was deposited. These volumes are referred to as “sensitive detectors” or “sensitive volumes”. This information will later be used as an input towards producing simulated digis. Sometimes we record steps in the readout electronics that are mounted close to the detector; we use this to compute the radiation dose on those materials so that we know how to specify the radiation-hardness of the electronics that we buy. We have also studied radiation doses in many parts of the Mu2e Hall and surrounding areas.

During the G4 process the Mu2e code adds MC-truth information to the art::Event. This includes information about each paricle that was popped off of the stack, the parent-child information among these particles and information about energy deposits in sensitive detectors. Optionally, it may include the true tracjectory of selected particles.

The Mu2e simulation process takes the output of G4 and uses algorithms written by Mu2e to transform energy depositions in sensitive volumes into digis. These algorithms must know about the energy transport processes within the sensitive materials, the response of the readout electronics to this energy and processes that introduce noise into the system. Mu2e code writes these digis to the art::Event.

Reconstruction of simulated events starts by reading the digis from the art::Event and proceeds using the same algorithms as will be used on data. These algorithms never look at the MC truth information, only at the data-like information. The reconstructed information is written to the art::Event.

There are missing steps in this process: the transformation of MC digis into the raw data format and back into reco digis. Since this process is lossless in both directions we do not compromise the physics integrity by skipping these steps.

Following reconstruction of simulated events, Mu2e runs code that matches reconstructed objects to their MC precursors. This information is also added to the art::Event. This information can be use to characterize the correctness of the reconstruction algorithms and as a guide to improving their correctness.

MC truth information is usually also added to Analysis Format files that are made from simulated events.

Simulations with Pileup

The previous section described how Mu2e simulates a single conversion electron with no additional activity in the detector. This is not realistic because real events will contain contributions from many other sources, collectively called pileup. These include:

  1. Low energy electrons from the muon beam that bremsstrahlung in the target to create photons. Some photons will produce hits in the calorimeter. Some photons will produce photoelectrons that produce hits in the tracker. Other photons will interact in the tracker materials to produce a Compton electron. Often Compton electrons have enough energy they spiral along the magnetic field lines and leave hits in many straws; sometimes they reach the calorimeter.
  2. Electrons from muon Decay in Orbit (DIO) in the stopping target that have a momentum too low to make a reconstructable track but high enough to make hits in the innermost tracker straws or innermost calorimeter crystals. DIO electrons that have enough energy to produce a reconstructable track are rare and are very unlikely to occur in the same event as a conversion electron. For more information about DIO, see BackgroundsPhysIntro#Decay_In_Orbit.
  3. Protons and deuterons from muon nuclear capture. Some will produce reconstructable tracks. Others will make hits in the innermost parts of the detector but not produce a reconstructable track.
  4. There are many sources of neutrons; these include the primary proton interactions in the production target, secondary hadrons that interact in the shielding materials or collimators and neutrons produced by muon nuclear capture in any material, including, but not limited to, the stopping target. A neutron may interact in or near a CRV counter to produce secondaries that hits in that counter.
  5. Some beam muons stop in the Inner Proton Absorber (IPA) and form muonic atoms. Some of these muonic atoms will produce DIO electrons. Because the IPA is far off of the axis of the tracker, the main body of Michel electrons from these decays will travel through the tracker and make hits. Most of the reconstructable tracks in the tracker come from this source.
  6. Photons and neutrons produced in the stopping target by muon nuclear capture can interact in the DS cryostat and the albedo from these interactions can produce hits in the tracker or calorimeter.

The Mu2e workflow for creating a simulated event with a conversion electron plus pileup is given below. This description is conceptual and skips some steps that are critical for computational efficiency but introduce no new physics content.

  1. Use Geant4 to:
    1. Simulate a proton beam hitting the production target.
    2. Record the times and positions at which muons stop in the stopping target.
    3. Build a model of pileup energy depositions in the tracker, calorimeter and CRV. Do not process these to digis; just leave them as energy depositions,
  2. Create a G4 primary particle that models the conversion electron
    1. Choose a position from 1.2;
    2. Set the energy to the known value of conversion energy and generate a direction that is uniform on 4pi;
    3. Set the time to the time from 1.2 plus a time randomly chosen from the exponential decay of a muonic Al atom.
  3. Process this primary particle with Geant4.
  4. Overlay the pileup model (the output of 1.3) onto the output of 3). This means to store in memory all of the energy depositions in each sensitive detector, including those from the conversion electron and those from pileup.
  5. Process the output of 4) through a model of the detector response, including the electronics response, to produce simulated digis.
  6. Write the simulated digis and the MC truth information to the art::Event.

In a Mu2e data, and in simulated events, almost all hits in the tracker, calorimeter and CRV come from pileup processes.

Background vs Pileup

When one is speaking carefully, background and pileup have different meanings. Backgrounds are a source of particles that could be reconstructed as 105 MeV/c electrons, consistent with coming from the stopping target and passing all of the other selection criteria required of signal candidates; that is, they can produce false signal candidates. A summary of physics processes that may produce backgrounds is available elsewhere on the wiki, BackgroundsPhysIntro. And the meaning of pileup was defined in the previous section.

In casual conversation people sometimes use the word "background" to include both background, as just defined, and pileup. If you are uncertain about how someone is using the word "background", ask them,

Still Under Construction

The information below is correct. Some of it will remain on this page. Some of it will get moved elsewhere.

Coding

The Mu2e simulation and reconstruction code is written in c++. We write modules which create simulated data, or read data out of the event, process it, and write the results back into the event. The modules plug into a framework called art, and this framework calls the modules to do the actual work, as the framework reads an input file and writes an output file. The primary data format is determined by the framework, so it is called the art format and the file will have an extension .art.

We use the git code management system to store and version our code. Currently, we have one main git repository which contains all our simulation and reconstruction code. You can check out this repository, or a piece of it, and build it locally. In this local area you can make changes to code and read, write and analyze small amounts of data. We build the code with a make system called scons. The code may be built optimized (prof) or non-optimized and prepared for running a debugger (debug).

At certain times, the code is tagged, built, and published as a stable release. These releases are available on the /cvmfs disk area. cmfvs is a sophisticated distributed disk system with layers of servers and caches, but to us it just looks like a read-only local disk, which can be mounted almost anywhere. We often run large projects using these tagged releases. cmvfs is mounted on the interactive nodes, at remote institutions, on some desktops, and all the many grid nodes we use.

You can read more about accessing and building environment, git, scons and cvmfs.

Executables

Which modules are run and how they are configured is determined by a control file, written in fcl (pronounced fickle). This control file can change the random seeds for the simulation and the input and output file names, for example. A typical run might be to create a new simulation file. For various reasons, we often do our simulation in several stages, writing out a file between each run of the executable, or stage, and reading it in to start the next stage. A second type of job might be to run one of the simulation stages with a variation of the detector design, for example. Another typical run might be to take a simulation file as input and test various reconstruction algorithms, and write out reconstruction results.

Data products

The data in an event in a file is organized into data products. Examples of data products include straw tracker hits, tracks, or clusters in the calorimeter. The fcl is often used to decide which data products to read, which one to make, and which ones to write out. There are data products which contain the information of what happened during the simulation, such as the main particle list, SimParticles.

UPS Products

Disambiguation of "products" - please note that we have both data products and UPS products which unfortunately are both referred to as "products" at times. Please be aware of the difference, which you can usually determine from the context.

The art framework and fcl control language are provided as a product inside the UPS software release management system. There are several other important UPS products we use. This software is distributed as UPS products because many experiments at the lab use these utilities. You can control which UPS products are available to you ( which you can recognize as a setup command like "setup root v6_06_08") but most of this is organized as defaults inside of a setup script.

You can read more about how UPS works.

Histogramming

Once you have an art file, how to actually make plots and histograms of the data? There are many ways to do this, so it is important to consult with the people you work with, and make sure you are working in a style that is consistent with their expertise and preferences, so you can work together effectively.

In any case, we always use the root UPS product for making and viewing histograms. There are two main ways to approach it. The first is to insert the histogram code into a module and write out a file which contains the histograms. The second method is to use a module to write out an ntuple, also called a tree. This is a summary of the data in each event, so instead of writing out the whole track data product, you might just write out the momentum and the number of hits in the nutple. The ntuple is very compact, so you can easily open this and make histogram interactively very quickly.

Read more about ways to histogram or ntuple data for analysis.

Workflows

Designing larger jobs

After understanding data on a small level by running interactive jobs, you may want to run on larger datasets. If a job is working interactively, it is not too hard to take that workflow and adapt it for running on large datasets on the compute farms. First, you will need to understand the mu2egrid UPS product which is a set of scripts to help you submit jobs and organize the output. mu2egrid will call the jobsub UPS product to start your job on the farm. You data will be copied back using the ifdh UPS product, which is a wrapper to data transfer software. The output will go to dCache, which is a high-capacity and high-throughput distributed disk system. We have 100's of terabytes of disk space here, divided into three types (a scratch area, a persistent disk area, and a tape-backed area). Once the data is written, there are procedures to check it and optionally concatenate the files and write them tape. We track our files in a database that is part of the SAM UPS product. You can see the files in dCache by looking under the /pnfs filesystem. Writing and reading files to dCache can have consequences, so please understand how to use dCache and also consult with an experienced user before running a job that uses this disk space.

Grid resources

Mu2e has access to a compute farm at Fermilab, called Fermigrid. This farm is several thousand nodes and Mu2e is allocated a portion of the nodes (our dedicated nodes). Once you have used the interactive machines to build and test your code, you can submit a large job to the compute farms. You can get typically get 1000 nodes for a day before your priority goes down and you get fewer. If the farm is not crowded, which is not uncommon, you can get several times that by running on the idle or opportunistic nodes.

Mu2e also has access to compute farms at other institutions through a collaboration called Open Science Grid (OSG). It is easy to modify your submit command to use these resources. We do not have a quota here, we can only access opportunistic nodes, so we don't really know how many nodes we can get, but it is usually at least as much as we can get on Fermigrid. This system is less reliable than Fermigrid so we often see unusual failure modes or jobs restarting.

Your workflow

Hopefully you now have a good idea of the concepts and terminology of the Mu2e offline. What part of the offline system you will need to be familiar with will depend on what tasks you will be doing. Let's identify four nominal roles. In all cases, you will need to understand the accounts and authentication.

  1. ntuple user. This is the simplest case. You probably will be given a ntuple, or a simple recipe to make an ntuple, then you will want to analyze the contents. You will need to have a good understanding of c++ and root, but not much else.
  2. art user. In this level you would be running art executables, so you will also need to understand modules, fcl, and data products. Probably also how to make histograms or ntuples from the art file.
  3. farm user. In this level you would be running art executables on the compute farms, so you will also need to understand the farms, workflows, dCache, and possibly uploading files to tape.
  4. developer. In this case, you will be writing modules and algorithms, so you need to understand the art framework, data products, geometry, c++, and standards in some detail, as well as the detector itself.