CodeDebugging
Introduction
The two main tools we use to debug code are the standard gnu debugger gdb and the memory debugger valgrind. Some people use gprof or openspeedshop as a profiler.
art tools
art has some debugging tools built in art tools, including time and memory checking systems.
gdb
gdb is the gnu standard text-only debugger. We set up a compatible version when you set up an Offline version.
Here are a few simple commands to get started, there is an online manual.
If your command is
mu2e -s somefile.art -c Print/fcl/print.fcl
Evoke gdb with:
gdb --args mu2e -s somefile.art -c Print/fcl/print.fcl
Start execution:
(gdb) run
You can also restart execution by typing "run" again. If you run once, the libraries won't all be loaded, but when you re-run they will be. Show the call stack:
(gdb) where
Select second frame in stack:
(gdb) frame 2
Typically, the exe will have been built on another machine, so gdb can't find the source code. You can tell it about source directories like:
(gdb) dir /cvmfs/mu2e.opensciencegrid.org/Offline/v7_0_4/SLF6/prof/Offline/Print/src
These commands can be put in a .gdbinit file. Step one line, stepping over function calls:
(gdb) n
Step one line, stepping into function calls:
(gdb) s
Set break by function (tab completion available if libraries are loaded)
(gdb) break 'mu2e::CaloHitPrinter::Print(art::Event const& event, std::ostream& os)'
Set break by line number
(gdb) break 'CaloHitPrinter.cc:102'
Continue after a break:
(gdb) cont
Run to the end of a stack call:
(gdb) finish
Print local variable "x":
(gdb) p x
Print stl vector "myVector"
(gdb) print *(myVector._M_impl._M_start)@myVector.size()
list code line n
(gdb) list n
Catch art throws
(gdb) catch throw
It is possible to to do very much more such as setting break on a memory location write, attach to a running process, examine threads, call functions, set values, break after certain conditions, etc.
ddt
valgrind
valgrind is a memory debugger which largely works by inserting itself into heap memory access. It can detect:
- use of unintialized variables
- accessing memory freed or never allocated
- double deletion
- memory leaks
It is installed on all interactive machines, or you can UPS setup a particular version.
Here is a typical command
setup valgrind v3_13_0 valgrind --leak-check=yes --error-limit=no -v \ --demangle=yes --show-reachable=yes --num-callers=20 mu2e -c my.fcl
The additional memory checking causes the exe to run much. much slower.
You may find that packages like root libraries may have so many (probably not consequential) errors detected that it drowns out the useful messages. "Conditional jump or move depends on uninitialised value(s)" is a common ROOT error. valgrind allows you to suppress errors that are not important to you. We have an example suppressions file which can be added with
valgrind --leak-check=yes --error-limit=no -v \ --suppressions=$MU2E_BASE_RELEASE/scripts/valgrind/all_including_geant_callbacks.supp \ --demangle=yes --show-reachable=yes --num-callers=20 mu2e -c my.fcl
This file is aggressive in removing benign errors, and it may also remove some useful errors. For example, the easiest way to suppress the thousands of errors from geant, is to suppress errors that include the geant paths in the call stack. Unfortunately, the way geant is setup, we register our simulation code with geant, and geant calls our code at appropriate times. To valgrind, this makes our code look like a part of geant and the naive suppression method may also suppress errors from our simulation code. There is no easy way around this, but it may be possible with sufficient effort.
art floating point service
art reference. An example:
services.FloatingPointControl : { enableDivByZeroEx: true enableInvalidEx: true #enableUnderFlowEx: true # usually harmless.. enableOverFlowEx: true }
gprof
openspeedshop
This profiling package needs to be installed from documents.
osspcsamp "mu2e -s somefile.art -c Print/fcl/print.fcl" >& log