CodeDepencyGraph: Difference between revisions

From Mu2eWiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(19 intermediate revisions by 2 users not shown)
Line 2: Line 2:
Instructions to make a code dependency graph.
Instructions to make a code dependency graph.


These instructions will only work on machines that have ack, tred and dot installed.  At this time, these are not available on the mu2e interactive machines; they are available on the art teams development machine, woof.
These instructions will only work on machines that have ack and graphviz installed.   


These instructions describe how to run code-dep-graph, a tool that creates a graph of compile time dependence relationships between packages.  The tools starts from the current working directory and works down from there.  This tool has it's own notion of a ''package'' which is slightly different from other notions in common use.  Roughly speaking any directory that contains code is its own package; if a directory has a subdirectories named inc and src, those names are elided.  Here are some examples of packages taken from a recent version of Mu2e Offline:  RecoDataProducts, ExtinctionMonitorFNAL/Utilities, ExtinctionMonitorFNAL/Reconstruction; the slash is part of the package name.
These instructions describe how to run code-dep-graph, a tool that creates a graph of compile time dependence relationships between packages.  The tools starts from the current working directory and works down from there.  This tool has it's own notion of a ''package'' which is slightly different from other notions in common use.  Roughly speaking any directory that contains code is its own package; if a directory has a subdirectories named inc and src, those names are elided.  Here are some examples of packages taken from a recent version of Mu2e Offline:  RecoDataProducts, ExtinctionMonitorFNAL/Utilities, ExtinctionMonitorFNAL/Reconstruction; the slash is part of the package name.
Line 11: Line 11:
<ol>
<ol>
  <li> Make a clean working directory and cd to it.</li>
  <li> Make a clean working directory and cd to it.</li>
      mkdir work
<pre>
      cd work
mkdir work
cd work
</pre>
  <li>Clean checkout.  This will be a throw-away copy of the code; don't use a copy with changes you intend to commit and push.</li>
  <li>Clean checkout.  This will be a throw-away copy of the code; don't use a copy with changes you intend to commit and push.</li>
  git clone ssh://p-mu2eofflinesoftwaremu2eoffline@cdcvs.fnal.gov/cvs/projects/mu2eofflinesoftwaremu2eoffline/Offline.git
<pre>
  git clone https://github.com/Mu2e/Offline
</pre>


<li> Optional but highly recommended:  remove files that unnecessarily complicate the picture. This includes all modules, which must sit at the top of the dependence hierarchy, and a few directories like Sandbox.  This step deletes files, hence the warning about using a throw-away copy.</li>
<li> Optional but highly recommended:  remove files that unnecessarily complicate the picture. This includes all modules, which must sit at the top of the dependence hierarchy, and a few directories like Sandbox.  This step deletes files, hence the warning about using a throw-away copy. By agreement, test directories have no production code, so ignore them too.</li>
 
<pre>
  find Offline -name \*_module.cc -exec rm -f {} \;
  find Offline -name \*_module.cc -delete
  find Offline -name \*_source.cc -exec rm -f {} \;
  find Offline -name \*_source.cc -delete
  find Offline -name \*_tool.cc -exec rm -f {} \;
  find Offline -name \*_tool.cc -delete
  rm -r Offline/Sandbox Offline/HelloWorld Offline/Legacy Offline/CalPatRec/attic
find Offline -name \*_service.cc -delete
  rm -r Offline/Sandbox Offline/HelloWorld  
  rm -rf Offline/.git
  rm -rf Offline/.git
rm -rf $(find Offline -maxdepth 2 -type d -name "test")
</pre>


<li> Discover what versions of cetbulidtools are available.  Choose a recent version.  As of October 22, 2013 version
<li> Discover what versions of cetbulidtools are available.  Choose a recent version.  As of October 22, 2013 version
     v7_04_00 is known to work.</li>
     v7_04_00 is known to work.</li>
  ups -aK+ cetbuildtools
<pre>
  ups list -aK+ cetbuildtools
</pre>


<li> Setup an appropriate version of cetbuildtools.</li>
<li> Setup an appropriate version of cetbuildtools.</li>
<pre>
setup ack
  setup cetbuildtools v7_04_00
  setup cetbuildtools v7_04_00
    or in al9
spack load cetmodules/xyn2j2a
spack load ack/x32hbd2
</pre>


<li> Run code-dep-graph from the top directory of Offline.  The output is a dot file that contains dependency pairs; you can
<li> Run code-dep-graph from the top directory of Offline.  The output is a "dot file" that contains dependency pairs; you can read this file with a text editor.  Dot files are the standard format of the [http://https://www.graphviz.org graphviz package] the syntax is intuitive and you can find documentation online - google graphviz or "dot file format". The text represents a directed graph.  The log file is verbose and I have not found anything useful in it if the code is working properly. The sed command removes the redundant "Offline/" everywhere</li>
read this file with a text editor; the syntax is intuitive and you can find documentation online - google "dot file format".
<pre>
The text represents a directed graph.  The log file is verbose and I have not found anything useful in it if the code is working properly.</li>
  time ${CETBUILDTOOLS_DIR}/bin/code-dep-graph -v -o trimmed.dot >& trimmed.log
cd Offline
sed -i 's/Offline\///g' trimmed.dot
  time ${CETBUILDTOOLS_DIR}/bin/code-dep-graph -v -o ../trimmed.dot >& ../trimmed.log
</pre>


<li> The previous step should take several minutes on an unloaded machine.  The names trimmed.dot and trimmed.log are illustrative; they have no meaning.
<li> The previous step should take ~4 minutes on an unloaded machine with local disks.  The names trimmed.dot and trimmed.log are illustrative; they have no meaning.


<li> Do a transitive reduction of the directed graph and create the reduced directed graph as a png file.  A transitive reduction means the following.  Suppose that we have 3 packages, A, B, C; also suppose the A depends directly on both B and C, while B depends on C.  In the transitive reduction the dependence of A on C will be elided since it is implied by the dependence of A on B.  If you don't do this step on Offline, the resulting graph is not readable.
<li> Do a transitive reduction of the directed graph and create the reduced directed graph as a png file.  A transitive reduction means the following.  Suppose that we have 3 packages, A, B, C; also suppose that A depends directly on both B and C, while B depends on C.  In the transitive reduction the dependence of A on C will be elided since it is implied by the dependence of A on B.  If you don't do this step on Offline, the resulting graph is so busy that it is not readable. The commands dot and tred are part of the graphviz package: https://www.graphviz.org
</li>
</li>
<pre>
setup graphviz
  (in al9, this is installed as an rpm)
  tred trimmed.dot  | dot -Tpng -o trimmed.png
  tred trimmed.dot  | dot -Tpng -o trimmed.png
</pre>


<li> If there are loops in the directed graph tred will issue a diagnostic like the following:</li>
<li> If there are loops in the directed graph tred will issue a diagnostic like the following:</li>
<pre>
  warning: %1 has cycle(s), transitive reduction not unique
  warning: %1 has cycle(s), transitive reduction not unique
  cycle involves edge MCDataProducts -> RecoDataProducts
  cycle involves edge MCDataProducts -> RecoDataProducts
</pre>


<li> Additional comments on tred
<li> Additional comments on tred
Line 53: Line 74:
       <li> The packages mentioned in the "involves the edge" can sometimes be a few hops from the real problem; so you may need to hunt.
       <li> The packages mentioned in the "involves the edge" can sometimes be a few hops from the real problem; so you may need to hunt.
       <li> The output of tred is just another dot file; you can capture it to a file instead of piping to dot.
       <li> The output of tred is just another dot file; you can capture it to a file instead of piping to dot.
      <li> The commands dot and tred are part of the graphviz package: https://www.graphviz.org .  I will see if I can get them put on the mu2e interactive machines.
       <li> If you prefer a format different from png, dot has a lot of options:  <tt>dot --help</tt>
       <li> If you prefer a format different from png, dot has a lot of options:  <tt>dot --help</tt>
       <li> Sometimes the transitive reduction can be confusing - it's hard to see a dependence that you know should be there.  In that case just grep the original dot file and you can verify that the dependence is there.
       <li> Sometimes the transitive reduction can be confusing - it's hard to see a dependence that you know should be there.  In that case just grep the original dot file and you can verify that the dependence is there.
       <li> Hint: when you are in the process of discovering loops, you may be able to hand edit the dot file to remove the offending edge and rerun tred; this is faster than rerunning code-dep-graph.
       <li> Hint: when you are in the process of discovering loops, you can identify the offending edge and hand edit the dot file to remove it; then rerun tred to get the new graph; this is much, much faster than rerunning code-dep-graph.
   </ol>
   </ol>


<li> View the graph using your favorite png browser, such as a web browser, <tt>Preview</tt> on a Mac or <tt>display</tt> on SLF. </li>
<li> View the graph using your favorite png browser, such as a web browser, <tt>Preview</tt> on a Mac, <tt>display</tt> on SLF, or <tt>eog</tt> on al9. </li>
    An example of a dependency graph is here:[[Media:Trimmed.png]]
An example of a dependency graph is here:[[Media:Trimmed.png]]This is taken from the git 5c4868af, from Oct 24, 2018.
    This is taken from the git 5c4868af, from Oct 24, 2018.


    A revised version of the dependency graph with 1 edge (RecoDataProducts -> MCDataProducts) hand removed from the dot file: [[Media:Trimmed4.png]].  This is the edge that will be removed when the one outstanding problem is fixed.
A revised version of the dependency graph with 1 edge (RecoDataProducts -> MCDataProducts) hand removed from the dot file: [[Media:Trimmed4.png]].  This is the edge that will be removed when the one outstanding problem is fixed.


<li> Some comments on the graph.
<li> Some comments on the graph.
Line 77: Line 96:
  dot trimmed.dot -Tpng -o trimmed_notred.png
  dot trimmed.dot -Tpng -o trimmed_notred.png
</ol>
</ol>
[[Category:Computing]]
[[Category:Code]]

Latest revision as of 02:48, 2 June 2024

Instructions to make a code dependency graph.

These instructions will only work on machines that have ack and graphviz installed.

These instructions describe how to run code-dep-graph, a tool that creates a graph of compile time dependence relationships between packages. The tools starts from the current working directory and works down from there. This tool has it's own notion of a package which is slightly different from other notions in common use. Roughly speaking any directory that contains code is its own package; if a directory has a subdirectories named inc and src, those names are elided. Here are some examples of packages taken from a recent version of Mu2e Offline: RecoDataProducts, ExtinctionMonitorFNAL/Utilities, ExtinctionMonitorFNAL/Reconstruction; the slash is part of the package name.

code-dep-graph walks the directory tree looking for C++ header and implementation files. It parses these files and extracts the #include directives. It builds the package dependency tree based on the #include directives.

Here is a cookbook to run the tool.

  1. Make a clean working directory and cd to it.
  2. mkdir work
    cd work
    
  3. Clean checkout. This will be a throw-away copy of the code; don't use a copy with changes you intend to commit and push.
  4.  git clone https://github.com/Mu2e/Offline
    
  5. Optional but highly recommended: remove files that unnecessarily complicate the picture. This includes all modules, which must sit at the top of the dependence hierarchy, and a few directories like Sandbox. This step deletes files, hence the warning about using a throw-away copy. By agreement, test directories have no production code, so ignore them too.
  6.  find Offline -name \*_module.cc -delete
     find Offline -name \*_source.cc -delete
     find Offline -name \*_tool.cc -delete
     find Offline -name \*_service.cc -delete
     rm -r Offline/Sandbox Offline/HelloWorld 
     rm -rf Offline/.git
     rm -rf $(find Offline -maxdepth 2 -type d -name "test")
    
  7. Discover what versions of cetbulidtools are available. Choose a recent version. As of October 22, 2013 version v7_04_00 is known to work.
  8.  ups list -aK+ cetbuildtools
    
  9. Setup an appropriate version of cetbuildtools.
  10.  setup ack
     setup cetbuildtools v7_04_00
        or in al9
     spack load cetmodules/xyn2j2a 
     spack load ack/x32hbd2
    
  11. Run code-dep-graph from the top directory of Offline. The output is a "dot file" that contains dependency pairs; you can read this file with a text editor. Dot files are the standard format of the graphviz package the syntax is intuitive and you can find documentation online - google graphviz or "dot file format". The text represents a directed graph. The log file is verbose and I have not found anything useful in it if the code is working properly. The sed command removes the redundant "Offline/" everywhere
  12.  time ${CETBUILDTOOLS_DIR}/bin/code-dep-graph -v -o trimmed.dot >& trimmed.log
     sed -i 's/Offline\///g' trimmed.dot
    
  13. The previous step should take ~4 minutes on an unloaded machine with local disks. The names trimmed.dot and trimmed.log are illustrative; they have no meaning.
  14. Do a transitive reduction of the directed graph and create the reduced directed graph as a png file. A transitive reduction means the following. Suppose that we have 3 packages, A, B, C; also suppose that A depends directly on both B and C, while B depends on C. In the transitive reduction the dependence of A on C will be elided since it is implied by the dependence of A on B. If you don't do this step on Offline, the resulting graph is so busy that it is not readable. The commands dot and tred are part of the graphviz package: https://www.graphviz.org
  15.  setup graphviz
      (in al9, this is installed as an rpm)
     tred trimmed.dot  | dot -Tpng -o trimmed.png
    
  16. If there are loops in the directed graph tred will issue a diagnostic like the following:
  17.  warning: %1 has cycle(s), transitive reduction not unique
     cycle involves edge MCDataProducts -> RecoDataProducts
    
  18. Additional comments on tred
    1. If there are several loops in the graph, tred will only report the first one that it finds. So you need to fix the problem and iterate until there are no more diagnostics.
    2. The packages mentioned in the "involves the edge" can sometimes be a few hops from the real problem; so you may need to hunt.
    3. The output of tred is just another dot file; you can capture it to a file instead of piping to dot.
    4. If you prefer a format different from png, dot has a lot of options: dot --help
    5. Sometimes the transitive reduction can be confusing - it's hard to see a dependence that you know should be there. In that case just grep the original dot file and you can verify that the dependence is there.
    6. Hint: when you are in the process of discovering loops, you can identify the offending edge and hand edit the dot file to remove it; then rerun tred to get the new graph; this is much, much faster than rerunning code-dep-graph.
  19. View the graph using your favorite png browser, such as a web browser, Preview on a Mac, display on SLF, or eog on al9.
  20. An example of a dependency graph is here:Media:Trimmed.png. This is taken from the git 5c4868af, from Oct 24, 2018. A revised version of the dependency graph with 1 edge (RecoDataProducts -> MCDataProducts) hand removed from the dot file: Media:Trimmed4.png. This is the edge that will be removed when the one outstanding problem is fixed.
  21. Some comments on the graph.
    1. Packages with no dependencies are at the bottom of the graph; packages that have no packages depending on them are at the top.
    2. code-dep-graph gets a little confused with tools; it forgets to draw a box around the name of the tool plugin; I will speak with the author. This feature is not shown in this figure but it is a known issue.
    3. A typical service package makes two shared libraries: the service plugin and the library that contains all of the other code in the package. Both are represented in the graph; the service plugin is colored blue and the other library is uncolored.
    4. One obvious problem with the first graph is that MCDataProducts and RecoDataProducts depend on each other. The problem is understood and has been assigned.
  22. Optional: create the visualization of the graph without transitive reduction - but it's impossible to read
  23. dot trimmed.dot -Tpng -o trimmed_notred.png