CppFAQ: Difference between revisions
(9 intermediate revisions by 3 users not shown) | |||
Line 29: | Line 29: | ||
</ul> | </ul> | ||
=== Tutorials | === Tutorials=== | ||
<ul> | <ul> | ||
<li> Andrew Koenig and Barbara E. Moo, "Accelerated C++: Practical Programming by Example" Addison-Wesley, 2000. ISBN 0-201-70353-X. | <li> Andrew Koenig and Barbara E. Moo, "Accelerated C++: Practical Programming by Example" Addison-Wesley, 2000. ISBN 0-201-70353-X. | ||
</ul> | </ul> | ||
=== Best Practices=== | === Best Practices=== | ||
Line 57: | Line 56: | ||
*[https://cdcvs.fnal.gov/redmine/projects/gm2public/wiki/CPP2011 Kevin Lynch's FAQ] | *[https://cdcvs.fnal.gov/redmine/projects/gm2public/wiki/CPP2011 Kevin Lynch's FAQ] | ||
== C++20 == | |||
The art team will upgrade art to C++20 soon after ROOT has upgraded. This is estimated for late 2020 or early 2021. In preparation for this Kyle will prepare a series of presentations about C++20 that he will give at art stakeholders meetings: | |||
* [https://indico.fnal.gov/event/43987/contributions/189256/attachments/130873/159936/cpp-20.pdf Part 1: Overview] | |||
In the Mu2e trigger rare events with very long execution times can backup the pipelines, causing loss of data. In C++17 and earlier there is no way to interrupt a module that is taking too long, flag the event as pathological and continue with the next module, next path or next event. If we had this feature we could decide to route pathological events to the regular output stream, to a separate output stream or simply to discard them. At present the only option is to kill the offending art process and restart it; this will result in the loss of the event being processed. | |||
There is a new feature in C++ 20, named jthreads, that ''may'' provide the tools needed to implement interruptible modules. Kevin Lynch sent an email with a heads up about jthreads: | |||
<blockquote>However - and I'm certainly not an expert here - C++20 also added a fully back compatible replacement for std::thread called std::jthread for "joining thread" ... if the thread is not joined manually, it joins in its destructor, removing much of the pain of cleaning up after abandoned threads. The important bit for your question, however, is that it _also_ added a cancellation mechanism via std::stop_source/std::stop_token/std::stop_callback to allow thread safe, cooperative thread cancellation. I don't have a really good reference for this, but there is the currently incomplete documentation on https://en.cppreference.com/w/cpp/thread, and I watched a cppcon 2019 video on this: | |||
Bryce Adelstein Lelbach “The C++20 Synchronization Library” https://www.youtube.com/watch?v=Zcqwb3CWqs4 | |||
That's of course a 1hr+ video with the bit you want buried somewhere in the middle. | |||
Finally, Anthony Williams (who literally wrote the book on C++ concurrency "C++ Concurrency in Action") gave a talk that I have not watched, but his slides are available here, and he starts off quite quickly with discussion of stop_token and friends. https://www.justsoftwaresolutions.co.uk/files/cppcon_2019_concurrency_in_cpp20.pdf | |||
</blockquote> | |||
== Where may I use: static const Type = value; == | == Where may I use: static const Type = value; == | ||
Line 98: | Line 115: | ||
member within a class declarationn. We have, however, encountered a situation in which it produces | member within a class declarationn. We have, however, encountered a situation in which it produces | ||
incorrect code. This non-standard usage can be identified by using the -pedantic flag on | incorrect code. This non-standard usage can be identified by using the -pedantic flag on | ||
the compiler. | the compiler. For many years, headers from code that Mu2e | ||
depends on, such as the framework, ROOT and G4 | depends on, such as the framework, ROOT and G4 were not compliant with -pedantic. | ||
Therefore we could not use that flag. As of the summer of 2023 we noticed that | |||
headers in non-Mu2e code have been modernized to be compliant with --pedantic. We are | |||
now working to make our code base compliant with --pedantic. | |||
When you need to have a static const data member of non-integral type, the way to initialize it | When you need to have a static const data member of non-integral type, the way to initialize it | ||
Line 121: | Line 138: | ||
class declaration. If you make the latter choice there can be multiple copies of this | class declaration. If you make the latter choice there can be multiple copies of this | ||
object in memory, one for each compilation unit which includes the header. | object in memory, one for each compilation unit which includes the header. | ||
== Do not use Exception Specifications == | == Do not use Exception Specifications == | ||
Line 791: | Line 806: | ||
==Pedantic checks== | |||
In July 2023, we turned on the flag which makes pedantic compiler warnings into errors. The Gnu compiler allows various extensions to the c++ technical standards and requesting pedantic checks enforces the basic standards. This can help find typos and makes the code more uniform, less likely to cause problems for alternative compilers, and a little easier to read. The flags checks for [https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html many things] but the most common we found in our code are the following: | |||
* two semicolons at the end of a line | |||
* a semicolon after the closing brace of a namespace block | |||
* a semicolon after the closing brace of a function definition | |||
* a semicolon after a macro (like the art module macros) | |||
* a variable length c-style array created on the stack; use std::vector instead (or std::array if the length is known at compile time). | |||
==Best Practices== | ==Best Practices== | ||
Some useful references: | |||
[http://mu2e-docdb.fnal.gov | *[https://indico.fnal.gov/category/1164/ Indico page for the Programming Video Journal Club series] | ||
[https://youtu.be/xVT1y0xWgww video on globals and mixed static/dynamic linking] | *[http://mu2e-docdb.fnal.gov/cgi-bin/ShowDocument?docid=2384 Marc Paterno's talk from the August 2012 Workshop] | ||
*[https://youtu.be/xVT1y0xWgww video on globals and mixed static/dynamic linking] | |||
Still to come: | Still to come: | ||
Line 829: | Line 853: | ||
} | } | ||
</pre> | </pre> | ||
==gcc Optimization Levels== | |||
This link describes the meaning of different gcc optimization levels: | |||
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html | |||
[[Category:Computing]] | [[Category:Computing]] | ||
[[Category:Code]] | [[Category:Code]] |
Latest revision as of 15:37, 25 July 2023
What C++ References are Recommended?
There are four sorts of references you will need:
- A C++ Language Reference
- A reference that describes the Standard Library
- A Tutorial
- As your skills develop you will also want references that describe current ideas about best practices.
Most good tutorial books are, by design, incomplete language references and incomplete standard library references. If you are a beginner it is wise to borrow or buy each of the first two references at the same time you acquire your first tutorial. The section below recommends some specific books and some online resources.
A C++ Language References
- Stroustrup, Bjarne: "The C++ Programming Language, Special Third Edition", Addison-Wesley, 2000. ISBN 0-201-70073-5.
- cplusplus.com Language Tutorial
Standard Library References
- Josuttis, Nicolai M., "The C++ Standard Library: Tutorial and Reference", Addison-Wesley, 1999. ISBN 0-201-37926-0.
- cplusplus.com Library Reference
Tutorials
- Andrew Koenig and Barbara E. Moo, "Accelerated C++: Practical Programming by Example" Addison-Wesley, 2000. ISBN 0-201-70353-X.
Best Practices
- Herb Sutter and Andrei Alexandrescu, "C++ Coding Standards: 101 Rules, Guidelines, and Best Practices.", Addison-Wesley, 2005. ISBN 0-321-11358-6.
- Any of the books on Walter Brown's book recommendation page - broken link!, particularly those under the heading "C++ Programming Practices".
Bob Bernstein has used "Thinking in C++" Volumes 1 and 2 from
http://www.ibiblio.org/pub/docs/books/eckel/ .
I have not read this carefully but, on first glance these appear to be a very formal introductory
tutorial, mixed with advanced tutorials.
C++11
In 2011 the International Standards committee defined a new C++ standard, named C++11. The section below contains links to places that discuss what features are new in C++11 and who to use them.
C++20
The art team will upgrade art to C++20 soon after ROOT has upgraded. This is estimated for late 2020 or early 2021. In preparation for this Kyle will prepare a series of presentations about C++20 that he will give at art stakeholders meetings:
In the Mu2e trigger rare events with very long execution times can backup the pipelines, causing loss of data. In C++17 and earlier there is no way to interrupt a module that is taking too long, flag the event as pathological and continue with the next module, next path or next event. If we had this feature we could decide to route pathological events to the regular output stream, to a separate output stream or simply to discard them. At present the only option is to kill the offending art process and restart it; this will result in the loss of the event being processed.
There is a new feature in C++ 20, named jthreads, that may provide the tools needed to implement interruptible modules. Kevin Lynch sent an email with a heads up about jthreads:
However - and I'm certainly not an expert here - C++20 also added a fully back compatible replacement for std::thread called std::jthread for "joining thread" ... if the thread is not joined manually, it joins in its destructor, removing much of the pain of cleaning up after abandoned threads. The important bit for your question, however, is that it _also_ added a cancellation mechanism via std::stop_source/std::stop_token/std::stop_callback to allow thread safe, cooperative thread cancellation. I don't have a really good reference for this, but there is the currently incomplete documentation on https://en.cppreference.com/w/cpp/thread, and I watched a cppcon 2019 video on this:
Bryce Adelstein Lelbach “The C++20 Synchronization Library” https://www.youtube.com/watch?v=Zcqwb3CWqs4
That's of course a 1hr+ video with the bit you want buried somewhere in the middle.Finally, Anthony Williams (who literally wrote the book on C++ concurrency "C++ Concurrency in Action") gave a talk that I have not watched, but his slides are available here, and he starts off quite quickly with discussion of stop_token and friends. https://www.justsoftwaresolutions.co.uk/files/cppcon_2019_concurrency_in_cpp20.pdf
Where may I use: static const Type = value;
The code fragment:
static const Type name = value;
defines a variable whose constant value is known at compile-time; this allows, but does not require, the compiler to perform various optimzations at compile-time. There are, however, some very non-intuitive restrictions on where this syntax can be used. The C++ standard states (I am paraphrasing):
- With one exception, one must never initialize a static const data member inside of the class definition; it must be initialized outside of the class declaration. See below for how to do this. A class definition is the code that your normally find inside a .h or .hh file.
- The exception is that you may initialize a static const data member if and only if "Type" is an "integral-type", which includes things like int, short, long and any enum type.
There are two places in which you may use the above syntax:
- Outside of any class declaration; that is at global scope, namespace scope or file scope.
- Within the body of a function, either a member function or a free function.
In both of these cases, the value of the variable is known at compile time so optimizations are possible but not required. Why "not required"? For example the standard states that compilers are not required to be able do to floating point arithmetic at compile time; so compilers may defer all floating point arithmetic to run time, even if the information is available at compile time. Were compile-tine floating point arithmetic required, it would be very difficult, even impossible, to write a standard-compliant cross-compiler.
In the first case above, the scope of the variable may
be broader than the compilation unit so the compiler will
definitely allocate memory for the variable. In the second
case, the scope of the variable is a subset of the
compilation unit so the compiler can choose to make the variable a true compile time constant.
The g++ compiler, with its default options, will let you initialize a static const double data
member within a class declarationn. We have, however, encountered a situation in which it produces
incorrect code. This non-standard usage can be identified by using the -pedantic flag on
the compiler. For many years, headers from code that Mu2e
depends on, such as the framework, ROOT and G4 were not compliant with -pedantic.
Therefore we could not use that flag. As of the summer of 2023 we noticed that
headers in non-Mu2e code have been modernized to be compliant with --pedantic. We are
now working to make our code base compliant with --pedantic.
When you need to have a static const data member of non-integral type, the way to initialize it is as follows. In the header file:
struct MyClass{ // ... static const double member_datum; };
In the implementation file ( the .cc file ):
const double MyClass::member_datum=123.4;
And you really should put the last line in a .cc file, not in the .hh file and outside of the class declaration. If you make the latter choice there can be multiple copies of this object in memory, one for each compilation unit which includes the header.
Do not use Exception Specifications
In the following code fragment, the segment highlighted in red is known as an exception specification:
void ClassName::methodName(int idx) const throw(art::Exception("category") { // body of the function if ( badThingHappens ){ throw(art::Exception("category") << "Informational message."; } }
An exception specification is not required to be present when your code throws an exception and, when writing code for Mu2e you should never write an exception specification. ( Actually there is only one very obscure situation in which you must use an exception specification; the obscure situation is described at the end of this section. )
This is in contrast to the Java programming language in which exception specifications are required. The confusion arises because exception specifications do very different things in C++ and Java; these are both described below.
As a preamble to the explanation, you need to remember that code can throw in two ways:
- It can have an explicit throw statement, as in the above code fragment.
- It can call code that throws.
In the following, the phase "if a method throws" is true if it does either of the above.
In Java, if a method throws an exception without catching it,
then the code must have an exception specification of the correct type or the code will not compile.
Conversely, if the method has an exception specification,
then the body of the method may throw an exception of the specified type that is not caught.
A consequence of this rule is as follows: consider method A, which calls method B, which calls method C,
which calls method D.
If method D throws an exception and we want it to be caught be method A, all of methods B, C and D must
have an exception specification and method A must not. Alternatively, if we expect method C to catch
the exception, then
only method D must have an exception specification and all others must not.
All of these rules are enforced at compile time.
In C++, on the other hand, the behaviour is very different. If a method has an exception specification, then the compiler will insert code that checks, at run time, the type of any exception thrown by the code ( either directly or propagated up from code called by the method). If this code detects that the thrown exception matches that in the exception specification then nothing special happens. If, on the other hand, it detects that a different exception has been thrown, then the program will immediately call terminate().
The Mu2e framework is designed to catch most exceptions. When it does catch an exception its default behaviour is to shutdown as gracefully as possible; under most circumstances it will properly close all of the histogram files and event-data output files and it will flush the log files. The framework can also be configured to do things like write the offending event to a separate output file and to continue with the next event. Using an exception specification is either useless or it will produce a hard termination that leaves corrupted and incomplete histogram files, event-data output files and log files.
The obscure circumstance in which an exception specification is required is this: if your code inherits from a Standard Library class, and that class has functions with nothrow specifications, then you derived class's functions must too.
The links below take you to discussions of this in the Computer Science literature.
- [1] http://www.linuxprogrammingblog.com/cpp-exception-specifications-are-evil</a>
- http://www.gotw.ca/publications/mill22.htm
- http://www.gotw.ca/gotw/082.htm
- http://www.boost.org/development/requirements.html#Exception-specification
- The presentation in Item #75 (pp. 146-7) from Sutter and Alexandrescu's book "C++ Coding Standards".
Compiler Generated Methods
Consider the following class:
// Hit.hh class Hit{ Hit():_pos(), _dir(){} Hit( Hep3Vector pos, Hep3Vector dir): _pos(pos), _dir(dir){} // Accept compiler generated, d'tor, copy c'tor and assignment operator. const Hep3Vector& position() const { return _pos; } const Hep3Vector& direction() const { return _dir; } private: Hep3Vector _pos, _dir; };
As the comment suggests, his class has three methods that are generated by the compiler:
- The destructor.
- A copy constructor.
- The assignment operator (= operator).
For this particular class these three methods will do the correct thing: the compiler generated destructor will call the destructor of each data member; the compiler written copy and assignment operators will call the corresponding methods of each data member.
With these additional methods, the above class allows code like:
Hit A, B( Hep3Vector(0.,0.,0.), Hep3Vector(0.,0.,1.) ); Hit C(B); // Copy constructor A = B; // Assignment operator
If you know that the compiler generated methods will do the correct thing, then we
recommend that you let the compiler generate these methods.
The compiler written methods will do the correct thing if the data members are "Plain Old Data" (POD); they will also do the correct thing if the data members are objects which themselves have only data member that are PODs; and so on, recursively. In the above example, Hep3Vector is a POD. So it is safe to have data members of type Hep3Vector. If a data member is a std::vector<T>, where T satisfies the previous constraints, the compiler written code will do a deep copy of the vector. If this is the required behaviour, then let the compiler write the code.
In general the compiler written methods will do the wrong thing if your data members include objects that manage external resources, for example most kinds of pointers, or objects that are file streams.
A useful rule of thumb is callsed the "Rule of Three": if you discover that you need to write any of these three functions (presumably because the compiler would do the wrong thing), then write all three of them.
With the release of C++11, classes can have move-aware behaviour. When this happens, the compiler will also be able to write two addtional methods, a move-aware constructor and a move-aware assignment operator. When this happens, the rule of three will become the rule of five: if you need to write any of the five functions, then write all five.
Comparison between Signed and Unsigned Types
When the appropriate compiler warnings are enabled, the compiler will warn about comparisons between signed and unsigned integer types. Such code will work correctly so long as the expression of signed type is guaranteed to never be negative and the expression of unsigned type is guaranteed to never exceed the maximum value of the signed type.
Loop Indices
The following code fragment,
std::vector<T> v; // code to fill v for ( int i=0; i<v.size(); ++i){ // do something with each element }
will generate an compiler diagnostic of the form,
warning: comparison between signed and unsigned integer expressions
The issue is that v.size() returns an unsigned type while i is a signed type. The recommended solution is to change the type of i:
for ( std::size_t i=0; i<v.size(); ++i){ }
To be pedantic, the correct data type for i is std::vector<T>::size_type. But in all implementations we know of, this is just a typedef to std::size_t and the visual impact of the full type is sufficiently distracting that we recommend just std::size_t.
String Lengths in Indices
The following code fragment, is looking inside a string for the presence of a substring delimited by a pair of open and close braces.
int iopen = Value.find("{"); int iclose = Value.find_last_of("}"); if ( ( iopen == string::npos ) || ( iclose == string::npos ) || ( iclose < (iopen+1) ) ){ }
It will generate a compiler diagnostic of the form,
warning: comparison between signed and unsigned integer expressions
There are several issues here. The two find methods return an unsigned type but the compiler will correctly convert it, without diagnostic, to a signed type as requested in the first two lines. The diagnostic is generated by the comparison to string::npos, which is an unsigned type. The recommended form is,
std::string::size_type iopen = Value.find("{"); std::string::size_type iclose = Value.find_last_of("}"); if ( ( iopen == string::npos ) || ( iclose == string::npos ) || ( iclose < (iopen+1) ) ){ // Correct. See below!! }
It would also be acceptable to write the type of iopen and iclose as std::size_t.
This example illustrates another point. One might have written the last line as
( iclose-1 < iopen ) // Unsafe
but that would have been a mistake. Consider the case the iclose=0 and iopen is either zero or almost any postive value. This code will fail because, under the rules of arithmetic with unsigned variables, the expression (iclose-1) evaluates to a large postive value! One can always avoid subtraction of unsigned types by changing to addition on the other side of the comparison. If previous logic has ensured that the subtraction is safe, and if the code reads much more naturally with subtraction, there is a case for writing the code using the subtraction. I strongly prefer we not do this - what if someone unwittingly removes the safety checks? If you decide there is a good reason to write your code this way, add a comment to explain why it is safe and add a comment to the safety checks to say that downstream code depends on them. Having said, this I expect that I have old code around that violates this rule; I will fix these as I encounter them.
Unused Variables
In order to get clean builds we must not have any unused variables in our code. This section discusses a few things we can do to avoid this. One large class of unused variables is those used for debugging; it is acceptable practice to simply comment these out. There are lots of other alternatives involving compile time flags. Whatever you choose, do it consistently and make sure that the production code compiles without diagnostics.
Another large class of warnings comes from code like:
G4Material* vacuum = new G4Material( mat.name, 1., 1.01 *g/mole, density, kStateGas, temperature, pressure)
where the variable vacuum is never used. In this example, when the G4Material object is created, the object registers itself with G4's material store. The material store then takes ownership of the object and manages its lifetime.
The recommended solution to this situation is to use a bare new and to comment the unusual choice:
// G4 takes ownership of this object. new G4Material( mat.name, 1., 1.01 *g/mole, density, kStateGas, temperature, pressure);
At a future date we may develop a different solution for documenting this behaviour; one option is to have a registry to hold pointers to objects that are really owned by G4. The registry would never do anything with the pointers it holds; it would not even have accessor methods. The act of registering would get rid of the compiler diagnostic and document the transfer of ownership to G4.
The Many Meanings of const
Under construction
Bernstein recommends:
http://www.ibiblio.org/pub/docs/books/eckel/
Volume 1 Chapter 8
The short answer is that const is a contract between two pieces of code that one piece
of code will not modify an object that is owned by another piece of code.
If code breaks the contract then the result will usually be a
compile-time error but the error may sometimes be delayed until link-time or load-time.
The two pieces of code may be even be in separate files. The long answer is below.
The Basics
The basic example of const is to show how it applies to objects:
double x = 5.; const double y = 6.; // ... x = 7.; // OK; you may modify x. y = 6.; // Compiler error; you are not permitted to modify y. x = y + 1.; // OK; you are only using y, not modifying it.
There is an alternate syntax for the second line; the position of the const has changed. Both syntaxes produce exactly the same code.
double const y = 6.;
Const also applies to references to objects:
double x = 5.; double& y = x; const double& z = x; y = 7; // OK. This changes the value of both x and y. z = 9.; // Compiler error; you are not permitted to modify z.
There is an alternate syntax for the second line; the position of the const has changed. Both syntaxes produce exactly the same code.
double const& z = x;
With references, you may not make a non-const reference to a const object:
const double x = 5.; const double& y = x; // OK. double& z = x; // Compiler error.
With pointers, there are four permuations of const:
double x=42.; double * y1 = &x; // y1 is a non-const pointer to a non-const pointee. double const * y2 = &x; // y2 is a non-const pointer to a const pointee. double * const y3 = &x; // y3 is a const pointer to a non-const pointee. double const * const y4 = &x; // y4 is a const pointer to a const pointee. double z=23.; y1 = &z; // Ok. y1 now points to z. *y1 = 13.; // Ok. pointee now has the value 13. y2 = &z; // Ok. y2 now points to z. *y2 = 13.; // Compiler error. pointee is const. y3 = &z; // Compiler error. y3 is a const pointer. *y3 = 13.; // OK. pointee now has the value 13. y4 = &z; // Compiler error. y4 is a const pointer. *y4 = 13.; // Compiler error. pointee is const. // As with references, pointers must obey the constness of their pointee. const double w = 99.; double * t1 = w; // Compiler error. Pointee is const and pointer must obey that. double const * t2 = w; // OK double * const t3 = w; // Compiler error. Pointee is const and pointer must obey that. double const * const t4 = w; // OK.
The way to parse pointer constness is to read right to left:
"y1 is a non-const pointer to a non-const double"
"y4 is a const pointer to a const double"
Finally, there is an alternate syntax for a const pointee:
double x=42.; const double * y2 = x; // y2 is a non-const pointer to a const pointee. const double * const y4 = x; // y4 is a const pointer to a const pointee.
Why both "const double x" and "double const x"?
The two syntaxes are historical and come from backwards compatibility with c. To my mind the natural way to write values and references is with the const up front, while the natural way to write const-pointees is with the const in the second place:
const double x=42.; const double& y=x; double const * y = &x;
My thinking about pointers is that the "parse from the right rule" is easiest to use if the constness of the pointee is in the second position. In a quest for uniformity, I started to write values and references with the const in the second position. However I found that this confused too many people; because Mu2e standards and practices strongly discourages the use of bare pointers, this convention is a case of the tail wagging the dog. A secondary consideration is that the version of xemacs available on some of the Fermilab managed machines does not perform correct syntax highlighting if const is in the second place!
The Mu2e coding standard does not specify which position of const to use but does request that you be consistent within one file!
In some of the early Mu2e code, I always wrote const in the second place. I am now writing const in first place for values and references.
As an Argument to a Function
This is an example of legal code:
// func.hh using CLHEP::Hep3Vector; void func ( Hep3Vector& v ); // main.cc #include "func.hh" int main(){ Hep3Vector a(0.,0.,1.); func(a); cout << "a = " << a << endl; } // func.cc #include "func.hh" void func( Hep3Vector& v ){ v = Hep3Vector(1.,0.,0.); }
In this example the function func receives its argument by reference, modifies the argument and returns. In the main program the variable a is created with some value, which is modifed in func. The program will print out:
a = (1.,0.,0.)
Enforcing the contract: Case 1
Now consder modifying main.cc to make the variable "a" const:
const Hep3Vector a(0.,0.,1.);
This will generate a compiler error that looks something like:
main.cc: In function `int main()': main.cc:8: error: invalid initialization of reference of type 'Hep3Vector&' from expression of type 'const Hep3Vector' func.hh:1: error: in passing argument 1 of `void func(Hep3Vector&)'
This error occurs when compiling main.cc. At this time the compiler only knows about main.cc and func.hh; it does not know whether or not func.cc actually modifies its argument; it only knows, from the header, that func is permitted to modify its argument. The compiler also knows that a, once created, may never be modified. So it will refuse to call func.
Breaking the contract: Case 2
// func.hh using CLHEP::Hep3Vector; void func ( const Hep3Vector& v ); // main.cc #include "func.hh" int main(){ Hep3Vector a1(0.,0.,1.); func(a1); const Hep3Vector a2(0.,0.,1.); func(a2); } // func.cc #include "func.hh" void func( const Hep3Vector& v ){ v = Hep3Vector(1.,0.,0.); // Compile time error; you are not permitted to modify v. }
In this case, main.cc will compile without incident. The header func.hh tells the compiler that func does not modify its arguments. So the compiler may pass either a1 or a2 as arguments to func, even though one is const and one is not. On the other hand, when it tries to compile func.cc, the compiler will issue an error like the following:
func.cc: In function `void func(const Hep3Vector&)': func.cc:4: error: assignment of read-only reference `v'
This says that the function func broke its contract by trying to modify a const argument.
As in the previous section, the const may come before or after the type,
// func.hh using CLHEP::Hep3Vector; void func ( Hep3Vector const& v );
Similar comments apply if the argument is passed into func as a pointer to const Hep3Vector:
// func.cc #include "func.hh" void func( Hep3Vector const *v ){ *v = Hep3Vector(1.,0.,0.); // Compile time error; you are not permitted to modify v. }
Mismatched declaration and definition: Case 3
// func.hh using CLHEP::Hep3Vector; void func ( const Hep3Vector& v ); // main.cc #include "func.hh" int main(){ Hep3Vector a(0.,0.,1.); func(a); } // func.cc #include "func.hh" void func( Hep3Vector& v ){ v = Hep3Vector(1.,0.,0.); }
This will generate a link-time ( or possibly load-time error ) that looks something like,
In function `main': : undefined reference to `func(Hep3Vector const&)'
This message is generated because the linker understands the following two functions to be distinct.
void func ( const Hep3Vector& t ); void func ( Hep3Vector& t );
In the example, the main program knows about the first function, because its header is included. It does not, however, know about the second function because that function is never declared within main.cc ( either directly or by being included ). The main program understands that the first function will satisfy its needs and the compiled file main.o will contain a request that the linker find the first function and link it in. When the linker looks within func.o it can only find the second function; therefore it reports an error.
Why did func.cc compile successfully? To answer this, consider the file,
// func.cc void func( Hep3Vector& v ){ v = Hep3Vector(1.,0.,0.); }
This will compile successfully. There is no rule that says that a function definition must find a preceding function declaration. If this file happens include a header that declares functions that are not used by func that is also OK; they are simply ignored.
Return types and const Member Functions: Case 4
// Hit.hh class Hit{ Hit():_channelNumber(-1),_pos(){} Hit( int chan, const Hep3Vector& pos): _channelNumber(chan), _pos(pos){} // Accept compiler generated, d'tor, copy c'tor and assignment operator. // Accessors. int channel() const { return _channelNumber;} const Hep3Vector& position() const { return _pos; } void setPosition( const Hep3Vector& pos){ _pos=pos; } private: int _channelNumber; Hep3Vector _pos; } // main.cc int main() { Hit hit1( Hep3Vector(0.,0.,0.), Hep3Vector(0.,0.,1.) ); const Hep3Vector& v = hit1.position(); // OK Hep3Vector& v = hit1.position(); // Compiler error Hep3Vector v = hit1.position(); // OK since it makes a copy. hit1.setDirection ( Hep3Vector(1.,0.,0.) ); // OK const Hit hit2( Hep3Vector(0.,0.,0.), Hep3Vector(0.,0.,1.) ); const Hep3Vector& v = hit1.position(); // OK hit2.setDirection ( Hep3Vector(1.,0.,0.) ); // Compiler error }
This example contains a toy class that might represent a hit in some detector; it holds a channel number and a position in 3 space. This example illustrates two more uses of const:
- The accessor function position() returns its information by a const reference to a private data member.
- The two accessor functions have a const after the () and before the opening {.
Note that the channel number is returned by value, not by const reference. The short answer is that an int is a "small object" while a Hep3Vector is a "large" object for which we have to consider the overhead of copying it. This is discussed in more detail in the next section.
The first use of const is necessary because we want users of a const Hit object to be able to see, but not modify its internal state.
More about return types: Case 5
// Hit.hh class Hit{ Hit():_channelNumber(-1),_pos(), _dir(){} Hit( int chan, Hep3Vector pos, Hep3Vector dir): _channelNumber(chan), _pos(pos), _dir(dir){} // Accept compiler generated, d'tor, copy c'tor and assignment operator. const Hep3Vector& position() const { return _pos; } const Hep3Vector& direction() const { return _dir; } int channel() const { return _channelNumber;} private: int _channelNumber; Hep3Vector _pos, _dir; } // main.cc int main() { Hit hit( Hep3Vector(0.,0.,0.), Hep3Vector(0.,0.,1.) ); const Hep3Vector& v = hit.position(); // OK Hep3Vector& v = hit.position(); // Compiler error Hep3Vector v = hit.position(); // OK. Makes a copy. }
In this example the class Hit allows users to look at its member data but not to modify them. This is a very common pattern found in Mu2e classes that are part of the event model, the geometry data or the conditions data. In the case of event data, once data has been added to the event it may never be modified; this is required so that the audit trail of who created which data product is not violated. In the case of geometry data, the geometry service is responsible for keeping the geometry up to date; users of the geometry information may only view the geometry data, not modify it. A similar situation is true for conditions data. The Mu2e classes have been designed so that the compiler will spot attempts to evade these rules and will give compile time errors for illegal operators.
There is a second use of const in this example, the const that follows the name of the two accessor functions, postiion() and direction(). This will be discussed in the next section.
There are a variety of choices for the return type of the accessor functions:
- Return by value ( make a copy ).
- Return by const reference.
- Return by pointer to const.
- Return by some sort of smart pointer to const.
This list excludes things like non-const reference and pointer to non-const that would allow the user to modify the data inside of the class Hit.
The first option was chosen for returning the channel number, while the second option was chosen for returning the position and direction. The reasoning is that, if an object is "small", then return it by value and, if an object is "large", then return it by one of the other three types. This is an efficiency argument: it can be expensive in both memory and CPU to make a copy of a large object; therefore, grant access via some sort of pointer type ( a reference qualifies as a pointer type in this sense ) and use const to ensure that the member data cannot be modified. In usual practice, small objects include the built in data types plus objects that use no more memory than does a pointer (4 bytes on a 32 bit machine and 8 bytes on a 64 bit machine).
This brings us the the decision of which pointer type to use. If this Hit class were a top-level object, something that one gets directly either from a Service or from the Event Data Model (EDM), then it would be appropriate to use some sort of smart pointer type that projects the user from the object not being there at all. Instead I imagine that a class representing a single hit will be inside a top level object.
Pedantic checks
In July 2023, we turned on the flag which makes pedantic compiler warnings into errors. The Gnu compiler allows various extensions to the c++ technical standards and requesting pedantic checks enforces the basic standards. This can help find typos and makes the code more uniform, less likely to cause problems for alternative compilers, and a little easier to read. The flags checks for many things but the most common we found in our code are the following:
- two semicolons at the end of a line
- a semicolon after the closing brace of a namespace block
- a semicolon after the closing brace of a function definition
- a semicolon after a macro (like the art module macros)
- a variable length c-style array created on the stack; use std::vector instead (or std::array if the length is known at compile time).
Best Practices
Some useful references:
- Indico page for the Programming Video Journal Club series
- Marc Paterno's talk from the August 2012 Workshop
- video on globals and mixed static/dynamic linking
Still to come:
- return argument constness
- Pointless-ness of const values as return types.
- const member functions
- const and stl containers. const vector<T> can return only a const T& and const T*.
- const is not deep
- const is viral
- mutable - use sparingly.
- Advanced topic: reference to const temp as an argument or a return type. But not ref to non-const temp.
- Advanced topic: overloaded pairs of functions. Signature does not include the return type but does include method constness.
// func.hh using CLHEP::Hep3Vector; void func ( Hep3Vector v ); void func ( Hep3Vector& v ); void func ( const Hep3Vector& v ); // main.cc #include "func.hh" int main(){ Hep3Vector v(0.,0.,1.); func(v); func(v); func(v); }
gcc Optimization Levels
This link describes the meaning of different gcc optimization levels: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html