C++ Tips and Tricks
C++ Tips and Tricks
• You run it, it works fine, then you make a release build and after a few days in production
you have a bug. doSomething() is called apparently randomly when it shouldn’t have been.
• x is uninitialized which means, on release builds it will contain whatever value that
memory location happened to hold (in debug builds the compiler will emit additional code
to initialize it to zero which will mask the bug). So it’s just a matter of time until that
memory will hold 3 which will trigger the bug.
Always initialize variables
• This problem is especially bad when leaving pointers uninitialized. For
example:
SomeClass *pC; // wrong SomeClass *pC = nullptr; // correct
if (someCondition) { if (someCondition) {
pC = new SomeClass(); pC = new SomeClass();
} }
… …
if (pC != nullptr) { if (pC != nullptr) {
pC->someMethod() pC->someMethod()
} }
• This will most likely crash the first time you run a release build since
the pointer being uninitialized contains a junk value which is not zero
(nullptr), so the condition will pass and the method will be invoked on
an object which doesn’t exist.
Always initialize variables
• This also applies to fields within a class. Either make the constructor
require and initialize the fields, or assign them default values:
• There’s no need to do any initialization now on an instance of A since it takes care of its own
initialization in the constructor:
A a; // totally fine, A does its own initialization of every
field
• Class types have constructors that run automatically upon instantiation and should initialize
all class fields, thus there should be no need to ever initialize a class type’s fields manually.
Argument passing hints
• Raw data types, such as integers, pointers, booleans, etc should
almost always be passed by value – the exception being when they
are “out” parameters, in which case it’s a good idea to make this
explicit. For example:
void foo(int x, char c, void* ptr, float &out_f)
{
// out_f is an out parameter passed by
reference
// and is explicitly named so
}
• Large data types (such as containers or other classes) should, for
performance reasons, almost always be passed by reference – see
next page.
Argument passing hints
• Always prefer const reference above mutable reference above pointer
to const above pointer:
• const& > & > const * > *
• This means you should rarely use pass-by-pointer, and only in those
situations where you require the parameter to be nullable.
• Passing by reference guarantees the reference cannot be null, while
passing by pointer requires you to check for a null pointer.
• Passing by const reference (const&) guarantees the caller that your
function will not modify his data, and allows the caller to pass const
values as inputs to your function. More on constness later.
Argument passing hints
const reference (mutable) reference pointer to const pointer
void foo(Bar const & b); void foo(Bar & b); void foo(const Bar *b); void foo(Bar *b);
• a Bar is passed by • a Bar is passed by • a Bar is passed by pointer • a Bar is passed by pointer
reference reference (b is a pointer to Bar) • there are no guarantees
• foo() guarantees not to • b may be modified by the • the function foo() neither on the part of
modify b – in fact in C++ function foo() – there’s no guarantees not to modify foo() nor on the caller –
this guarantee is hard- guarantee that it won’t, the Bar the pointer may be null
enforced by the compiler, unless we know exactly • the function foo() is and the function may
so attempting to modify b what foo() does required to check b for freely modify the object
will result in a compile • Caller may be reluctant to null or else it may crash – pointed to.
error passing his data directly the function foo() cannot • Avoid this in all cases
• Caller has no second for fear of it being trust the caller not to pass except when b is an out
thoughts on the safety of modified and may be a null pointer value in which case you
his data passed as tempted to make a copy should name it “out_b”,
argument first, negating all but also check it for null
performance benefits of
passing by reference
Valid use case for passing-by-const-
pointer
• The only case when you should use pass-by-const-pointer is when a
complex argument is optional (you could also give it a default value to
make this clear):
void foo(int arg1, int arg2, const std::vector<std::string> *optionalArg = nullptr) {
// optionalArg is optional and has a nullptr default value
// MUST check for null:
if (optionalArg) {
size_t size = optionalArg->size();
}
}
// calling foo:
foo(1, 2); // optional argument omitted
std::vector<std::string> v {…};
foo(3, 4, &v); // optional argument passed as a pointer to const
Valid use case for pass-by-pointer
(mutable)
• The only case when you should use pass-by-pointer (non const) is when you are
passing an optional “out” parameter into which you’ll receive a value. Note the
bold optional – unless the parameters are optional, use pass-by-reference. These
parameters should be clearly named with the “out_” prefix:
unsigned countOddNumbers(std::vector<int> const& numbers, std::vector<int> *out_odds =
nullptr) {
// out_odds is optional; if provided we populate it with the odd numbers
unsigned count = 0;
for (unsigned i=0; i < numbers.size(); i++) {
if (numbers[i] % 2) {
// number is odd
count++;
if (out_odds) { // MUST check for null
out_odds->push_back(numbers[i]);
}
}
}
return count;
}
Argument passing hints
• So to recap:
• pass simple input parameters by value
• pass complex input parameters by const reference
• pass optional arguments by pointer
• optional input arguments by const pointer
• optional out arguments by non-const pointer
• Always check pointer arguments for null! (references make your life easier
since they can’t be null)
struct vs class
• The only difference (from the perspective of the compiler) between
struct and class is the default visibility of fields: for struct it’s public,
for class it’s private.
• That said, there’s a difference between them to us humans – they are
different words, so we can use one for some things and the other for
other things to make it easier to read the code:
• Use class for types that have behavior
• Use struct for plain data objects
• In this way when you look at the code you get a hint of what each
type should be used for.
Assert instead of assuming
• At some point you’ll be writing a complex algorithm which, when it reaches a certain point, you assume some
conditions hold true. Make this assumption explicit by asserting it. In this case, if the conditions happen to be
false, the program will crash with an error rather than moving on in mysterious and buggy ways that you did
not anticipate:
#include <cassert>
void foo() {
if (someCondition) {
for (int i=0; i< …; i++) {
if (anotherCondition) { …
} else if (yetAnotherCondition) {
// at this point, i should be odd and s should be null
assert(i % 2 && “i should be odd”); // the string is there just to make it easier to read
the error
assert(s == nullptr && “s should be null”);
… // do your stuff safely now
}
}
}
}
• If the assertion fails (the condition is not met), then the program will crash with an error similar to this:
• assertion failed at file/path.cpp:24: i % 2 && “i should be odd”
Assertions
• The assert() is a macro that only produces code in a debug build, so
release builds are free from these extra checks in order to make them
fast.
• Thus, you need not worry about the performance of the expression
that is passed into assert(). You can even call a function that does
expensive computations within an assert just to make sure
everything’s alright with zero costs in production.
• Be generous with asserts, there’s no reason not to be, and you’ll
reduce the risk of obscure bugs considerably.
Constness
• This is an especially important concept in the world of multithreading,
but is also very useful in controlling write access to data and in
implementing pure methods.
• Ideally everything should be const (in some languages it actually is) but
of course that’s a lot of “const” to write, so we indulge a few omissions.
• Things that can be const:
• variables, class fields
• parameters to functions
• methods / operators
• So apart from functions, pretty much everything can be const
Const methods
• This is a somewhat specific concept to C++ but it’s the most useful of all
• A const method is a method that guarantees not to modify the state of the object
it is called on (the “this”).
• Const methods are the only kind of methods that can be called on a const object.
• In a multithreaded environment, you are free to mix const and non-const
methods, BUT in order to decrease the chances of shooting yourself in the foot,
ALWAYS prefer const methods when possible. The non-const methods – if any –
should be scarcely used and guarded by access synchronization mechanisms such
as mutexes.
• Calling only const methods on an object is thread safe and requires no
synchronization (on the part of the object, but of course if the methods modify
external data, that’s a different problem)
Const methods
class A {
int x = 3;
public:
int getX() const { return x; }
int setX(int _x) { x = _x; }
};
A a;
foo(a);
By declaring getX() method const, we allow others who possess a const& to our object to call this method,
in a read-only manner.
However, they can’t change “a” in any way since the compiler forbids that.
Const methods in multithreading
• Marking methods as const makes it way easier to implement safe
multithreading, since you know exactly when program state can be
altered and can control that with synchronization mechanism.
• The idea is to minimize the number of non-const methods and only
use those where strictly necessary.
• If a method guarantees not to alter state, it’s (almost) safe to call from
any thread at any time – almost because the state it reads can still be
written to by something else.
• If all methods that are being called in a multithreaded environment are
const, then thread safety is achieved without need for synchronization.
Mixing non-const and const
methods
• Bear in mind that doing something like this is still not thread safe:
class A {
int x = 1;
public:
int getX() const { return x }
void setX(int _x) { x = _x; }
};
A sharedA;
// Thread #1 // Thread #2
sharedA.getX(); // this may return garbage if sharedA.setX(13);
called
// while the other thread is
writing
• It’s ok to use these as long as the overall type doesn’t get too
complicated AND you’re not passing them around. So restrict to using
them only in local, relatively short scopes where it’s clear what they
are and what they do.
Compound data types – don’t abuse
them
• You may come across something like this:
std::pair<std::string, std::vector<std::pair<const SomeClass *, std::string>>
• Or even worse:
std::pair<std::string, std::vector<std::vector<std::tuple<const SomeClass *,
std::map<std::string, SomeOtherClass::SomeSubType>, int, void*>>> x;
auto length = std::get<2>(x.second[i][j]).begin()->first.size(); // hard to understand
• That’s a single data type, but it’s quite hard to get your head around
• And worse, imagine having to pass this as parameters to several functions
• So, as a rule: only use compound types in locally isolated scopes, when the type
is simple enough and it’s purpose is clear. If you have to nest more than one
pair<> or tuple<> it’s time to rethink the strategy.
Compound types – what to do
instead
• Instead of this
std::pair<std::string, std::vector<std::vector<std::tuple<const SomeClass *,
std::map<std::string, SomeOtherClass::SomeSubType>, int, void*>>> x;
auto length = std::get<2>(x.second[i][j]).begin()->first.size(); // hard to understand
• Do this:
using SomeSubtypeByNameMap = std::map<std::string, SomeOtherClass::SomeSubType>; // this is a type alias
struct Y {
const SomeClass* pSomeClass;
SomeSubtypeByNameMap subTypeByNameMap;
int aNumber;
void* aPointer;
};
using YList = std::vector<Y>;
struct X {
std::string someString;
std::vector<YList> yLists;
};
X x; // now this is much easier to understand and modify later
auto length = x.yLists[i][j].subTypeByNameMap.begin()->first.size();
Lambda Functions
• in C++ Lambda functions have an additional property from other
languages – the capture list.
• By default, a lambda function doesn’t capture any context, it can only
operate on the arguments it receives.
• The programmer can give the lambda access to context selectively, in
a finely grained manner, for each individual variable controlling both
visibility and the way it’s captured – by value, by reference or const
reference.
Lambda Functions
int x; std::string s;
auto lambdaSum = [](int a, int b) { return a+b; } // lambda function with no capture
• To avoid these problems only capture by reference when you are sure that the lambda
lives less than the scope it’s defined in.
Keep your interface lean & reduce
inter-dependencies
• There are two benefits to doing this:
1. The interface will be more readable – the consumer can easily
understand how to use it
2. Keeping inter-dependencies between headers and the amount if
exposed data types to a minimum greatly reduces compile times.
For example if a header “a.h” is included in three other headers
“b.h”, “c.h”, and “d.h” and each of those is included in two “cpp”
files, the “a.h” header is indirectly included in 6 “cpp” files; Any
modification to the “a.h” file will trigger the recompilation of all 6
“cpp” files which will take more time.
Keep your interface lean - #1
Classes
• It’s very important to write a clean and minimal public interface to any
class, library, api or whatever. In this way the consumer will not be
confused as to how to use your code. To help with this, keep an eye on
these guidelines:
• Start your class’ definition with the public interface – burying deep below
dozens of lines of private stuff is not nice
• Make the public interface contain only the functions / fields relevant for the
consumer. If unsure about a function, make it private, you can promote it later if
need be.
• Hide internal data types not relevant for the consumer
• Avoid exposing class fields publicly unless you’re only offering const references
to those objects – this can easily break thread safety – instead use accessors.
Start your class’ definition with the
public interface
Good Bad
class A { class A {
public: int x;
A(); std::string s;
int publicGetter(); std::vector<bool> v;
private: void privateMethod();
int x; int someOtherPrivateMethod();
std::string s; public:
std::vector<bool> v; A();
void privateMethod(); int publicGetter();
int someOtherPrivateMethod(); }
}
Hide internal data types not
relevant for the consumer
• If some of your class’s methods use a data type that is never used by the public methods, avoid
defining or including that data type into your class’s header, instead either:
• define it directly into the cpp file if it’s a simple data type, or:
• define it in a private header included only in the cpp, and forward-declare it into your public header
header.h code.cpp
#include <vector> #include “header.h”
namespace MATH { using namespace MATH; // import everything from namespace MATH
into the global namespace
std::vector<int> range(int start, int end); // in this way, we can directly refer to the range() function
without prefixing it with MATH::
} // namespace MATH
struct vector {
float x;
float y;
};
header.h code.cpp
#include <vector> #include “header.h”
} // namespace MATH // in the best case the following will simply produce a
confusion – which vector is this? our local struct or from
std:: ?
// in the worst case this will produce a compilation error
since the compiler will think it’s the local struct and it
takes no template arguments
vector<int> r = range(1, 10);
“using namespace” directives
• Another problem may arise when a symbol with the same name is
defined in two separate namespaces which are both imported into
the global namespace with “using namespace” directives.
• Conclusion: prefer being explicit about namespaces and avoid using
“using namespace” directives, even in cpp files, but especially in
header files which can indirectly affect unsuspecting cpp files that
include them. It’s a bit extra typing for writing “std::” everywhere but
it’s safe and clear. Prefer safe and clear.
Isolating local data and functions
• Local data and functions represent data and functions that are defined in a
cpp file. Those are “local” in the sense that they’re not directly visible
outside the cpp file.
• However, those objects are still linked together with the rest of the
program, so, by exploiting knowledge of the internals of that cpp file, they
can easily be exposed into another cpp file – see example on the next page.
• In order to avoid this possibility and make sure these objects cannot be
accessed from anywhere else, we need to isolate them.
• To isolate an object (data or function) within a cpp file, we declare it as
“static”. The keyword “static”, when applied to a global object, makes it
inaccessible outside that cpp file.
Isolating local data and functions
BAD
file1.cpp file2.cpp
// everything declared here is not directly visible to // we know there’s an int called “localVal” in another
another cpp file, but this can be changed with a bit cpp file, so we forward declare it here, exposing it
of inside knowledge extern int localVal;
int localVal = 3; // same for the function:
extern void localFunction(int param);
void localFunction(int param) {
… // we can now use them:
} localFunction(localVal); // compiles OK
GOOD
file1.cpp file2.cpp
// We can enforce isolation by declaring these // This won’t work any more:
“static” extern int localVal;
static int localVal = 3; extern void localFunction(int param);
Always use this guy unless you have good reasons to require the other one
Custom types as map keys #1
• For std::map, you can use most standard types as key, since the “<“ operator is
defined for them already – integer types, floating-point types, std::string
• If you want to use a custom type as a key, you need to define the less-than
operator for it, so that the map knows how to order the elements. There are two
ways to do this:
namespace std {
template<>
struct hash<MyKey> {
std::size_t operator()(MyKey const& key) const {
return (
hash<std::string>()(key.s) // sub-hash for “.s”
^ // this operator combines sub-hashes together
(hash<int>()(key.x) << 1) // sub-hash for “.x”, left-
shifted
) >> 1;
}
};
}