Debugging Techniques in Modern C++
Debugging Techniques in Modern C++
Programming
14. C++ Ecosystem I
Debugging
Federico Busato
2023-11-14
Table of Context
1 Debugging
2 Assertions
3 Execution Debugging
Breakpoints
Watchpoints / Catchpoints
Control Flow
Stack and Info
Print
Disassemble
1/63
Table of Context
4 Memory Debugging
valgrind
Stack Protection
5 Sanitizers
Address Sanitizer
Leak Sanitizer
Memory Sanitizers
Undefined Behavior Sanitizer
6 Debugging Summary
2/63
Table of Context
7 Compiler Warnings
8 Static Analysis
9 Code Testing
Unit Testing
Test-Driven Development (TDD)
Code Coverage
Fuzz Testing
10 Code Quality
clang-tidy
3/63
Feature Complete
4/63
Debugging
Is this a bug?
6/63
Cost of Software Defects
7/63
Types of Software Defects
• C++ is very error prone language, see 60 terrible tips for a C++
developer
• Human behavior, e.g. copying & pasting code is very common practice and can
introduce subtle bugs → check the code carefully, deep understanding of its
behavior
9/63
Dealing with Software Defects
Static Analysis A proactive strategy that examines the source code for (potential)
errors.
Techniques: Warnings, static analysis tool, compile-time checks
Limitations: Turing’s undecidability theorem, exponential code paths
10/63
Program Errors
12/63
Assertion
template<typename T>
T sqrt(T value) {
static_assert(std::is_arithmetic_v<T>, // precondition
"T must be an arithmetic type");
assert(std::is_finite(value) && value >= 0); // precondition
int ret = ... // sqrt computation
assert(std::is_finite(value) && ret >= 0 && // postcondition
(ret == 0 || ret == 1 || ret < value));
return ret;
}
Assertions may slow down the execution. They can be disable by define the NDEBUG
macro
# define NDEBUG // or with the flag "-DNDEBUG" 13/63
Execution
Debugging
Execution Debugging (gdb)
-O0 Disable any code optimization for helping the debugger. It is implicit for most
compilers
-g Enable debugging
- stores the symbol table information in the executable (mapping between assembly
and source code lines)
- for some compilers, it may disable certain optimizations
- slow down the compilation phase and the execution
-g3 Produces enhanced debugging information, e.g. macro definitions. Available for
most compilers. Suggested instead of -g 14/63
gdb - Breakpoints
15/63
gdb - Watchpoints / Catchpoints
16/63
gdb - Control Flow
19/63
gdb - Disassemble
Command Description
examine address
n number of elements,
x/nfu <address>
f format (d: int, f: float, etc.),
u data size (b: byte, w: word, etc.)
20/63
gdb - Notes
Terms like buffer overflow, race condition, page fault, null pointer, stack exhaustion,
heap exhaustion/corruption, use-after-free, or double free – all describe memory
safety vulnerabilities
Solutions:
• Run-time check
• Static analysis
• Avoid unsafe language constructs
23/63
valgrind 1/9
$ wget [Link]
$ tar xf [Link].bz2
$ cd valgrind-3.21
$ ./configure --enable-lto
$ make -j 12
$ sudo make install
$ sudo apt install libc6-dbg #if needed
some linux distributions provide the package through apt install valgrid , but it could be an old version
24/63
valgrind 2/9
Basic usage:
• compile with -g
Output example 1:
==60127== Invalid read of size 4 !!out-of-bound access
==60127== at 0x100000D9E: f(int) ([Link])
==60127== by 0x100000C22: main ([Link])
==60127== Address 0x10042c148 is 0 bytes after a block of size 40 alloc'd
==60127== at 0x1000161EF: malloc (vg_replace_malloc.c:236)
==60127== by 0x100000C88: f(int) ([Link])
==60127== by 0x100000C22: main ([Link])
25/63
valgrind 3/9
Output example 2:
!!memory leak
==19182== 40 bytes in 1 blocks are definitely lost in loss record 1 of 1
==19182== at 0x1B8FF5CD: malloc (vg_replace_malloc.c:130)
==19182== by 0x8048385: f ([Link])
==19182== by 0x80483AB: main ([Link])
• Definitely lost
• Indirectly lost
• Still reachable
• Possibly lost
When a program terminates, it releases all heap memory allocations. Despite this,
leaving memory leaks is considered a bad practice and makes the program unsafe with
respect to multiple internal iterations of a functionality. If a program has memory leaks
for a single iteration, is it safe for multiple iterations?
A robust program prevents any memory leak even when abnormal conditions occur
27/63
valgrind 5/9
Definitely lost indicates blocks that are not deleted at the end of the program (return
from the main() function). The common case is local variables pointing to newly
allocated heap memory
void f() {
int* y = new int[3]; // 12 bytes definitely lost
}
int main() {
int* x = new int[10]; // 40 bytes definitely lost
f();
}
28/63
valgrind 6/9
Indirectly lost indicates blocks pointed by other heap variables that are not deleted.
The common case is global variables pointing to newly allocated heap memory
struct A {
int* array;
};
int main() {
A* x = new A; // 8 bytes definitely lost
x->array = new int[4]; // 16 bytes indirectly lost
}
29/63
valgrind 7/9
Still reachable indicates blocks that are not deleted but they are still reachable at the
end of the program
int* array;
int main() {
array = new int[3];
}
// 12 bytes still reachable (global static class could delete it)
# include <cstdlib>
int main() {
int* array = new int[3];
std::abort(); // early abnormal termination
// 12 bytes still reachable
... // maybe it is delete here
}
30/63
valgrind 8/9
Possibly lost indicates blocks that are still reachable but pointer arithmetic makes the
deletion more complex, or even not possible
# include <cstdlib>
int main() {
int* array = new int[3];
array++; // pointer arithmetic
std::abort(); // early abnormal termination
// 12 bytes still reachable
... // maybe it is delete here but you should be able
// to revert pointer arithmetic
}
31/63
valgrind 9/9
Advanced flags:
• --leak-check=full print details for each “definitely lost” or “possibly lost”
block, including where it was allocated
• --show-leak-kinds=all to combine with --leak-check=full. Print all leak kinds
• --track-fds=yes list open file descriptors on exit (not closed)
• -fstack-usage Makes the compiler output stack usage information for the
program, on a per-function basis
Adding FORTIFY SOURCE define, the compiler provides buffer overflow checks for the
following functions:
memcpy , mempcpy , memmove , memset , strcpy , stpcpy , strncpy , strcat ,
strncat , sprintf , vsprintf , snprintf , vsnprintf , gets .
Recent compilers (e.g. GCC 12) allow detects buffer overflows with enhanced
coverage, e.g. dynamic pointers, with FORTIFY SOURCE=3 *
35/63
Sanitizers
Address Sanitizer
Sanitizer are used during development and testing to discover and diagnose memory
misuse bugs and potentially dangerous undefined behavior
Sanitizer are implemented in Clang (from 3.1), gcc (from 4.8) and Xcode
Project using Sanitizers:
• Chromium
• Firefox
• Linux kernel
• Android
36/63
Memory error checking in C and C++: Comparing Sanitizers and Valgrind
Address Sanitizer
• [Link]/docs/[Link]
• [Link]/google/sanitizers/wiki/AddressSanitizer
37/63
• [Link]/onlinedocs/gcc/[Link]
Leak Sanitizer
• [Link]/docs/[Link]
• [Link]/google/sanitizers/wiki/AddressSanitizerLeakSanitizer
38/63
• [Link]/onlinedocs/gcc/[Link]
Memory Sanitizers
-fsanitize-memory-track-origins=2
track origins of uninitialized values
• [Link]/docs/[Link]
• [Link]/google/sanitizers/wiki/MemorySanitizer
39/63
• [Link]/onlinedocs/gcc/[Link]
Undefined Behavior Sanitizer
-fsanitize=integer Checks for undefined or suspicious integer behavior (e.g. unsigned integer
overflow)
-fsanitize=nullability Checks passing null as a function parameter, assigning null to an lvalue, and
returning null from a function
• [Link]/docs/[Link]
• [Link]/onlinedocs/gcc/[Link] 40/63
Sanitizers vs. Valgrind
Valgrind - A neglected tool from the shadows or a serious debugging tool? 41/63
Debugging Summary
How to Debug Common Errors
Segmentation fault
• gdb, valgrind, sanitizers
• Segmentation fault when just entered in a function → stack overflow
Infinite execution
• gdb + (CTRL + C)
Incorrect results
• valgrind + assertion + gdb + sanitizers
42/63
Compiler Warnings
Compiler Warnings
-Wextra Enables some extra warning flags that are not enabled by -Wall (∼15 warnings)
44/63
Static Analyzers - clang static analyzer
It find bugs by reasoning about the semantics of code (may produce false positives)
Example:
void test() {
int i, a[10];
int x = a[i]; // warning: array subscript is undefined
}
How to use:
scan-build make
cppcheck provides code analysis to detect bugs, undefined behavior and dangerous
coding construct. The goal is to detect only real errors in the code (i.e. have very few
false positives)
cppcheck --enable=warning,performance,style,portability,information,error
<src_file/directory>
cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON .
cppcheck --enable=<enable_flags> --project=compile_commands.json
46/63
Static Analyzers - PVS-Studio, FBInfer
Customers: IBM, Intel, Adobe, Microsoft, Nvidia, Bosh, IdGames, EpicGames, etc.
47/63
Static Analyzers - DeepCode, SonarSource
SonarLint plugin is available for Visual Code, Visual Studio Code, Eclipse, and IntelliJ
IDEA
48/63
see also A curated list of static analysis tool
Code Testing
Code Testing
see Case Study 4: The $440 Million Software Error at Knight Capital
49/63
from: Kat Maddox (on Twitter)
Code Testing
Unit Test A unit is the smallest piece of code that can be logically isolated in a
system. Unit test refers to the verification of a unit. It supposes the
full knowledge of the code under testing (white-box testing)
Goals: meet specifications/requirements, fast development/debugging
Functional Test Output validation instead of the internal structure (black-box testing)
Goals: performance, regression (same functionalities of previous
version), stability, security (e.g. sanitizers), composability (e.g.
integration test)
50/63
Unit Testing 1/3
Unit testing involves breaking your program into pieces, and subjecting each piece to
a series of tests
Unit testing should observe the following key features:
• Isolation: Each unit test should be independent and avoid external interference
from other parts of the code
• Automation: Non-user interaction, easy to run, and manage
• Small Scope: Unit tests focus on small portions of code or specific
functionalities, making it easier to identify bugs
52/63
Unit Testing 3/3
53/63
JetBrains C++ Developer Ecosystem 2022
Test-Driven Development (TDD)
54/63
Test-Driven Development (TDD) - Main advantages
• Understandable behavior. New user can learn how the system works and its
properties from the tests
• Increase confidence. Developers are more confident that their code will work as
intended because it has been extensively tested
• [Link]/catchorg/Catch2
• The Little Things: Testing with Catch2 56/63
catch 2/2
Code coverage is a measure used to describe the degree to which the source code of
a program is executed when a particular execution/test suite runs
gcov and llvm-profdata/llvm-cov are tools used in conjunction with compiler
instrumentation (gcc, clang) to interpret and visualize the raw code coverage
generated during the execution
gcovr and lcov are utilities for managing gcov/llvm-cov at higher level and
generating code coverage results
[Link]:
# include <iostream>
# include <string>
60/63
Coverage-Guided Fuzz Testing
A fuzzer is a specialized tool that tracks which areas of the code are reached, and
generates mutations on the corpus of input data in order to maximize the code
coverage
LibFuzzer is the library provided by LLVM and feeds fuzzed inputs to the library via a
specific fuzzing entrypoint
The fuzz target function accepts an array of bytes and does something interesting with these
bytes using the API under test:
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* Data,
size_t Size) {
DoSomethingInterestingWithMyAPI(Data, Size);
return 0;
}
61/63
Code Quality
Linters - clang-tidy 1/2
lint: The term was derived from the name of the undesirable bits of fiber
clang-tidy provides an extensible framework for diagnosing and fixing typical
programming errors, like style violations, interface misuse, or bugs that can be deduced
via static analysis
$ cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON .
$ clang-tidy -p .
clang-tidy searches the configuration file .clang-tidy file located in the closest
parent directory of the input file
• Fuchsia • Performance
• Google • Readability