Callgrind: a call-graph generating cache and branch prediction profiler
code. This is because there are no explicit call or return instructions in these instruction sets, so Callgrind has to rely
on heuristics to detect calls and returns.
6.1.2. Basic Usage
As with Cachegrind, you probably want to compile with debugging info (the
-g
option) and with optimization turned
on.
To start a profile run for a program, execute:
valgrind --tool=callgrind [callgrind options] your-program [program options]
While the simulation is running, you can observe execution with:
callgrind_control -b
This will print out the current backtrace. To annotate the backtrace with event counts, run
callgrind_control -e -b
After program termination, a profile data file named
callgrind.out.<pid>
is generated, where
pid
is the process
ID of the program being profiled. The data file contains information about the calls made in the program among the
functions executed, together with
Instruction Read
(Ir) event counts.
To generate a function-by-function summary from the profile data file, use
callgrind_annotate [options] callgrind.out.<pid>
This summary is similar to the output you get from a Cachegrind run with cg_annotate: the list of functions is ordered
by exclusive cost of functions, which also are the ones that are shown. Important for the additional features of Callgrind
are the following two options:
•
--inclusive=yes
: Instead of using exclusive cost of functions as sorting order, use and show inclusive cost.
•
--tree=both
: Interleave into the top level list of functions, information on the callers and the callees of each
function. In these lines, which represents executed calls, the cost gives the number of events spent in the call.
Indented, above each function, there is the list of callers, and below, the list of callees. The sum of events in calls to
a given function (caller lines), as well as the sum of events in calls from the function (callee lines) together with the
self cost, gives the total inclusive cost of the function.
94