Callgrind: a call-graph generating cache and branch prediction profiler
quite capable of avoiding cycles, it has to be used carefully to not cause symbol explosion. The latter imposes large
memory requirement for Callgrind with possible out-of-memory conditions, and big profile data files.
A further possibility to avoid cycles in Callgrind’s profile data output is to simply leave out given functions in the
call graph. Of course, this also skips any call information from and to an ignored function, and thus can break
a cycle. Candidates for this typically are dispatcher functions in event driven code. The option to ignore calls
to a function is
--fn-skip
=function
. Aside from possibly breaking cycles, this is used in Callgrind to skip
trampoline functions in the PLT sections for calls to functions in shared libraries. You can see the difference if you
profile with
--skip-plt
=no
. If a call is ignored, its cost events will be propagated to the enclosing function.
If you have a recursive function,
you can distinguish the first 10 recursion levels by specifying
--separate-recs10
=function
.
Or for all functions with
--separate-recs
=10
, but this will give
you much bigger profile data files.
In the profile data, you will see the recursion levels of "func" as the different
functions with names "func", "func’2", "func’3" and so on.
If you have call chains "A > B > C" and "A > C > B" in your program, you usually get a "false" cycle "B <> C". Use
--separate-callers2
=B
--separate-callers2
=C
, and functions "B" and "C" will be treated as different
functions depending on the direct caller. Using the apostrophe for appending this "context" to the function name, you
get "A > B’A > C’B" and "A > C’A > B’C", and there will be no cycle. Use
--separate-callers
=2
to get a
2-caller dependency for all functions. Note that doing this will increase the size of profile data files.
6.2.5. Forking Programs
If your program forks, the child will inherit all the profiling data that has been gathered for the parent. To start with
empty profile counter values in the child, the client request
CALLGRIND_ZERO_STATS
;
can be inserted into code
to be executed by the child, directly after
fork
.
However, you will have to make sure that the output file format string (controlled by
--callgrind-out-file
)
does contain
%p
(which is true by default). Otherwise, the outputs from the parent and child will overwrite each other
or will be intermingled, which almost certainly is not what you want.
You will be able to control the new child independently from the parent via callgrind_control.
6.3. Callgrind Command-line Options
In the following, options are grouped into classes.
Some options allow the specification of a function/symbol name, such as
--dump-before
=function
, or
--fn-skip
=function
. All these options can be specified multiple times for different functions. In addition, the
function specifications actually are patterns by supporting the use of wildcards ’*’ (zero or more arbitrary characters)
and ’?’ (exactly one arbitrary character), similar to file name globbing in the shell. This feature is important especially
for C++, as without wildcard usage, the function would have to be specified in full extent, including parameter
signature.
6.3.1. Dump creation options
These options influence the name and format of the profile data files.
--callgrind-out-file=<file>
Write the profile data to
file
rather than to the default output file,
callgrind.out.<pid>
.
The
%p
and
%q
format specifiers can be used to embed the process ID and/or the contents of an environment variable in the name, as
is the case for the core option
--log-file
. When multiple dumps are made, the file name is modified further; see
below.
98