Using and understanding the Valgrind core
• Machine instructions, and system calls, have been implemented on demand.
So it’s possible, although unlikely,
that a program will fall over with a message to that effect. If this happens, please report all the details printed out,
so we can try and implement the missing feature.
• Memory consumption of your program is majorly increased whilst running under Valgrind’s Memcheck tool. This
is due to the large amount of administrative information maintained behind the scenes.
Another cause is that
Valgrind dynamically translates the original executable. Translated, instrumented code is 12-18 times larger than
the original so you can easily end up with 150+ MB of translations when running (eg) a web browser.
• Valgrind can handle dynamically-generated code just fine.
If you regenerate code over the top of old code (ie.
at the same memory addresses), if the code is on the stack Valgrind will realise the code has changed, and work
correctly. This is necessary to handle the trampolines GCC uses to implemented nested functions. If you regenerate
code somewhere other than the stack, and you are running on an 32- or 64-bit x86 CPU, you will need to use the
--smc-check=all
option, and Valgrind will run more slowly than normal. Or you can add client requests that
tell Valgrind when your program has overwritten code.
On other platforms (ARM, PowerPC) Valgrind observes and honours the cache invalidation hints that programs are
obliged to emit to notify new code, and so self-modifying-code support should work automatically, without the need
for
--smc-check=all
.
• Valgrind has the following limitations in its implementation of x86/AMD64 floating point relative to IEEE754.
Precision: There is no support for 80 bit arithmetic. Internally, Valgrind represents all such "long double" numbers
in 64 bits, and so there may be some differences in results. Whether or not this is critical remains to be seen. Note,
the x86/amd64 fldt/fstpt instructions (read/write 80-bit numbers) are correctly simulated, using conversions to/from
64 bits, so that in-memory images of 80-bit numbers look correct if anyone wants to see.
The impression observed from many FP regression tests is that the accuracy differences aren’t significant. Generally
speaking, if a program relies on 80-bit precision, there may be difficulties porting it to non x86/amd64 platforms
which only support 64-bit FP precision. Even on x86/amd64, the program may get different results depending on
whether it is compiled to use SSE2 instructions (64-bits only), or x87 instructions (80-bit).
The net effect is to
make FP programs behave as if they had been run on a machine with 64-bit IEEE floats, for example PowerPC.
On amd64 FP arithmetic is done by default on SSE2, so amd64 looks more like PowerPC than x86 from an FP
perspective, and there are far fewer noticeable accuracy differences than with x86.
Rounding: Valgrind does observe the 4 IEEE-mandated rounding modes (to nearest, to +infinity, to -infinity, to
zero) for the following conversions: float to integer, integer to float where there is a possibility of loss of precision,
and float-to-float rounding. For all other FP operations, only the IEEE default mode (round to nearest) is supported.
Numeric exceptions in FP code: IEEE754 defines five types of numeric exception that can happen: invalid operation
(sqrt of negative number, etc), division by zero, overflow, underflow, inexact (loss of precision).
For each exception, two courses of action are defined by IEEE754: either (1) a user-defined exception handler may
be called, or (2) a default action is defined, which "fixes things up" and allows the computation to proceed without
throwing an exception.
Currently Valgrind only supports the default fixup actions. Again, feedback on the importance of exception support
would be appreciated.
When Valgrind detects that the program is trying to exceed any of these limitations (setting exception handlers,
rounding mode, or precision control), it can print a message giving a traceback of where this has happened, and
continue execution. This behaviour used to be the default, but the messages are annoying and so showing them is
now disabled by default. Use
--show-emwarns=yes
to see them.
The above limitations define precisely the IEEE754 ’default’ behaviour: default fixup on all exceptions, round-to-
nearest operations, and 64-bit precision.
25