OLDER NEWS
typically reduced by 15-30%, averaging about 24% for SPEC CPU2000.
The other tools have smaller but noticeable speed improvements.
We
are interested to hear what improvements users get.
Memcheck uses less memory due to the introduction of a compressed
representation for shadow memory.
The space overhead has been
reduced by a factor of up to four, depending on program behaviour.
This means you should be able to run programs that use more memory
than before without hitting problems.
- Addrcheck has been removed.
It has not worked since version 2.4.0,
and the speed and memory improvements to Memcheck make it redundant.
If you liked using Addrcheck because it didn’t give undefined value
errors, you can use the new Memcheck option --undef-value-errors=no
to get the same behaviour.
- The number of undefined-value errors incorrectly reported by
Memcheck has been reduced (such false reports were already very
rare).
In particular, efforts have been made to ensure Memcheck
works really well with gcc 4.0/4.1-generated code on X86/Linux and
AMD64/Linux.
- Josef Weidendorfer’s popular Callgrind tool has been added.
Folding
it in was a logical step given its popularity and usefulness, and
makes it easier for us to ensure it works "out of the box" on all
supported targets.
The associated KDE KCachegrind GUI remains a
separate project.
- A new release of the Valkyrie GUI for Memcheck, version 1.2.0,
accompanies this release.
Improvements over previous releases
include improved robustness, many refinements to the user interface,
and use of a standard autoconf/automake build system.
You can get
it from http://www.valgrind.org/downloads/guis.html.
- Valgrind now works on PPC64/Linux.
As with the AMD64/Linux port,
this supports programs using to 32G of address space.
On 64-bit
capable PPC64/Linux setups, you get a dual architecture build so
that both 32-bit and 64-bit executables can be run.
Linux on POWER5
is supported, and POWER4 is also believed to work.
Both 32-bit and
64-bit DWARF2 is supported.
This port is known to work well with
both gcc-compiled and xlc/xlf-compiled code.
- Floating point accuracy has been improved for PPC32/Linux.
Specifically, the floating point rounding mode is observed on all FP
arithmetic operations, and multiply-accumulate instructions are
preserved by the compilation pipeline.
This means you should get FP
results which are bit-for-bit identical to a native run.
These
improvements are also present in the PPC64/Linux port.
- Lackey, the example tool, has been improved:
* It has a new option --detailed-counts (off by default) which
causes it to print out a count of loads, stores and ALU operations
done, and their sizes.
45