NEWS
* Support for ARM/Linux.
Valgrind now runs on ARMv7 capable CPUs
running Linux.
It is known to work on Ubuntu 10.04, Ubuntu 10.10,
and Maemo 5, so you can run Valgrind on your Nokia N900 if you want.
This requires a CPU capable of running the ARMv7-A instruction set
(Cortex A5, A8 and A9).
Valgrind provides fairly complete coverage
of the user space instruction set, including ARM and Thumb integer
code, VFPv3, NEON and V6 media instructions.
The Memcheck,
Cachegrind and Massif tools work properly; other tools work to
varying degrees.
* Support for recent Linux distros (Ubuntu 10.10 and Fedora 14), along
with support for recent releases of the underlying toolchain
components, notably gcc-4.5 and glibc-2.12.
* Support for Mac OS X 10.6, both 32- and 64-bit executables.
64-bit
support also works much better on OS X 10.5, and is as solid as
32-bit support now.
* Support for the SSE4.2 instruction set.
SSE4.2 is supported in
64-bit mode.
In 32-bit mode, support is only available up to and
including SSSE3.
Some exceptions: SSE4.2 AES instructions are not
supported in 64-bit mode, and 32-bit mode does in fact support the
bare minimum SSE4 instructions to needed to run programs on Mac OS X
10.6 on 32-bit targets.
* Support for IBM POWER6 cpus has been improved.
The Power ISA up to
and including version 2.05 is supported.
* ==================== TOOL CHANGES ====================
* Cachegrind has a new processing script, cg_diff, which finds the
difference between two profiles.
It’s very useful for evaluating
the performance effects of a change in a program.
Related to this change, the meaning of cg_annotate’s (rarely-used)
--threshold option has changed; this is unlikely to affect many
people, if you do use it please see the user manual for details.
* Callgrind now can do branch prediction simulation, similar to
Cachegrind.
In addition, it optionally can count the number of
executed global bus events.
Both can be used for a better
approximation of a "Cycle Estimation" as derived event (you need to
update the event formula in KCachegrind yourself).
* Cachegrind and Callgrind now refer to the LL (last-level) cache
rather than the L2 cache.
This is to accommodate machines with
three levels of caches -- if Cachegrind/Callgrind auto-detects the
cache configuration of such a machine it will run the simulation as
if the L2 cache isn’t present.
This means the results are less
likely to match the true result for the machine, but
Cachegrind/Callgrind’s results are already only approximate, and
should not be considered authoritative.
The results are still
16