Helgrind: a thread error detector
• Qt version 4.X. Qt 3.X is harmless in that it only uses POSIX pthreads primitives. Unfortunately Qt 4.X has its
own implementation of mutexes (QMutex) and thread reaping. Helgrind 3.4.x contains direct support for Qt 4.X
threading, which is experimental but is believed to work fairly well. A side effect of supporting Qt 4 directly is
that Helgrind can be used to debug KDE4 applications. As this is an experimental feature, we would particularly
appreciate feedback from folks who have used Helgrind to successfully debug Qt 4 and/or KDE4 applications.
• Runtime support library for GNU OpenMP (part of GCC), at least for GCC versions 4.2 and 4.3.
The GNU
OpenMP runtime library (
libgomp.so
) constructs its own synchronisation primitives using combinations of
atomic memory instructions and the futex syscall, which causes total chaos since in Helgrind since it cannot
"see" those.
Fortunately, this can be solved using a configuration-time option (for GCC). Rebuild GCC from source, and
configure using
--disable-linux-futex
. This makes libgomp.so use the standard POSIX threading
primitives instead.
Note that this was tested using GCC 4.2.3 and has not been re-tested using more recent
GCC versions. We would appreciate hearing about any successes or failures with more recent versions.
If you must implement your own threading primitives, there are a set of client request macros in
helgrind.h
to
help you describe your primitives to Helgrind. You should be able to mark up mutexes, condition variables, etc,
without difficulty.
It is also possible to mark up the effects of thread-safe reference counting using the
ANNOTATE_HAPPENS_BEFORE
,
ANNOTATE_HAPPENS_AFTER
and
ANNOTATE_HAPPENS_BEFORE_FORGET_ALL
, macros.
Thread-safe
reference counting using an atomically incremented/decremented refcount variable causes Helgrind problems
because a one-to-zero transition of the reference count means the accessing thread has exclusive ownership of the
associated resource (normally, a C++ object) and can therefore access it (normally, to run its destructor) without
locking. Helgrind doesn’t understand this, and markup is essential to avoid false positives.
Here are recommended guidelines for marking up thread safe reference counting in C++. You only need to mark
up your release methods -- the ones which decrement the reference count. Given a class like this:
class MyClass {
unsigned int mRefCount;
void Release ( void ) {
unsigned int newCount = atomic_decrement(&mRefCount);
if (newCount == 0) {
delete this;
}
}
}
the release method should be marked up as follows:
void Release ( void ) {
unsigned int newCount = atomic_decrement(&mRefCount);
if (newCount == 0) {
ANNOTATE_HAPPENS_AFTER(&mRefCount);
ANNOTATE_HAPPENS_BEFORE_FORGET_ALL(&mRefCount);
delete this;
} else {
ANNOTATE_HAPPENS_BEFORE(&mRefCount);
}
}
There are a number of complex, mostly-theoretical objections to this scheme.
From a theoretical standpoint it
appears to be impossible to devise a markup scheme which is completely correct in the sense of guaranteeing to
remove all false races. The proposed scheme however works well in practice.
116