Helgrind: a thread error detector
The following section explains Helgrind’s race detection algorithm in more detail.
7.4.2. Helgrind’s Race Detection Algorithm
Most programmers think about threaded programming in terms of the basic functionality provided by the threading
library (POSIX Pthreads): thread creation, thread joining, locks, condition variables, semaphores and barriers.
The effect of using these functions is to impose constraints upon the order in which memory accesses can happen.
This implied ordering is generally known as the "happens-before relation". Once you understand the happens-before
relation, it is easy to see how Helgrind finds races in your code. Fortunately, the happens-before relation is itself easy
to understand, and is by itself a useful tool for reasoning about the behaviour of parallel programs. We now introduce
it using a simple example.
Consider first the following buggy program:
Parent thread:
Child thread:
int var;
// create child thread
pthread_create(...)
var = 20;
var = 10;
exit
// wait for child
pthread_join(...)
printf("%d\n", var);
The parent thread creates a child. Both then write different values to some variable
var
, and the parent then waits
for the child to exit.
What is the value of
var
at the end of the program, 10 or 20? We don’t know. The program is considered buggy (it
has a race) because the final value of
var
depends on the relative rates of progress of the parent and child threads. If
the parent is fast and the child is slow, then the child’s assignment may happen later, so the final value will be 10; and
vice versa if the child is faster than the parent.
The relative rates of progress of parent vs child is not something the programmer can control, and will often change
from run to run. It depends on factors such as the load on the machine, what else is running, the kernel’s scheduling
strategy, and many other factors.
The obvious fix is to use a lock to protect
var
.
It is however instructive to consider a somewhat more abstract
solution, which is to send a message from one thread to the other:
111