Helgrind: a thread error detector
Thread #1 is the program’s root thread
Thread #2 was created
at 0x511C08E: clone (in /lib64/libc-2.8.so)
by 0x4E333A4: do_clone (in /lib64/libpthread-2.8.so)
by 0x4E33A30: pthread_create@@GLIBC_2.2.5 (in /lib64/libpthread-2.8.so)
by 0x4C299D4: pthread_create@* (hg_intercepts.c:214)
by 0x400605: main (simple_race.c:12)
Possible data race during read of size 4 at 0x601038 by thread #1
Locks held: none
at 0x400606: main (simple_race.c:13)
This conflicts with a previous write of size 4 by thread #2
Locks held: none
at 0x4005DC: child_fn (simple_race.c:6)
by 0x4C29AFF: mythread_wrapper (hg_intercepts.c:194)
by 0x4E3403F: start_thread (in /lib64/libpthread-2.8.so)
by 0x511C0CC: clone (in /lib64/libc-2.8.so)
Location 0x601038 is 0 bytes inside global var "var"
declared at simple_race.c:3
This is quite a lot of detail for an apparently simple error. The last clause is the main error message. It says there is a
race as a result of a read of size 4 (bytes), at 0x601038, which is the address of
var
, happening in function
main
at
line 13 in the program.
Two important parts of the message are:
• Helgrind shows two stack traces for the error, not one. By definition, a race involves two different threads accessing
the same location in such a way that the result depends on the relative speeds of the two threads.
The first stack trace follows the text "
Possible data race during read of size 4 ...
" and the
second trace follows the text "
This conflicts with a previous write of size 4 ...
".
Hel-
grind is usually able to show both accesses involved in a race.
At least one of these will be a write (since two
concurrent, unsynchronised reads are harmless), and they will of course be from different threads.
By examining your program at the two locations, you should be able to get at least some idea of what the root cause
of the problem is.
For each location, Helgrind shows the set of locks held at the time of the access.
This often
makes it clear which thread, if any, failed to take a required lock. In this example neither thread holds a lock during
the access.
• For races which occur on global or stack variables, Helgrind tries to identify the name and defining point
of the variable.
Hence the text "
Location 0x601038 is 0 bytes inside global var "var"
declared at simple_race.c:3
".
Showing names of stack and global variables carries no run-time overhead once Helgrind has your program up
and running. However, it does require Helgrind to spend considerable extra time and memory at program startup
to read the relevant debug info. Hence this facility is disabled by default.
To enable it, you need to give the
--read-var-info=yes
option to Helgrind.
110