Since my last example of fixing a bug received some interesting feedback (although I notice no upload of the package in question ..) we'll have another go.
Looking over my ~/.bash_history file one command I use multiple times a day is make. Happily GNU make has at least one interesting bug open:
I verified this bug by saving the Makefile in the report and running make:
make: file.c:84: lookup_file: Assertion `*name != '\0'' failed.
(OK so this isn't a segfault; but an assertion failure is just as bad. Honest!)
So I downloaded the source to make, and rebuilt it. This left me with a binary with debugging symbols. The execution was much more interesting this time round:
*** glibc detected ***
/home/skx/./make: double free or corruption (fasttop): 0x00000000006327b0 ***
======= Backtrace: =========
[snip mucho texto]
And once I'd allowed core-file creation ("ulimit -c 9999999") I found I had a core file to help debugging.
Running the unstripped version under gdb showed this:
#5 0x00000000004120a5 in multi_glob (chain=0x1c, size=40) at read.c:3106
3106 free (memname);
So it seems likely that this free is causing the abort. There are two simple things to do here:
- Comment out the free() call - to see if the crash goes away (!)
- Understand the code to see why this pointer might be causing us pain.
To get started I did the first of these: Commenting out the free() call did indeed fix the problem, or at least mask it (at the cost of a memory leak):
make: *** No rule to make target `Erreur_Lexicale.o', needed by `compilateur'. Stop.
So, now we need to go back to read.c and see why that free was causing problems.
The function containing the free() is "multi_glob". It has scary pointer magic in it, and it took me a lot of tracing to determine the source of the bug. In short we need to change this:
memname = 0;
Otherwise the memory is freed multiple times, (once each time through the loop in that function. See the source for details).