Some people get by with a little understanding

Sunday, 9 March 2008

Since my last example of fixing a bug received some interesting feedback (although I notice no upload of the package in question ..) we'll have another go.

Looking over my ~/.bash_history file one command I use multiple times a day is make. Happily GNU make has at least one interesting bug open:

I verified this bug by saving the Makefile in the report and running make:

skx@gold:~$ make
make: file.c:84: lookup_file: Assertion `*name != '\0'' failed.
Aborted

(OK so this isn't a segfault; but an assertion failure is just as bad. Honest!)

So I downloaded the source to make, and rebuilt it. This left me with a binary with debugging symbols. The execution was much more interesting this time round:

skx@gold:~$ ./make
*** glibc detected ***
  /home/skx/./make: double free or corruption (fasttop): 0x00000000006327b0 ***
======= Backtrace: =========
/lib/libc.so.6[0x2b273dbdd8a8]
/lib/libc.so.6(cfree+0x76)[0x2b273dbdf9b6]
/home/skx/./make[0x4120a5]
/home/skx/./make[0x4068ee]
/home/skx/./make[0x406fb2]
...
[snip mucho texto]

And once I'd allowed core-file creation ("ulimit -c 9999999") I found I had a core file to help debugging.

Running the unstripped version under gdb showed this:

(gdb) up
#5  0x00000000004120a5 in multi_glob (chain=0x1c, size=40) at read.c:3106
3106			    free (memname);

So it seems likely that this free is causing the abort. There are two simple things to do here:

  • Comment out the free() call - to see if the crash goes away (!)
  • Understand the code to see why this pointer might be causing us pain.

To get started I did the first of these: Commenting out the free() call did indeed fix the problem, or at least mask it (at the cost of a memory leak):

skx@gold:~$ ./make
make: *** No rule to make target `Erreur_Lexicale.o', needed by `compilateur'.  Stop.

So, now we need to go back to read.c and see why that free was causing problems.

The function containing the free() is "multi_glob". It has scary pointer magic in it, and it took me a lot of tracing to determine the source of the bug. In short we need to change this:

free (memname);

To this:

free (memname);
memname = 0;

Otherwise the memory is freed multiple times, (once each time through the loop in that function. See the source for details).

Patch mailed.

| 5 comments.

 

Comments On This Entry

[gravitar] joeyo

Submitted at 16:40:24 on 11 march 2008

Nicely done. I often end up with something like this in my C code:
#define safe_free(a) { if(a) free(a); a=0; }
I never really understood why this (or something better) isn't in the standard C libs.
[gravitar] Marius Gedminas

Submitted at 17:52:30 on 11 march 2008

I love these posts about bug fixing.
joeyo: the if(a) check is redundant; free(0) is perfectly valid C.
(Meh. I should curb my http://xkcd.com/386/ instinct. Especially when my corrections aren't always correct.)
[author] Steve Kemp

Submitted at 20:35:49 on 11 march 2008

Thanks for the feedback!

I agree with Marius. It is valid to free a NULL pointer; but it isn't valid to free a pointer which has previously been freed. (Hence this bug!)

[gravitar] joeyo

Submitted at 20:09:23 on 12 march 2008

Marius, that's a good point. I was just using the check for null NULL as a poor-man's this-memory-has-already-been-freed flag. But you are right, K&R says free(NULL) "does nothing". Thanks!
[gravitar] Ondrej

Submitted at 00:29:55 on 14 march 2008

excellent post. keep sending more of these! :)

 

Comments are closed on posts which are more than ten days old.

Recent Posts

Recent Tags