About Archive Tags RSS Feed

 

Entries tagged bugs

With many a winding turn

25 September 2007 21:50

GNU Screen rocks, in general, but recently I've been using it a lot for custom applications and have discovered a pair of annoying bugs.

If you're not familiar with it then please read this GNU Screen tutorial - it really is worth getting to know!

Anyway onto the bugs:

  • Session names may not start with numbers.
  • Session names must be globally unique.

The two are related, but I'm not yet sure whether I should report bugs against the Debian package and the code is sufficiently cryptic that I cannot create a fix yet.

Taking the bugs in reverse order please try this:

screen -S foo
[detach]
screen -S foo2
[detach]

At this point you should have two screen sessions "foo" and "foo2". You should be able to attach to them by running "screen -R foo" or "screen -R foo2". Lets try that:

skx@vain:~$ screen -R foo
There are several suitable screens on:
        22317.foo       (Detached)
        22342.foo2      (Detached)
Type "screen [-d] -r [pid.]tty.host" to resume one of them.

Even though foo should be sufficient to identify a unique screen, the first one, it doesn't let you attach. Nasty.

(Yes, you can attach to it if you use the number/number+name:

screen -R  22317.foo 

The second issue is related. Create a screen session with "screen -S 222". Now try to attach to it with "screen -R 222" - instead of attaching it gives you a brand new screen.

Ugh.

| No comments

 

You see I'm falling in the vast abyss

24 October 2007 21:50

A while back I posted about a couple of my irritations with GNU Screen.

One of my irritations was the failure to reattach to sessions by name, if common prefixes were in use. For example with the following two (detached) sessions:

There are screens on:
        24419.abc       (Detached)
        24395.abcd      (Detached)
2 Sockets in /var/run/screen/S-skx.

The naive "screen -R abc" fails.

Yesterday whilst looking over the screen bug list I came up with a patch. It isn't ideal as it introduces a new failure case, but I believe it is a step in the right direction and better than the current situation. See attachment to #361274 for the code.

Also I patched screen so that #330036 is now fixed, and the blankerprg primitive works as expected.

Finally I closed #317450 (with a version) as it has been fixed since Etch.

Fun stuff.

In the spirit of completeness I should say I had a stab at #447210 which is tilde (~) expansion in the chdir primitive, but gave up after a while as the code got too messy even for me.

The trivial s/~/getenv("HOME")/ approach works fine for the simple case, but dealing with the expansion of strings such as ~foo/bar/ gets messy quickly. I can offer my patch if there is any interest though as a stop-gap measure.

Now I'm almost tempted to look over another package's bugs, but I think I'd rather eat pie & drink beer...

must. stop. talking. about. pies.

Update: Patch for tilde expansion submitted to #447210 - tested and seemed to cover all cases. Now time for beer!

| No comments

 

Listen to me when I'm telling you

14 February 2008 21:50

So today I'm a little bit lazy and I've got the day off work. As my previous plan suggested I wanted to spend at least some of the day tackling semi-random bugs. Earlier I picked a victim: less.

less rocks, and I use it daily. I even wrote an introduction to less once upon a time.

So lets take a look at two bugs from the long-neglected pile. These two issues are basically the same:

They seem like simple ones to fix, with the same root cause. Here's an example if you want to play along at home:

 cp /dev/null testing
 gzip testing
 zless testing.gz

What do you see? I see this:

"testing.gz" may be a binary file.  See it anyway?

When I select "y" I see the raw binary of the compressed file.

So, we can reproduce it. Now to see why it happens. /bin/zless comes from the gzip package and is a simple shell script:

#!/bin/sh
# snipped a lot of text
LESSOPEN="|gzip -cdfq -- %s"; export LESSOPEN
exec less "$@"

So what happens if we run that?

$ LESSOPEN="|gzip -cdfq -- ~/testing.gz" /usr/bin/less ~/testing.gz
"/home/skx/testing.gz" may be a binary file.  See it anyway?

i.e. it fails in the same way. Interestingly this works just fine:

gzip -cdfq -- ~/testing.gz | less

So we've learnt something interesting and useful. We've learnt that when LESSOPEN is involved we get the bug. Which suggests we should "apt-get source less" and then "rgrep LESSOPEN ~/less-*/".

Doing so reveals the following function in filename.c:

	public char *
open_altfile(filename, pf, pfd)
	char *filename;
	int *pf;
	void **pfd;
{

/* code to test whether $LESSOPEN is set, and attempt to run the
   command if it is */

		/*
		 * Read one char to see if the pipe will produce any data.
		 * If it does, push the char back on the pipe.
		 */
		f = fileno(fd);
		SET_BINARY(f);

		if (read(f, &c, 1) != 1)
		{
			/*
			 * Pipe is empty.  This means there is no alt file.
			 */
			pclose(fd);
			return (NULL);
		}
		ch_ungetchar(c);
		*pfd = (void *) fd;
		*pf = f;
		return (save("-"));

That might not be digestible, but basically less runs the command specified in $LESSOPEN. If it may read a single character of output from that command it replaces the file it was going to read with the output of the command instead!

(i.e. Here less failed to read a single character, because our gzipped file was zero-bytes long! So instead it reverted to showing the binary gzipped file.)

So we have a solution: If we want this to work we merely remove the "read a single character test". I can't think of circumstance in which that would do the wrong thing, so I've submitted a patch to do that.

Bug(s) fixed.

Incidentally if you like these kind of "debuggin by example" posts, or hate them, do let me know. So I'll know whether to take notes next time or not..

| 22 comments

 

Some people get by with a little understanding

9 March 2008 21:50

Since my last example of fixing a bug received some interesting feedback (although I notice no upload of the package in question ..) we'll have another go.

Looking over my ~/.bash_history file one command I use multiple times a day is make. Happily GNU make has at least one interesting bug open:

I verified this bug by saving the Makefile in the report and running make:

skx@gold:~$ make
make: file.c:84: lookup_file: Assertion `*name != '\0'' failed.
Aborted

(OK so this isn't a segfault; but an assertion failure is just as bad. Honest!)

So I downloaded the source to make, and rebuilt it. This left me with a binary with debugging symbols. The execution was much more interesting this time round:

skx@gold:~$ ./make
*** glibc detected ***
  /home/skx/./make: double free or corruption (fasttop): 0x00000000006327b0 ***
======= Backtrace: =========
/lib/libc.so.6[0x2b273dbdd8a8]
/lib/libc.so.6(cfree+0x76)[0x2b273dbdf9b6]
/home/skx/./make[0x4120a5]
/home/skx/./make[0x4068ee]
/home/skx/./make[0x406fb2]
...
[snip mucho texto]

And once I'd allowed core-file creation ("ulimit -c 9999999") I found I had a core file to help debugging.

Running the unstripped version under gdb showed this:

(gdb) up
#5  0x00000000004120a5 in multi_glob (chain=0x1c, size=40) at read.c:3106
3106			    free (memname);

So it seems likely that this free is causing the abort. There are two simple things to do here:

  • Comment out the free() call - to see if the crash goes away (!)
  • Understand the code to see why this pointer might be causing us pain.

To get started I did the first of these: Commenting out the free() call did indeed fix the problem, or at least mask it (at the cost of a memory leak):

skx@gold:~$ ./make
make: *** No rule to make target `Erreur_Lexicale.o', needed by `compilateur'.  Stop.

So, now we need to go back to read.c and see why that free was causing problems.

The function containing the free() is "multi_glob". It has scary pointer magic in it, and it took me a lot of tracing to determine the source of the bug. In short we need to change this:

free (memname);

To this:

free (memname);
memname = 0;

Otherwise the memory is freed multiple times, (once each time through the loop in that function. See the source for details).

Patch mailed.

| 5 comments

 

Looks like me an Vincent caught you boys at breakfast

27 September 2009 21:50

It is interesting that François Marier recently posted a brief "howto" document on debugging problems caused by overly-agressive filtering with privoxy, as I've recently been having problems with that tool.

My home network frequently changes configuration depending on what I'm concentrating upon, but every few months I'll start/cease using the following tools:

  • squid - The caching proxy server.
  • tor - The onion router.
  • privoxy - The filtering cache.

Recently I was experimenting with XSS attacks against various browsers, which meant using them for real. As not all browsers have the same anti-advert setups I was running privoxy to filter out web-annoyances, and I spotted a major flaw with it.

Unfortunately I can only describe the problem, not reproduce it, or track it down. I'm 80% certain the bug is in privoxy, but the stack is suitably high that determining that for sure is problematic.

In short the issue is that HTTP requests would end up being sent to the wrong host:

  • I load my start page in one tab: http://www.steve.org.uk/start/
  • I click to open the following URL in another tab: http://www.perlmonks.org/?node=Newest Nodes.
  • The request gets sent to http://steve.org.uk/?node=...

After that clicking around consistently sends requests to the first HTTP host which was accessed succesfully. So, for example, attempting to visit http://foo.com/bar/ will send the request to http://steve.org.uk/bar - which then gives a 404.

In terms of setup I use a dnsmasq DNS cache, privoxy and iceweasel from Debian unstable. From the symptoms I'm not sure if iceweasel's "KeepAlive" system is to blame, or if privoxy has a bad cache of hosts. Perhaps it is dnsmasq returning bogus DNS data, or my cable connection itself having DNS issues.

Anyway once the symptoms present themselves closing the browser and restarting the cache fixes it. Until the next time which might be hours or days later.

I'd report it as a bug - but I don't know where it should be. Privoxy caching things it shouldn't? iceweasel having keepalive issues? dnsmasq returning wrong DNS entries?

I'd ask "Have you seen this before, internet world?" but I guess if you have tracked it down it'd be fixed by now, and it clearly isn't!

Anyway for the moment I've uninstalled privoxy.

ObFilm: Pulp Fiction

| 4 comments

 

There must be a name for bugs you only find post-release

18 July 2013 21:50

This week I made two releases of my mail client. Immediately after both releases I found bugs. Despite having been using the github source tree on my box for reading mail for days.

There must be a name for bugs that come up immediately after you've just made a release.

I'm torn between wanting to make a new release right now to fix the thing I spotted, or wait a few more days to fix a few other niggles.

Still I did write some cool code today:

  1. If a mail is received on the list debian.security-announce
  2. And the package in the Subject: is not installed on the current machine.
  3. The mail is marked as read.

Sure this means that a package on my webserver won't be visible to me, but my upgrade tool will see that. It just decreases the odds I read about an update that doesn't apply to me.

ObQuote: "don't pay heed to temptation
for his hands are so cold"

| 4 comments