Entries posted in December 2010

It would be nice if we could record which files populate or read

Saturday, 1 January 2011

It would be really neat if there were some tool which recorded which dotfiles an application read, used, or created.

As an example emacs uses .emacs, but won't create it. However firefox will create and fill ~/.mozilla if it isn't present, and links will create ~/.links2.

What would we do with that data? I'm not sure off the top of my head, but I think it is interesting to collect regardless. Perhaps a simple tool such as apt-file to download the data and let you search:

who-creates ~/.covers
who-creates ~/.dia

Obviously the simple use is to purge user-data when matching packages are removed - e.g. dpkg-posttrigger hook. But that's a potentially dangerous thing to do.

Anyway I'm just pondering - I expect that over time applications will start switching to using "centralised" settings such as ~/.gconf2 etc.

In the menatime I've started cleaning up ~/ on my own machines - things like ~/.spectemurc, ~/.grip, etc.

ObQuote: What a long sword. I like that in a man - Blood of the Samurai (Don't be tempted; awful film.)



Minor success has been achieved in reimplimenting mod_rewrite

Tuesday, 28 December 2010

So yesterday I mentioned the mod_rewrite-compatible proxy server. Today I've spent several hours getting to grips with that.

I've progressed far enough along that the trivial cases are handled, as the following test case shows:

TestParamteredMatch (CuTest * tc)
     * We pretend this came in via the network.
    char *request = "GET /login/steve.kemp/secret#password HTTP/1.1\n\n";

     * This is the mod_rewrite rule we're going to test.
    char *rule    = "RewriteRule

    int res = 0;

    /* Parse the HTTP request */
    struct http_request *req = http_request_new (request);
    CuAssertPtrNotNull (tc, req);

    /* Ensure it looks sane. */
    CuAssertStrEquals(tc, "/login/steve.kemp/secret#password", req->path );

    /* Create the rewrite rule */
    struct rewrite_rule *r = rewrite_rule_new (rule);
    CuAssertPtrNotNull (tc, r);

    /* Assert it contains what we think it should. */
    CuAssertStrEquals(tc, "^/login/(.*)/(.*)/*$", r->pattern );

    /* Apply - expect success (==1) */
    res = rewrite_rule_apply( r, req );
    CuAssertIntEquals (tc, 1, res );

    /* Ensure path is updated. */
    CuAssertStrEquals(tc, "/cgi-bin/index.cgi?mode=login;lname=steve.kemp;lpass=secret#password", req->path );

    free_http_request (req);
    free_rewrite_rule (r);

So all is good? Sadly not.

I was expecting to handle a linked list of simple rules, but I've now realised that this isn't sufficient. Consider the following two (real) examples:

#  If the path is /robots.txt and the hostname isn't repository.steve.org.uk
# then redirect to the master one.
RewriteCond %{http_host} !^repository\.steve\.org\.uk
RewriteRule /robots.txt$  http://repository.steve.org.uk/robots.txt [R=permanent,L]

#  Request for :  http://foo.repository.steve.org.uk 
#  becomes:  http://repository.steve.org.uk/cgi-bin/hgwebdir.cgi/foo/file/tip
RewriteCond %{http_host} .
RewriteCond %{http_host} !^repository.steve.org.uk [NC]
RewriteCond %{http_host} ^([^.]+)\.repository.steve.org.uk [NC]
RewriteRule ^/$ http://repository.steve.org.uk/cgi-bin/hgwebdir.cgi/%1/file/tip [QSA,L]

So rather than having a simple linked list of rules for each domain I need to have a list of rules - each of which might in turn contain sub-rules. In terms of parsing this is harder than I'd like because it means I need to maintain state to marry up the RewriteCond & RewriteRules.

Still the problem isn't insurmountable and I'm pleased with the progress I've made. Currently I can implement enough of mod_rewrite that I could handle all of my existing sites except the single site I have with the complex rule demonstrated above.

(In all honesty I guess I could simplify my setup by dropping the wildcard hostname handling for the repository.steve.org.uk name, but I do kinda like it, and it makes for simple canonical mercurial repositories.)

ObQuote: - 300

| No comments


This weekend I have mostly been parsing HTTP

Monday, 27 December 2010

A few weeks ago I was lamenting the lack of of a reverse proxy that understood Apache's mod_rewrite syntax.

On my server I have a bunch of thttpd processes, each running under their own UID, each listening on local ports. For example I have this running:

thttpd -C /etc/thttpd/sites.enabled/blog.steve.org.uk

This is running under the Unix-user "s-blog", with UID 1015, and thus is listening on

steve@steve:~$ id s-blog
uid=1015(s-blog) gid=1016(s-blog) groups=1016(s-blog)

steve@steve:~$ lsof -i :1015
thttpd  26293 s-blog    0u  IPv4 1072632       TCP localhost:1015 (LISTEN)

Anyway in front of these thttpd processes I have Apache. It does only two things:

  • Expands mod_rewrite rules.
  • Serves as a proxy to the back-end thttpd processes.

Yes other webservers could be run in front of these processes, and yes other webservers have their own rewrite-rule-syntax. I'm going to say "Lala lala can't hear you". Why? Because mod_rewrite is the defacto standard. It is assumed, documented, and included with a gazillion projects from wikipedia to wordpress to ...

So this weekend I decided I'd see what I needed to do to put together a simple proxy that only worked for reverse HTTP, and understood Apache's mod_rewrite rules.

I figure there are three main parts to such a beast:

Be a network server

Parse configuration file, accept connections, use fork(), libevevent(), or similar such that you ultimately receive HTTP requests..

Process HTTP Requests, rewriting as required

Once you have a parsed HTTP-request you need to test against each rule for the appropriate destination domain. Rewriting the request as appopriate.


Send your (potentially modified) request to the back-end, and then send the response back to the client.

Thus far I've written code which takes this:

GET /etc/passwd?root=1 HTTP/1.0
Host: foo.example.com:80
Accept-Language: en-us,en-gb?q=0.5
Refer: http://foo.bar.com/
Keep-Alive: 300
Connection: keep-alive

Then turns it into this:

struct http_request
   * Path being requested. "/index.html", etc.
  char *path;

   * Query string
  char *qs;

   * Method being requested "GET/POST/PUT/DELETE/HEAD/etc".
  char *method;

   * A linked list of headers such as "Referer: foo",
   * "Connection: close"
  struct http_headers *headers;

There are methods for turning that back to a string, (so that you can send it on to the back-end), finding headers such as "Referer:", and so on.

The parser is pretty minimal C and takes only a "char *buffer" to operate on. It has survived a lot of malicious input, as proved by a whole bunch of test cases.

My next job is to code up the mod_rewrite rule-processor to apply a bunch of rules to one of these objects - updating the struct as we go. Assuming that I can get that part written cleanly over the next week or two then I'll be happy and can proceed to write the networking parts of the code - both the initial accepting, and the proxying to the back-end.

In terms of configuration I'm going to assume something like:

/etc/proxy/global.conf                    : Global directives.

/etc/proxy/steve.org.uk/back-end          : Will contain
/etc/proxy/steve.org.uk/rewrite.conf      : Will contain the domain-wide rules

/etc/proxy/blog.steve.org.uk/back-end     : Will contain
/etc/proxy/blog.steve.org.uk/rewrite.conf : Will contain the domain-wide rules


That seems both sane & logical to me.

ObQuote: "I shall control the fate of the world... " - Day Watch.

ObRandom: Coding in C is pleasant again.

| No comments


The remote root hole in exim4 is painful

Sunday, 12 December 2010

Recently I noticed a report of an alleged remote root security compromise of a machine, via the exim mailserver.

At the time I wasn't sure how seriously to take it, but I followed updates on the thread and it soon became clear that there was a major problem on our hands.

It later became obvious that there were two problems:


A remote buffer overflow, allowing the execution of arbitrary code as the user Debian-exim.


A privilege escelation allowing the attacker to jump from running code as Debian-exim to running code as root.

Trivial exploits are floating around the internet - and we were seeing this bug be exploited in the wild as early as yesterday afternoon.

Although I can feel somewhat smug that my own personal server is running qpsmtpd ahead of exim it's still a wake-up call, and this hole has the potential to significantly expand available botnets - it is probably only a matter of days hours until we see worms taking advantage of the flaw.

ObPlug: I've put together an updated version of exim4 for etch - if you're still running etch then you don't have any official security support (timely upgrading is obviously preferred) and it might be useful to have more folk pointed at that..

ObQuote: "We're all going to die down here" - Resident Evil.



Recent Posts

Recent Tags