About Archive Tags RSS Feed


Entries posted in March 2011

I updated my redis-based filesystem

2 March 2011 21:50

In July last year I made a brief post about a simple filesystem I'd put together which used Redis for the storage.

At that time I thought it was a cute hack, and didn't spend too much time with it. But recently I found a use for it so I cleaned it up, synced up the C client for Redis which I used and generally started to care again.

If it is useful you can now find it online:

The basic idea is the same as it was before, except I did eventually move to an INODE-like system. Each file/directory entry receives a unique identifier (integer) - and then I store the meta-data in a key based off that name.

This means for a file I might have keys, and values,like this:

INODE:1:NAMEThe name of the file (e.g. "passwd").
INODE:1:SIZEThe size of the file (e.g. "1661" )
INODE:1:GIDThe group ID of the file's owner (e.g. "0")
INODE:1:UIDThe user ID of the file's owner (e.g. "0")
INODE:1:MODEThe mode of the file (e.g. 0755)

To store these things I use a Redis "SET" which allows me to easily iterate over all the entries in each directory.

ObQuote: "They fuck up, they get beat. We fuck up, they give us pensions. " - The Wire



A final update on redisfs

4 March 2011 21:50

I think I'm done with the redis filesystem for the moment. It does everything I need it to do, although I am curious to see how much faster it could go if it were to use non-blocking writes that isn't a major concern.

There is only one missing feature I'm planning to play with - and that is the ability to implement snapshots.

As a refresher redis is a key+value store, which mostly uses system memory. I've built a simple FUSE filesystem on top of that (now with symlink support!) and so all the file contents, meta-data, and similar is stored in memory.

Implementing snapshots could be done by two different routes:

Copying all keys and their values, under a new name.

For example right now, by default, all the filesystem entries in the root directory are stored beneath the key "SKX:/" - where "SKX" is the key prefix.

Assume I copy each existing key, and the associated value(s), giving them a new prefix such as "SKX2:" I can mount the filesystem against that prefix - and we've got a point-in-time snapshot.

Serialising all keys & values

Redis has a primitive which allows you to determine the names of keys at runtime. Given that all my filesystem keys have a prefix ("SKX:" by default) it wouldn't be difficult to find them, and serialise them.

This would require more effort to re-import and re-mount, but it should be portable across hosts.

Anyway assuming I get this right we'll have a filesystem which is replication-friendly and snapshot-able. A fun combination.

ObQuote: "We had been everywhere. We had really seen nothing. " - Lolita



A few forks.

13 March 2011 21:50

Although I promised in my previous entry that I'd made my last mention of the redisfs - replication friendly redis-based filesystem I have to disappoint.

Ben Sykes informed me that he'd made a fork of redisfs, called shredisfs. I like forks. Forks are good, and this particular one was very welcome - from the README:

Steve's original managed around 250k per second for writes on my test machine, this version does about 15MB a second writes.


Read speed on the test machine here now is around 180MB/sec..

Needless to say I "stole" the improvements and rolled them into my original release. Thanks all round to those of you that submitted bug reports, suggestions, and codez.

I also had a release out briefly which used zLib to compress the values of keys stored in memory. Unfortunately that lead to a net-slowdown for files which were bigger than a single block - due to the overhead of file-system appends which translated to: "fetch", "compress", "append", and "decompress".

Finally on the subject of qpsmtpd, my favourite SMTP-server - the exceptionally talented Matt Sergeant released a proof of concept rewrite in Javascript. Using nodejs this new SMTP server is heavily asynchronous and you can find it online under the name Haraka.

Although my javascript-fu is weak I'm very impressed by the codebase out there so far, and I'd expect good things of it in the future. Even if it does use Javascript and not my beloved Perl.

Finally I'm a year older. But birthdays aren't important.

ObQuote: "Do you wanna know what makes all my candy taste so special? " - Epic Movie.

| 1 comment


nodejs is fun

18 March 2011 21:50

A while back I was working on a mod_rewrite compatible proxy server written in C. The reason for this is that my current webhost uses Apache2 in front ofa number of thttpd processes, and I'd like to remove apache and use something smaller/faster/neater.

Dividing things up I'm running about ten domains, and only around half of them use mod_rewrite rules - small enough perhaps to port the rules, large enough to make it annoying.

Upon reflection I think the thing to do is to replace apache with javascript - via node.js. Writing proxies with node.js is almost ridiculously simple - in fact doing anything HTTP-like is very very simple thanks to the bundled libraries.

Time will tell whether this is a waste of time or not, but I'm confident I could listen upon *:80 and route requests to localhost:N for a few domains.

As a trivial toy I wrote a simple transforming proxy last night which performed simple rewrites as it passed traffic about.

Another fun, and possibly useful, thing I put together was a node.js port of the simple httpstat.us website. That server would have been easy to write in any number of languages, but in node it was almost too easy.

Update - here is my rewriting node.js-based reverse-proxy.

ObQuote: "I just need you to stop being nice to me unless you're gonna marry me. " - He's just not that into you.

| No comments


My node-reverse-proxy is both stable and public

20 March 2011 21:50

I posted a brief snippet of code on Friday which was my initial stab at a reverse HTTP proxy in Javascript (using node.js).

Over the past couple of days I've tidied it up, added a command line parser, and made it flexible enough that it works for me.

My node reverse HTTP proxy is now both documented ( a little ) and available for further eyeballs.

Usage is pretty much:

$ node ./node-reverse-proxy.js --config ./path/to/config.file.js

The configuration file defines lists of virtual hosts along with the destination back-ends to proxy to - which is usually going to be a server running upon a high port on the loopback adapter, but might not be.

In addition to that we can perform rewrites such as:

  * Handler for wildcard host: *.repository.steve.org.uk
         * Rewrites for static files - these will be handled via a
         * separate virtual host.
        'rules': {
            '^/robots.txt':  'http://repository.steve.org.uk/robots.txt',
            '^/favicon.ico': 'http://repository.steve.org.uk/favicon.ico',

That says requests for http://chronicle.repository.steve.org.uk/robots.txt will be redirected to http://repository.steve.org.uk/robots.txt.

Alternatively we can invoke javascript for each request matching a pattern:

     * static.steve.org.uk will mostly proxy to
     * but files beneath /private/ have an IP-based ACL.
        host: 'localhost',
        port: '1008',

        'functions': {
            '/private': (function(orig_host, vhost,req,res) {
                var remote = req.connection.remoteAddress;;

                if ( ( remote != "" ) &&
                     ( remote != "" ) &&
                     ( remote != "") &&
                     ( remote != "" ) )
                    res.write( "Denied access to " + req.url  + " from " + remote );

Fun stuff. It was live for my server, replacing apache, for a few hours today. I need to add some trivial HTTP Basic-Auth handling then it will go back.

Otherwise I hope it is vaguely useful to others, and that the provided examples explain things neatly.

ObQuote: "Only one thing alive with less than four legs can hear this frequency" - Superman.



New software always causes surprises

21 March 2011 21:50

I recently deployed my node.js proxy server, removing all traces of Apache2 from my main server. During the course of this transition I discovered:

Bugs in my code

Not unexpected, in all honesty.

There were two main issues; the first was relating to how I handled the 304 response, the second was relating to how I performed rewrites for my mercurial repository vhost.

Bugs in node.js

Given how new node.js is there wasn't a huge surprise here either, although I thought I'd been good testing against 0.2.x. As it turned out I needed to run the more recent 0.4.x to avoid a couple of issues:

In short I have a backported node.js package for Squeeze which is almost worthless. I'll update it in the near future. For the moment:

cd node-v0.4.3/
./configure --without-ssl --prefix=/opt/node-0.4.3 && make && make install
ln -fs /opt/node-0.4.3  /opt/node
Oddities in thttpd
thttpd is what actually runs my websites and I discovered during some extended debugging sessions that it just does not like HTTP requests starting with a doubled "/" character.

For example this works fine:

wget http://www.acme.com/software/thttpd/

But this fails:

wget http://www.acme.com//software/thttpd/

Previously it seems that Apache was (silently) fixing this up before it proxied requests. Now I have to do it myself, no big thing, but still a surprise.

All in all it was worth it to be able to run:

dpkg --purge libapache2-mod-rpaf \
             apache2.2-common \
             apache2.2-bin \
             apache2-utils \
             apache2-mpm-prefork \
             apache2 \
             libaprutil1-dbd-sqlite3 \
             libaprutil1-ldap \
             libaprutil1 \

ObQuote: "Bad news. The fog's getting thicker." - Airplane!