About Archive Tags RSS Feed


Entries tagged caching

When you want to go to it

4 October 2007 21:50

Here's a quick question - does there exist a stable and reliable caching proxy for APT?

Both apt-proxy and approx cause me problems on a regular basis - with MD5 sum mismatches on Release files. And general hangs, stalls and timeouts.

I've just installed and configured apt-cacher but #437166 doesn't fill me with confidence..

| No comments


That is it, I'm going to do it

22 March 2014 21:50

That's it, I'm going to do it: I have now committed myself to writing a scalable, caching, reverse HTTP proxy.

The biggest question right now is implementation language; obviously "threading" of some kind is required so it is a choice between Perl's anyevent, Python's twisted, Rubys event machine, or node.js.

I'm absolutely, definitely, not going to use C, or C++.

Writing a a reverse proxy in node.js is almost trivial, the hard part will be working out which language to express the caching behaviour, on a per type, and per-resource basis.

I will ponder.



Some things on DNS and caching

29 March 2014 21:50

Although there wasn't too many comments on my what would you pay for? post I did get some mails.

I was reminded about this via Mario Langs post, which echoed a couple of private mails I received.

Despite being something that I take for granted, perhaps because my hosting comes from the Bytemark, people do seem willing to pay money for DNS hosting.

Which is odd. I mean you could do it very very very cheaply if you had just four virtual machines. You can get complex and be geo-fancy, and you could use anycast on a small AS, but really? You could just deploy four virtual machines0 to provide a.ns, b.ns, c.ns, d.ns, and be better than 90% of DNS hosters out there.

The thing that many people mentioned was Git-backed, or Git-based DNS. Which would be trivial if you used tinydns, and no much harder if you used bind.

I suspect I'm "not allowed" to do DNS-things for a while, due to my contract with Dyn, but it might be worth checking...

ObRandom: Beat me to it. Register gitdns.io, or similar, and configure hooks from github to compile tinydns records.

In other news I started documenting early thoughts about my caching reverse proxy, which has now got a name stockpile.

I wrote some stub code using node.js, and although it was functional it soon became callback hell:

  • Is this resource cachable?
  • Does this thing exist in the cache already?
  • Should we return the server's response to the client, archive to memcached, or do both?

Expressing the rules neatly is also a challenge. I want the server core to be simple and the configuration to be something like:

is_cachable ( vhost, source, request, backened )
     * If the file is static, then it is cachable.
    if ( request.url.match( /\.(jpg|png|txt|html?|gif)$/i ) ) {
        return true;

     * If there is a cookie then the answer is false.
    if ( request.has_cookie? ) { return false ; }

     * If the server is alive we'll now pass the remainder through it
     * if not then we'll serve from the cache.
    if ( backend.alive? ) {
        return false;
    else {
        return true;

I can see there is value in judging the cachability based on the server response, but I plan to ignore that except for "Expires:", "Etag", etc ,etc)

Anyway callback hell does make me want to reexamine the existing C/C++ libraries out there. Because I think I could do better.