About Archive Tags RSS Feed

 

Entries tagged haproxy

Redesigning my clustered website

7 February 2016 21:50

I'm slowly planning the redesign of the cluster which powers the Debian Administration website.

Currently the design is simple, and looks like this:

In brief there is a load-balancer that handles SSL-termination and then proxies to one of four Apache servers. These talk back and forth to a MySQL database. Nothing too shocking, or unusual.

(In truth there are two database servers, and rather than a single installation of HAProxy it runs upon each of the webservers - One is the master which is handled via ucarp. Logically though traffic routes through HAProxy to a number of Apache instances. I can lose half of the servers and things still keep running.)

When I setup the site it all ran on one host, it was simpler, it was less highly available. It also struggled to cope with the load.

Half the reason for writing/hosting the site in the first place was to document learning experiences though, so when it came to time to make it scale I figured why not learn something and do it neatly? Having it run on cheap and reliable virtual hosts was a good excuse to bump the server-count and the design has been stable for the past few years.

Recently though I've begun planning how it will be deployed in the future and I have a new design:

Rather than having the Apache instances talk to the database I'll indirect through an API-server. The API server will handle requests like these:

  • POST /users/login
    • POST a username/password and return 200 if valid. If bogus details return 403. If the user doesn't exist return 404.
  • GET /users/Steve
    • Return a JSON hash of user-information.
    • Return 404 on invalid user.

I expect to have four API handler endpoints: /articles, /comments, /users & /weblogs. Again we'll use a floating IP and a HAProxy instance to route to multiple API-servers. Each of which will use local caching to cache articles, etc.

This should turn the middle layer, running on Apache, into simpler things, and increase throughput. I suspect, but haven't confirmed, that making a single HTTP-request to fetch a (formatted) article body will be cheaper than making N-database queries.

Anyway that's what I'm slowly pondering and working on at the moment. I wrote a proof of concept API-server based CMS two years ago, and my recollection of that time is that it was fast to develop, and easy to scale.

| 8 comments