About Archive Tags RSS Feed


Dynamically discovering settings for a cluster?

6 September 2013 21:50

Pretend I run a cluster, for hosting a site. Pretend that I have three-six web-nodes, and each one needs to know which database host to contact.

How do I control that?

Right now I have a /etc/settings.conf file, more or less, deployed by Slaughter. That works. Another common pattern is to use a hostname - for example pmaster.example.org.

However failover isn't considered here. If I wanted to update to point to a secondary database I'd need to either:

  • Add code to retry the second host on failure.
    • Worry about divergence if some hosts used DB1, then DB2, then DB1 came back online.
    • Failover is easy. Fail-back is probably best avoided.
  • Worry about DNS caches and TTL.

In short I'm imagining there are several situations where you want to abstract away the configuration in a cluster-wide manner. (A real solution is obviously floating per-service IPs. Via HAProxy, Keepalived, ucarp, etc. People do that quite often for database specifically, but not for redis-servers, etc.)

So I'm pondering what is essentially a multi-cast accessible key-value storage system.

Have a deamon on the VLAN which will respond to multicast questions like "get db", or "get cache", with a hostname/IP/result.

Suddenly your code would read:

  • Send mcast question ("which db?").
  • Get mcast reply ("db1").
  • Connect to db1.

To me that seems like it should be genuinely useful. But I'm unsure if I'm trading one set of problems for another.

I can't find any examples of existing tools/deamons in this area, which either means I'm being novel, innovate, and interesting. Or I'm over thinking...



Comments on this entry

icon Steve Kemp at 10:20 on 6 September 2013

A colleague has pointed me at etcd - which looks like a similar system I'd never heard of.

Maybe I do need to look at go after all. Or learn how zeroconf/ahavi work.

icon Steven C. at 12:12 on 6 September 2013

With CARP you might be doing basically the same thing with ARP packets. Knowing that host 'db' is, it asks with a broadcast on that subnet, "at which ethernet address is this IP?" It gets a reply from db1, and so uses that.

Of course if your network doesn't allow the floating IP, or your DB nodes are not all on the same subnet, you might have to reinvent this at a higher layer like with avahi.

That's ARP balancing mode. In IP balancing mode, all traffic to is multicast to all DB's. Exactly one node will answer, (hopefully) knowing which other db nodes are down, and standing in for them.

In both cases you have a choice of a designated 'active' DB with others on standby. Or a load-balanced setup, by taking a unique ID (or source address) of the client, modulo the number of active servers.

icon Steve Kemp at 13:53 on 6 September 2013

Interesting answer, and yes you're right that's almost exactly how ARP stuff works.

I'm only considering the case here of a private VLAN, because that means you're not having to cope with malicious local clients.

For the moment I've experimented with reimplementing a small subset of etcd - A node HTTP server that spits out JSON, and can be updated via multicast broadcasts.

That solves my immediate problem and looks like it will be an interesting tool for the future - even with the obvious caveats it possesses.

icon Stephen Gran at 14:06 on 6 September 2013

This idea is called 'service discovery', and there's lots of ways to skin this cat. At one end of the spectrum are things like avahi/zeroconf, and at the other are things like zookeeper. It depends on whether you're happy to have anything on the network announcing a service be canonical, or whether you'd prefer to have a gatekeeper in the form of an API driven service-discovery mechanism.


icon Steven C. at 17:59 on 6 September 2013

If mstore ran on every client, could it proxy connections/requests through to the 'active' DB node? That way your client merely try to connect to a DB at 'localhost' which has magically become an HA cluster.

icon Steve Kemp at 18:20 on 6 September 2013

It isn't a proxy.

But yes, in the general case you could wrap arbitrary TCP/IP services via something like rineted, so that all clients connected to localhost.

The big downside is, obviously, you'd have to update the configuration of that proxy when the master changed.

I don't see it as a great solution, and it seems tangential to the problem that mstore is solving.

icon Steven C. at 19:14 on 6 September 2013

I didn't mean for it to be configured statically. I meant, instead of mstore keeping track of the active node, and just telling clients where to find it; mstore could proxy TCP connections there directly, or even better, individual DB queries (if it understands the protocol) through to the db node it knows is active at that exact moment, during a single uninterrupted connection from the client. The client should need no modifications that way, not knowing what's happening behind the scenes.

icon Steven C. at 19:50 on 6 September 2013

And just to widen the scope even more; why aren't DB queries themselves multicast, where one of the cluster of DBs nodes chooses to answer, and/or replicate to the others...

icon Steve Kemp at 20:39 on 6 September 2013

I understood your point; I just was using "get the database hostname" as shorthand for querying many different types of data.

Yes, having a clever auto-updating proxy for MySQL would be useful, but that isn't the problem I'm trying to solve here :)

PS: See-also mysql-proxy, and spockproxy, and etc.