About Archive Tags RSS Feed


Discovering back-end servers automatically?

11 March 2014 21:50

Recently I've been pondering how to do service discovery.

Pretend you have a load balancer which accepts traffic and routes incoming requests to different back-ends. The loadbalancer might be pound, varnish, haproxy, nginx, or similar. The back-ends might be node applications, apache, or similar.

The typical configuration of the load-balancer will read:

# forward

# backends
backend web1  { .host = ""; }
backend web2  { .host = ""; }
backend web3  { .host = ""; }

#  afterword

I've seen this same setup in many situations, and while it can easily be imagined that there might be "random HTTP servers" on your (V)LAN which shouldn't receive connections it seems like a pain to keep updating the backends.

Using UDP/multicast broadcasts it is trivial to announce "Hey I'm a HTTP-server with the name 'foo'", and it seems to me that this should allow seamless HTTP load-balancing.

To be more explicit - this is normal:

  • The load-balancer listens for HTTP requests, and forwards them to back-ends.
  • When back-ends go away they stop receiving traffic.

What I'd like to propose is another step:

  • When a new back-end advertises itself with the tag "foo" it should be automatically added and start to receive traffic.

i.e. This allows backends to be removed from service when they go offline but also to be added when they come online. Without the load-balancer needing its configuration to be updated.

This means you'd not give a static list of back-ends to your load-balancer, instead you'd say "Route traffic to any service that adfvertises itself with the tag 'foo'.".

VLANS, firewalls, multicast, udp, all come into play, but in theory this strikes me as being useful, obvious, and simple.

(Failure cases? Well if the "announcer" dies then the backend won't get traffic routed to it. Just like if the backend were offline. And clearly if a backend is announced, but not receiving HTTP-requests it would be dropped as normal.)

If I get the time this evening I'll sit down and look at some load-balancer source code to see if any are written in such a way that I could add this "broadcast discovery" as a plugin/minor change.



Comments on this entry

icon Matt at 14:55 on 11 March 2014

The way I've seen recently, is to use something like etcd1 to hold the location of your backends. The backend itself would register its location in etcd with a key that has a lifetime, and reregister on some sort of heartbeat. etcd can then be configured to call a script when a value is added or removed and update your haproxy configuration with the valid backends.


1: https://github.com/coreos/etcd

icon Steve Kemp at 14:58 on 11 March 2014

Yes, that does seem to be all the rage. I was pondering storing back-ends in redis, or some other system, but I suspect that's not the hard part.

The hard part is getting the load-balancer to poll backends from an external source - all the tools I've looked at so far assume they are static and come from the configuration file.

(Thus far I've glanced at nginx, varnish, pen, pound, haproxy, and fair.)

icon Anonymous at 16:06 on 11 March 2014

Rather than doing broadcasts, why not have the backends connect to the load balancer and notify it?

Bonus: you could even add security to that by making an authenticated connection to the load balancer, so it would work across an untrusted network (such as that of a hosting provider).

icon Steve Kemp at 16:11 on 11 March 2014

That would also work.

I guess the key thing is that I don't want a cronjob to update the configuration file and restart/reload the load-balancer. That is too prone to problems in my experience.

Instead of that I'd be thinking of something like one of the following:

  • As you say the back-ends alerting the load-balancer, via a HTTP-POST, or some other mechanism.
  • The back-ends broadcasting, and the load-balancer noticing these broadcasts.
  • The load-balancer and the back-ends each using a mutually agreeable intermediary store such as redis, etcd, or similar.

There are probably more options that I'm missing, but from my point of view the key is making something that can allow the dynamic discovery of back-ends.

icon Stephen Gran at 16:23 on 11 March 2014

lots of ways to skin this cat. Cloud software like AWS and openstack have autoscaling groups where the members are poked into the load balancer automagically. Apache zookeeper is a service discovery tool designed to track things that announce what they do. There's also a new tool, serf (http://www.serfdom.io/intro/) that looks like it might be interesting.

icon Steve Kemp at 16:32 on 11 March 2014

serf seems like a very well thought out tool, thanks for sharing.

(The only downside I see is you need to configure the client to talk to one existing node. If it could auto-discover peers that would be perfect.)

icon cargill at 17:55 on 11 March 2014

For the discovery part, there are a couple standardised protocols that you could use, some of them are quite simple to set up (mDNS), some are quite complex (SLP):
- https://en.wikipedia.org/wiki/Service_Location_Protocol
- https://en.wikipedia.org/wiki/DNS-SD#DNS-SD
- https://en.wikipedia.org/wiki/Simple_Service_Discovery_Protocol

icon Steve Kemp at 18:47 on 11 March 2014

So pen looked interesting:

  • You can launch it as a load-balancer, and configure it to create a control-socket.
  • Using the control socket you can add/remove back-ends on the fly.

Unfortunately I cannot trust it, because if you configure a control-socket anybody who can access it can overwrite arbitrary files on your server. Meh.

(Plus of course the damn thing runs as root, to allow it to bind to :80/:443. Allowing root-owned files to be trashed.)

icon Steven C. at 21:53 on 11 March 2014

Not what you wanted to do here, but an example of doing this at a lower layer would be for all foo-servers to share a CARP IP address (or some Linux equivalent of this feature). Each Internet-routed source IP maps to one of the alive machines in the pool, so you wouldn't want (or need) the separate load balancer machine[s] here. You'd probably run a separate Nginx/varnish instance on each machine instead.

icon Steve Kemp at 21:56 on 11 March 2014

Steven C: Doesn't that cut down on throughput though?

i.e. If I use ucarp then only one of the hosts would be serving traffic at once because only one host could have the virtual IP up at any given time?

Sure you do gain redundancy, or hot-failover, but I've had to got to fiddly lengths to have all nodes in a cluster handle traffic when using ucarp, rather than having them sit idle in the atypical event there is an outage of one host.

icon Steven C. at 00:57 on 12 March 2014

Steve: I'm not sure about ucarp, but traditional carp should distribute traffic among all active nodes, in an already somewhat load-balanced way according to a hash of the source IP address.

That's why it's preferable not to have a load balancer for a carp setup. It might forward traffic from its own, single IP address and always get the same node. (I think OpenBSD designed their load balancer to spoof the original source IP on forwarded traffic, probably for that and other reasons [easier logging, or firewalling on the nodes if needed]). We want carp to see the original, Internet-routed source IP address to get proper load balancing.

The hash should give a static mapping for all traffic flows from a given source IP to go to the same node - unless the node goes down - then the next (failover) node won't recognise the established TCP connections and reset them.

icon Steven C. at 01:08 on 12 March 2014

I find the cache invalidation issue interesting when you have more than one instance; I think this is something more aptly handled at the application layer. The backends might either need to do service discovery of all caches, or send some multicast purge request to all of them.

(You'd also get the slight inefficiency of having separate caches, but I doubt it's worth the effort of trying to exchange cacheable objects with each other.)

icon Steve Kemp at 08:30 on 12 March 2014

Yeah caching is hard as soon as you have more than one back-end.

I wrestled with the problem and decided there were two main approaches:

  • Application layer-caching; have all nodes use a common redis/memcached instance.
  • Proxy-level caching, using varnish, squid, nginx, or similar.

If you have control of your application you get a lot of gain via a shared storage area; the downside is that you suddenly have a single point of failure.

As documented in the link I shared previously I went with varnish, but the caching is naive. I cache almost everything - but then blow it away whenever I get a successful "HTTP POST".

(More or less. Every server-side operation that changes state calls "Cache::Flush" when it thinks the state has changed. So a handler serving an edit-profile form doesn't flush the cache, but when a submission occurs it does. In practice I could be way way more careful, but the programmer complexity goes away in my scheme and the hit rate is over 80% because most people are read-only users of the site, they don't post comments, blog entries, or vote.)

icon Christian at 14:18 on 12 March 2014

We manage our caching server backend configs through saltstack. We provision apache nodes through salt and tag them with roles like production and balancePool1. Then salt, as the final stage of building a new apache node, kicks off the refresh varnish backends state. That job iterates through the known webservers and adds them to varnish as appropriate.

I like this scheme as it allows us to configure what our environment should look like, versus relying on broadcasts which configure on what it is currently. I would rather define how I want it to look, and get alerted if the environment deviates from that.