Hacker News new | ask | show | jobs
by weddpros 3719 days ago
At clanofthecloud.com (Gaming backend as a service), we needed to implement autoscaling for DO, in node.js. We used consul.io for service discovery. CPU load was used as the central metric, and each service instance would store it in consul.io. We then triggered the creation/destruction of instances with thresholds.

In the end, we only used it for our sandbox environment, as the production env runs on bare metal (more capacity, cheaper at scale, easier on admin).

So I'd say, from our experience:

- DO's API was quite easy to work with

- consul.io was used as a reliable distributed source of information, for leader elections and health monitoring... Changing the autoscaler configuration in consul.io produced immediate results like starting/stopping new instances... Cool "remote control" effect ;-)

- haproxy/nginx load balancers use consul.io templates to update their configuration

- our autoscaler was HA, through a leader election. The instances managed themselves (no single point of failure). There were at least 2 instances running.

- you should expect a few "surprises" if you're running consul.io on Digital Ocean, heartbeats are delayed quite often (depends on datacenters), which makes failure detection hard

- and of course, we used DO custom images to start new instances

2 comments

> you should expect a few "surprises" if you're running consul.io on Digital Ocean, heartbeats are delayed quite often (depends on datacenters), which makes failure detection hard

isn't that true for any CP store in any "cloud" provider?

I'd say it's more true of cloud providers who don't schedule your VM for 15+ seconds, from time to time.

As I said, we've deployed our production environment on bare metal (in reaction to this exact problem...)

Thank you. I'll look at consul.io closer