Hacker News new | ask | show | jobs
by ytsb 3630 days ago
How do you take a server out of rotation if one goes down? For example, if I hit a server in the round robin that isn't online, I could try force-refresh the page to initiate a new lookup but in that case 1 in every 3 requests to the site would still fail (assuming only one server is down, and hoping my client doesn't decide to cache the initial resolution?). In your post you mentioned a browser can handle this transparently - did you mean that if the client can see that the domain has multiple A records and the initial connection fails on one of the IPs, it will automatically try to establish a connection on the next IP in the round robin if the first connection times out, is that correct? Is this a browser standard or is this behaviour handled differently between different browsers?

Either way interesting project, thanks for sharing.

EDIT: Also it'd be cool if the code wasn't in a tarball. I'm on a mobile device (as I'm sure many of your users are too) that doesn't allow me to save/extract the archive. Would've liked to have had a browse through it! Maybe consider uploading to a service like GitHub, or having an extracted version available so we can view the contents directly in a browser? :)

1 comments

I don't have to do anything to take a server out of rotation. Browsers automatically try all IPs, and then stick with the first IP they find that works. Even if some servers are down, your HTTP requests continue to all work. This is a standard browser behavior. For a really extensive outage (2+ days), I would probably bother to manually update the DNS records.

Thanks for the feeback about making the code browsable. Will consider.

That's only true if origin doesn't respond at all. If one of your IPs accepts connections and hangs, then the browser will display an error.
The browser would time out after a few minutes. However if the user stops loading the page and hits Reload, the browser will try another IP and the page will load successfully. I verified this behavior with Chrome. For a personal blog, that's definitely "good enough" HA.
If a webpage I view hangs on loading, I'm going to bounce and not come back. Because it's just a blog makes me less inclined to wait it out.
When you think back on it, this was the true genius behind RSS aggregators. I could still read your blog when your blog was not working, because Google Reader (or your favorite aggregator) had already downloaded it once. It's too bad that RSS died.
Yes! Absolutely.
True. However this specific type of failure should be relatively rare. I expect most of my outages to be either network issues or a powered off machine (TCP timeout) or the web service is not running (TCP RST). Accepting a connection and not replying is just rare for something as dumb as an HTTP request for a static file.