Hacker News new | ask | show | jobs
by throw5323446 814 days ago
> Instead of messing around trying to repair it, simply kill the machine, or take it out of the pool. Get a new one.

"4:10pm the new machine still has the same performance issue"

3 comments

Sure, but more often than not - esp in cloud scenarios, sometimes you just get a machine that is having a bad day and it’s quicker to just eject it, let the rest of the infra pick up the slack, and then debug from there. Additionally if you’ve axed a machine, and got the same issue, you know it’s not a machine issue, so either go look at your networking layer or whatever configs you’re using to boot your machines from…
> esp in cloud scenarios

... so the nice thing about the about the cloud is that you can workaround cloud-specific issues?

4:20pm Turns out it was DNS
That made me laugh. Thank you. Of course, it is not DNS. DNS has become the new cabling. DNS is not especially complicated, but cabling is neither. Yet, during dot.com and subsequent years the cabling was causing a lot of the problems so that we get used to first check the cabling. But it only took a few more years to realize that it is not always cabling, actually failures are normally distributed.

Is it wrong to check DNS first? No, but please realize that DNS misconfiguration is not more common than other SNAFUS.

    It’s not DNS
    There’s no way it’s DNS
    It was DNS
Certificates are the new DNS for service breakages
That's actually amazing, a reproducible problem is a 90% solved problem!