| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by chronid 3741 days ago

DNS is hard. Very hard.

It may seems trivial when it works (hint: it's not), but some of the biggest fuck ups I've seen in my professional life were caused by strange DNS things happening or DNS servers going kaboom.

I feel the pain of the DO engineers trying to mitigate this issue. I really do.

3 comments

johansch 3741 days ago

BS. DNS is a trivial thing to scale, compared to most other web-scale efforts.

Things break when people don't use 20 year old best practices. There is no defense against inexperience and ignorance.

link

takeda 3741 days ago

I took the OPs comment as "it's hard to understand DNS and biggest fuck ups happen because people think they understand DNS when they actually don't".

The problem with DNS is that it can work even when it is configured incorrectly. This makes people who has no idea what they are doing that they actually understand it. The strange issues with DNS only happen with strange configurations. When you follow best practices everything is predictable.

link

johansch 3741 days ago

All right. This I can agree with.

link

thyrsus 3741 days ago

Please help the ignorant and provide a link to a description of those best practices.

link

dividuum 3741 days ago

> I feel the pain of the DO engineers trying to mitigate this issue. I really do.

Me too. Just last week they had another problem with DNS on the client side of things: Resolving with the Google Public DNS, which most droplets use by default, didn't work reliably. I hope that they post a combined post mortem for both of those incidents.

link

Thaxll 3741 days ago

It's not hard, the problem is everything relies on DNS so when DNS goes down or has problems you have cascading failure.

link

bitJericho 3741 days ago

That's why you use multiple providers.

link

scurvy 3741 days ago

I'll bite. You can have multiple NS records but only a single SOA. The .com registry minimum TTL for SOA is a day.

How in the world would "multiple providers" help you in a 6 hour outage?

link

bitJericho 3741 days ago

You can have multiple NS records. You should have ns records that point to different companies DNS servers, and preferably different continents.

link

scurvy 3741 days ago

That's great for NS records. What about SOA?

link

bitJericho 3741 days ago

Soa isn't used for resolving names afaik.

link

vidarh 3741 days ago

Unless you use a better resolver than the standard glibc resolver on Linux (e.g. dnsmasq, bind or similar running locally and pointing resolv.conf at it), you appear doomed to slow lookups etc. if your first resolv.conf entry fails, as most of the resolv.conf options that might have helped (if you'd set them) simply don't work or doesn't do anything particularly useful in the versions used in the Linux distro's I've tested it on.

link

pbarnes_1 3741 days ago

unbound is great for this.

link

jbaptiste 3741 days ago

Even with multi providers, DNS issues are a cluster fuck.

link

takeda 3741 days ago

Only if you don't know what you're doing. The problem with DNS is that it might work even when it is misconfigured, and misconfiguration is the source of strange issues.

link

camikazeg 3741 days ago

I think that we all have areas where we don't know what we're doing. This is one of mine. With all the talk of how obvious/important/easy it is to have a failover in place in case this happens, I'm having trouble finding a good resource about setting up a redundant DNS. Running a droplet on Digital Ocean with Debian and Nginx.

link

zaroth 3741 days ago

Sounds like we're begging for someone to write a nice blog post for how they set up redundant DNS across multiple providers "the right way"... Sounds like it would hit the front page in short order if anyone is willing to share how they think this should be mitigated, and specifically how to expect common clients to behavior in that case when faced with the various types of outages that may occur!

link

bitJericho 3741 days ago

Yeah I mean it takes work to do it correctly. I wouldn't call it a cluster fuck.

link

dsr_ 3741 days ago

Suppose you have multiple providers, but one of them screws up and authoritatively denies the existence of all of your hosts?

link

bitJericho 3741 days ago

That's what you keep an extremely low ttl for.

link

Karunamon 3741 days ago

Which doesn't mean much when a nontrivial amount of ISPs out there don't respect the TTL settings.

Source: Days-long service degradation caused by customer ISP's caching bad DNS information well beyond the 10 minute TTL we had set.

link