Hacker News new | ask | show | jobs
by shamsalmon 2880 days ago
I know HPA is a legit issue but DNS failures seem to be fairly normal in kubernetes. Scaling up kube-dns has helped us resolve that particular issue as well as moving away from Alpine and into minimal Debian images. Alpine has its own DNS issues that caused us much pain.
2 comments

We've had issues with KubeDNS, too. Lots of retries and timeouts on the client side, and lots of conntrack entries.

Libc has pretty slow retries (5s, I think) by default, and until 1.11 hits you can't easily set up resolver configs, though you can inject an envvar separately into each. And musl-based distros like Alpine don't even support some of libc's options, iirc.

We ended up scaling up KubeDNS to 2 replicas and moving them to a dedicated nodepool just to make sure they weren't competing with other nodes. That fixed our issues for now.

Kube-dns (or CoreDNS in newer clusters) is pretty stable in my experience. It's still a very good idea to run more than one replica so that you can tolerate a single node failure, but if DNS failures are "fairly normal" that definitely warrants some additional investigation.
Most dns problems in kubernetes, in my experience, can be traced to udp failures due to the iptables kubeproxy backend.
Thanks we did look into it but not as thoroughly as probably needed. Switching out from Alpine fixed pretty much all our issues.