|
|
|
|
|
by tbrock
2299 days ago
|
|
This happened to us at Hustle years ago. Basically if you run on AWS there’s a DNS server provided inside each VPC that usually works fine but which has no observable load metrics etc... so you don’t really know you are slamming it and are about to have a problem unless you audit your entire codebase. Why? Well that tiny DNS server has certain capacity constraints and if you don’t cache DNS lookups by using a http/https agent for example (in NodeJS) you wind up looking up the same dns info over and over and churning sockets like it’s going out of style. If you run really really hot the poor thing falls over (rightly so). The limits are high and DNS is fast so you usually don’t notice but when you are under load bugs like this come out of the woodwork. When it falls down you look up the AWS docs, lean back in your chair upon finding this isn’t an “elastic” part of AWS and say “FUUUUUUUUCK” so loud it can be heard from outer space. If you are Robinhood though don’t you have some former Netflix SRE/DevOps beast on staff that knows this and so you run your own DNS and monitor it? |
|
Apparently not on Linux! https://stackoverflow.com/questions/11020027/dns-caching-in-...