Hacker News new | ask | show | jobs
by 1vuio0pswjnm7 1716 days ago
"I'm not sure why Instagram's frontend servers returned 503, though."

One explanation is Facebook uses a proxy configuration that requires DNS in order to resolve the internal IP addresses for the backend servers. High availability proxy servers like haproxy can easily use files loaded into memory to do lookups, instead of making DNS requests. Apparently Facebook had no backup plan if the DNS method started failing. Facebook remained down until their DNS servers became available. The proxies continued to work and no doubt the backend servers were available the entire time, but proxies could not connect to them because the DNS lookups for their internal IP addresses (serv)failed. After the retried DNS queries finally timeout, a 503 is returned.

"Maybe their backend fleet was included in the withdrawn prefixes..."

According to Cloudflare's writeup the only prefixes withdrawn were for DNS servers.

1 comments

Another possibility is that failing to announce the prefixes for their DNS server IPs was just a symptom of a larger problem, like misconfigured routers.