Hacker News new | ask | show | jobs
by nrmitchi 1913 days ago
> These customers are not taking advantage of our DNS management

I think I understand the point you are trying to make, that customers who are utilizing Netlify DNS Management are unaffected because reasons, but this is phrased in a way that implies that it is your users fault for this downtime because they didn't chose to use your related service.

3 comments

Full RCA with the steps the team has taken to improve this setup will be coming soon. The main issue with AWS's DNS solution, in this context, is that they don't support ALIAS records or similar techniques (CNAME flattening, etc) for A records pointing to any external provider. That limits our options a lot in terms of what we can do, since anyone using this setup need to point all their traffic to one or more fixed IP addresses.

Our current solution for the free/self-serve tier of Netlify has been to rely on Google's load balancer product to give people a stable IP pointing to a highly available solution. In light of recent issues, our team has setup a new permanent IP for A records (75.2.60.5) backed by a different solution, but due to the way DNS providers with no ALIAS record support work, it does require our customers to manually change their A records.

I totally get that moving DNS providers is a big deal and we want to give the best experience we can regardless of what provider you're on, but we have to work within the technical limitations of those providers and it's the nature of things that we do have more options to deliver a completely seemless experience when we operate both the DNS and the edge layer for customers.

Route 53 General Manager here. Flattening of external provider CNAMEs has a number of availability and accuracy risks. Route 53 offers a 100% availability SLA, and we really mean it. We’ve heard over and over from customers that reliability is our most valuable feature. We can’t provide that same reliability when external queries are in the mix; if we query asynchronously then features such as geo-based routing don’t work as expected for customers. If we query synchronously, then latency and availability are impacted directly.

We do offer ALIAS records between Route 53 hosted zones, and this capability is open to providers such as Netlify. We’d be happy to have customers ALIAS to a hosted zone managed and updated by Netlify. It sounds like your IP addresses are relatively stable, keeping these in sync doesn’t sound like it would be a big deal, and would give you a lever you could pull to change your customer DNS quickly in an event such as this. You could also configure health checks on your own DNS records, which any customer ALIAS records that point to your DNS records in Route 53 would inherit.

If you’re interested in going this route, please contact me at alecpete <at> amazon <dot> com.

If each Route 53 POP is already close to the querying DNS client, then things like geo routing with cached answers might just work well enough in most cases? With each POP having its own cache.

Auto-refreshing the popular records in the background before the TTL expires to help smooth over any temporary issues?

Other big name DNS providers have ALIAS type records. I imagine according to the SLA, AWS Route 53 is still "available", even if it can't resolve a "target address record" (as the ANAME draft calls them) but Route 53 is still able to respond.

Phrasing can always be better but the point is that there's a way to map your DNS to Netlify which is risky and Netlify hasn't made the aggressive decision of blocking it. They outline in their docs all the reasons why you shouldn't do it, provide instructions for how to avoid it and also offer (but do not require) a hosted DNS setup which avoids this pitfall by design.

Some folks still choose to use this way, some have no other choice for various reasons and some don't care/comprehend the potential pitfalls. I do believe most users avoid using a root domain name for their website.

> I do believe most users avoid using a root domain name for their website.

This is where you're definitely wrong.

I could be. Are you saying this based on data or intuition?
As someone who is a little clueless about network infrastructure: if I own "dwrodri.com", and I'm not running a bunch of other services which need to point to this domain, is there any reason why I wouldn't have my root domain pointed to my personal website?

I would personally imagine that any individual or SOHO business hosting their website on GitHub/GitLab would just buy "MomAndPopShop.com" and point it there. I guess I don't know off the top of my head how many of those sorts of places on the web still exist...

The problem is not that they're pointing their apex domain to a personal website; the problem is that they have a CNAME record in place for their apex domain, which is not actually allowed per the DNS standards
Sadly, even after switching to their DNS I am still affected.
This should not be the case; if you'd like, Netlify's Support team will be happy to review your settings to help discover why it didn't help you out (start from https://netlify.com/support) and ensure that you are "futureproofed"!
I can heartily recommend contacting _fool for support at Netlify. Always an absolute pleasure.