Hacker News new | ask | show | jobs
by jallmann 336 days ago
Good writeup.

> It’s worth noting that DoH (DNS-over-HTTPS) traffic remained relatively stable as most DoH users use the domain cloudflare-dns.com, configured manually or through their browser, to access the public DNS resolver, rather than by IP address.

Interesting, I was affected by this yesterday. My router (supposedly) had Cloudflare DoH enabled but nothing would resolve. Changing the DNS server to 8.8.8.8 fixed the issues.

5 comments

I disagree. The actual root cause here is shrouded in jargon that even experienced admins such as myself have to struggle to parse.

It’s corporate newspeak. “legacy” isn’t a clear term, it’s used to abstract and obfuscate.

> Legacy components do not leverage a gradual, staged deployment methodology. Cloudflare will deprecate these systems which enables modern progressive and health mediated deployment processes to provide earlier indication in a staged manner and rollback accordingly.

I know what this means, but there’s absolutely no reason for it to be written in this inscrutable corporatese.

I disagree, the target audience is also going to be less technical people, and the gist is clear to everyone: they just deploy this config from 0 to 100% to production, without feature gates or rollback. And they made changes to the config that wasn’t deployed for weeks until some other change was made, which also smells like a process error.

I will not say whether or not it’s acceptable for a company of their size and maturity, but it’s definitely not hidden in corporate lingo.

I do believe they could have elaborate more on the follow up steps they will take to prevent this from happening again, I don’t think staggered roll outs are the only answer to this, they’re just a safety net.

If you carry on reading, its quite obvious they misconfigured a service and routed production traffic to that instead of the correct service, and the system used to do that was built in 2018 and is considered legacy (probably because you can easily deploy bad configs). Given that, I wouldn't say the summary is "inscrutable corporatese" whatever that is.
I agree it's not "inscrutable corporatese"

It's carefully written so my boss's boss thinks he understands it, and that we cannot possibly have that problem because we obviously don't have any "legacy components" because we are "modern and progressive".

It is, in my opinion, closer to "intentionally misleading corporatese".

Joe Shmo committed the wrong config file to production. Innocent mistake. Sally caught it in 30 seconds. We were back up inside 2 minutes. Sent Joe to the margarita shop to recover his shattered nerves. Kid deserves a raise. Etc.
Yea the "timeline" indicating impact start/end is entirely false when you look at the traffic graph shared later in the post.

Or they have a different definition of impact than I do

How does DoH work? Somehow you need to know the IP of cloudflare-dns.com first. Maybe your router uses 1.1.1.1 for this.
Yeah, your operating system will first need to resolve cloudflare-dns.com. This initial resolution will likely occur unencrypted via the network's default DNS. Only then will your system query the resolved address for its DoH requests.

Note that this introduces one query overhead per DNS request if the previous cache has expired. For this reason, I've been using https://1.1.1.1/dns-query instead.

In theory, this should eliminate that overhead. Your operating system can validate the IP address of the DNS response by using the Subject Alternative Name (SAN) field within the CA certificate presented by the DoH server: https://g.co/gemini/share/40af4514cb6e

And even if you have already resolved it the TTL is only 5 minutes
Are we meant to use a domain? I've always just used the IP.
You need a domain in order to get the s in https to work
That's not correct.

LetEncrypt are trialling ip address https/TLS certificates right now:

https://letsencrypt.org/2025/07/01/issuing-our-first-ip-addr...

They say:

"In principle, there’s no reason that a certificate couldn’t be issued for an IP address rather than a domain name, and in fact the technical and policy standards for certificates have always allowed this, with a handful of certificate authorities offering this service on a small scale."

right, this was announced about two weeks ago to some fanfare. So in principle there was no reason not to do it two decades ago? It would've been nice back then. I never heard of any certificate authority offering that.
> I never heard of any certificate authority offering that.

DigiCert does. That is where 1.1.1.1 and 9.9.9.9 get their valid certificates from

It the beginning of HTTPS you were supposed to look for the padlock to prove if was a safe site. Scammers wouldn’t take the time and money to get a cert, after all!

So certs were often tied with identity which an IP really isn’t so few providers offered them.

Nope. That is not correct. https://1.1.1.1/dns-query is a perfectly valid DoH resolver address I've been using for months.

Your operating system can validate the IP address of the DNS response by using the Subject Alternative Name (SAN) field within the CA certificate presented by the DoH server: https://g.co/gemini/share/40af4514cb6e

what about certificate for IP address?
What about a route that gets hijacked? There is no HSTS for IP addresses.
Presumably the route hijacker wouldn't have a valid private key for the certificate so they wouldn't pass validation
What about a reverse DNS lookup?
Yeah I don’t understand this part either, maybe it’s supposed to be bootstrapped using your ISP’s DNS server?
Pretty much that. You set up a bootstrap DNS server (could be your ISPs or any other server) which then resolves the IP of the DoH server which then can be used for all future requests.
Funny. I was configuring a new domain today, and for about 20 minutes I could only reach it through Firefox on one laptop. Google's DNS tools showed it active. SSH to an Amazon server that could resolve it. My local network had no idea of it. Flush cache and all. Turns out I had that one FF browser set up to use Cloudflare's DoH.
My (Unifi) router is set to automatic DoH, and I think that means it's using Cloudflare and Google. Didn't notice any disruptions so either the Cloudflare DoH kept working or it used the Google one while it was down.
Check Jallmann’s response https://news.ycombinator.com/item?id=44578490#44578917

TLDR; DoH was working

AFAICS, Jallmann just left 1 comment and it was top-level. I'm not sure what you mean by "Jallmann’s response".
Good writeup except the entirely false timeline shared at the beginning of the post
You need to clarify such a statement, in my opinion.