If I’m reading the code right round trips (HTTP requests) go through queueForIdleConn which picks up any pre-existing connections to a host. The only time these connections are cleaned up (in HTTP2) is if keepalives are turned off and the connection has been idle for too long OR the connection breaks in some way OR the max number of connections is hit LRU cache evictions take place.
It should, but like the sibling, I haven't seen what Go does. I've seen it happen elsewhere. Exchange used to cache any answer it got until it restarted. Java has had that behavior from time to time if you're not careful as well.
Querying DNS can be expensive, so it makes sense to build a cache to avoid querying again when you don't need to, but typical APIs for name resolution such as gethostbyname / getaddrinfo don't return the TTL, so people just assume forever is a good TTL. Especially for a persistant (http) connection, it kind of makes sense to never query DNS again while you already have a working connection that you made with that name, and if it's TLS, it's quite possible that you don't check if the certificate has expired while you're connected or if you do a session resumption.
But innocent things like this add up to make operating services tricky. Many times, if you start refusing connections, clients figure it out, but sometimes the caches still don't get cleared.
I don't know about Golang but I swear I've seen this before as well - clients holding on to an old IP address without ever re-resolving the domain name. It makes me wary of using DNS for load balancing or blue-green deployments. I feel like I can't trust DNS clients.
It's been 8-10 years but when I was serving tracking pixels we were astonished how long we still got requests from residential IPs for whole hostnames we had deprecated. That means I would not trust DNS caching anyway. I'm not talking days here, but months, with a TTL set to mere days.
The other reason: you have an open TCP socket that you're actively using. Unless you finish with that connection or it breaks, why would you re-resolve it when you're not running connect() a second time? The failure mode we noticed most when looking into why clients weren't following DNS changes isn't that they were long lived connections, like a server copying a large file or streaming logs. Which isn't unusual if you think about it, just not a short lived web browser or curl-esque connection.
TTL isn't universally respected. Consider the following path:
Your machine -> Local router -> Configured upstream DNS Server (ISP/CF/Quad8/etc) -> ? -> Authoritative DNS Server
Any one of those layers can override/mess with/cache in a variety of ways including TTL. This is why Cloudflare and a variety of other providers use IP anycast. They accepted DNS for what it is and worked around it.
Not only is the IP always the IP, the "global" BGP routing table actually universally and consistently updates much faster than DNS. Then whatever routers, machines, etc downstream from that don't matter.
If I’m reading the code right round trips (HTTP requests) go through queueForIdleConn which picks up any pre-existing connections to a host. The only time these connections are cleaned up (in HTTP2) is if keepalives are turned off and the connection has been idle for too long OR the connection breaks in some way OR the max number of connections is hit LRU cache evictions take place.
Furthermore, the golang dnsclient doesn’t even expose record TTLs to callers so how could the HTTP2 transport know when an entry is stale? https://github.com/golang/go/blob/master/src/net/dnsclient_u...