Yeah you basically do. Sure you can reroute the traffic internally over the private global network to the relevant server, but that's going to use unnecessary bandwidth and add cost.
By sharding/routing with DNS, the client and public internet deal with that and allow AWS to save some cash.
Bear in mind, S3 is not a CDN. It doesn't have anycast, PoPs, etc.
In fact, even _with_ the subdomain setup, you'll notice that before the bucket has fully propagated into their DNS servers, it will initially return 307 redirects to https ://<bucket>.s3-<region>.amazonaws.com
I'm not sure you understand how anycast works. It would be very shocking if Amazon didn't make use of it and it's likely the reason they do need to split into subdomains.
Anycast will pull in traffic to the closest (hop distance) datacenter for a client, which won't be the right datacenter a lot of the time if everything lives under one domain. In that case they will have to route it over their backbone or re-egress it over the internet, which does cost them money.
Google Cloud took a different approach based on their existing GFE infrastructure. It does not really seem to have worked out, there have been a couple of global outages due to bad changes to this single point of failure, and they introduced a cheaper networking tier that is more like AWS.
I don't think that's true. Route53 has been using Anycast since its inception [0].
The Twitter thread you linked simply points out that fault isolation is tricky with Anycast, and so I am not sure how you arrived at the conclusion that you did.
Got it, thanks. Are there research papers or blog posts by Google that reveal how they resume transport layer connections when network layer routing changes underneath it (a problem inherent to Anycast)?
I do understand how it works and can confirm that AWS does not use it for the IPs served for the subdomain-style S3 hostnames.
Their DNS nameservers which resolve those subdomains do of course.
S3 isn't designed to be super low latency. It doesn't need to be the closest distance to client - all that would do is cost AWS more to handle the traffic. (Since the actual content only lives in specific regions.)
Added to my comment, but basically S3 is not a CDN - it doesn't have PoPs/anycast.
They _do_ use anycast and PoPs for the DNS services though. So that's basically how they handle the routing for buckets - but relies entirely on having separate subdomains.
What you're saying is correct for Cloudfront though.
They could do that, but they have absolutely no incentive to do so - all it would do is cost them more. S3 isn't a CDN and isn't designed to work like one.
By sharding/routing with DNS, the client and public internet deal with that and allow AWS to save some cash.
Bear in mind, S3 is not a CDN. It doesn't have anycast, PoPs, etc.
In fact, even _with_ the subdomain setup, you'll notice that before the bucket has fully propagated into their DNS servers, it will initially return 307 redirects to https ://<bucket>.s3-<region>.amazonaws.com
This is for exactly the same reason - S3 doesn't want to be your CDN and it saves them money. See: https://docs.aws.amazon.com/AmazonS3/latest/dev/VirtualHosti...