Hacker News new | ask | show | jobs
by winter_blue 270 days ago
Does S3 really have that high of a latency? So high that —— if you run a static file several in an EC2, would that be faster than S3?
2 comments

Yes, definitely. S3 has a time to first byte of 50-150ms (depending on how lucky you are). If you're serving from memory that goes to ~0, and if you're serving from disk, that goes to 0.2-1ms.

It will depend on your needs though, since some use cases won't want to trade off the scalability of S3's ability to serve arbitrary amounts of throughput.

In that case you run the proxy service load balanced to get desired throughput or run a sidecar/process in each compute instance where data is needed .

You are limited anyway by the network capacity of the instance you are fetching the data from .

S3 has a low-latency offering[0] which promises single digit millisecond latency, I’m surprised not to see it mentioned.

[0]: https://aws.amazon.com/s3/storage-classes/express-one-zone/

These are, effectively, different use cases. You want to use (and pay for) Express One Zone in situations in which you need the same object reused from multiple instances repeatedly, while it looks like this on-disk or in-memory cache is for when you may want the same file repeatedly used from the same instance.
Is it the same instance ? Rising wave (and similar tools )are designed to run in production on a lot of distributed compute nodes for processing data , serving/streaming queries and running control panes .

Even for any single query it will likely run on multiple nodes with distributed workers gathering and processing data from storage layer, that is whole idea behind MapReduce after all.

Also, aren't most people putting Cloudfront in front of S3 anyway?
For CDN use-cases yes, but not for DB storage-compute separation use-cases as described here.