Hacker News new | ask | show | jobs
by twotwotwo 1902 days ago
When you have a lot of small files, the latency on each operation, though not huge in absolute terms, can end up taking up more time than the data transfer and whatever other work is being done on the content.

(At work, we had an upload job with ~800k files, ranging from <1kb to >100kb. I looked at rearranging how we stored things to avoid small files, but it ended up a straighter shot to continue to use little files but use a worker pool to make the transfer parallel.)

1 comments

(Er, some files >100MB not >100KB. If they maxed out around 100KB we probably wouldn't have picked S3 to store them!)