Hacker News new | ask | show | jobs
by Zvez 430 days ago
calling everything 'for AI' is the new standard

>if you're reading from, like, big Parquet files, that probably means lots of random reads

and it also usually means that you shouldn't use s3 in the first place for workloads like this. Because they are usually very inefficient comparing to distributed fs. Unless you have some prefetch/cache layer, you will get both bad timings and higher costs

1 comments

But a distributed FS is far more expensive than cloud blob storage would be, and I can't imagine most workloads would need the features of a POSIX filesystem.