Hacker News new | ask | show | jobs
by gberger 44 days ago
S3: Simple Storage Service. It's a building block, and it's only natural other abstractions are built on top of it.
2 comments

Agree it doesn't have to be part of S3 itself. My point is that there is a missing semantic layer.

In practice, many teams use S3 directly without any layer on top. So without better organizational capabilities, they can't keep track of what they have stored where, who created it, whether it is still used, etc.

And when teams do use a catalog, it's usually detached from the storage layer itself, so you can't easily view a dataset in the catalog and know how much it costs, who accessed it, and so on.

Have you seen better places that figured out a better way to handle this? Without a ton of custom tooling?

No but why doesn’t this object-storage-primitive accommodate all my specific requirements already?

They should also accommodate my need for all POSIX filesystem API’s included cheap-moves and renames!!!!!

/s

POSIX isn't the ask. Datasets are. The need to keep track of what data you have stored is universal, not my specific requirement.
I make the (glib) comment, because it’s a similar argument to the one that was popular a few years ago.

S3 is an object store. Treat it more like a KV store. As other comments have pointed out, the solution here is pick-your-favourite-metadata-store, be it Postgres, or what iceberg does, and other data on S3.