Hacker News new | ask | show | jobs
by mgaunard 68 days ago
Zero mention of s3fs which already did this for decades.
5 comments

This is pretty different than s3fs. s3fs is a FUSE file system that is backed by S3.

This means that all of the non-atomic operations that you might want to do on S3 (including edits to the middle of files, renames, etc) are run on the machine running S3fs. As a result, if your machine crashes, it's not clear what's going to show up in your S3 bucket or if would corrupt things.

As a result, S3fs is also slow because it means that the next stop after your machine is S3, which isn't suitable for many file-based applications.

What AWS has built here is different, using EFS as the middle layer means that there's a safe, durable place for your file system operations to go while they're being assembled in object operations. It also means that the performance should be much better than s3fs (it's talking to ssds where data is 1ms away instead of hdds where data is 30ms away).

It also means that you need to pay for EFS, which is outrageously expensive, to use S3, whose whole purpose is to be cheap.
Of course, you don't need to, this is just a way to opt-in to getting file semantics on top of S3.

The purpose of S3 isn't to be cheap, it's to be simple.

You can also use something like JuiceFS to make using S3 as a shared filesystem more sane, but you're moving all the metadata to a shared database.
Or ZeroFS which doesn’t require a 3rd party database, just a s3 bucket!

https://github.com/Barre/ZeroFS

ZeroFS isn't a shared redundant filesystem.
It's definitely shared, and can be redundant.
A more solid (especially when it comes to caching) solution would be appreciated.

I thought that would be their https://github.com/awslabs/mountpoint-s3 . But no mention about this one either.

S3 files does have the advantage of having a "shared" cache via EFS, but then that would probably also make the cache slower.

I'd assume you can still have local cache in addition to that.
I was thinking: "No way this has existed for decades". But the earliest I can find it existing is 2008. Strictly speaking not decades but much closer to it than I expected.
There's also https://github.com/kahing/goofys, a Go equivalent. A bit of a dead project these days.
Yeah, that blog post was written as if sliced bread has been invented again.

Reading through it, I was only thinking "is this distinguished engineer TOC 2M aware that people have been doing this since forever?".