Hacker News new | ask | show | jobs
by nr378 103 days ago
Based on the docs and API surface, I think the filesystem abstraction is probably copy-on-mount backed by object storage.

I suspect it works as follows: when a task starts, filesystem contents sync down from S3/R2/GCS to a local directory, which gets bind-mounted into the container. The agent reads and writes normally - no FUSE, no network round-trips per file op. On task completion or explicit sync, changes flush back to object storage. The presigned URL support for upload/download is the giveaway that object storage is the source of truth.

This makes way more sense than FUSE for agent workloads. Agents do thousands of small reads (find, grep, git status) that would each be a network call with FUSE. With copy-on-mount it's all local disk speed after initial sync.

Cross-task sharing falls out naturally - two tasks mounting the same filesystem ID just means two containers syncing from the same S3 prefix. Probably last-write-wins rather than distributed locking, which is fine since agents rarely have concurrent writes to the same file.

2 comments

That's a good analysis:) We want to go with FUSE but the performance overhead, especially with multiple calls to use files, is a constraint
How have you determined that? You can easily push 6GB/s+, sub ms ttfb with networked filesystems, and hundreds of thousands of iops through fuse.
sprites.dev / fly.io has publicly said they are using a variant of JuiceFS for the object-storage-to-VM-filesystem stuff, it's cool tech.

* https://fly.io/blog/design-and-implementation/ * https://juicefs.com