Hacker News new | ask | show | jobs
by drbawb 1064 days ago
I am super interested in learning more about the storage subsystem! I figured they'd be using ZFS, given the people involved, but it appears they've also gone ahead and built a clustered FS (crucible) on top of it? I figured something like that would be necessary to handle fault tolerance at the gimlet level. (Losing an entire shelf / drive controller, etc.) Getting ZFS to go multi-node is surely a neat trick.

Second to that I just want to say the presentation of these docs is top notch. (I so desperately wish I was the target customer for these systems; reading these docs makes me want to do terrible things to my electrical service and play with one of these racks.)

2 comments

They use Crucible on top of ZFS. https://github.com/oxidecomputer/crucible I don't think they have anything for S3-like service but there are other options for that, e.g. https://garagehq.deuxfleurs.fr or MinIO. I am not sure whether they have their own SSDs or use of the shelf SSDs just with their firmware or something.
They implemented a block store with replication from scratch? That's kinda brave, considering that that's a project big enough to justify full startups for!
However, the folks at Oxide are at the top of the game for this space with dozens of years of experience in building and testing such systems. Secondly Oxide's crucible stack is completely written from scratch in Rust, which dramatically reduces failure modes common to such stacks, which are often written in C / C++.
They're not the first company to do that. https://panzura.com/ did something similar.