|
|
|
|
|
by redserk
929 days ago
|
|
This is dependent on your usecase, what types of storage you use, familiarity with tuning systems, setting up raid layouts, etc. I love ZFS. It's incredibly powerful. It's also incredibly easy to screw up when designing how you want to set up your drives, especially if you intend to grow your storage. This also isn't including the effort needed to figure out how to make your filesystem redundant across datacenters or even just between racks in the same closet. At the end of the day, if I screw up setting something on EFS I can always create a new EFS filesystem and move my data over. If I screw up a ZFS layout, I'm going to need a box of temporary drives to shuffle data onto while I remake an array. |
|
True, but…
At EFS pricing, this seems like the wrong comparison. There’s no fundamental need to ever grow a local array to compete — buy an entirely new one instead. Heck, buy an entirely new server.
Admittedly, this means that the client architecture needs to support migration to a different storage backend. But, for a business where the price is at all relevant, using EFS for a single month will cost as much as that entire replacement server, and a replacement server comes with compute, too. And many more IOPS.
In any case, AWS is literally pitching using EFS for AI/ML. For that sort of use case, just replicate the data locally if you don’t have or need the absurdly fast networks that could actually be performant. Or use S3. I’m having trouble imagining any use case where EFS makes any sort of sense for this.
Keep in mind that the entire “pile” fits on ~$100 of NVMe SSD with better performance than EFS can possibly offer. Those fancy “10 trillion token” training sets fit in a single U.2 or EDSFF slot, on a device that speaks PCIe x4 and costs <$4000. Just replicate it and be done with it.