Hacker News new | ask | show | jobs
by RantyDave 3014 days ago
Have you considered getting a single large volume and sharing it over NFS? If this is a bad idea, why?
2 comments

Another option is to use EFS (AWS version of NFS). It's easy to use/set up. However, EFS costs 3x-12x (gp2-sc1) more than EBS. https://www.fittedcloud.com/blog/aws-elastic-file-system-a-q....
Nfs can be slow and have consistency and locking issues under heavy write loads. Also security isn't often top notch.

It really depends on the use case.

Instead of downvotes, please explain why I'm wrong. Not only do I want to know, but it's helpful to other.

These have been my experiences with nfs. It's great for certain things, especially read heavy ones. It's very not great at write heavy or ones that require locks.

None of these statements are inherently true about NFS. They may be applicable to some deployments or applications of NFS, but they don't hold generally speaking.

NFS is used heavily throughout HPC for workloads that don't require a parallel filesystem (and even then, pNFS is plugging along, even if it's largely ignored in favor of Lustre et al for accessing parallel filesystems.)

It is true that there are some very high performance NFS systems out there, throughput optimized like Isilon, and some IOPS optimized like Tintri. A lot of enterprise virtualization systems store their VM disks on NFS and seem to perform decently.

But in a general sense, when I've seen suggestions to use NFS, it hasn't been the correct solution. Like a system that is designed to run on a single server, and someone wants to make it active-active HA by pointing two servers at the same data over NFS. No, that won't work. Or the high IOPS system, and someone wants to move it to a throughput optimized NFS service. Or the team the decided it was a good idea to log from multiple servers to a single file over NFS, and then complained that their log messages were not being written in order. Or having multiple web servers (HA!) serve media off a single NFS server.

In this specific case, if someone is maximizing IOPS per server, it probably isn't a good idea to use a single large volume exported via NFS and share it across them. Max IOPS for gp2 volumes is 10,000, for io1 it is 32,000. Sharing a single 10,000 IOPS volume across a bunch of servers isn't going to get you a bunch of 10,000 IOPS volumes.

Based on my experience, I instinctively question the use of NFS, it is so rarely the right solution in a cloud environment. Sometimes it is the right solution, I'm just saying it is pretty rare.

I used to work at a supercomputing center and nfs wasn't the choice for much of anything. Lustre or AFS were the normal options depending on use case. That was a decade ago.

Also, nfs is bad at applications requiring locks. Perhaps some implementations aren't, but I haven't seen one.

We used pnfs at FNAL across 5000+ physical servers doing analysis of CMS detector data. Worked well.
Interesting. I hadn't used that. Thanks!