Hacker News new | ask | show | jobs
by mschuster91 3113 days ago
> So once you add a load balancer and several PHP servers, you need to use a network mount for most of your files, which has a big impact on IO performance.

That depends if you set up cachefilesd correctly. Many people think it's enough to do the NFS mount and that's it, but it's not - NFS's mount parameters have a big impact on performance, and having cachefilesd enabled can make performance go through the roof, especially with big files. Without cachefilesd the only caching is in-kernel memory, which can and will lead to files being evicted from the kernel cache...

Same for MySQL - having configured it correctly (or incorrectly!) can make a life-or-server-death difference.

All without having to touch PHPs configuration... there's a reason why a good operations guy is worth his weight in gold.

1 comments

I’ve given up on NFS entirely and just use one of the big 3 cloud file storage services.

Out of 8 downtime events, 5 were related to NFS mounts.

Hmm. I don't trust any external CDN, to be honest. No matter which one you choose, you lock yourself dead into the vendor - should it decide to kick you off for whatever reason you're toast, but especially I'm afraid of doing a tiny mistake in AWS leading to accidental disclosure of private data.

Such "hacks" have hit too many too big firms for me.

Out of interest, what issues have you had with NFS mounts? I run a fleet of virtual servers with their disks on an NFS-mounted NAS share and backed by cachefilesd, never had a problem with that setup.

Eventually, NFS ends up freaking out and consumes all I/O on the client machine until it is rebooted. I suspect that this occurs when the underlying network is saturated, but don't have evidence of such. CentOS 6/7, NFS v3/v4, tuned every setting I could think of, and spent dozens of hours Googling and reading.

It may be worth mentioning that we had decent throughput with NFS. Roughly 10 writes per second (from 100kb-250mb) and 20 reads per second (IIRC).

We use rclone to do a daily backup from primary -> secondary cloud file store, and use a little wrapper function in app to switch which host we're pulling files from so it's not too hard to failover during an outage.

I do agree with you about vendor lock-in though, it's nasty stuff. At the end of the day it comes down to time allocation. I'm a one-man ops show with too many other things to do than to wake up with a Pingdom alarm at 3AM because of NFS.

Whoa. I have never hit this one, to be honest, in years. Maybe it was something CentOS specific, I have everything I have control of at either Debian or Ubuntu... but I will keep this in mind in case I ever do hit this error.

Might have been worth a try to get a RHEL support contract, but if you're a one-man show and happy with CDN, then that's the better solution for you definitely ;)