Hacker News new | ask | show | jobs
by j0rd 3113 days ago
I've always been really skeptical of those language performance comparisons which refer to PHP.

As stated above PHP7 has similar performance when compared to HHVM which was made by Facebook. Additionally PHP has an amazing performance debugger by FB called xhprof.

PHP performance could always be increased hugely by making sure is cache buckets have enough memory (opcode cache, realpath_cache_size).

Secondly with any web framework your app will most likely be limited by IO (database, file lookups, networking) before it becomes limited by actual code execution performance.

If anyone's ever worked on a project where the performance problem was the language and not IO I'd be really interested in hearing about it, but in my career of making websites I've never ran into this problem yet.

4 comments

Being skeptical is a good attitude. I'm very skeptical of this Eldorado that some posters see in PHP7. My experience is quite different.

> Additionally PHP has an amazing performance debugger by FB called xhprof.

xhprof is unmaintained for years. The official version does not compile with PHP7. Various forks exist, but the only stable fork has been rebranded and defaults to sending all the performance data to the branding company.

> PHP performance could always be increased hugely by making sure is cache buckets have enough memory (opcode cache, realpath_cache_size).

Always, really? The history of PHP opcode's caches is complex. Before PHP5.6 where Zend published their opcache+, I've seen the various solutions (APC and others) cause vicious bugs.

> Secondly with any web framework your app will most likely be limited by IO (database, file lookups, networking) before it becomes limited by actual code execution performance.

I've seen quite a few PHP applications that were CPU-bound, and that were far from Facebook's scale. For instance, the learning platform Moodle is popular with universities, but getting it to handle hundreds of concurrent users cannot be done on a plain quad-core server.

Another poster suggested load balancing as an obvious solution, but adding this is not that easy. E.g. Moodle has to track various files, including uploads. So once you add a load balancer and several PHP servers, you need to use a network mount for most of your files, which has a big impact on IO performance.

I'm not saying that other languages are better, because I can't compare the exact same large application in two languages, but PHP7 isn't an Eldorado.

> So once you add a load balancer and several PHP servers, you need to use a network mount for most of your files, which has a big impact on IO performance.

That depends if you set up cachefilesd correctly. Many people think it's enough to do the NFS mount and that's it, but it's not - NFS's mount parameters have a big impact on performance, and having cachefilesd enabled can make performance go through the roof, especially with big files. Without cachefilesd the only caching is in-kernel memory, which can and will lead to files being evicted from the kernel cache...

Same for MySQL - having configured it correctly (or incorrectly!) can make a life-or-server-death difference.

All without having to touch PHPs configuration... there's a reason why a good operations guy is worth his weight in gold.

I’ve given up on NFS entirely and just use one of the big 3 cloud file storage services.

Out of 8 downtime events, 5 were related to NFS mounts.

Hmm. I don't trust any external CDN, to be honest. No matter which one you choose, you lock yourself dead into the vendor - should it decide to kick you off for whatever reason you're toast, but especially I'm afraid of doing a tiny mistake in AWS leading to accidental disclosure of private data.

Such "hacks" have hit too many too big firms for me.

Out of interest, what issues have you had with NFS mounts? I run a fleet of virtual servers with their disks on an NFS-mounted NAS share and backed by cachefilesd, never had a problem with that setup.

Eventually, NFS ends up freaking out and consumes all I/O on the client machine until it is rebooted. I suspect that this occurs when the underlying network is saturated, but don't have evidence of such. CentOS 6/7, NFS v3/v4, tuned every setting I could think of, and spent dozens of hours Googling and reading.

It may be worth mentioning that we had decent throughput with NFS. Roughly 10 writes per second (from 100kb-250mb) and 20 reads per second (IIRC).

We use rclone to do a daily backup from primary -> secondary cloud file store, and use a little wrapper function in app to switch which host we're pulling files from so it's not too hard to failover during an outage.

I do agree with you about vendor lock-in though, it's nasty stuff. At the end of the day it comes down to time allocation. I'm a one-man ops show with too many other things to do than to wake up with a Pingdom alarm at 3AM because of NFS.

Whoa. I have never hit this one, to be honest, in years. Maybe it was something CentOS specific, I have everything I have control of at either Debian or Ubuntu... but I will keep this in mind in case I ever do hit this error.

Might have been worth a try to get a RHEL support contract, but if you're a one-man show and happy with CDN, then that's the better solution for you definitely ;)

> Always, really? The history of PHP opcode's caches is complex. Before PHP5.6 where Zend published their opcache+, I've seen the various solutions (APC and others) cause vicious bugs.

First of all realpath_cache_size is unrelated to opcache and has always worked well. The only problem with that configuration is that the default is waay too low for modern frameworks.

Also I think GP meant that now you can always finetune the opcode cache. The current opcache is built into php 5.5+ and just works. It works so well that there are almost no competitors because there is no need.

Tideways (xhprof fork) maintainer here.

We are working on separating the extension that powers our SaaS APM for PHP and the original xhprof callgraph profiler again into two extensions in the near future :-) Then the extension will still be branded, but it has no code related to sending data to our service and just the backwards compatible profiling API.

> Secondly with any web framework your app will most likely be limited by IO (database, file lookups, networking) before it becomes limited by actual code execution performance.

Latency is additive; it's a sum() operation, not a max() operation. Yes, a large fraction of most web requests get eaten by blocking on IO, but everything on top of that adds up, and small numbers add up surprisingly quickly. And the more you try to be smart about that stuff, whether it's optimizing what you query, do extra caching, try to turn multiple DB roundtrips into a single trip, you often just trade IO time for CPU time (albeit lower). You still care about CPU performance.

> Latency is additive; it's a sum() operation, not a max() operation.

This! And you can only discard some components latency when it disappears in the deviation of another components much larger latency. In most PHP (or Ruby/Python/etc for that matter), this is not the case. They add some 10-150ms on top, and often the db calls are <10ms.

Performance does matter, and is hard to improve once you have many KLOC of code in a slow language. Hence FB spend big bucks on PHP performance improvements.

> and often the db calls are <10ms

Most are <10ms. Those nasty few ones with 100+ms (or sometimes 10+s!) are where the best money is spent optimizing. However good query/db schema optimization is a skill many developers do not possess... which is why a good DBA is worth his weight in gold.

> If anyone's ever worked on a project where the performance problem was the language and not IO I'd be really interested in hearing about it, but in my career of making websites I've never ran into this problem yet.

Our company's product is CPU bound, and our language (Python) is definitely inhibiting our ability to meet basic performance goals. I've done a few crude benchmarks and I'm confident that a rewrite in Go would buy us at least a tenfold performance improvement, although the benchmarks themselves suggest it's closer to 100-200X. Besides being faster out of the box, Go is also easier to optimize--better profiling tooling but also the ability to control memory layout, allocations, and dispatch. And that's all without parallelism.

or alternatively break out the big CPU bound bits into an FFI language, i.e. C, Rust, Etc. That's why Python has these bindings. You can almost sort of do it with Go, but it's not easy, since Go has a VM as well.
Unfortunately, the bottleneck is largely traversing a massive, poorly-defined data structure. If you leave the structure in Python, then FFI doesn't help you much (probably doesn't justify the maintainability cost of FFI). Porting the data structure is a comparable amount of work to rewriting from scratch, so FFI doesn't gain us anything except maybe an iterative avenue toward rewriting.

In any case, the application is still CPU bound, thereby making it an example of a web application which is not IO bound, per the OP's request.

Of course there are CPU bound web-app's. That the OP thinks otherwise is just ignorance. But most web apps are not CPU bound, most are IO bound.

You can start with things like Cython which will give you a modest boost over plain python. I agree it may not be useful in your case, but doing full-rewrites are generally awful, abysmal and a nightmare all combined. Sure, there are cases where you must for various reasons, but they should be avoided if possible.

That's the beauty of doing something like an FFI with C/Rust/etc. You can iteratively move that direction, get code deployed faster and get your performance gains, while also slowly moving into a full-rewrite, but it can happen gradually and you get all the perf enhancements during the re-write, plus you get code deployed in production faster, so you can get all the code tested, and make end-users lives much better in the process of your re-write, instead of making them wait until the bitter end and then finding out that X, Y and Z undocumented features customers were using didn't get included in the re-write.

Anyways all that said, it sounds like you know what you are doing for the most part, so my advice here may not apply to you directly at all, and that's certainly plausible. But I think in general it's fairly obvious to experienced devs that re-writes are an awful plan.

I agree--rewrites are painful, and most web apps are not CPU bound. FFI does permit iterative translation, which is a great bonus, although I don't want to end up owning a bunch of C or Cython. Rust appeals to me, but I don't think my peers would agree. Not sure what we will end up doing, ultimately. :/
What do you mean by VM?

I don't think there's a virtual machine in Go.

Go doesn't have a VM, but it has a garbage collector and scheduler (its runtime) which makes FFI more difficult than languages like Rust or C, to the OP's point.
There isn't I meant GC(Garbage Collector). Sorry.
Depends on the use case. I've definitely worked with projects that required a ton of capacity, because the realtime search functionality was booting the full Symfony framework every time someone typed in a letter. To get rid of those performance problems, someone builds a fragile mess of PHP. In that case something like Go or Node.js is a much better fit to the problem.

But if you're building a big old MySQL CRUD app full of complicated business rules, PHP makes a lot more sense.

> fragile mess > or Node.js

/s ?