Hacker News new | ask | show | jobs
by justindocanto 4037 days ago
Lead Developer of TheDirty.com here. Nice work. We're built on WordPress, running on a single machine, using Nginx + PHP-FPM, and currently serving ~20,000,000 pageviews a month.

Before I took over development we were running on apache w/ vanilla PHP and CPU's would be > 75% on a regular basis. The site frequently ran into the Apache 10k issue you mentioned in your paper as well. After getting the new server setup & optimizing all our MySQL queries, it's abnormal for all of our CPUs to even be doing anything. Out of all our CPUs, we might have 5 < 10% currently.

One thing I've gathered from this project is technique is a huge part of optimizing WordPress. 1 resource hog at this scale is greatly amplified and can bring everything crashing down. I remember the first time I coded a simple 'popular posts' widget and how hard that hit when you have 150,000 posts to run through and 300,000+ pages generating on a regular basis throughout the day. We had to scrape it the same week it went up.

One big example on how thinking about your queries is huge (let's ignore page caching for a moment) is if you have a menu that queries all your categories (We have 1,000+) in the menu on every page load, and then run that query every time you generate a page, that would be ~650,000 times a day that menu query is run and the menu is generated. When you think about running one query on a table with 1,000+ rows, it sounds like nothing. But when you have to do it 650,000+ times a day... It adds up.

By switching our menu to be generated once every 5 minutes, having it cached in html each time, and by having that menu loaded onto every page via javascript instead; we only have to query & generate the menu 288 times a day. Apply this "single query/generation + load in js technique" to things like widgets within WordPress (all our related posts areas are loaded like this), it has made our resource usage less than 1% of what it used to be.

Although people get really excited about things like widgets & cool custom plugins... It's good to remember, if you can code, that sometimes hardcoding and/or coming up with your own technique will save you a ton of cpu/ram usage if you truly want to build for scale. Every single query counts. Nginx is crucial > 10k connections & php-fpm is a great partner for nginx as well. I'm going to print out your thesis and run through it a few times. Nice work my friend.

----

Also, idk why somebody said APC is dead. One of the most, if not the most, widely used caching plugins, W3TC, still supports APC caching and a lot of talks about WordPress optimization use APC. Personally, I find other ways to be better, but that's another topic.

2 comments

Thanks for the kind words Justin, they are the hugest approval of my work so far. :) Not getting this in uni where everybody is about matrices and stuff.

Have you tried using HHVM instead of PHP-FPM? Although you are definitely having a fast site, HHVM could potentially double the numbers.

Yes, one slow DB query can ruin all the fun. That's why using Redis for in-memory DB caching can help. In some cases, it might be more optimal than using full page caching because DB caching works for logged in users too while page caching has to be bypassed.

Using JavaScript for better performance helps too, I like how you optimized that menu. I've noticed that profiling the code with, for example xhprof, can be really useful during development. Found myself doing some slow code/queries, ran the profiler and immediately discovered the bad parts. It's quite easy and straightforward to fix them if you know where they are.

----

They probably meant that APC, which was a separate PHP caching module back then, is dead. However, caching is now part of PHP and as far as I can remember, it is using the same API as APC was, so basically it still lives and W3TC supports it.

I think people are saying APC is dead because another opcode cache was included in PHP 5.5 and APC has not seen a release since 2012.

https://en.wikipedia.org/wiki/List_of_PHP_accelerators#Alter... http://pecl.php.net/package/apc

There is still a need for a ucache (in addition to built in opcode cache)... although a lot of high traffic WP sites are turning to redis over memcache for this.
Ah gotcha. The ol' "it's old so it must be dead" way of thinking. That makes sense. Thanks.