Hacker News new | ask | show | jobs
by ck2 5613 days ago
Yes but that's handled by hardware, not by people, and I am sure only a few critical coders handle the software performace.

WordPress.com is not as trivial as you might think http://en.wordpress.com/stats/

Certainly it's just a matter of scaling once you hit a certain level of volume, you just have to be able to bring more servers online into the grid.

Scaling from 10,000 users to 1 million is probably very hard.

Scaling from 1 million to 100 million, well you better have a pattern down that works with easy hardware replication (like google does).

4 comments

Understand that Google is continuously rewriting their infrastructure to handle increased scale. That's what they have 25k employees for.

Jeff Dean's rule of thumb is that you should build a growth factor of 10 into the design, but any more than that and you will probably have to re-architect anyway. So going from 10,000 to 1 million and 1 million to 100 million are probably roughly equivalent in difficulty.

How many of those 25k employees actually handle any of the infrastructure scaling?
A fairly large percentage of them, and many of the ones whose direct job responsibility isn't infrastructure (like me) frequently have to deal with the consequences of building for scale as they develop features.
The big difference between blog hosting and things like Twitter (or Reddit, or Digg, etc) is that blogs are independent, so adding servers scales you up linearly. When you are looking at X's blog, basically everything you see is coming from one server.

You will have to have something that deals with mapping URLs in the unified logical name space of your site to the individual servers that the particular blogs on--that's the part that you can't just throw machines at and get good results.

With the social sites, you can't isolate things easily, because what a given person sees at any time is drawn from an ever changing set of content from other users, with each viewer drawing from a different set.

Based on the stats on the site, I calculate that Wordpress.com is averaging about 10 writes and around 8800 reads per second.

edit: Correction, ~880 reads per second.

Looking at the posts per day detail I think it's more like 30 per second with bursts that probably go to twice that, while nighttime is idle depending on timezones.

But I agree it's a fraction of 7k/sec peak for twitter.

However, twitter does not have to parse html, has no plugins to execute or templates, and has a max string length of 140 characters which is trivial.

Each post/comment published on wp.com takes many, many more cpu cycles than twitter.

I agree that Wordpress requires far more CPU time (I run a network of Wordpress blogs).

My figures: 900,000 transactions (500,000 posts + 400,000 comments) / 86,400 seconds per day:

~= 10 transactions per second on average, though like twitter, activity probably has a power law distribution corresponding with US daylight hours.

Reads: 2.3 billion / month Suppose a 30-day month, that's 86,400*30 or 2,592,000 seconds per month.

2,300,000,000 / 2,592,000 ~= 887 pageviews per second.

I was incorrect by an order of magnitude, though the same uneven traffic patterns caveat applies.