| HN Mirror

It depends how you approach the problem. I'm guessing here, from my own experience implementing social graph features, but here goes:

There are a couple of ways you can go about this, the first is the database: Join the network of people against the land of content and bring it back. This doesn't work (as an aside this is what people mean when they say web scale, it has nothing to do with web traffic, it's social graphs) your database will cry. All though not at first, in development it works fine, and you feel fine, and for a while you're ok, but you start growing....

Another way you can go about it is by denormalizing. In this world you store a pointer to each content item for each user. So anytime I do something all the people [following|watch|connected|friended] to me get a record indicating I did this. This works, but now you have lots of data (lots and lots of data!) spread all crazy around. You need some kind of system to push that data out to everybody. It's those last two that drive up your hardware usage, it's not necessarily web boxes, but it's boxes in the background broadcasting the events out to the world, and the datastores to hold it all. Depending on how your web code works you could also have a lot of overhead on the webservers putting all that stuff together.

My experience here comes from building the social features into toolbox.com. A good example is this page http://it.toolbox.com/people/george_krautzel/posts-connectio... That's all the posts from users connected to our CEO (all 750k of them). Getting that to return in near real time is super fun (and you can probably tell that I went down the DB join path before it all fell apart).