Hacker News new | ask | show | jobs
by ra 5613 days ago
Twitter.com is probably one of the great engineering feats of todays social Internet.

You tweet from your mobile phone app (maintained by twitter) or twitter.com, and that tweet gets posted on your public timeline. The tweet is parsed for mentions, and it is also copied on to the mentionee's incoming timeline and their mentions timeline.

The tweet is parsed for hashtags, and active searches (i.e. ones open on twitter.com, or via the APIs) and is also copied onto those timelines.

Also, you have followers. So Twitter also copies the tweet onto the inbox timeline of each of your followers.

Not only is all this information published in real time on Twitter.com, but it is also made available via JSON API, Streaming API and Firehose API.

All tweets (on all timelines) are stored forever.

Twitter's scale is hard to fathom - all of this processing is way beyond what your Rails / Django app could process with a MySQL backend. MySQL replication wouldn'e even come close to keeping up with the sheer volume of events to be processed.

To make matters more complicated - Twitter is expected to scale to meet the demand of emerging world events (eg: Egypt, Iran, snowstorms, hurricanes, earthquakes, bushfires). These events don't evenly spread traffic across Twitter's network, but instead provide "storm surges" of localised intense traffic.

Oh, and Twitter haven't just launched an analytics product?

I think for sheer engineering at scale, there are maybe only about half a dozen other companies in the same league.

(Edit: grammer changes for readability)

3 comments

A lot of companies deal with traffic of this magnitude. UPS perform 12 billion transactions per day through their systems - DB2 on Sysplex'd mainframes. Finance companies are currently dealing with tens of thousands of messages per second with much more complex handling and forwarding rules than Twitter. And, of course, in terms of page reads, Google and Facebook leave them well behind.

Twitter has become impressive, but let's not overstate its achievements.

Sure but UPS, Google and Facebook have a lot more than 350 employees.
Not an apples-to-apples comparison. UPS, Google and Facebook have much less concentrated product/service offerings (ie, they each have >1).
> Not an apples-to-apples comparison.

So why did you make the comparision?

I am just pointing out that ck2's summation of Twitter understates the complexity of the problem they solve.

> So why did you make the comparision?

My point was that twitter's scale and problem complexity are not unprecedented and that small, motivated teams have solved them before. Parallel queries and transactions for DB2 on z/OS were developed by small teams in Poughkeepsie, NY and Perth, Australia. Google's infrastructural software was designed by a small group of very bright people, and so on.

If we compare the groups actually working on the problems then I expect that twitter will have comparably small groups of engineers directly facing the scaling problem. And I also repeat the point that bigger problems have already been solved. Twitter's issues are not unprecedented if you are prepared to look outside the Journal of Stuff I Remember Seeing on Highscalability.com.

I don't disagree with any of that, except that no one has claimed that Twitter's issues are, "unprecedented".

Once again I will reiterate I am just pointing out that ck2's summation of Twitter understates the complexity of the problem they solve.

Great "social internet" Engineering feat, eh maybe. Compared with other real-time & data intensive platforms, not even close. Electronic finance assets exchange platforms are a order of magnitudes more complex.
That's a good breakdown of the processes but I am positive all that is done in stages and queues.

So with a good design you have groups of servers doing the different stages in the queue.

One you've got the pattern down for 100 tweets per second, the pattern should be reproducible by scaling servers in each queue to 1000 tweets per second, and eventually 10,000 tweets per second.

The database requirements may explain why it's all done in one datacenter instead of trying to do replication across the country/world.

In the research literature this is called SEDA: http://en.wikipedia.org/wiki/Staged_event-driven_architectur...