Hacker News new | ask | show | jobs
by madrox 1285 days ago
A long time ago we discovered Twitter used a round robin of three servers for assigning IDs to tweets. We inferred the round robin was done by doing mod 3 of a signed int32, and because that space doesn't divide neatly by two it meant one of the three servers saw less load than the others and we could map ID assignment volume according to how often it overflowed and hence estimate total tweet volume for a given period.

Some of the details escape me (this was a decade ago) but it was a fun combination of statistical inference and CS knowledge that I don't get to use often. Whenever integer overflow comes up in a systems engineering context I get a little tickled.

7 comments

I am pretty sure that snowflake didn't use a mod of a signed int32. It used a service discovery pool as part of finagle (and prior to that dns iirc). The server used a very simple method internally to convert time into a integer (that was 52 bits because of javascript). In fact it was completely open source: https://blog.twitter.com/engineering/en_us/a/2010/announcing...

The integer generation was pretty simple, there was a fixed id of each server, and unless I a mistaken we have 5 servers per datacenter. Each id was basically <time><offset><id> where time was a millisecond timer, offset was the number of ids generated in that same millisecond by the same server, and id was the machines unique identifier. When we first talked about this process I thought that offset was going to roll, every id would increment it by one. This was changed to resetting it every millisecond specifically so that it would obscure tweet volumes.

At the time I remember reading a LOT of articles estimating tweet volume and most of them were way, way off. I don't know that we ever really put effort into correcting them though. =)

* - Does not account for changes in the system post 2012.

Awesome to hear from someone on the other side of the API with knowledge of this. This is ringing bells for me. Yeah, the id parameter was how we knew how many servers there were, and we saw more assignment in some servers than others that neatly mapped to a int32 max failing to divide by the number of IDs we saw. I thought I recall Twitter confirming that was how round robin happened but I could be totally misremembering. We never got a contact to talk to Twitter about it. FWIW, we did eventually see this fixed. I imagine it was pretty easy to spot one server seeing less load than others.

The offset was actually how we calculated volume, because millisecond collisions become a variant of the german tank problem[1]. A few times when y'all made tweet volumes public it mapped pretty closely with our estimates.

This was around 2011, so your knowledge should be relevant.

1: https://en.wikipedia.org/wiki/German_tank_problem

I think what your describing was actually a problem that I introduced in our bind configuration back when we used DNS for service discovery. Its not exactly what your describing but I can totally see how it would appear that way externally.

So initially we used 2 bind servers with authoritative zones serving using round robin. This worked fairly well as the load on each server was high enough to keep the round robin flowing and the connections from ruby would "stick" for 30 seconds anyway. We had a very large pool of front ends so its not like it mattered. Eventually we had to put a local bind cache on each server, but this introduced another super annoying bug. For some reason, the interaction between the authoritative server, and the caching server would end up causing the served record set to be ordered the same. Normally the authoritative server would serve A, B, C for the first query, then B, C, A for the second, etc. When the cache in the middle for some reason it would reset al queries to A, B, C after the refresh happened. So effectively every 30 seconds it would reset to A, B, C and start round robin again. Since the front ends would connect and stay sticky via connection reuse for a while this meant that A got like 50% of the load, B got 33%, and C got 17%. I am guessing you latched on to this by seeing that one of the servers was horribly underutilized in comparison and reproduced the math accidentally. =)

The fix for this was the massively disgusting, but super effective "make every single machine a DNS zone secondary". Rather than having a simple record cache we just fetched the whole zone if it changed. This actually made DNS quite a bit faster as every machine had a local copy of the zone.. Once that happened distribution returned to normal (20/20/20/20/20) fix for all of our internal services which used DNS for discovery, including snowflake.

This is awesome and completely retconning a 10 year old project for me. I was working on social media analytics at Disney and we were exploring ways to measure twitter conversation of our brands, which led to us attempting to estimate total conversation volume, which is why this technical nuance was relevant to us. It was a wildly experimental time. Thanks for the story!
This thread is amazing.. thank you both for sharing!
Agree, was super excited when I saw reference to the German tank problem as that is immediately what I thought of in the first post.
I dream of the day in the future, possibly in Eden, when I'll have opportunity to discuss the things I've reverse engineered with the guys and gals who actually engineered them. This conversation in enlightening to witness.

I will have an especially hard time phrasing questions in a respectful manner for the Adobe devs, though ))

Also, totally unrelated Twitter story (because I am nostalgic lately and you reminded me of it)

Early in my time there we had a major, unexpected load spike in the middle of the day. People where tweeting like crazy, breaking records for volume, etc. The system was groaning under strain but it mostly held. Turns out Michael Jackson had died. We sustained 452 (or maybe 457, it was roughly 450-something) tweets a second. this quickly became 1 "MJ" worth of tweets. We informally used this to define load the entire time I was there. Within a few months we got to a point where we were never BELOW 1 MJ, within a year I think we had peaked at double digit MJs and sustained several even in the lowest periods. Before I left we had hit 1 MJ in photo tweets, etc.

Around the time I left we did something like 300 MJ's per second, and I was only there 3 years.

I remember those days before snowflake and Blaine was desperately trying to keep the lights on. It's why no one had time to talk to us (marketers and related disciplines) back then. Even Facebook said "all our focus is on acquiring users. You have no choice but to meet us on our terms because we own the eyeballs." Everything was growing so fast. Hard to believe that was only a decade ago.
Musk claimed a record of 20,000 of tweets per second recently [1]. How does that square with what you’re saying? 300 times 450 is closer to 150,000. Am I missing something?

[1]https://twitter.com/elonmusk/status/1595505413113323520?s=20...

That tweet does not claim 20,000 is a record.

The event also did not reach 20,000, it "almost" hit it.

You’re right. I misinterpreted ‘record usage’. I found a 2013 blog post [1] claiming a 143 thousand tweets per second peak.

[1] https://blog.twitter.com/engineering/en_us/a/2013/new-tweets...

That's interesting; I worked on a different site that also got record traffic when MJ died. I wonder if that happened to every site that had a chat or news feature.

Kind of obvious now, but I bet we could detect major world news just by sampling traffic size of chat sites.

Someone previously created a tweet linking to itself by predicting the likely ID range: https://oisinmoran.com/quinetweet
A simpler scheme I've used for adtech (billions of requests per day) is to simply reserve a chunk of numbers for each server from a central source. Easy to implement, very fast since each node can just increment in process, and using a 64-bit integer is effectively infinite.
That's how Active Directory Domain Controllers handle the distributed creation of SIDs too.
Twitter didn't always use Snowflake -- that was introduced in November 2010. There was another, much simpler algorithm used before that which generated much smaller IDs (e.g. under 3e10).
Yea.. "auto increment" in mysql. it SUCKED in so many ways, primary of which was that it required a single mysql database to be the primary for all tweet writes. Every tweet just incremented a counter by 1. As we scaled up that server got overloaded rapidly and became a huge single point of failure. Snowflake was introduced to solve several problems at once. =)
> 52 bits because of javascript

But IEEE 754 doubles have a significand that supports a 53-bit range. What am I missing?

You're not missing anything, people just forget the implied bit sometimes.
negative numbers not used?
That doesn't make any sense.
Wouldn't that unevenness only affect 2^31 - 2 and 2^31 - 1, so a negligible fraction of the integers? Was that tiny discrepancy enough to make your calculations?

In other words, what do you mean that it was done by doing mod 3 of a signed int32? If it was a monotonically increasing or random int32, I don't see how that unevenness would manifest in a meaningful way.

In another subthread, we realized my memory was wrong and we were measuring millisecond collisions. The serving ID imbalance was a side-effect. Also, it might've been an int16 I was thinking of but turns out the whole thing was shadows on cave walls.
Maybe a dumb question, but I don’t follow what you mean by “the space doesn’t divide neatly by two” and also how that connects with overflowing ints. Asking because I’m genuinely curious and would like to know more about this. Sounds really neat!
If they were incrementing and modding wouldn’t that server see an extra 1/2 billionth more traffic?

I don’t get how mod 3 affects anything if you’re just incrementing…

If it’s round robin then it should be an even load, how does the modulo change that exactly?

Also what number are they using to modulo and where is that happening? Because at that point don’t they already have an incrementing ID before generating another one?

Take a 3-bit counter:

    0->A
    1->B
    2->C
    3->A
    4->B
    5->C
    6->A
    7->B
A and B get hit three times while C only twice, so it will see 66% utilization compared to A and B

EDITED s/once/twice/ thanks CyberDildonics

while C only once

You listed C twice

That's a typo. You can still see they're correct about the ratio (two "C"s for every three "A"s and "B"s).
There's no ratio. It's even across all of them, as long as the integer keeps incrementing. One more number (9) instead of stopping at 8 and there would be an even spread.
Not when the counter overflows back to 0. If it's a 3 bit counter, 0 is A again, not C.
Edited, thanks
That doesn't change anything... it's still round robin. You just stopped at an arbitrary number of 8 integers instead of 9.
It's not arbitrary, GP stated it was a 3 bit counter. In GGP (or something, not sure how far down in the thread we are), they were referring to a 32 bit int counter until overflow. If you bin each number from 0 to 2^32 - 1 by mod 3, you don't get 3 bins of equal sizes, 1 bin always comes out smaller.
Smaller by a single number at most, it's effectively insignificant. Not sure where the 3-bit counter assumption came from as the original post said 32bit signed int.
That's really neat, I'd love to hear more about it. Was this something you were actively trying to find out, or was it poking around until something caught your eye?
A little of both. I was working in social media analytics, and we were collecting everything we could to understand how to communicate the value of this new medium to businesses who could use twitter for marketing. This was still in an era where privacy wasn't at the front of anyone's minds, so there were zero retention policies. Hard to believe that was only 10 years ago.

Eventually, we learned to treat Twitter as a lead generation tool for off-platform activity and apply old school funnel mechanics to it. The next problem became how to build a follower count. Sadly, that problem is what I think led to extremism on the platform. Hence: https://madrox.substack.com/p/yet-another-quitting-twitter

Not op, but at an ecommerce company I worked for we did similar things to track how well our competitors were doing relative to us so could be something like that.

Also collecting data like this can be useful if you want to beat markets.

Reminds me of the self-quoting tweet: https://news.ycombinator.com/item?id=25244872