|
|
|
|
|
by throwaway290
1298 days ago
|
|
My point 1 was specifically phrased to clarify how it could have been more reliable before migration to FB because it did not have to deal with the same load back then, you said nothing to show it was not a correlation. Your point 2 sounds like there were additional factors that could have influence the reliability besides the load (they didn't simply migrate to FB infra but also switched to Signal). > The broadcasting is important, yes, but it is very heavily biased towards reads over writes, so something like Cloudflare would solve 99% of the load. There's push-notifying millions of devices within seconds after a celeb or a major news source tweets. There's tracking view and engagement stats on that in realtime. There's making sure a tweet is not available to any of those within seconds after it's been deleted or moderator. There're separate back-office apps for moderating that firehose of content. And that's just what I can see from the outside. An e2ee instant messenger with size-limited chat groups doesn't even come close. Please don't say "just stick a CDN on top of it and you are 99% there", it's embarrassing (and not to twitter). This will maybe get you 80% there if your goal is "a microblogging platform" but not even 20% if your goal is being both the go-to news source and shitpost forum for people worldwide reliably working even in sensitive times and emergencies. Twitter used to be a microblogging platform back when it had much fewer employees and you'd see a fail whale regularly even as it had much fewer active users, in recent yeas it's a completely different beast and saying increased headcount is unrelated is amusing. |
|
The push notification argument is also overstated. Sharding and fan-out solves the burstiness. And people overall receive a similar number of messages (and thus push notifications) from WhatsApp as Twitter. Besides, these days the push notifications go through Google/Apple servers anyways to reduce the number of open connections needed on the phone side.
Then there are DMs. They are per person so CDNs don't help much (just static assets), but also they shard basically perfectly. So, shard them.
Which in the end leaves the user feeds. Designed correctly, sharding would work extremely well, and what doesn't work could be handled by caching closer to users for those 1k most popular accounts.
Honestly, with the correct architecture, languages and tooling, it could be handled by an experienced 50 person dev team plus another hundred in ops. Obviously Twitter doesn't have the perfect setup, so maybe an order of magnitude more? And if you throw a bunch of subpar engineers and tooling at the problem, nothing can dig you out of inefficiencies at this scale anyways.
And no, I'm not wildly optimistic here. StackOverflow still runs off of 9 on-prem servers [0]. I've seen message queues that can give 200M notifications per second on a single machine (written in C++, for HFT). This stuff is hard yes, but throwing more bodies at it doesn't help past the point your fundamentals are solved.
0. https://www.datacenterdynamics.com/en/news/stack-overflow-st...