i.e. my laptop could handle the write load. A retired nerd with a real server they put in a data center for bandwidth could easily run that level of traffic on 2022 hardware.
For reference my work laptop (8 logical cores, so 4+HT? 32GB RAM) can handle 100k rows/second sustained inserts into postgres 14 with some batch jobs I'm working on. You can buffer http requests into batches and easily handle way more than 10k/s on a server while still providing synchronous semantics and reasonable latency to the client (e.g. flush batches every 10-100 ms).
I doubt Mastodon is designed for that kind of scalability, but most techies could probably afford to run Twitter as a hobby if they knew what they're doing and they weren't trying to do all the analytics and advertising stuff to monetize it/just wanted to provide the service.
I think you don't understand that 10,000 tweets isn't 10,000 inserts. It's not enough to just write tweet, you need to fan-out writes to every user's feed. It's also not batched into neat transactions like your batch jobs.
You don't need to fan out writes to feeds. Users subscribe to other users, not tweets. You can attempt to send out the notification to subscribed users, and if it fails, that's fine. You don't need to record notification status. Have a worker that records (in the db) the last tweet id it's processed, and just regularly joins N new tweets to authors to subscribers and attempts to send. You can play with how that query works to limit total work done in a chunk rather than N new tweets (in case of someone with many subscribers), but the idea is straightforward.
You can easily batch writes into neat transactions: make a queue, have your POST handler write (row, callback) onto the queue and await the callback. Have the queue reader grab a chunk, push it to the db in batch, and execute all of the callbacks on commit. The callbacks return a 200 to the client, or 500 if the commit failed. This can all happen fast enough to be done in "real time" (however fast you want your queue worker to flush batches).
You can do all of this in a couple dozen lines of code with something like Scala/ZIO.
Computationally, it is totally doable. The biggest constraint is the cost of storage.
I've worked in a social network with feeds. Feeds are hard. Both fan-out on write vs fan-in on read both have upsides and downsides. Either way, it's "cheap" problem to solve in a couple of dozen lines of code.
When I said that you can't batch, I meant that each tweet will be at least one transaction. Async write with batching like you suggested will have a horrible user-experience, also your client often is a mobile device or browser - how do you deliver a callback from server there?
Nah, how many inserts your laptop can make and how many tweets created in a unit of time are two irrelevant metrics.
> how do you deliver a callback from server there?
You don't. The callback is on the server before it ever responds to the request. The client sees a synchronous response with a delay of a few extra milliseconds.
Here, I slapped this together to demonstrate the technique[0]
Which processes up to 8k requests at a time, waiting up to 10 ms for a batch.
The worker does a bulk db insert, and completes the promises with the generated ids.
Similar techniques should work on read batching, but I haven't tried that. You can also speed that up some more with the COPY protocol, but IIRC you need to be more careful about escaping/SQL injection. The example I wrote uses prepared statements/parameter binding.
On my 6 year old mid-range desktop (this CPU[1] and this disk[2]) this program can process ~30k `create`s per second. For about $1500, I could buy a new computer with a Ryzen 9 7950 with 4x the core count/8x the thread count and 2x the single-threaded performance, so around ~10x more processing power, 128 GB of RAM, and a Samsung 980 Pro SSD, which can do 1M Write IOPS (25x more than my SSD) or 5GB/s sequential writes (10x more). So a $1500 computer with a single disk should be able to do around 300k/s. PCIe gen 5 is now coming out, which will allow for another doubling of disk performance.
128GB of RAM means you can keep at least 100M rows worth of index in memory. It's not that expensive (under $10k) to build a server with 1TB of RAM.
Totally feasible for a hobbyist to do without tons of tricky optimization (the code I posted is purely functional Scala!); people spend $20k on a jetski or $80k on a truck. Like I said, the most expensive part is going to be the storage, but you could do something like only store the most recent 1000 tweets per person, and charge $10 to bump that up to the most recent 10 million tweets or something. You'd come out at a substantial profit with that model if you got a few thousand takers. Similarly you could charge to let someone follow more than a few thousand people so you could pay for a read replica or two.
I'd assume it's not so much the peaks that are a challenge - most of these 10k tweets/second aren't critical to serve to anyone fast, and that scales horizontally- it's the hot spots, that one tweet thread in the spotlight right now that everybody wants to read and jump on - that doesn't scale by just adding more servers
An actual takeover will almost certainly be a virtual DDoS on Mastodon.