Hacker News new | ask | show | jobs
by justnoise 629 days ago
I'd guess that Discord's storage systems lean towards processing a lot more writes than reads. Postgres and other databases that use B-tree indexing are ideally suited for read heavy workloads. LSM based databases like Cassandra/Scylla are designed for write intensive workloads and have very good horizontal scaling properties built into the system.
2 comments

Would you actually have more writes than reads? Are messages read by fewer people than post them?
When you send a message, afaik it sends to all people looking at it at the time. So there is no read when in a conversation, and maybe the reads are batched when reading multiple.
Read traffic is much higher than write traffic due to mobile clients needing to sync chat history more often as their sessions are much shorter lived. Also search queries execute 1 query per result. And don't forget people doing GDPR data dump requests. It adds up.