| HN Mirror

Small slack teams are easily horizontally scalable; for a small team, the web server, the app server, and the db could probably run on a single EC2 instance, and AWS offers some rather large instance sizes.

Lets start there, though. 70k stand-alone (paid!) slack teams means 70k stand-alone systems. How do you operate, well, all of them, simultaneously? With one mammoth central database, there's one database to upgrade; if it goes down, there's one database to fix. With 70k small databases there's 70,000 problems! With 70,000 systems, how do your engineers deploy code, and how many times per day can they do it (it had better be well into the double digits)? How do you roll them back? What do you do if an upgrade goes wrong? With 70k different apps, one small problem quickly becomes 70k small problems, which is harder to manage than 1. Some things can (and I'm sure are) scaled horizontally but the isolation that grants you does not come for free.

And then, what about past that? Looking at the customers listed on Slack.com, they serve some larger enterprises, who are going to need the "expensive" level of scaling. No database is going to be able to scale to that level without team to manage it (no matter the technology), so then you need a queue as well as a db, plus a team to manage each of those, and then how do you do searching/indexing. You also can't ever take a single database node offline, so then it's a database cluster, with hot spares, and also large enterprises operate globally so then their slack team system needs to run multi-region hot as well, and then and then and then? I've got Slack open all the time on both my (work) phone and my (work) laptop as do the majority of my coworkers, which means their webservers have heavier requirements compared to Lyft, which I use for a few minutes whenever I take a ride.

Slack usage will hit a lull outside of business hours, so you'd want it to scale resources that serve that - I'll bet a non-insignificant portion of the $4M/month probably goes to resources that are only used during the business day - so in some sense, Slack is paying AWS a premium to not pay them for unneeded resources at 3:30 AM.

Slack's optimized their app for development cost (much to my laptop's sadness), it doesn't seem that far fetched that slack has also done some optimization of server side costs. future money isn't worth as much as money today is, and this fact is reflected in AWS RI offerings.