| HN Mirror

There is scaling and then there is scaling.

Sometimes you want to scale up to the box you are on and the most efficient way to do that is to use threads in a single process.

When you hit the limits of a single machine then it's time to start scaling out to multitple machines. You are right that then you have to start paying the communication/serialization costs and in return you get much greater scale.

However you don't have to start paying that cost until scale in that direction. You can get quite a lot out of multicore machine these days without having to pay serialization costs if you use threads.

Many times the serialization costs aren't worth the benefit unless you are getting a whole other box with another 32+ cores out of the deal. Paying it before you get that benefit isn't efficient engineering. And for some people choosing a language or framework that forces them to pay that cost before it's necessary is a bad idea.