Hacker News new | ask | show | jobs
by wpietri 4080 days ago
> In Node.js, it's really easy to identify process boundaries since the child_process module forces you to put code into different files and communicate via loosely coupled IPC channels.

I find that approach troubling. Process boundaries are expensive, because you have to serialize and everything each time you cross a boundary.

I also haven't used Go, but I think you can get great clarity with something like Akka's Actor model. And to do it you don't have to pay a large serialization tax until you move particular actors to other machines.

1 comments

Yes, socket/pipe-based IPC is more expensive than shared memory up to a point, but it's more scalable since you don't have to deal with locking (mutexes, semaphores) and the limits this imposes.
There is scaling and then there is scaling.

Sometimes you want to scale up to the box you are on and the most efficient way to do that is to use threads in a single process.

When you hit the limits of a single machine then it's time to start scaling out to multitple machines. You are right that then you have to start paying the communication/serialization costs and in return you get much greater scale.

However you don't have to start paying that cost until scale in that direction. You can get quite a lot out of multicore machine these days without having to pay serialization costs if you use threads.

Many times the serialization costs aren't worth the benefit unless you are getting a whole other box with another 32+ cores out of the deal. Paying it before you get that benefit isn't efficient engineering. And for some people choosing a language or framework that forces them to pay that cost before it's necessary is a bad idea.

If you don't want to deal with locking but are willing to pay extra memory cost, then you can just duplicate the data. A re/de-serialization step is a much more expensive way to do that.