| > cpu cycle overhead The original poster's point was CPU and network overhead. A local procedure/function call or message-send takes on the order of one or up to a few nanoseconds. Depending on how you organize things, an IPC is going to be in the microsecond or even millisecond range. That's a lot of orders of magnitude. It's also latency that you just aren't going to get back, no matter what extra resources you throw at it. [1][2] In the early naughties, a rewrite of very SOA/microservice-y BBC backend system I re-architected as a monolith became around 1000x faster. [3] In addition, in-process calls are essentially 100% reliable. Network calls, and various processes attached to them, not so much (see [1], again). The BBC system not just became a lot faster, it also became roughly 100 times more reliable, and that's probably low-balling it a bit. It essentially didn't fail for internal reasons after we learned about Java VM parameters. And it was less code, did more, and was easier to develop for. [1] https://en.wikipedia.org/wiki/Fallacies_of_distributed_compu... [2] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.115... [3] https://link.springer.com/chapter/10.1007%2F978-1-4614-9299-... |
Like any other design decision, there's a trade-off here. (see my other comments in this tree, about how many 9's in reliability/latency you're targeting).
If you're working in an environment where sub-5ms latency to the 4th or 5th 9 is critical, inter-machine communication is not for your application, period.
Reliability, as an orthogonal concern, is one that has improved incredibly since the early aughts. The "transport" and error-handling layer of open-source RPC frameworks has improved by orders of magnitude. I'd recommend taking a long look at the experiences of companies built on gRPC.
It's much easier to build a reliable SOA-esque system today than it was even 5 years ago. It's been an area of rapid progress.