Hacker News new | ask | show | jobs
by danielheath 1946 days ago
That’s because those things are only simple when they’re working.

Each of them introduces a new, rare failure mode; because there are now many rare failure modes, you have frequent-yet-inexplicable errors.

Dead letter queues back up and run out of space; network splits still happen;

1 comments

Sounds like you're saying "Don't do distributed work" if possible (considering tradeoffs of course, but I guess people just don't even consider this option is your contention).

And secondly, if you do end up with q distributed systems, remember how many independently failing components there are because thag directly translates to complexity.

On both these counts I agree. Microservices is no silver bullet. Network partitions and failure happen almost every day where I work. But most people are not dealing with that level of problems, partly because of cloud providers.

Same kind of problems will be found on a single machine also. Like you'd need some sort of write ahead log, checkpointing, maybe optimize your kernel for faster boot up, heap size and gc rate.

All of these problems do happen, but most people don't need to think about it.

I'm not reading this as "Don't do distributed work". It's "distributed systems have nontrivial hidden costs". Sure, monoliths are often synonymous with single points of failure. In theory, distributed systems are built to mitigate this. But unfortunately, in reality, distributed systems often introduce many additional single points of failure, because building resilient systems takes extra effort, effort that oftentimes is a secondary priority to "just ship it".
If you are using a database, a server and clients, you already have a distributed system.

You'll also likely use multiple databases (caching in e. g. Redis) and a job queue for longer tasks.

You'll also probably already have multiple instances talking to the databases, as well as multiple workers processing jobs.

Pretending that the monolith is a single thing is sneakily misleading. It's already a distributed system

Indeed. So with monolith usually we already have 3-4 (or more) somewhat reliable systems, and one non-reliable system which is your monolithic app. Why add other non reliable systems if you don't really need it?

Making a system to be reliable is really really hard and take many resources, which seldom companies pursuit.

just because all the code is in one place doesn't mean its one system.