Hacker News new | ask | show | jobs
by chralieboy 3947 days ago
That's really been the most enlightening part of the series for me.

Reinforces that the hardest part in engineering is rarely the technical problem. Distributed databases are really f*ing hard, but infinitely harder if the people can't work together or don't open up to faults.

1 comments

From experience I think some of the problem is that not everyone appreciates the importance of correctness in this sort of system. At the very least you should clearly documenting your expected failure modes, so that it's possible to build correct things on top of it.

In part it's hard to convince everyone of the importance of spending time on what looks like an unlikely corner case until after the outage - when it also becomes much harder to fix as you usually need to build this into the core design of your system.