Hacker News new | ask | show | jobs
by hhas01 2426 days ago
“roll back first, ask questions later”

Better yet: build code and processes appropriate for long-lived massively distributed systems that will be incrementally upgraded over time. If the system is architected right, it will never get into a state where a rollback recovery becomes necessary. This is why we have Content Negotiation. This is why we have Erlang. This is not a new challenge by decades, and there is a huge body of expert knowledge and tools upon which to draw when implementing such systems, so any such complete catastrophic basic failures now are entirely down to PEBKAC, and remedied by a swift clue-by-four with a pink slip nailed to the end.

There is a very simple principle underpinning distributed communication: servers should never make assumptions about who their clients are and what they need. Talk to the client, find out what format(s) it’s willing/able to accept, and serve it the best match. A client should never need to know, nor care, if it’s talking to a version A server or a version B server: if the client says “I only understand version A data” then it’s the server’s job to serve up data in that exact format, not to pique and whine about how old and out of date the client is, push it version B data instead, and then blame the client for choking on it.

Indolent developers who approach IPC the same as local messaging and then blame everything but themselves when it barfs all over the place are the absolute bane of this industry, and this shit is entirely on them. And shame on the equally inept management culture that continues to let such incompetent amateurs get away with it.

1 comments

You’re asking too much.

There will be bad rollouts. I know of no set of practices which prevent bad rollouts. You talk about “indolent web developers”, well, that’s not productive and pointing fingers doesn’t make your software work. Your software will, in spite of your best practices, in spite of hiring the best people, in spite of experience, sometimes fall over.

Yes, it will sometimes segfault.

My software shits itself all the time. What matters is that it does so safely. And when it doesn’t, I can tell you why, because I know what corners have been cut and why, and I’m not afraid to accept and acknowledge my responsibilities in such fuckups.

And yeah, I count on the fingers of one hand the number of web developers I’ve dealt with over the last decade who I’d be willing to cross the road to piss on were they on fire, and still have fingers to spare. They’re just the worst of the worst.

There was NO excuse for the failure described in the article. There was NO excuse for the described response to that failure. Yet such base incompetence and gross irresponsibility is not only systemic but entrenched, rationalized, and embraced in this industry. With responses like yours, it’s not hard to tell why. Buncha Children.

> There was NO excuse for the failure described in the article.

In this case, right. In general, stuff happens. There’s a tradeoff between reliability and effort. The correct reliability target is not 100%, because you can’t get 100% anyway, and as you approach 100% reliability the cost increases without bound.

I’m not sure what the rest of your comment is about besides taking a big shit on web developers and talking about how awful they are.

There is a precious small percentage of developers who are really good at making reliable systems and they have the burden / responsibility of spreading their knowledge. They work with the other actual developers you hire, those beautiful imperfect developers who cut corners, test in production, and don’t write tests.

You make changes to your culture and your practices. You build monitoring and rollout automation. You increase test coverage.

If you just call people children you’re going to be there, on the sidelines, watching other people build real products. You don’t teach people by making fun of them.

> I count on the fingers of one hand the number of [X], and still have fingers to spare

So many words to non precisely say one to three (assuming a five finger hand).