Hacker News new | ask | show | jobs
by jacurtis 1100 days ago
As I have moved through my career and been in charge of larger and larger companies, I continue to be surprised how much of what you would expect is critical infrastructure that affects many people is being "cobbled" together.

I've worked at some companies that manage huge systems that affects tens of millions of people, which cause extremely severe problems if they break, and they are "cobbled" together with a surprising amount of metaphorical bubble gum and duct tape.

Every time I go somewhere new, I assume they have their shit together and I am surprised every time.

So no, it doesn't surprise me that an operation like this was cobbled together in the same fashion.

3 comments

The difference is that the cobbling together at a large org has generally been battle-tested over time, at scale. The bits that were going to break have already broken and been replaced with something better. While it may seem horrible aesthetically from a developer's perspective, it's nevertheless functional and reliable for its purpose.

But when you use that approach on a new, one-off device with life-threatening consequences for failure, sooner or later you're going to find out that "testing in production" may not always be appropriate.

Can anyone downvoting antonvs please explain why? I think it’s a valid opinion to hold and point to make irrespective of whether you agree.
I didn’t downvote him, but I think it’s a bit dismissive of the original point and kind of falls prey to the no true Scotsman fallacy. antonvs’ point can be summarized as essentially “even though you have firsthand experience running companies with cobbled together critical infrastructure, that’s not actually cobbled together infrastructure.”
I don't agree with your summary. I specifically talked about "the cobbling together at a large org", I didn't say it wasn't actually cobbled together.

The point is just that any system that's being used successfully in production at scale has already been tested, has already broken in myriad ways both in testing and in production, and those issues have been addressed somehow - quite possibly by more cobbling together.

This doesn't somehow make the system "not actually cobbled together" - as I said, the implementation may still seem horrible to engineers. It just means that the points of failure that have actually arisen, whether in testing or production, have been addressed somehow, so that the system is able to function at scale.

A key point in all this is that factors like survivor bias are at play: you're not looking at an org that failed because of their cobbled-together system, you're looking at one that succeeded. Large orgs are also more likely to have more testing to help catch the issues with their cobbled-together systems.

This all means that if you try to use this approach on your experimental submarine that you're selling tickets for, you're trading off the almost certain loss of life of some of your passengers against the short-term time and cost savings achieved by poor engineering practices.

To quote Lord Farquaad, "Some of you may die, but it's a sacrifice I am willing to make."

What antonovs tries to say that larger, older companies will tend to have a collection of modules/components that over the years have been relatively well tested and hardened. While a system may be cobbled together with the metaphorical duct tape, at least the components are sane-ish themselves.

Look at e.g. Kubernetes. You can cobble something together on kubernetes and once you get it limping, the system will be running surprisingly well for being cobbled together, compared to e.g. a system cobbled together from one off bash scripts.

I think that the term 'cobbled together' does injustice to something that's been running in production and is quite stable. It isn't a demo that's been cobbled together at short notice. The former isn't elegant because it doesn't have the clean lines of the original design, and age takes its toll.
Have to share the sentiment. I remember the expectation of great systems engineering I had when I was at the beginning of my career. You know, ”These things must’ve been done well, as thousands/millions people use these daily”. I was enlightened pretty fast by the first couple of years as a consultant.

Nowadays I expect something clumsy and mediocre, regardless of the organization, with very apparent problems and _maybe_ some bright spots somewhere in the system architecture.

Some of the cobbling are illegible she’ll scripts written by “my friend”.