Hacker News new | ask | show | jobs
by 5e92cb50239222b 1555 days ago
It's fine. Maybe it will force them to finally start paying attention to the quality of their work. If crap I'm writing for a living was misbehaving that frequently, I'd be sweeping the streets by now (or doing some other work that's actually useful to society).
2 comments

It's OK to be frustrated since we rely on GitHub so much, but this is unkind. Software is complex. GitHub operates at a scale few of us work at. There are people at the other end doing their best traversing complex internal systems (organization and tech).

I would argue GitHub has done more for societal good than most tech ventures, by the way.

I was pretty pissed off, alright, so my comment probably gave out wrong vibes. I'm not arguing I could do any better (I probably wouldn't get past their interview process), and they certainly do have the talent (which is obvious by their technical blog posts).

It doesn't change the fact that the company has absolutely crap dev culture which seems to put features first and foremost, at the expense of everything else. There are products with even more complexity that don't fall over and die almost every single day. It's just not funny anymore. Facebook is pretty complex, it had major issues like this one, what, once in its entire life?

I don't remember Google Search (or other Google products) ever not answering my queries, and I've been using it for about 18 years.

And so on. I reckon it's because those companies have strong engineering culture (Google certainly does, at least), and this one doesn't.

GitHub actions has been like this for years now. Years. Years!!!!

And the crazy thing is you see people on HN demanding that some one person side project/SaaS has to be at 100% uptime with multiple failovers, automatic scaling, etc. etc. There is such an emphasis on scalability on HN and yet... you just brush that all away because "software is tough." Yeah, no shit. Poor Github. They are also Microsoft now. One of the wealthiest corporations in the entire world. And people are paying Github. This isn't Twitter fail whale we're talking about.

> And the crazy thing is you see people on HN demanding that some one person side project/SaaS has to be at 100% uptime with multiple failovers, automatic scaling, etc. etc. There is such an emphasis on scalability on HN and yet... you just brush that all away because "software is tough.

I'm not one of those people. I may have been when I was much more inexperienced.

Software is hard. Full stop. Organizational politics, engineering culture, business / tech alignment are all hard. Distributed systems are hard.

> Yeah, no shit. Poor Github. They are also Microsoft now. One of the wealthiest corporations in the entire world. And people are paying Github. This isn't Twitter fail whale we're talking about.

I may have also thought this when I was much more inexperienced. This isn't a resource problem. Even a small startup, when they start having failures due to scale from growth, it's not a money problem. Throwing money at this doesn't make it go away.

By the way, the Twitter fail whale impacted paying customers (advertisers).

That’s because GitHub Actions is Azure DevOps, or if you want to go back further, Team Foundation Server Pipelines.
People tend not to be very kind when any product they pay for goes down.

At the end of the day - our companies also have people that rely on our software working in order to do a lot of societal good.

Sure, but it’s incredibly naive to see gh having problems and go “they must not know what they are doing”
It is probably caused from postmortem culture not being shared in the community.

"Having problems" in this world (any kind, not only due to the github scale!) is something that happens - we are not perfect and we work on an incredible amount of layers of complexity.

It is sufficient to actually touch production code on a daily basis to see that it can happen to the best, with the best observability systems or processes. The key is avoiding blaming, and understanding iteratively how to fix the problems underneath (faster recovery, detection time, and so on).

Everybody should be refunded $0.05 for the unavailability of the service they paid for.
You should probably look for a new job then, because it's pretty difficult to get fired for underperformance as a software engineer these days. There are plenty of places you can write shit code, or if you prefer Rust, places where you can blog about other people writing shit code.

Anyway, you shouldn't fire someone for causing bugs in production since it indicates a systemic failure of all the checks that should come before the bug is deployed. Even if you can trace the root cause to one person, it would be counterproductive to fire them, because now they've made the mistake they probably won't make it again. Whereas their replacement doesn't have the same wisdom.