Hacker News new | ask | show | jobs
by burnerburnson 846 days ago
> errors and unsafe acts will not be punished if the error was unintentional.

No sane organization would ever implement this. If someone repeatedly makes mistakes, they're going to get fired even if the mistakes are unintentional. Anything else is going to cause more safety issues in the long-term as inadequate employees are allowed to proliferate.

12 comments

This is just blameless post mortems and many, many many places implement this.

There are always going to be some level of "inadequate" employees, and also perfectly adequate employees that sometimes make mistakes in any organization and if your organization requires that no employees ever make mistakes in order to operate safely, then you have serious problems.

The purpose of a statement like that is that you don't just have a post-mortem that is like: "Our company went off the internet because an employee had a typo in a host name. We fired the employee and the problem is solved." When in reality the problem is that you had a system that allowed a typo to go all the way into production.

It's like that story of the pilot who, after his refueling technician almost caused a crash by using the wrong fuel, insisted that he always have that technician because they'd never make that mistake again.
That was the late, and definitely great, R.A. "Bob" Hoover, I am proud to have shared a beer with him at Oshkosh. His Shrike Commander was miss-fueled with jet fuel instead of avgas because it was mistaken for the the larger turboprop model. Rather than blaming the individual refueler, he recognized that there was a systemic problem and developed an engineering solution. He proposed and the industry adopted a mutually incompatible standard of fuel nozzles/receptacles for jet fuel and avgas as a result. You can find some great YouTube material on him, or the film "Flying the Feathered Edge"

https://sierrahotel.net/blogs/news/a-life-lesson

https://en.wikipedia.org/wiki/Bob_Hoover#Hoover_nozzle_and_H...

https://www.imdb.com/title/tt2334694/

Here's an old timey video of Bob in his prime. At 8:55 he flys a barrel roll with one hand while pouring himself a glass of iced tea with the other. Hardest part was pouring the tea backhanded so the camera had a good view. Then he finishes with his trademark no-engine loop, roll, and landing.

https://www.youtube.com/watch?v=PT1kVmqmvHU&t=510s

the question is what do you do with the technician after the 2nd mistake. that is to say, When does this logic break down?
That's not really the question:

Punishment culture assumes people naturally do bad, lazy things unless they are deterred by punishment and fear. Therefore we must punish mistakes.

That perspective has long been debunked. You don't see competent, skilled leaders using it. It turns out that generally people want to do well (just like you do), and they don't when they are scared / activated (in fight/flight/freeze mode), poorly trained, poorly supported, or poorly led. They excel when they feel safe and supported.

If you are the manager and the technician makes the same mistake the 2nd or 3rd time, you will find the problem the next morning in your bathroom mirror. :) At best, you have put them in a position to fail without the proper training or support. Leadership might also be an issue.

I would say that every skilled leader must use punishments and consequences to some degree.

If your tech gets drunk every day and doesnt do their job, you need to cut them loose. This isn't a management problem.

Sometimes people end up in positions where they are not suited and will continue to fail. If you hired a plumber and you need a doctor, that isnt an on the job training, support, or leadership issue.

> you need to cut them loose. This isn't a management problem.

That is 100% a management problem.

> Sometimes people end up in positions

I wonder how they got in those positions? That sounds like a management problem too.

If you implemented some changes so the mistake is caught before disastrous consequences, you're already doing better. Well enough to let the 2nd one slide. Even the 3rd. After that, action seems reasonable. It's no longer a mistake, it's a pattern of faulty behavior.
That is a big IF. At some point it comes down to the error type, and if it is a reasonable/honest mistake.

The situation is very different if the fuel cans are hard to distinguish vs if the tech is lazy and falsifying their checklist.

Underlying any safety culture is a one of integrity. No safety culture can tolerate a culture of apathy and indifference.

I expect there's precisely 1 safety culture that can tolerate a culture of apathy and indifference -- one in which no work is ever completed (without infinite headcount).

You apply risk mitigation and work verification to resolve safety issues.

Then you recursively repeat that to account for ineffective performance of the previous level of verification.

Ergo, end productivity per employee is directly proportional to integrity, as it allows you to relax that inefficient infinite (re-)verification.

Redesign the system again if it's unintentional. It is almost impossible to control humans to the degree that they never make mistakes. It's far better to design a system in which mistakes are categorically impossible.
I'm trying to push back on the knee jerk sentiment that there are no bad employees, only bad systems.

There are no systems that are human proof, and what kind of human behavior is tolerated is a characteristic of the system.

In fact, there are humans that lie, cheat, are apathetic, and incompetent. Part of a good system is to not only mitigate, but actively weed these people out.

For example, if someone falsifies the inspection checklist for your plane, you dont just give them a PIP.

> I'm trying to push back on the knee jerk sentiment that there are no bad employees, only bad systems.

Why is it important to you?

Falsifying the inspection checklist is not a honest mistake.
Yes there are obviously bad employees but the line for actual incompetent/malicious employee is a lot further away than most people understand.

A lot of bad management is hand-waved as crappy employees (by management - shocking!)

I think that this anecdote [0] is appropriate for showing the glaring disconnects that can exist in the human<-->system symbiosis.

[0]: https://www.controlinmotion.com/news/news-archive/a-little-h...

It's seemingly simple "oh the technician keeps messing up"

Did the technician mess up (sometimes true), or were they doing their job in good faith - was it the system/protocol/organization that made the task mistake prone? Did someone else actually mess up but the situation made it look like it's the technician's fault? Does this technician do a task/service that is failure prone? Are there other technicians on other tasks that are far less failure prone? Here the former technician would seem poor, the latter, excellent, but it's a function of the task/role and not the person.

I've been "the technician" - I catch a lot of blame because people know I'm anti-blame culture, so I'd rather take the blame on myself that point my finger to the next guy in line. I'm also willing to take on high risk tasks for the greater good even if they suck and are blame prone / risky. I believe in team culture in this way. If the organization doesn't respect that belief and throws me under the bus, I leave - which is quite punishing for them since they remain completely unaware of a major internal problem. If an organization "sees me" and my philosophy, then together we get very very good at optimizing the system to minimize the likelihood of failure / mistakes.

Well certainly not after the first time at least

Imo it's a function of time, company and team culture, severity, and role guidelines.

If an employee makes a mistake but followed process, and no process change occured, that's just acknowledging the cost of doing business imo and would be a unbounded number of times so long as it's good faith from the employee

My point is that good faith and sufficient competence are crucial. If the employee didn't care if the plane crashed, they are a bad fit.

If they cant read the refueling checklist, they are a bad fit.

Ideally you have system controls to screen and weed these people out too.

> a function of ... severity

Not severity; that sort of thinking is actually part of low-safety cultures. A highly safe culture requires the insight that people don't behave differently based on outcome. In fact, most people can't assess the severity of their work (this is by design; for example someone with access to the full picture makes the decisions so that technicians don't have to). So they couldn't behave differently even if they did somehow make better decisions when it matters.

But, and I'll reiterate the point for emphasis, people make all their decisions using the same brain. It is like bugs; any code can be buggy. Code doesn't get less buggy because it is important code. It gets less buggy because it is tested, formally verified, battle scarred, well specified and doesn't change often.

Would s/severity/impact/g also be counterproductive of safety culture? Genuinely trying to learn here, gotta be responsible/accountable and all.

Maybe impact relative to carelessness/aloof-ity?

I agree that an engineer/person will not behavior differently based on outcomes, but if they know in advance something can have a wide, destructive blast radius if some procedure is not followed, I feel there's a bit more culpability on the part of the engineer. Regardless I don't think I feel I have a sufficient grasp on this concept I'm trying to define so definitely agreed I shouldn't have included 'severity' in the function definition nor any alternative candidate

You take him into a boolean tree within a and with another employee for quality and put him on a improvement plan?
maybe. or maybe you turn them over to the authorities because the 2nd time their lazy and reckless disregard killed several people.
Exactly. https://asteriskmag.com/issues/05/why-you-ve-never-been-in-a... is a great article illustrating this in the airline industry itself.
> When in reality the problem is that you had a system that allowed a typo to go all the way into production.

That's a typical root cause, and is exactly what should come out of good post-mortems.

But human nature is human nature...

Just culture doesn't prevent you from firing someone who makes repeated mistakes.

In fact, Just Culture in itself provides the justification for this. As the next line says "However, those who act recklessly or take deliberate and unjustifiable risks will still be subject to disciplinary action". A person who repeated makes mistakes is an unjustifiable risk.

When a punishment is applied with more deliberation, it can also be more severe.
Why is severity desirable? Or if it's not desirable, so what?
Severity is desirable iff it's justified. I wouldn't ever sign off on a policy that says "you'll be fired for a single mistake" (that would be a severity of punishment out of proportion to the risk/underperformance).

But a policy that never provided for the possibility of termination (insufficient maximum severity) is also not desirable.

> Severity is desirable iff it's justified.

It's necessary if it's (necessary & efficient & justified); it's never desirable IMHO.

Doing severe things because they are justified is just acting out on a desire or drive - internal anger - but now we can 'justify' the target and feel ok about it. Lynch mobs think they are justified.

Designing severe things to be included as part of a process is a desirable property of that system if the severe thing is sometimes required.

No one is designing a formal system that includes lunch mobs. But a formal system of repercussions for employee behavior that does not include firing is an incomplete system.

It’s not that firing itself is ever desirable, but rather that its inclusion in a disciplinary progression is desirable.

You can really dumb it down to why didn’t you follow the checklist? If someone makes the same mistake after being corrected three times and the proper procedures exist for the worker to follow then the safety culture provides the structure and justification for their dismissal
No, you really need to smarten it up, and start off by making sure that your checklist is correct. Is it the correct checklist for the airplane model that you are building? Are all the right items on the checklist? Are they being done in the correct order? Do you have the correct validation/verification steps in your checklist? Does your checklist include all the parts that will need to be replaced? If the mechanic finds a quality issue while working the checklist and a job needs to be re-done, which checklists then need to be re-done? What other jobs are impacted by the rework?

All indications here (from the NTSB prelim and the widely reported whistleblower account) are that during rework for a minor manufacturing discrepancy, the mechanics on the shop floor followed bad manufacturing planning / engineering instructions to-the-letter, then the ball was dropped in error handling when the engineering instructions did not match the airplane configuration, because Boeing was using two different systems of record for error handling that did not communicate with each other except though manual coordination.

That's not the fault of the front-line assembly worker not following a checklist.

I agree with you. If the systems/procedures/checklists are bad it is not the fault of a front line worker.

I thought I was replying more to a parent comment addressing the inability to people go who repeatedly make mistakes, which is acceptable unless they are not following procedures.

That's quite a leap from "unintentional" to "repeatedly."
Not at all: Systemic problems will result in repeated errors until the system is changed.
Ideally, as a result of the post-mortem, the same mistake shouldn't even be repeatable, because mechanisms should be introduced to prevent it.

And if someone keeps making new original mistakes, revealing vulnerabilities in your processes, I would say that it is a very valuable employee, a lucky pen-tester of sorts.

I once destroyed $10k worth of aerospace equipment. I admitted it immediately and my only reprimand was that my boss asked me if I learned my lesson. (I did)
Once destroyed a industrial manufacturing site with a unfinished robot program that ran because I allowed myself to be distracted mid alterations.
And what happened?
Who do you think came up with this rule, bleeding heart liberals’? Stop and think for a second, why does that rule exist?

You described a fantasy world, in the real world everyone makes mistakes, and if the mistakes are punished, then there are no mistakes because no one reports them. That is until the mistake is so catastrophic, it cannot be covered up- that’s how you get Chernobyl or Boeing max

Boeing max (if you mean the crashes caused by MCAS) wasn't due to a "mistake" not being reported, it was deliberate and intentional on the part of company management. The system was designed badly and without redundancy, and without any information available to the pilots about its very existence, specifically because management wanted it that way. It wasn't caused by some kind of accident.
Every sane organization implements this. Failure to do so leads to fear of reporting mistakes, and you get Boeing. This isn't news.
If it's possible for an employee to unintentionally make the same mistake twice, that's purely management's failure. It's impossible to make systems completely fool proof, but once you know of a specific deficiency in your process you fix it. If you've corrected the issue, it should take deliberate effort for someone to do it again. An organization that knows its processes are deficient but makes no changes and expects a different result is insane.
I think the wording is clumsy, but this is analogous no-blame processes. The wording is just accounting for the possibility of wontonly malicious or recklessly negligent work quality. Think someone either sabotaging the product, or showing up to work very high or drunk.
This.

A mistake like "accidentally turning the machine off when it shouldn't be" is a fixable problem.

If someone has attitude like "fuck the checklist, I know better", it is not really a mistake, and that person should be rightfully fired or at least moved to a position where they cannot do any harm.

Wowwww never become a manager please.
Furthering the insinuation that everyone has the right to work every job. Sometimes people suck at their job.
As your sibling comments mentioned, there's a difference between giving a chance for someone to learn from a single mistake without punishment, and allowing them to make the same mistake twice without taking matters out of their hands after.

If it's a really critical role, the training will have realistic enough simulation for them to make countless mistakes before they leave the training environment. Then you can assess their level of risk safely.

This whole thread is missing the fact that the NTSB had a theory that transparency leads to safer airplanes, they tried it, and it works. People hesitate to self-report when it comes with punishment (fines, demotions, or just loss of face among peers). You need a formal “safe space” where early reporting is rewarded and late reporting is discouraged.

Safety is a lot about trust, and there is more than one kind of trust. At a minimum: are you capable of doing this thing I need you to do? Will you do this thing I need you to do?

It's not just the NTSB, it's part of things like the Toyota Production System. There's ample evidence to show both that punishment discourages safety and that lack of punishment encourages safety, across multiple industries.
Yes this is cross industry best practices.

Goodhart's law also applies, as in the case of the edoor bolts, Spirit intentionally bypassed safety controls to meet performance metrics.

The Mars Climate Orbiter is another example. While unit conversion was the scapegoat, the real cause of the crash is that when people noticed that there was a problem they were dismissed.

The Andon cord from the Toyota Production System wasn't present due to culture problems.

Same thing with impact scores in software reducing quality and customer value.

If you intentionally or through metrics incentivize cutting corners it will be the cost of quality and safety.

I am glad they called out the culture problem here. This is not something that is fixable under more controls, it requires cultural changes.

> The Mars Climate Orbiter is another example. While unit conversion was the scapegoat, the real cause of the crash is that when people noticed that there was a problem they were dismissed.

Challenger too. Multiple engineers warned them about the O-rings. They weren't just ignored, but were openly mocked by the NASA leadership. (https://allthatsinteresting.com/space-shuttle-challenger-dis...)

A decade later a senior engineer at NASA warned about a piece of foam striking Space Shuttle Columbia and requested they use existing military satellites to check for damage. She was ignored by NASA leadership, and following (coincidentally) a report by Boeing concluding nothing was wrong, another 7 people were killed by a piss-poor safety culture. (https://abcnews.go.com/Technology/story?id=97600&page=1)

But but but what about my intuition and gotcha questions about how this could never work in practice?
The dirty secret of why traffic circles reduce accidents? Stoplights feel safer than they actually are, while circles feel more dangerous than they actually are. That nervousness becomes vigilance, which reduces accidents. It’s also why people intuitively hate them. They’re actually right, but also wrong.

Feeling safe is an illusion that governments try to maintain for their people. It’s one of their biggest jobs. But the illusion has dimensions and it’s hard to keep several going at once.

I think there is more nuance to it than that. Not everything is a mistake, not every mistake is recoverable, and not all skills are trainable.

The fundamental goal is to distinguish between recoverable errors and those that are indicative of poor employee-role fit.

Mistakes are the problem, as they will always happen.

The point is to build a culture where you value teamwork and adjust and learn from failures.

This isn't an individual team problem, this is an organization problem.

It is impossible to hire infallible, all knowing employees.

But it is quite possible to enable communication and to learn from pas mistakes.

When you silence employees due to a fear of retribution bad things happen.

People need to feel safe with calling out the systemic problems that led to a failure. If that ends up being the wrong mixture of skills on a team or bad communication within a team that is different.

Everything in this report was a mistake, and not due to gross incompetence from a single person.

The E door bolts as an example was directly attributed to metrics that punished people if they didn't bypass review. The delivery timelines and defect rates were what management placed value on over quality and safety.

Consider the prisoner delema, which is resolved by communication, not choosing a better partner.

I don't disagree with what you said about this instance, but I'm trying to push back on the knee jerk sentiment that there are no bad employees only bad systems- There are both. cultures that are too permissive of bad actors degrade the system.

Part of maintaining quality culture is maintaining red lines around integrity.

Like I said above, not all errors are recoverable or honest mistakes.

I work in medicine and a classic example would be falsifying data. That should always be a red line, not a learning opportunity. You can add QA and systemic controls, but without out integrity, they are meaningless. I have seen places with a culture of indifference, where QA is checked out and doesn't do their job either.

> I work in medicine and a classic example would be falsifying data

Certainly nobody has ever thought about that before. In fact, there definitely isn't a second sentence in the definition of aviation's just culture that is being completely ignored in favour of weird devil's advocacy.

> 4) Just Culture- errors and unsafe acts will not be punished if the error was unintentional. However, those who act recklessly or take deliberate and unjustifiable risks will still be subject to disciplinary action.

Oh wait.