Hacker News new | ask | show | jobs
by scarier 1650 days ago
I mean, that's all well and good, but assuming that your pilots will correctly diagnose and correct a failure of a new system that may or may not have similar symptoms to existing emergencies is still a really poor practice. If MCAS had been designed to command a continuous forward pitch moment until the AOA excursion had resolved, it would have very accurately resembled a pitch trim runaway following an AOA probe failure. As it was, it clearly didn't have a strong enough resemblance.

Safely operating a poorly designed aircraft can be done, but it starts with explaining the deficiency in exhaustive detail in a bold typeface in a prominent place in the operating manual, with clear warnings to avoid certain flight regions, and a well-documented emergency procedure. For example:

MCAS FAILURE: If you experience repeated momentary uncommanded nose-down pitch excursions. 1. Conduct RUNAWAY PITCH TRIM procedure. 2. Do not reset pitch trim circuit .

Unfortunately that would have required new training.

1 comments

Runaway stab trim is very easy to diagnose. The airplane pitches, the two big wheels on either side of the console start spinning, and there's a loud clacking sound.

The MCAS failure exhibited as runaway trim.

> it clearly didn't have a strong enough resemblance.

I disagree. The trim randomly and repeatedly coming on and driving the pitch down is runaway trim.

> a well-documented emergency procedure

Like this one distributed to ALL 737MAX crews:

Boeing Emergency Airworthiness Directive

"Initially, higher control forces may be needed to overcome any stabilizer nose down trim already applied. Electric stabilizer trim can be used to neutralize control column pitch forces before moving the STAB TRIM CUTOUT switches to CUTOUT. Manual stabilizer trim can be used before and after the STAB TRIM CUTOUT switches are moved to CUTOUT."

https://theaircurrent.com/wp-content/uploads/2018/11/B737-MA...

>Like this one distributed to ALL 737MAX crews:

...after their poorly designed system caused a plane crash.

If you have a 737 type rating and can speak to type-specific training standards I would love to be educated, but the defining characteristic of runaway trim failures that I've experienced is that the trim keeps going in one continuous motion until it can't go any further or you manually shut off the system. A momentary, uncommanded attitude change would initially make me want to troubleshoot the autopilot, rather than the trim system. This is exactly why you have to describe new systems and their failure modes in detail, even if they have elements in common with and use the same emergency procedures as existing failures.

I do not have a 737 type rating. But I did work on the 757 stabilizer trim system and gearbox design for 3 years. I know what runaway stabilizer trim is, and have been through the failure mode analysis on the 757 trim system.

> the defining characteristic of runaway trim failures that I've experienced is that the trim keeps going in one continuous motion until it can't go any further or you manually shut off the system

Runaway trim is the trim coming on when it isn't supposed to. It could be continuous, it could go it fits and starts, it could come on randomly. The corrective action is the same - turn it off.

This is why the trim cutoff switch is prominently there on the console within easy reach.

Waiting until it can't go further, i.e. it runs into the stops, is just not a good idea as by then the airplane may be in an extreme pitch position which may not be recoverable.

> ...after their poorly designed system caused a plane crash

The LA crew never turned off the trim system, despite restoring normal trim with the electric thumb switches 25 times.

The previous LA flight experienced the same MCAS malfunction, and after restoring trim a couple times, turned off the stab trim system. They then proceeded to their destination and landed normally. They did not know about MCAS, but they did know that runaway trim is dealt with by turning it off, which is a memory item.

The MCAS system was poorly designed. But a contributing factor to the crash was the pilots not following proper procedure in response to runaway stab trim.

I am not a pilot, so take the following as you will:

1. if I suspected an autopilot malfunction, I would turn it off and fly manually and let the maintenance people figure it out.

2. if I experienced runaway trim, I would turn off the trim and fly without it as much as possible, again letting the maintenance people debug it.

In general, I am not going to debug a flight critical system at 30,000 feet that is malfunctioning if I can fly safely without it.

I agree that Boeing made a big mistake in not disclosing the existence of MCAS and how it operated.

In all seriousness, I would enjoy talking about commercial aircraft trim system failure modes over a beer sometime.

For what it's worth, while I agree with your technical definition of a trim runaway, every time I've seen it in the sim or real life it's been a single, continuous event moving from steady-state flight trim to an extreme. I'd be willing to bet a few beers that this is what most pilots are trained to expect from a trim runaway, and what B737 crew see in the sim while getting type rated. I'm not disagreeing that if the LA crew diagnosed it as a trim failure and performed the EP correctly they would likely still be alive, and I'm also not arguing that they were an exceptionally good or even average crew.

I'm arguing that you can't really fault a below-average-but-still-acceptably-competent crew for not diagnosing the failure of a system they couldn't have reasonably been aware of as a trim problem on an otherwise perfectly functioning aircraft. There are plenty of atypical emergencies that require the crew to "do some pilot shit" to get the plane back safely on deck, but an easily foreseeable single-sensor failure shouldn't be one.

We'll probably just have to agree to disagree about how likely an average crew would be to treat this as a trim failure, but I like to think we can still agree that the likelihood was unacceptably low for commercial aviation safety standards.

I don't mind at all having a friendly disagreement. No problem!

I can't really imagine erratic operation not considered as a failure. After all, if you're coming in to land you wouldn't want the stab trim coming on uncommanded even for a second. As far as the 757 Flight Controls group was concerned, an intermittent failure in the trim system was unquestioned cause for immediately disabling it.

Two independent computers controlled the automatic stab trim. They were custom computers, designed by two groups that weren't allowed to talk to each other. They used different CPUs, different algorithms, and different programming languages. The computed commands were run through a comparator. If they differed, both computers were instantly electrically isolated from the trim system.

How Boeing evolved from that ethos to relying on a single sensor, I cannot understand.

BTW, these ideas have trickled into my approach to writing software, often engendering spirited debate with me against the world :-)

I claim credit for the term "defensive programming". It was the title of a talk I gave long ago. I'd never seen the term applied to programming before, and have seen it often sense. Unfortunately, I have since lost the contents of my talk. I don't even remember which conference it was at, there have been so many.

I work on the avionics side of the industry and really enjoy when I run into your posts in discussions. You explain things to people not in the industry much more eloquently than I could.

While I'm not in my company's fly-by-wire group currently, I have been in the past.

> Two independent computers controlled the automatic stab trim. They were custom computers, designed by two groups that weren't allowed to talk to each other. They used different CPUs, different algorithms, and different programming languages. The computed commands were run through a comparator. If they differed, both computers were instantly electrically isolated from the trim system.

Current thinking in fly-by-wire software is a little different. There have been studies performed that showed nearly all software issues at this level are due to a misinterpretation of requirements. These misinterpretations were shared between the different software teams, leading to the two different units outputting identical (though incorrect) commands which would pass through the comparitors. So in essence you're doubling your development cost for no actual safety benefit. I can see if I can dig up those studies if you'd like. It will take a while, though, since almost everyone at my company is already on vacation for the year.

I'm simplifying what follows a little as I'm not sure how in depth I can get on our hardware design. What we do now is essentially run the same fly-by-wire software over multiple computers. These computers must have a mix of CPUs, including having differing endianness. If a single computer miscompares the comparitor turns that computer off. If more end up failing, the system falls back to a much simpler failsafe mode without a CPU in the loop where the flight controls in the cockpit are interpretated directly by the electronics that drive the actuators.

Heh, I think I did a poor job of expressing myself. You're absolutely right that an intermittent/erratic trim operation is a failure, and potentially a serious one. I'm thankful that you, as a flight control engineer, are as concerned about it as you are (for hopefully obvious reasons). Out of curiosity, how probabilistic is your failure mode analysis? I'm wondering what the relative likelihood of an intermittent uncommanded actuation compared to the neutral-to-extreme runaway failure that most pilots expect. I've never been in a sim where the operator console had trim failure options other than "stuck at current position" or "runaway to extreme limits," but I wouldn't be surprised if they should've included other failure modes.

Yeah, that seems like an eminently reasonable way to design a trim system. I don't think the MCAS concept is fundamentally unsound, but it blows my mind that they didn't design it with that kind of mindset.

Oh man, I wish more software was built to the standards of aerospace control system best practices...