Hacker News new | ask | show | jobs
by ceejayoz 2324 days ago
> According to the source, Boeing patched a software code error just two hours before the vehicle reentered Earth's atmosphere. Had the error not been caught, the source said, proper thrusters would not open during the reentry process, and the vehicle would have been lost.

Uh, that's extremely concerning for a CREWED capsule.

5 comments

It depends on if the crew would have been able to control those thrusters themselves. Obviously you want the entire system to be autonomous enough that no crew interaction is required but things do happen and the crew needs to be able to act and fix the problem if capsule can't do it nor the ground. When the Starliner started a burn at the wrong time, a crew would have been able to stop it and prevent the loss of fuel. I wonder if this re-entry thruster issue was a result of the earlier thruster issue (or a result of the troubleshooting of it).

There are uncrewed test flights for a reason. You can't always simulate every possible failure mode. Things fail on the ground that wouldn't be possible during normal operation and vice versa.

The "crew would fix it" argument is a very very bad one. Many spacecraft maneuvers need to be very precise in both pointing and direction. Something computers are very good at, humans less so.

Also, the crew would first have to know something wrong is going on either based on activity happening that was not planned previously or unexpected data on flight instruments. But guess what is driving those instruments in a modern crewed space vehicle - also computers and software. That software might be faulty as well or even displaying the same wrong data the automated control software is acting upon.

In such a case the crew might not even notice something is wrong until the craft is on a wrong and potentially even unrecoverable trajectory once ground radar notices something is wrong.

As for the crew taking over thtuster control during a reentry - sorry, if you space capsule is trying to kill you that hard, something is wrong.

At that point in time, the capsule is hurtling through the atmosphere protected only by its from ablative shield. The thrusters are used to shift the center of gravity a bit, to give the capsule some lift, offsetting some of the g forces due to the rapid deceleration. This is called "lifting reentry".

This all needs to be very very precise & based on up to date sensor data, as the whole capsule is not covered by the heat shield and if you change the center of gravity too much, you might expose unprotected parts of it to the hot plasma.

This is not really a good environment for a crew member to take over - not only are you under couple g's of deceleration but any mistake will kill you all. But hey, no pressure!

BTW, the Soyuz capsule has a backup mode available in case it's reentry control thrusters fail, where the capsule just follows an unguided ballistic reentry. This is much harder on the crew (due to no lift compensating for some of the deceleration), but survivable & has been used a couple times during various emergencies.

Im not disagreeing with you, I suppose, but we did do this before with a lot less sophisticated systems and a lot more manual control. It stands to reason that, given proper training, a pilot of one of these spacecraft could identify a problem and switch to manual.
A good spacecraft would allow transportation of injured or incapacitated crew, so fully automatic landing is definitely desirable.
Sure, but that's different from a simpler spacecraft that simply does not have such automated systems, affecting crew training accordingly.

Having one that has sophisticated automation which you need to constantly monitor in case it tries to kill you due to a simple programming error is not really comparable I'm afraid.

> You can't always simulate every possible failure mode. Things fail on the ground that wouldn't be possible during normal operation and vice versa.

This should be concerning, then:

https://spaceflightnow.com/2019/11/04/boeing-starliner-pad-a...

> “Boeing is not going to do an in-flight abort test,” said Jon Cowart, deputy manager of the mission management office for NASA’s commercial crew program, before the pad abort test. “They’re just going to do the ground one. They think that they can get enough data and then extrapolate that out, with good analytical techniques that we’ve endorsed. They will go and do it in that particular way, versus SpaceX, which is going to do both.

You'd think that after two air plane disasters, they'd tread carefully
> Finally, before the meeting ended, the chair of the safety panel, Patricia Sanders, noted yet another ongoing evaluation of Boeing. "Given the potential for systemic issues at Boeing, I would also note that NASA has decided to proceed with an organizational safety assessment with Boeing as they previously conducted with SpaceX," she said.

This is a welcome development.

In many spacecraft failure modes, the crew would be unconscious.
Wouldn’t like to be the guy pushing an update to a crewed capsule just before re-entry. I stress out enough about pushing code as it is. But then I wouldn’t like to be the guy who left an error in the code to begin with.
Doesn't that depend on the testing schedule?

If the schedule called for simulations to be run in parallel with the live test, then it's an expected outcome. It should be _expected_ that every test will find a problem. Since this was uncrewed, there was no risk (other than to the uncompleted tests possibly requiring a second flight) to running them in parallel and porting across fixes for any problems that were found.

It was a schedule compression attempt with a cost of second test flight risk if it failed.

Boeing took the "we'll do very rigorous engineering up front and prove everything on paper" approach, where SpaceX took the "we'll prove it works by actually launching it" approach (which isn't to say SpaceX isn't operating with engineering rigor).

In theory Boeing ran all their simulations over the past couple years, and this flight should have just been a formality. As it turns out, Boeing is running into a lot of issues when they actually test their hardware.

The problem with simulations and paperwork with high tech engineering is you need enough competent, independent reviewers that understand how the whole system works together.

That sort of thing is rarely organized. So we test it instead.

This is likely an oversimplification. There are multiple organizations, even within Boeing Defense & Space, that write their own flight software. All stovepiped and largely working in parallel. This doesn't even include the commercial folks, infamous for the 737 Max. My understanding is that the St. Louis teams are better regarded, and the folks that worked on DARPA HACMS deserve some credit, but they seem to be outliers. Boeing's culture doesn't seem to prioritize modern software development methods or software rigor on the whole. Functional testing should be the last layer of bug-hunting techniques, not the first or primary. The issue seen on their capsule didn't surprise me at all. Other BDS software groups use utterly outdated software development methods and we should all be a little bit worried.
Obviously it's a simplification; comparing and contrasting the approaches of SpaceX and Boeing would require several walls of text...

At the core of the issue though, the process that SpaceX pitched to NASA (and NASA approved) involved quite a bit of actual hardware testing. Boeing's plan (also approved by NASA) relied much more heavily on simulation, modelling, and other sorts of process validation. For instance, Boeing did not perform an in-flight abort.

It kind-of depends on what sort of issue you're finding in your test. Complex systems like spacecraft often have unexpected interaction effects which only testing can reveal. I would call these the 'good' kind of test learnings - ones that an army of great engineers wouldn't be able to predict.

This is editorializing but it looks like Boeing didn't uncover very complicated interactions - they failed at a more basic level of competency - timer synchronization for the launch issue and then a major software bug for an important orbital maneuver. Those sorts of issues really should be sorted out on the ground using hardware simulators. Furthermore, the timer synchronization failure prevented testing of the docking hardware, further delaying the overall program.

For a human rated vehicle, personally I think you should have at least one full-up, fully nominal test before you send anyone along for the ride.

Man I hope the QA person who found that bug was rewarded..
Makes me wonder why put all this effort to make space-travel safe for crews. Why not focus on remotely controlled or AI autonomous vehicles instead?
I think we already do — there have been roughly 15 satellites etc. launched for every human who has ever reached earth orbit over the entire history of human spaceflight.
Good. I was just wondering why this project in particular needed to have a live crew.
It’s hard enough to get an AI to work well in a known situation line driving, never mind an unknown planet.

But more importantly - where is the fun in that?

You are right some people find it fun to go to dangerous places, I assume. Think about the crew of StarTrek and of course Buzz Lightyear. Personally I would rather send someone else to space than go there myself :-)
> Makes me wonder why put all this effort to make space-travel safe for crews

Would you want to board that capsule yourself, otherwise?

> Why not focus on remotely controlled or AI autonomous vehicles instead?

Buzzwords doesn't make something more reliable/less error prone. You really think that by throwing in something like AI, which can fail in unexpected ways, be a good idea?

I think it's worth studying and developing further, for instance study how to make AI which will not "fail in unexpected ways"