Hacker News new | ask | show | jobs
by cbanek 2637 days ago
Overall, I think designing a system that prevents people from being harmed is very hard, and has to be a design level concept that is in everyone's minds as they start, and every day through development.

Even your example "the worst can happen when our software has problem is someone won't get a meal on time or a meal is wasted." isn't really true. What if you ordered fish, or oysters, and they were left out too long and caused some kind of food poisoning (just as an example).

There are many levels of thinking about this problem. Maybe you can have a sticker on the package that reacts to temperature to let someone know the meal isn't safe, etc. You still have to train the user to know what it is, and when it is safe.

So in this simple example, you have software, hardware, redundancy, and user training that all have to happen. Same for things like cars or planes. You're really trying to build a safety critical system, and many times (such as the Boeing example), it isn't just software or hardware that causes the problems, but issues arise at the intersection of both.

For Boeing, it would be lack of user training, lack of good UX, possibly hardware design issues with being prone to stall, hardware issues with the angle of attack sensors, lack of enough redundancy of the angle of attack sensors to operate properly, etc.

You can never get to a 0% chance of failure. Most of the time you are just attacking the highest chances of failure, since when you get down to the level of faulty parts or mechanical fatigue, things always break.

Of course, each subsystem and integration should have good testing to find all these things, but it's sadly less of a science and more of an art IMHO. And I used to work on rocket software.

Many times, the answers are more simple than you think. Simplicity usually means better operation than trying to overcomplicate error handling. Sometimes you just need to change the whole way you are thinking about the problem.

1 comments

Thank you for very interesting thoughts. Learned something new.

You were right with the counter-exaple that we have redundancy in place that is 'hardware' - mealbox stickers, production qa.