Hacker News new | ask | show | jobs
by michaelmrose 3033 days ago
From the article

"However, the timestamps of the two radar pulses being compared were converted to floating point differently: one correctly, the other introducing an error proportionate to the operation time so far"

The code had a defect that effects its aim from turning it on but because it took 100 hours to drift by 1/3 of a second the problem wasn't apparent when rebooted regularly. If software can't continue to do basic math without manual intervention its defective.

In fact everyone including the company that made it admits it's defective.

Its possible your teacher picked a great example to illustrate a communication failure.

3 comments

The Patriot system was originally designed to operate in Europe against Soviet medium- to high-altitude aircraft and cruise missiles traveling at speeds up to about MACH 2 (1500 mph). To avoid detection it was designed to be mobile and operate for only a few hours at one location.

http://archive.gao.gov/t2pbat6/145960.pdf

Page 2

dug into reference 48 from Wikipedia which referenced this article which I did a search on google.

The fact that the bug manifests after a longer than normal period of operation doesn't ex post facto make it not a bug. If you add 2 and 2 and get 42 you failed.

It is however a good explanation why it remained undetected.

Conversations like this are surprisingly common in our industry ;-) To help ease communication there are 2 terms in common usage: software error and bug. A software error is code that is incorrect. A bug is a software error that manifests a user visible problem. In this case the incorrect code is a software error, but it does not manifest a user visible problem unless it is used outside some assumed parameters. The bug doesn't exist when the product is used as intended. One can argue that the behaviour is undefined when used outside of the intended use and therefore there is no bug. There is no arguing about the software error, though. It exists.

Arguing about whether or not something is a bug is pointless precisely because someone will just pull the "behaviour outside of expected use is undefined" thing out of the bag. Regardless of whether or not you should have expected something to work, if your product unintentionally kills people due to a software error, you have a gigantic problem. It's really that lesson we have to keep in mind.

I get this all the time from project managers: it doesn't matter if X fails because we aren't designing the software for X. But you can't just dismiss X -- you need to understand the consequences of X just in case somebody tries to do it. For example: It corrupts the DB if 2 people edit the same record at the same time. The project manager says, "Not a problem. I got sign off from the groups using the app and they promise never to have 2 people working on the same thing. Problem solved, and no need to modify the code!" Of course a week later the DB is corrupted and it's not a bug (it's a feature ;-) ).

It does make software development more costly, and you need to draw the line somewhere. This requires balancing risk. But I will argue that if you are writing software for a missile, there is no hiding behind the "we didn't design it for that" argument.

If by "defective" you mean has rounding errors, then sure. Everything that rounds numbers is defective. To be fair, round errors can sometimes be mitigated by carefully changing the order of operations, but never fully eliminated in those cases.
You can avoid rounding errors 100% of the time for as long as you like. For example you can use integers.

Its entirely possible to have any reasonable degree of precision reasonably required to the limits of our tools to measure.

This isn't about an inherent limit of computation its just programmer error.