| >As a computer scientist, I'm not accustomed to thinking about functional equivalence in the presence of hardware failure... How are you not? I mean, I get it 90% of the time we screw up the programming somehow, but as a computer scientist, I never ignore the possibility of hardware failure. Memory goes bad. Devices fail. Networks die. Semiconductors transiently in strange ways if you don't take the right precautions... It's the entire impetus behind GIGO. If you shove garbage into a perfectly working software system; (corrupt data from a malfunctioning input source), you still get out garbage. It's why life and safety critical automation is so fundamentally different from lower stakes programming tasks where "reboot the damn thing" is a viable option. If your sensor goes bad, and you're in the air, you can't do squat to fix it. You have to detect the error, and fail the system gracefully by taking it out of the loop, informing the operator of the system failure, and most importantly, never allow that system to do anything that could jeopardize the ability of the operator to continue operating. This is or at least I thought it was basic Control Systems 101... |
I research compilers and type systems. If the RAM dies while the compiler is running, you rerun the compiler on a new machine. A lot of computer science abstracts away the notion of hardware failure, because otherwise it becomes enormously cumbersome to talk about anything. This is fine as long as you don't actually build real high-reliability systems with the same approach.