Hacker News new | ask | show | jobs
by NovemberWhiskey 1641 days ago
I think it's predicated on a misunderstanding of what "fail-safe" actually means.

For example, in railway signaling, drivers are trained to interpret a signal with no light as the most restrictive aspect (e.g. "danger"). That way, any failure of a bulb in a colored light signal, or a failure of the signal as a whole, results in a safe outcome (albeit that the train might be delayed while the driver calls up the signaler).

Or, in another example from the railways, the air brake system on a train is configured such that a loss of air pressure causes emergency brake activation.

Fail-safe doesn't mean "able to continue operation in the presence of failures"; it means "systematically safe in the presence of failure".

Systems which require "liveness" (e.g. fly-by-wire for a relaxed stability aircraft) need different safety mechanisms because failure of the control law is never safe.

2 comments

> "systematically safe in the presence of failure".

And even then, you still need to define "safe". Imagine a lock powered by an electromagnet. What happens if you lose power?

The safety-first approach is almost always for the unpowered lock to default to the open state — allow people to escape in case of emergency.

Conversely, the security-first approach is to keep the door locked — nothing goes in or out until the situation is under control.

A more complex solution is to design the lock to be bistable. During operating hours when the door is unlocked, failure keeps it unlocked. Outside operating hours, when the door is set to locked, it stays locked.

The common factor with all these scenarios is that you have a failure mode (power outage), and a design for how the system ensures a reasonable outcome in the face of said failure.

Or nuclear reactors that fail safe by dropping all the control rods into the core to stop all activity. The reactor may be permanently ruined after that (with a cost of hundreds of millions or billions to revert) but there will be no risk of meltdown.
Sort of. A failsafe reactor design [can] include[s] things like:

* Negative temperature coefficient of reactivity: as temperature increases, the neutron flux is reduced, which both makes it more controllable, and tends to prevent runaway reactions.

* Negative void coefficient of reactivity: as voids (steam pockets) increase, the neutron is reduced.

* Control rods constructed solely of neutron adsorbent. The RBMK reactor (Chernobyl) in particular used graphite followers (tips), which _increased_ reactivity initially when being lowered.

It's also worth noting that nuclear reactors are designed to be operated within certain limits. The RBMK reactor would have been fine had it been operated as designed.

Source: was a nuclear reactor operator on a submarine.

I don't know enough about reactor control systems to be sure on that one. The idea of a fail-safe system is not that there's an easy way to shut them down, but more that the ways we expect the component parts of a system to fail result in the safe state.

e.g. consider a railway track circuit - this is the way that a signaling system knows whether a particular block of a track is occupied by a train or not. The wheels and axle are conductive so you can measure this electrically by determining whether there's a circuit between the rails or not.

The naive way to do this would be to say something like "OK, we'll apply a voltage to one rail, and if we see a current flowing between the rails we'll say the block is occupied." This is not fail-safe. Say the rail has a small break, or if power is interrupted: no current will flow, so the track always looks unoccupied even if there's a train.

The better way is to say "We'll apply a voltage to one rail, but we'll have the rails connected together in a circuit during normal operation. That will energize a relay which will cause the track to indicate clear. If a train is on the track, then we'll get a short circuit, which will cause the relay to de-energize, indicating the track is occupied."

If the power fails, it shows the track occupied because the relay opens. If the rail develops a crack, the circuit opens, again causing the relay to open and indicate the track is occupied. If the relay fails, then as long as it fails open (which is the predominant failure mode of relays) the track is also indicated as occupied.