|
|
|
|
|
by georgehm
65 days ago
|
|
>Effectively, eight CPUs run the flight software in parallel. The engineering philosophy hinges on a
>“fail-silent” design. The self-checking pairs ensure that if a CPU performs an erroneous calculation
>due to a radiation event, the error is detected immediately and the system responds. >“A faulty computer will fail silent, rather than transmit the ‘wrong answer,’” Uitenbroek explained.
>This approach simplifies the complex task of the triplex “voting” mechanism that compares results. >
>Instead of comparing three answers to find a majority, the system uses a priority-ordered source
>selection algorithm among healthy channels that haven’t failed-silent. It picks the output from the
>first available FCM in the priority list; if that module has gone silent due to a fault, it moves to
>the second, third, or fourth. One part that seems omitted in the explanation is what happens if both CPUs in a pair for whatever reason performs an erroneous calculation and they both match, how will that source be silenced without comparing its results with other sources. |
|
Put another way, the FIT (Failure in Time) value for the condition in which both CPUs in a lockstep pair perform the same erroneous calculation and still produce matching results is extremely small. That is why we selected and accepted this lockstep CPU design