9. Command and Data Handling
9.7 Avionics Reliability and Fault Tolerance
We would like for all these hardware components to be reliable until the end of the spacecraft mission. Unfortunately, each hardware component is susceptible to radiation effects in single events or accumulated as a total ionizing dose. The result of radiation effects is faulting in two types: “permanent faults—that is, faults that break computer components—and soft errors, which cause an error but do not cause permanent damage. Techniques have been developed to deal with both types of faults. Unfortunately, these techniques, especially those for fixing soft errors, rob the computer of much of its efficiency” [NASA]. Solution includes:
Redundant processors that combine or replicate results or vote on the final result.
- Distributing processing across multiple processors to distribute the risk of failure.
- Liberal use of watchdogs in multiple places to monitor critical components.
- Make use of reconfigurable hardware elements, i.e., sets of digital hardware elements whose wiring could be “programmed” as needed, like FPGAs, we could use these reconfigurable logic parts to implement the required algorithms on an as-needed basis.
- Keep a replica of startup software in non-volatile memory for times when the system needs to reset.
- Encode intelligence in the software to detect and correct errors but are limited in their application—that is, they cannot cover all machine operations.
- Enable spacecraft to accept software updates during missions to mitigate, prevent, or correct soft errors.