J. Christmansson and Ram Chillarege Chalmers University of Technology, IBM Research, 1996
Abstract: -- A significant issue in fault injection experiments is that the injected faults are representative of software faults observed in the field. Another important issue is the time used, as we want experiments to be conducted without excessive time spent waiting for the consequences of a fault. An approach to accelerate the failure process would be to inject errors instead of faults, but this would require a mapping between representative software faults and injectable errors. Furthermore, it must be assured that the injected errors emulate software faults and not hardware faults. These issues were addressed in a study of software faults encountered in one release of a large IBM operating system product. The key results are:
- A general procedure that uses field data to generate a set of injectable errors, in which each error is defined by: error type, error location and injection condition. The procedure assures that the injected errors emulate software faults and not hardware faults.
- The faults are uniformly distributed (1.37 fault per module) over the affected modules.
- The distribution of error categories in the IBM operating system and the distribution of errors in the Tandem Guardian90 operating system reported in [14] were compared and found to be similar. This result adds a flavor of generality to the field data presented in the current paper.