Ming L. Yu, Lisa A. DeLouise
Surface Science Reports
Backward error recovery, based on checkpointing and rollback, is often used for implementing fault tolerance in multicomputer systems. During failure-free operation the process states are regularly saved, and after a fault is detected the system is rolled back to a previously saved state. Four classes of techniques can be distinguished: semiautomatic techniques, message logging, coordinated checkpointing, and hybrid techniques. The authors provide a survey of these alternatives and discuss the overhead possibly involved, allowing the user to choose an optimal checkpointing and rollback technique for given facilities and applications.
Ming L. Yu, Lisa A. DeLouise
Surface Science Reports
Stephen M. Gates
Surface Science
Arvind Kumar, Jeffrey J. Welser, et al.
MRS Spring 2000
Victor Y. Lee, Karen Havenstrite, et al.
Advanced Materials