Abstract:It is important to provide efficient and continuously available fault tolerant services for cloud applications to ensure their reliable executions. This study adopts the fault tolerance as a service scheme to propose an optimized fault tolerance services provisioning method. The fault tolerance requirements for cloud applications are specified from certain aspects of cloud service components, such as reliability and response time. Based on major fault tolerance technologies, i.e., replication, checkpoint, and NVP (N-Version Programming), with consideration of the dynamic switching overhead among fault tolerance services, a novel method to compute optimal solution of feasible fault tolerance service provisioning is proposed according to the fault tolerance as a service scheme. Two analysis scenarios are considered, that is, whether cloud infrastructure resources used to support fault tolerance service are sufficient or not. The experimental results show that the proposed method reduces the fault tolerant service expenses for cloud application system, reduces the cost of cloud infrastructure resources supporting fault tolerance service, and improves the service capacity of fault tolerance service providers to provide efficient and reliable fault tolerance as a service for cloud application systems.