Multi-User Server Program Self-Recovery System
Author:
Affiliation:

  • Article
  • | |
  • Metrics
  • |
  • Reference [10]
  • |
  • Related
  • | | |
  • Comments
    Abstract:

    Long running multi-user server system may encounter frequent errors resulting in running disruptions due to its complexity of program, operating environments and user operations. This poses the need of self-recovery of system. Rollback and checkpoint scheme is a popular self-recovery strategy in current research and application, but has no obvious effects in multi-user system. In this paper, a VMM-based self-recovery system named VMSRS (virtual machine monitor-self recovery of service program) is designed according to the characteristics of multi-user server programs. The main idea of VMSRS is regarding VMM as major component of recovery, taking advantage of VM as independent underlying system and hardware resource monitor, and strictly maintaining the consistency and security of user data and atomicity of data operation. As an improved SRS (self recovery of service program), VMSRS controls errors to avert affecting normal users in case of system crash instead of committing rollback, allowing users and servers to proceed as if no crash happens. Rollback is avoided by taking advantage of self-cleansing mechanism of system and VMSRS. The issues addressed by VMSRS design include crash suppression module, demand driven restoration module, monitor module, and storage management module. The experiment results from analyzing basic function, basic performance and integral function validate that VMSRS can provide favorable security and consistency of user data while guaranteeing performance and committing no rollback. It recovers multi-thread programs excellently with no limit to threads. Meanwhile, this exploratory study also takes part in current research of self-recovery system utilizing virtualization technology.

    Reference
    [1] Rinard M. What to do when things go wrong: Recovery in complex (computer) systems. In: Proc. of the 11th Annual Int'l Conf. on Aspect-Oriented Software Development Companion. New York: ACM, 2012. 1–2. [doi: 10.1145/2162110.2162112]
    [2] Nagarajan V, Jeffrey D, Gupta R. Self-Recovery in server programs. In: Proc. of the ISMM. New York: ACM, 2009. 49–58. [doi: 10.1145/1542431.1542439]
    [3] Lenharth A, Adve V, King ST. Recovery domains: An organizing principle for recoverable operating systems. In: Proc. of the ASPLOS. New York: ACM, 2009. 49–60. [doi: 10.1145/1508244.1508251]
    [4] Sidiroglou S, Laadan O, Perez CR, Viennot N, Nieh J, Keromytis AD. ASSURE: Automatic software self-healing using rescue points. In: Proc. of the ASPLOS. New York: ACM, 2009. 37–48. [doi: 10.1145/1508244.1508250]
    [5] Wang AJ, Iyer M, Dutta R, Rouskas RN, Baldine I. Network virtualization: Technologies, perspectives, and frontiers. Journal of Lightwave Technology, 2013,31(4):523–537. [doi: 10.1109/JLT.2012.2213796]
    [6] Bessho N, Dohi T. Comparing checkpoint and rollback recovery schemes in a cluster system. In: Proc. of the the 12th Int'l Conf. on Algorithms and Architectures for Parallel Processing (ICA3PP 2012), Volume Part I. LNCS 7439, Berlin, Heidelberg: Springer-Verlag, 2012. 531–545. [doi: 10.1007/978-3-642-33078-0_38]
    [7] Azimi R, Tam DK, Soares L, Stumm M. Enhancing operating system support for multicore processors by using hardware performance monitoring. ACM SIGOPS Operating Systems Review, 2009,43(2):56–65. [doi: 10.1145/1531793.1531803]
    [8] Kapil D, Pilli ES, Joshi RC. Live virtual machine migration techniques: Survey and research challenges. In: Proc. of 2013 the 3rd IEEE Int'l Advance Computing Conf. (IACC). Washington: IEEE, 2013. 963–969. [doi: 10.1109/IAdCC.2013.6514357]
    [9] Ye KJ, Jiang XH, Huang DW, Chen JH, Wang B. Live migration of multiple virtual machines with resource reservation in cloud computing environments. In: Proc. of 2011 IEEE the 4th Int'l Conf. on Cloud Computing. Washington: IEEE, 2011. 267–274. [doi: 10.1109/CLOUD.2011.69]
    [10] Tomic O, Luciano G, Nilsen A, Hyldig G, Lorensen K, Næs T. Analysing sensory panel performance in a proficiency test using the PanelCheck software. European Food Research and Technology, 2009,230(3):497–511. [doi: 10.1007/s00217-009-1185-y] Betta G, Capriglione D, Pietrosanto A. A statistical approach for improving the performance of a testing methodology for measurement software. IEEE Trans. on Instrumentation and Measurement, 2008,57(6):1118–1126. [doi: 10.1109/TIM.2007.9151 43]
    Related
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

史椸,冯雨声,齐勇,孙伟.多用户服务器程序自恢复系统.软件学报,2015,26(8):1907-1924

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:March 03,2014
  • Revised:July 31,2014
  • Adopted:July 31,2014
  • Online: November 14,2014
You are the first2044088Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063