Fault-Tolerant Mechanism in Supercomputing Environment
DOI:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    The three layers supercomputing environment of Chinese Academy of Sciences is built to integrate the computing resources of the head center in Beijing, eight regional centers and several campus-level centers. To enhance the reliability of the supercomputing environment and provide stable and reliable computing services, the fault-tolerant mechanism research has become a research priority of the supercomputing environment. In this paper, the fault-tolerant basic concepts and computer fault-tolerant technologies are introduced at first. Next, a fault-tolerant framework of the supercomputing environment is proposed. Then the fault-tolerant solutions of different levels based on the framework and the performance test results in Deepcomp 7000 are presented. Finally, the fault-tolerant overheads of different levels are compared and analyzed to verify the impact on the application.

    Reference
    Related
    Cited by
Get Citation

赵毅,曹宗雁,朱鹏,迟学斌.超级计算环境容错机制.软件学报,2013,24(S2):89-98

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:August 05,2012
  • Revised:July 22,2013
  • Adopted:
  • Online: January 02,2014
  • Published:
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063