Efficient Fault Tolerant Compilation: Compress Error Flow to Reduce Power and Enhance Performance

微信服务号

微信订阅号

2025-4-4- 8

Home > Archive>Volume 17, Issue 12, 2006 >2425-2437

Efficient Fault Tolerant Compilation: Compress Error Flow to Reduce Power and Enhance Performance
DOI:
                        
                    
Author:
                        GAO LongGAO Long

Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
YANG Xue-JunYANG Xue-Jun

Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference [34]

Related [20]

Cited by [3]

Materials

Comments

Abstract:

In many reliability-critical applications, computers are required to have higher performance, lower power dissipation and fault tolerance simultaneously. Traditional software fault tolerance uses a great deal of branch instructions to detect errors, thus brings great overhead in both performance and power dissipation. In this paper, an error flow model is suggested, and it is used to explain the algorithm of error flow compressing. In error flow compressing algorithm, branch instructions are reduced greatly, while total instructions remain the same. The simulated results on Wattch of FFT benchmark from project StreamIT show that compared with the traditional EDDI error detection algorithm, the EFC can reduce total branch instructions by over 24%, improve IPC by over 12%, and at the same time, reduce the power dissipation by nearly 5%, at loop parameter n=2²⁵. Further reasoning shows that the reduction of branch instructions can be as much as over 43% when there are 8 store instructions in the innermost iteration.

Key words:SIHFT (software implemented hardware fault tolerance); COTS; error flow model; error flow compressing algorithm; branch instruction; high performance; low power dissipation

Reference

[1]2006.http://www.people.com.cn/GB/keji/25509/32328/

[2]Oh N,Mitra S,McCluskey EJ.ED4I:Error detection by diverse data and duplicated instructions IEEE Trans.on Computers,2002,51(2):180-199.

[3]Oh N.Software implemented hardware fault tolerance[Ph.D.Thesis].Stanford:Stanford University,2000.

[4]Juan LA,Jose G,Antonio G.Power-Aware control speculation through selective throttling In:Proc.of the 9th Int'l Symp.on High-Performance Computer Architecture.Washington:IEEE Computer Society,2003.103.

[5]Dharmesh P,Kevin S,Yan Z,Mircea S.Power-Aware Branch prediction:Characterization and design.IEEE Trans.on Computers,2004,53(2):168-186.

[6]Oh N,Shirvani PP,McCluskey EJ.Error detection by duplicated instructions in super-scalar processors.IEEE Trans.on Reliability,2002,51(1):63-75.

[7]Shirvani P.Fault tolerant computing for radiation environment[Ph.D.Thesis].Stanford:Stanford University,2001.

[8]Huang KH,Abraham JA.Algorithm-Based fault tolerance for matrix operations IEEE Trans.on Computers,1984,33(6):518-528.

[9]Maurizio R,Matteo SR,Massimo V,Marco T.A source-to-source compiler for generating dependable software In:Proc.of the 1st IEEE Int'l Workshop on Source Code Analysis and Manipulation Washington:IEEE Computer Society,2001.33-42.

[10]Clark JA,Pradhan DK.Fault injection:A method for validating computer-system dependability.IEEE Computer,1995,28(6):47-56.

[11]Ziegler JF.IBM experiments in soft fails in computer electronics (1978-1994).IBM Journal of Research and Development,1996,40(1):3-18.

[12]Cheynet P,Nicolescu B,Velazco R,Rebaudengo M,Reorda MS,Violante M.Experimentally evaluating an automatic approach for generating safety-critical software with respect to transient errors IEEE Trans.on Nuclear Science,2000,47(6):2231-2236.

[13]Lyons RE,Vanderkulk W.The use of triple-modular redundancy to improve computer reliability.IBM Journal of Research and Development,1962,6(2):200-209.

[14]Chen CL,Hsiao MY.Error-Correcting codes for semiconductor memory applications:A state-of-the-art review.IBM Journal of Research and Development,1984,28(2):124-134.

[15]2006.http://www.stratus.com/

[16]HP NonStop S88000,S78000,and S780 servers data sheet.Hewlett-Packard Development Company,2004.

[17]Lu DJ.Watchdog processor and structural integrity checking.IEEE Trans.on Computers,1982,C-31(7):681-685.

[18]Gomaa M,Scarbrough C,Vijaykumar TN,Pomeranz I.Transient-Fault recovery for chip multiprocessors.In:Proc.of the 30th Annual Int'l Symp.on Computer Architecture.New York:ACM Press,2003.98-109.

[19]Vijaykumar T,Pomeranz I,Cheng K.Transient-Fault recovery using simultaneous multithreading.In:Proc.of the 29th Annual Int'l Symp.on Computer Architecture.Washington:IEEE Computer Society,2002.87-98.

[20]Avizeinis A.The N-version approach to fault-tolerant software.IEEE Trans.on Software Engineering,1985,SE-11(12):1491-1501.

[21]Randell B.System structure for software fault tolerance.IEEE Trans.on Software Engineering,1975,SE-1(2):220-223.

[22]Alkalai L,Tai A,Chau S.COTS-Based fault tolerance in deep space:Qualitative and quantitative analyses of a bus network architecture.In:Proc.of the 4th IEEE Int'l Symp.on High Assurance.Washington:IEEE Computer Society,1999.97 -104.

[23]Equils DJ.Method for enhancing the process of software tool evaluation and selection:COTS,heritage,and custom software reviewed.In:Proc.of the SpaceOps 2004.Pasadena:Jet Propulsion Laboratory,National Aeronautics and Space Administration,2004.1-10.

[24]Liu P.Reliability Engineering Principles.Revised ed.,Beijing:Measurements Press,2002 (in Chinese).

[25]Alpern B,Wegman MN,Zadeck FK.Detecting equality of values in programs.In:Proc.of the 15th ACM Symp.on Principles of Programming Languages.New York:ACM Press,1988.1-11.

[26]Marc MB,Hanspeter M.Single-Pass generation of static single-assignment form for structured languages.ACM Trans.on Programming Languages and Systems,1994,16(6):1684-1698.

[27]Burger DC,Austin TM.The SimpleScalar tool set,version 2.0.ACM SIGARCH Computer Architecture News,New York:ACM Press,1997,25(3):13-25.

[28]Cliff Y,Michael DS.Static correlated branch prediction.ACM Trans.on Programming Languages and Systems,1999,21(5):1028-1075.

[29]Wu Y,Larus JR.Static branch frequency and program profile analysis.In:Proc.of the 27th Annual Int'l Symp.on Microarchitecture.New York:ACM Press,1994.1-11.

[30]Jason RC,Patterson.Accurate static branch prediction by value range propagation.In:Proc.of the ACM SIGPLAN'95 Conf.on Programming Language Design and Implementation.New York:ACM Press,1995.67-78.

[31]2006.http://cag.csail.mit.edu/streamit

[32]Brooks D,Tiwari V,Martonosi M.Wattch:A framework for architectural-level power analysis and optimizations.In:Proc.of the27th Annual Int'l Symp.on Computer Architecture.New York:ACM Press,2000.83-94.

[33]Freescale Semiconductor Inc.MPC7447A RISC microprocessor hardware specifications.technical data.Chandler:Freescale Semiconductor Inc.,2005.

[24]刘品.可靠性工程基础.修订版,北京:计量出版社,2002.

Get Citation

高珑,杨学军.高性能低功耗的容错编译技术:错误流压缩算法.软件学报,2006,17(12):2425-2437

Copy

Article Metrics

Abstract:4729
PDF: 5337
HTML: 0
Cited by: 0

History

Received:October 08,2005
Revised:February 23,2006
Adopted:
Online:
Published:

You are the first2032758Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History