一种面向移动计算的低代价透明检查点恢复协议

微信服务号

微信订阅号

2025年6月1日 22:40 星期日

首页 > 过刊浏览>2005年第16卷第1期 >135-144

一种面向移动计算的低代价透明检查点恢复协议
DOI:
                        
                    
CSTR:
                        
                    
作者:
                        李庆华李庆华
华中科技大学,计算机科学与技术学院,湖北,武汉,430074
在期刊界中查找
在百度中查找
在本站中查找
蒋廷耀蒋廷耀
华中科技大学,计算机科学与技术学院,湖北,武汉,430074
在期刊界中查找
在百度中查找
在本站中查找
张红君张红君
华中科技大学,计算机科学与技术学院,湖北,武汉,430074
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:Supported by the National Natural Science Foundation of China under Grant No.60273075 (国家自然科学基金); the National High-Tech Research and Development Plan of China under Grant No.863-306-11-01-06 (国家高技术研究发展计划(863))

A Transparent Low-Cost Recovery Protocol for Mobile-to-Mobile Communication

Author:

LI Qing-Hua
LI Qing-Hua

在期刊界中查找
在百度中查找
在本站中查找
JIANG Ting-Yao
JIANG Ting-Yao

在期刊界中查找
在百度中查找
在本站中查找
ZHANG Hong-Jun
ZHANG Hong-Jun

在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [29]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

移动计算系统中的检查点恢复协议面临着许多与传统分布式系统所不同的问题.在目前已出现的支持移动计算的检查点恢复机制中,基于建立全局一致的检查点的方法不能确保错误的独立恢复;基于m-MSS-m通信的消息日志方法其移动站之间交换的消息需通过移动基站的转发.提出了一种基于消息日志的支持移动站之间直接通信(m-m)的容错协议并给出了相应的算法及正确性证明.与m-MSS-m通信相比,m-m通信有利于降低信道冲突;减少消息传递延迟.仿真结果表明,所设计的协议比传统协议具有更小的无错误状态下引入负载和错误恢复时间.

关键词:移动计算;检查点;消息日志;回滚恢复

Abstract:

Mobile computing brings new challenges and requirements for checkpointing and recovery protocol. Existing checkpointing-only schemes can not guarantee the independent recovery through creating global consistent checkpoints. Message logging schemes based on mobile-MSS-mobile communication that exchanges messages among mobile hosts may incur large contention on the wireless network and high latency for message transmission relative to the direct mobile host to mobile host (m-m) communication. This paper presents a novel recovery protocol for m-m communication, in which two key problems, message order and duplicate message, are effectively solved. A proof of the protocol correctness is also given. Finally, simulation results indicate that the performance of the proposed approach is better than that of the traditional approaches in terms of fail-free and recovery overhead.

Key words:mobile computing; checkpoint; message logging; rollback recovery

参考文献

[1]Pradhan DK, Krishna P, Vaidya NH. Recovery in mobile environments design and trade-off analysis. In: Tohma Y, ed. Proc. of the 26th Int'l Symp. Fault-Tolerant Computing. Sendai: IEEE Press, 1996. 16-25.

[2]Koo R, Touge S. Checkpoinging and rollback-recovery for distributed systems. IEEE Trans. on Software Engineering, 1987,13(1):23-31.

[3]Kim JL, Park T. An efficient algorithm for checkpointing recovery in distributed systems. IEEE Trans. on Parallel and Distributed Systems, 1993,4(8):955-960.

[4]Chandy KM, Lamport L. Distributed snapshots: Determining global states of distributed systems. ACM Trans. on Computer Systems, 1985,3(1):63-75.

[5]Ramanathan P, Shin KG. Use of common time base for checkpointing and rollback recovery in a distributed system. IEEE Trans. on Software Engineering, 1993,19(6):571-583.

[6]Elnozahy EN, Johnson DB. The performance of consistent checkpointing. In: Harris C, ed. In: Proc. of the 11th Symp. on Reliable Distributed Systems. Houston: IEEE Press, 1992. 86-95.

[7]Silva LM, Silva JG. Global checkpointing for distributed programs. In: Harris C, ed. Proc. of the 11th Symposium on Reliable Distributed Systems. Houston: IEEE Press, 1992. 155-162.

[8]Prakash R, Singhal M. Low-Cost checkpointing and failure recovery in mobile computing systems. IEEE Trans. on Parallel and Distributed Systems, 1996,7(10):1035-1048.

[9]Manivannan D, Singhal M. Quasi-Synchronous checkpointing: Models, characterization and classification. IEEE Trans. on Parallel and Distributed Systems, 1999,10(7):703-713.

[10]Guohong C, Singhal M. Mutable checkpoints: A new checkpointing aporach for mobile computing systems. IEEE Trans. on Parallel and Distributed Systems, 2001,12(2):157-172.

[11]Wang YM. Maximum and minimum consistent global checkpoints and their applications. In: Sipple RS, ed. Proc. of the 14th Symp. on Reliable Distributed Systems. Bad Neuenahr: IEEE Press, 1995. 86-95.

[12]Randell BL. System structure for software fault tolerance. IEEE Trans. on Software Engineering, 1975,1(2):16-25.

[13]Wang YM, Fuchs WK. Lazy checkpoint coordination for bounding rollback propagation. In: Werner R, ed. Proc. of the 12th Symp. on Reliable Distributed Systems. Princeton: IEEE Press, 1993. 78-85.

[14]Alvisi L, Marzullo K. Message logging: Pessimistic, optimistic, causal, and optimal. IEEE Trans. On Software Engineering, 1998,24(2):145-149.

[15]Elnozahy EN, Zwaenepoe W. Manetho: Transparent rollback-recovery with low overhead, limited rollback and fast output commit. IEEE Trans. on Computers, 1992,41(5):526-531.

[16]Yao B, Ssu KF, Fuchs WK. Message logging in mobile computing. In: Martin DC, ed. Proc. of the 29th Fault-Tolerant Computing Symp. Madison: IEEE Press, 1999. 14-19.

[17]Park T, Yeom HY. An asynchronous recovery scheme based on optimistic message logging for mobile computing systems. In: Werner B, ed. Proc. of the 20th Int'l Conf. on Distributed Computing Systems. Taipei: IEEE Press, 2000. 436-433.

[18]Venkatesan S. Optimistic crash recovery without changing application messages. IEEE Trans. On Parallel and Distributed Systems, 1997,8(3)263-271.

[19]Rao S, Vin HM. The cost of recovery in message logging protocols. In: Palagi L, ed. Proc. Of the 17th Symp. On Reliable Distributed Systems. West Lafayette: IEEE Press, 1998.10-18.

[20]Pei D, Wang DS, Shen MM, Zheng WM. WOB: A novel approach to checkpoint active files. Acta Electronica Sinica, 2000,28(5)-9-12 (in Chinese with English abstract).

[21]Li KY, Yang XZ. Improving the performance of a checkpointing scheme with task duplication. Acta Electronica Sinica, 2000,28(5):33-35 (in Chinese with English abstract).

[22]Wei XH, Ju JB. SFT: A consistent checkpointing algorithm with short freezing time. Chinese Journal of Computers, 1999,22(6): 645-650 (in Chinese with English abstract).

[23]Wang DS, Shen MM, Zheng WM, Pei D. A checkpoint-based rollback recovery and processes migration system. Journal of Software, 1999,10(1):69-73 (in Chinese with English abstract).

[24]Lamport,L. Time, clocks, and the ordering of events in distributed systems. Communications of the ACM, 1978,21(7):558-565.

[25]Higaki H, Takizawa M. Checkpointing-Recovery protocol for reliable mobile systems. In: Palagi L, ed. Proc. of the 17th Symp. on Reliable Distributed Systems. West Lafayette: IEEE Press, 1998. 93-99.

[26]裴丹,汪东升,沈美明,郑纬民.WOB:一种新的文件检查点设置策略.电子学报,2000,28(5):9-12.

[27]李凯原,杨孝宗.提高用任务重复的检查点方案的性能.电子学报,2000,28(5):33-35.

[28]魏晓辉,鞠九滨.SFT:一个具有较短冻结时间的一致检查点算法.计算机学报,1999,22(6):645-650.

[29]汪东升,沈美明,郑纬民,裴丹.一种基于检查点的卷回恢复与进程迁移系统.软件学报,1999,10(1):69-73.

引用本文

李庆华,蒋廷耀,张红君.一种面向移动计算的低代价透明检查点恢复协议.软件学报,2005,16(1):135-144

复制

文章指标

点击次数:4040
下载次数: 5485
HTML阅读次数: 0
引用次数: 0

历史

收稿日期:2002-12-16
最后修改日期:2003-11-10
录用日期:
在线发布日期:
出版日期:

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码