不确定性MPI/PVM程序的完全调试
作者:
基金项目:

This project is supported by the NationalHigh Technology Development Program of China under Grant No.863-306-ZD01-02-3(国家863高科技发展计划基金)and the Youth Science Foundation of the University of Science and Technology of China un-der Grant No.98-1101(中国科学技术大学青年科

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [11]
  • |
  • 相似文献
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    讨论如何完全地调试不确定性MPI/PVM并行程序.在循环调试过程中,不确定性导致前次遇到的错误在以后的执行中很可能无法再现.基于MPI/PVM的FIFO通信模型,给出一种记录-重放技术的实现.通过可控制的重放,用户可以覆盖所有可能的程序执行路径,从而达到完全调试的目的.和其它方法相比,所提供的方法所需时空开销要小得多.此技术已在两种消息传递体系结构上得到实现:一种是曙光-2000超级服务器(由国家智能计算机研究中心开发),它由单处理器(PowerPC)结点经MESH网互联而成;另一种是国家高性能计算中心(

    Abstract:

    This paper discusses how to completely debug indeterminate MPI/PVM parallelprograms.Due tothe indeterminacy,the previous bugs may be non-repeatable in successive executions during a cyclic debuggingsession.Based on the FIFO communication model of MPI/PVM,an implementation of record and replay tech-nique is presented.Moreover,users are provided with an easy way to completely debug their programs by cover-ing all possible execution paths through controllable replay.Comparied with other solutions,the proposedmethod produces much less temporaland spatialoverhead.The implementation has been completed on two kindsof message passing architectures:one is Dawning-2000 super server(that was developed by the National Re-search Center for Intelligent Computing Systems ofChina)with single-processor(PowerPC)nodes which are in-terconnected by a custom-built wormhole mesh network;the other is a cluster ofworkstations(PowerPC/AIX)which has been built in NationalHigh Performance Computing Center at Hefei.

    参考文献
    [1] Leblanc,T.J.,Mellor-Crummey,J.M.Debugging parallelprograms with instant replay.IEEETransactions on Comput-  ers,1987,36(4):471~482.
    [2] Netzer,R.H.B.,Miller,B.P.Optimal tracing and replay for debugging message-passing parallel programs.In:Robert  Werner ed.Proceedings of the Supercomputing'92.Los Alamitos:IEEE Computer Society Press,1992.502~511.
    [3] Hicks,L.,Berman,F.Debugging heterogeneous applications with pangaea.In:SIGMETRICSed.Proceedings of the 1st  Symposium on Parallel and Distributed Tools.New York:ACM Press,1996.41~50.
    [4] Damodaran-Kamal,S.K.,Francioni,J.M.Nondeterminacy:testing and debugging in message passing parallel programs.In:Barton,P.M.,McDowell,C.,eds.Proceedings of the ACM/ONRW orkshop on Parallel and Distributed Debugging.  New York:ACM Press,1993,28(12):118~128.
    [5] Dieter,Kranzlmueller,Jens Volkert.Debugging point-to-point communication in MPIand PVM.In:Alexandrow,V.,Dongarra,J.,eds.Proceedings of the EURO PVM/MPI'98 International Conference.Berlin,Heidelberg:Springer-Ver-  lag,1998.265~272.
    [6] Wang,Feng,Zheng,Qi-long,An,Hong,et al.A parallel and distributed debugger implemented with Java.In:JianChen,Jian Lu,Bertrand Meyer,eds.Proceedings of the 31st International Conference on Technology of Object Oriented  Languages and Systems(TOOLSAsia'99).Los Alamitos:IEEEComputer Society Press,1999.342~348.
    [7] Chen,Qing-ping.Research and implementation of paralleldebugging techniques and toolfor cluster system[MS.Thesis].  University of Science and Technology of China,1998.
    [8] Lumetta,S.S.,Culler,David E.The mantis paralleldebugger.In:SIGMETRICSed.Proceedings of the 1st Symposium  on Parallel and Distributed Tools.New York:ACM Press,1996.118~126.
    [9] Pressman,R.S.Software Engineering——a Practitioner's Approach.Fourth edition,New York:McGraw-Hill,1999.
    [10] Netzer,R.H.B.,Brennan,T.W.,Damodaran-Kamal,S.K.Debugging race conditions in message passing programs.In:SIGMETRICSed.Proceedings of the 1st Symposium on Paralleland Distributed Tools.New York:ACM Press,1996.31  ~40.
    [11] Netzer,R.H.B.Optimal tracing and replay for debugging shared-memory parallel programs.In:Barton,P.M.,McDow-ell,C.,eds.Proceedings of the ACM/ONR Workshop on Parallel and Distributed Debugging.New York:ACM Press,  1993.1~11.
    相似文献
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

王锋,安虹,陈志辉,陈国良.不确定性MPI/PVM程序的完全调试.软件学报,2001,12(3):334-339

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:1999-10-15
  • 最后修改日期:2000-01-25
文章二维码
您是第19829834位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号