非平衡进程到达模式下MPI广播的性能优化方法
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学创新群体基金(60621003)


Optimizing Method for Improving the Performance of MPI Broadcast under Unbalanced Process Arrival Patterns
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    为了提高非平衡进程到达(unbalanced process arrival,简称UPA)模式下MPI广播的性能,对UPA模式下的广播问题进行了理论分析,证明了在多核集群环境中通过节点内多个MPI 进程的竞争可以有效减少UPA 对MPI广播性能的影响,并在此基础上提出了一种新的优化方法,即竞争式流水化方法(competitive and pipelined method,简称CP).CP方法通过一种节点内进程竞争机制在广过程中尽早启动节点间通信,经该方法优化的广播算法利用共享内存在节点内通信,利用由竞争机制产生的引导进程执行原算法在节点间通信.并且,该方法使节点间通信和节点内通信以流水方式重叠执行,能够有效利用集群系统各节点的多核优势,减少了MPI广播受UPA的影响,提高了性能.为了验证CP方法的有效性,基于此方法优化了3种典型的MPI广播算法,分别适用于不同消息长度的广播.在真实系统中,通过微基准测试和两个实际的应用程序对CP广播进行了性能评价,结果表明,该方法能够有效地提高传统广播算法在UPA模式下的性能.在应用程序的负载测试实验结果中,CP广播的性能较流水化广播的性能提高约16%,较MVAPICH2 1.2中广播的性能提高18%~24%.

    Abstract:

    This paper aims at improving the performance of MPI broadcasts under unbalanced process arrival (UPA) patterns. This paper analyzes this problem with a performance model and proves that the negative impact of UPA on MPI broadcast can be effectively reduced by the competition of intra-node MPI processes on a multicore cluster. Based on this theory, a new optimizing method, called competitive and pipelined method (CP), is proposed. The CP method can start inter-node communications during the broadcast process through an intra-node competitive mechanism. In a CP method based broadcast algorithm, intra-node communications overlap inter-node communications through a pipelined method, and intra-node communications are implemented through shared memory while inter-node communications are executed by a set of leader MPI processes, which is selected by the competitive mechanism. In order to verify the CP method, this paper improves three typical broadcast algorithms by using this method and evaluates these algorithms in a real platform by using a micro-benchmark case and two practical applications. The results show that the performance of the CP method can effectively improve the performance of broadcast algorithms in the condition of UPA patterns. In the experimental results of the performance of the practical applications, the performance of CP broadcasts is about 16% higher than the performance of P broadcasts and is 18% to 24% higher than the performance of broadcast operation in MVAPICH2 1.2.

    参考文献
    相似文献
    引证文献
引用本文

刘志强,宋君强,卢风顺,徐芬.非平衡进程到达模式下MPI广播的性能优化方法.软件学报,2011,22(10):2509-2522

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2009-12-11
  • 最后修改日期:2010-03-05
  • 录用日期:
  • 在线发布日期:
  • 出版日期:
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号