Optimal Processor Grid Selection for Parallel Program Independent of Load Balance
Affiliation:

  • Article
  • | |
  • Metrics
  • |
  • Reference [10]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    Physical processors are often viewed as a logical processor grid or process grid to ease the parallel algorithm implementation and to provide useful coordination information among parallel processes. However, the shape of processor grid has great impact on the final performance of user's parallel programs. How to select a suitable or even optimal processor grid for an parallel algorithm on certain parallel machines becomes an urgent problem. In this paper, a novel method named MDCPS (minimum degree of communication point set) is proposed, which tries to find out the optimal processor grid for parallel program independent the impaction of load balance through analysis on its communication pattern. The analysis results on ScaLAPACK parallel Cholesky factorization program match the experimental results well and show that the proposed method can select the optimal processor grid for parallel program successfully.

    Reference
    [1]High Performance Fortran Forum. High Performance Fortran Language Specification(version 2.0). 1997. http://www.netlib.org/hpf/hpf-v20-final.ps.gz.
    [2]Blackford, L.S., Choi, J., Cleary, A., et al. ScaLAPACK: a portable linear algebra library for distributed memory computers-design issues and performance. In: Proceedings of the Supercomputing'96. IEEE Computer Society Press, 1996. http://www.supercomp.org/sc96/proceedings.
    [3]MPI: a message passing interface standard international journal of supercomputer applications and high performance computing. MPI Forum, 1994,8(3~4):165~414.
    [4]Kalns, E.T., Xu, H., Ni, L.M. Evaluation of data distribution patterns in distributed-memory machines. Technical Report, MSU-CPS-ACS-80, Michigan State University, 1992.
    [5]Yan, Yong, Zhang, Xiao-dong, Song, Yong-sheng. An effective and practical performance prediction model for parallel computing on nondedicated heterogeneous NOW. Journal of Parallel and Distributed Computing, 1996,38(1):63~80.
    [6]Shi, Wei-song, Hu, Wei-wu, Tang, Zhi-min, et al. A study of shared virtual memory. Journal of Computer Engineering and Science, 1998,20(A1):84~90 (in Chinese).
    [7]Dongarra, J.J., Dunigan, T. Message-Passing performance of various computers. Technical Report, CS-95-299, University of Tennessee, 1996.
    [8]Zhang, Yun-quan, Shi, Wei-song. Communication point based optimal process grid selection in MPP machines. In: Proceedings of the 5th National Graduate Workshop on Computer Science and Technology. Beijing: Institute of Computing Technology, The Chinese Academy of Sciences, 1998. 80~85 (in Chinese).
    [6]施巍松,胡伟武,唐志敏,等.虚拟共享存储系统研究.计算机工程与科学,1998,20(A1):84~90.
    [8]张云泉,施巍松.基于通信点集合度的并行算法最适进程网格选择.见:第5届全国计算机科学与工程研究生学术讨论会论文集.北京:中国科学院计算技术研究所,1998.28~33.
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

张云泉,施巍松.负载平衡无关的并行程序最适处理器网格选择.软件学报,2000,11(12):1674-1680

Copy
Share
Article Metrics
  • Abstract:3663
  • PDF: 4935
  • HTML: 0
  • Cited by: 0
History
  • Received:June 18,1999
  • Revised:September 28,1999
You are the first2032799Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063