• Article
  • | |
  • Metrics
  • |
  • Reference [11]
  • |
  • Related
  • |
  • Cited by
  • | |
  • Comments
    Abstract:

    Computation partition is one of the most important problems in parallel compilation and optimization. For dealing with parallel loops with determinated data distribution, a computation partition algorithm based on the subset of uniform schemes is proposed. The method of getting the subset of uniform schemes is given, as well as the algorithm of selecting the most optimized scheme under the consideration of communication and load balance. The experimental results prove that this algorithm is simpler and more effective than several previous algorithms in dealing with parallel loops, and the p_HPF compiler adopted by this algorithm can obtain good speedups and efficiencies. The compiler has been applied in the field of petroleum.

    Reference
    [1]Banerjee U. Unimodular transformations if double loops. In: Proceedings of the 3rd Workshop on Languages and Compilers for Parallel Computing. 1990. 192~219.
    [2]Banerjee U. Loop Transformations for Restructuring Compilers. Norwell: Kluwer Academic Publishers, 1993.
    [3]Anderson JM, Lam MS. Global optimizations for parallelism and locality on scalable parallel machines. In: Proceedings of the ACM SIGPLAN'93 Conference on Programming Language Design and Implementation. 1993. 112~125.
    [4]Lim AW, Cheong GI, Lam MS. An affine partitioning algorithm to maximize parallelism and minimize communication. In: Proceedings of the 13th ACM SIGARCH International Conference on Supercomputing. 1999. 228~237.
    [5]Lim AW, Liao S-W, Lam MS. Blocking and array contraction across arbitrarily nested loops using affine partitioning. ACM SIGPLAN Notices, 2001,36(7):103~112.
    [6]http://www.cs.rice.edu/~dsystem/dhpf/dhpf-overview-96/index.html. 1995.
    [7]Hu CY, Jin GH, Johnsson SL, Kehagias D, Shalaby N. HPFBench: a hign performance Fortran Benchmark suite. ACM Transactions on Mathematical Software, 2000,26(1):99~149.
    [8]CRPC/HPFF/benchmarks/index.cfm. 1995.
    [9]Thirumalai A. Code generation and optimization for high performance Fortran . Department of Electrical and Computer Engineering, Louisiana State University, 1995.
    [10]Huang QJ. Parallel loop and its compiling and optimizing techniques . Beijing: Peking University, 2002 (in Chinese with English Abstract).
    [11]黄其军.并行循环及其编译优化技术[博士学位论文].北京:北京大学,2002.
    Related
    Cited by
Get Citation

黄其军,杨建武,余华山,许卓群.基于规范划分集的并行循环计算划分.软件学报,2003,14(3):362-368

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:November 19,2001
  • Revised:May 13,2002
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063