A Parallel Finite Difference Stencil Algorithm Based on Iterative Space Alternate Tiling

高效的并行有限差分Stencil 算法对于求解大型线性方程组是十分重要的.针对并行有限差分Stencil 算法中数据局部性差、同步和通信开销大的问题.首先改进传统有限差分Stencil 算法,提出了多层对称遍历有限差分Stencil 算法.然后给出了以迭代空间条块序作为执行序的串行算法,通过沿时间轴对迭代空间进行时滞划分,在不改变迭代算法性质的同时,对迭代空间条块内部多次迭代计算,提高算法的数据局部性.最后提出一种基于迭代空间条块的并行算法,该算法利用改进的多面体模型对迭代空间网格划分,并通过网格条块重排序减少了Cache 缺失率、通信启动和同步次数.理论分析和实验结果表明,该并行模型比传统的区域分解方法和红黑排序并行算法具有更好的数据局部性,并行效率和可扩展性.

Difference stencils are fundamental computations throughout a broad range of scientific and engineering computer programs. In order to optimize data locality and communication overhead, this paper proposes a novel alternate tiling stencil algorithm on distributed memory machines by exploiting the property of the iterative algorithm. The serial execution process of this iterative method is given, which introduces the sequence of iterative space tile as the sequence of execution, and uses time skewing technique to divide iteration space. In this process, nodes of the tile can be traversed many times to improve data locality. The parallel algorithm based on iteration space tile technique is presented, which uses an improved polyhedral model to implement the iteration space tiling algorithm and reorders the tiles of iteration space to reduce cache misses, and the cost of communication and synchronization. The theoretical comparison is given between alternate tiling and other parallelization techniques. Finally numerical results are presented to confirm the effectiveness of serial and parallel execution models of alternate tiling finite difference stencil algorithm, specifically compared with domain-decomposition and red-black iterative methods, and show that the new parallel iterative method has a good data locality, parallel efficiency and scalability.
