[关键词]
[摘要]
考虑到硬件管理Cache 多级存储结构在功耗和面积方面的开销过大,众核处理器倾向于采用软件管理的多级存储结构,这就需要软件规划好程序的数据在各级存储上的布局和传输.尝试了一种依赖程序原有循环结构和问题规模的简易数据自动分块方法,根据循环层内的数据访存范围进行相应的分块,避免数据复杂的依赖关系分析,使得该方法易于在编译器中实现.同时可根据需要进一步结合程序变换如循环交换、循环联合和循环分裂等方法得到更佳的分块参数.实验结果表明,在大多数问题规模下与一般分块方法的优化性能相当,但在某些特定问题规模下能够获得较高的优化性能.
[Key word]
[Abstract]
Many-Core CPU preferring software-managed memory hierarchies than the Cache memory hierarchies owe to the area and power consumption. Software-Managed memory hierarchies need soft explicitly managed the data placement and data transfer. This paper proposes an easy data tiling method compilation techniques for these large memory objects such as large arrays base on the program loop characteristic and the scale of program data. This method is easy carry out in the compiler and has the equal efficient as the loop and data tiling method. The experimental results of several benchmarks show that this method can outperform the loop and data tiling method when this method may acquire additional data locality in the memory on chip.
[中图分类号]
[基金项目]
Supported by the National Basic Research Program of China under Grant No.2007CB310900 (国家重点基础研究发展计划(973))