主页期刊介绍编委会编辑部服务介绍道德声明在线审稿编委办公编辑办公English
2018-2019年专刊出版计划 微信服务介绍 最新一期:2018年第12期
     
在线出版
各期目录
纸质出版
分辑系列
论文检索
论文排行
综述文章
专刊文章
美文分享
各期封面
E-mail Alerts
RSS
旧版入口
中国科学院软件研究所
  
投稿指南 问题解答 下载区 收费标准 在线投稿
何王全,刘勇,方燕飞,魏迪,漆锋滨.面向国产异构众核系统的Parallel C语言设计与实现.软件学报,2017,28(4):764-785
面向国产异构众核系统的Parallel C语言设计与实现
Design and Implementation of Parallel C Programming Language for Domestic Heterogeneous Many-Core Systems
投稿时间:2016-06-20  修订日期:2016-09-08
DOI:10.13328/j.cnki.jos.005197
中文关键词:  异构众核  编程模型  并行语言  Parallel C  编译器  消息传递
英文关键词:heterogeneous many-core  programming model  parallel language  Parallel C  compiler  message passing
基金项目:国家重点基础研究发展计划(973)(2016YFB0200502);国家高技术研究发展计划(863)(2012AA010903,2015AA01A301);计算机体系结构国家重点实验室基金(CARCH201403)
作者单位E-mail
何王全 江南计算技术研究所, 江苏 无锡 214083 wangquan_he@163.com 
刘勇 江南计算技术研究所, 江苏 无锡 214083  
方燕飞 江南计算技术研究所, 江苏 无锡 214083  
魏迪 江南计算技术研究所, 江苏 无锡 214083  
漆锋滨 江南计算技术研究所, 江苏 无锡 214083  
摘要点击次数: 1471
全文下载次数: 1000
中文摘要:
      异构众核架构具有超高的性能功耗比,已成为超级计算机体系结构的重要发展方向.但众核系统更为复杂的并行层次和存储层次,给编程和优化带来了极大的挑战.因此,研究面向众核系统的并行编程技术,对于降低国产众核系统并行应用的编程难度、提升并行程序的性能都具有重要的意义.提出统一架构的多模式并行编程模型,包括异构融合的加速运算模型和按同构方式编程的自主运算模型,根据编程模型设计了Parallel C语言,能够有效地描述国产众核系统的异构并行性.与其他众核系统上MPI+X的使用模式相比,编程和系统优化都具有全局视角,在多级局部性描述、单边消息、兼容已有多核应用等方面具有特色;基于Open64构建了Parallel C编译系统,全面支持加速运算模型和自主运算模型,提出并实现了数据布局与自动DMA、编译指导的线程代理和拓扑位置感知的集合通信等优化.Micro Benchmark和实际应用在神威太湖之光计算机系统上的测试数据结果表明:Parallel C语言和编译系统具有良好的性能和可扩展性,能够有效支撑大型应用.
英文摘要:
      Heterogeneous many-core architecture, with ultra-high performance to power consumption ratio, has become an important trend of supercomputer architecture development. However, many-core systems always have more complex parallel hierarchy and memory hierarchy, hence posing a great challenge to programming and optimization. Therefore, the study of many-core-oriented parallel programming techniques is of great significance, since it can reduce the difficulty of parallel programming on domestic many-core systems and improve the performance of parallel programs. This work proposes a multi-model parallel programming model upon unified architecture, including heterogeneous-fused speedup programming model and isomorphic independent programming model. Based on this model, Parallel C programming language is designed to effectively describe heterogeneous parallelism of the domestic many-core system. Compared to MPI+X programming pattern, programming with Parallel C has a global perspective, as well as advantages in the hierarchy locality description, one-side message passing and multi-core applications compatibility. The Parallel C compiler system constructed with Open64 fully supports the heterogeneous-fused speedup programming model and isomorphic independent programming model. In addition, the design and implementation of data layout and automatic DMA optimization, compiler-directed thread proxy optimization and topology-aware collective communications optimization are presented. The performance of the proposed method is evaluated with the Miro Benchmark and practical applications on Sunway Taihu Light computer system. Experimental results show that Parallel C language and the compile system have good performance and scalability to effectively support large-scale applications.
HTML  下载PDF全文  查看/发表评论  下载PDF阅读器
 

京公网安备 11040202500064号

主办单位:中国科学院软件研究所 中国计算机学会
编辑部电话:+86-10-62562563 E-mail: jos@iscas.ac.cn
Copyright 中国科学院软件研究所《软件学报》版权所有 All Rights Reserved
本刊全文数据库版权所有,未经许可,不得转载,本刊保留追究法律责任的权利