Optimization and Analysis of HPL on Domestic Heterogeneous System

doi:10.13328/j.cnki.jos.006004

微信服务号

微信订阅号

2025-4-24- 8

Home > Archive>Volume 32, Issue 8, 2021 >2319-2328. DOI:10.13328/j.cnki.jos.006004

PDF HTML XML Export Cite reminder

Optimization and Analysis of HPL on Domestic Heterogeneous System
DOI:
                        10.13328/j.cnki.jos.006004
                    
Author:
                        SHUI Chao-YangSHUI Chao-Yang
Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China;University of Chinese Academy of Sciences, Beijing 100190, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
YU Xian-ZhiYU Xian-Zhi
Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China;University of Chinese Academy of Sciences, Beijing 100190, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
WANG Yin-ShanWANG Yin-Shan
Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China;University of Chinese Academy of Sciences, Beijing 100190, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
TAN Guang-MingTAN Guang-Ming
Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China;University of Chinese Academy of Sciences, Beijing 100190, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:TP303
Fund Project:National Key Research and Development Program of China (2018YFB0204400, 2016YFB0201305, 2016YFB020 0803, 2016YFB0200300); Strategic Priority Research Program of the Chinese Academy of Sciences (Category C) (XDC01030000); National Natural Science Foundation of China (61972377, 61432018, 61702483); Key Research Program of Frontier Sciences of the Chinese Academy of Sciences (QYZDJ-SSW-JSC035)

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

As heterogeneous system becomes one of the most important choices to build super computers, how to orchestrate CPU and accelerator to leverage the great computability of heterogeneous systems is of great significance. HPL is the most important benchmark in HPC field, traditional HPL algorithm targeting at CPU-only systems cannot achieve high performance by only offloading matrix multiplication workload to accelerators. To solve this problem, this work proposes a HPL performance model and a multithread fine-grained pipelining algorithm for domestic-processor-domestic-accelerator heterogeneous system. Meanwhile, a light weight cross-platform heterogeneous framework is implemented to carry out a cross-platform HPL algorithm. The proposed performance model predicts HPL performance accurately on similar heterogeneous systems. On NVIDIA platform, the proposed HPL algorithm outperforms the NVIDIA proprietary counterparts by 9%. On domestic-processor-domestic-accelerator platform, the finally optimized Linpack program achieves 2.3 PFLOPS on 512 nodes, with floating-point efficiency 71.1%.

Key words:HPL;heterogeneous system;cross-platform;performance modeling;exascale computing

Get Citation

水超洋,于献智,王银山,谭光明.国产异构系统上HPL的优化与分析.软件学报,2021,32(8):2319-2328

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:August 16,2019
Revised:December 05,2019
Adopted:
Online: August 05,2021
Published: August 06,2021

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History