Reduction Algorithm Optimization Based on the OpenCL

微信服务号

微信订阅号

2025-4-13- 11

Home > Archive>Volume 22, Issue zk2, 2011 >163-171

PDF HTML XML Export Cite reminder

Reduction Algorithm Optimization Based on the OpenCL
DOI:
                        
                    
Author:
                        YAN Shen-GenYAN Shen-Gen
Laboratory of Parallel Software and Computational Science, Institute of Software, The Chinese Academy of Sciences, Beijing 100190, China; State Key Laboratory of Computing Science, Institute of Software, The Chinese Academy of Sciences, Beijing 100190, Ch
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
ZHANG Yun-QuanZHANG Yun-Quan
Laboratory of Parallel Software and Computational Science, Institute of Software, The Chinese Academy of Sciences, Beijing 100190, China; State Key Laboratory of Computing Science, Institute of Software, The Chinese Academy of Sciences, Beijing 100190, Ch
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
LONG Guo-PingLONG Guo-Ping
Laboratory of Parallel Software and Computational Science, Institute of Software, The Chinese Academy of Sciences, Beijing 100190, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
LI YanLI Yan
Laboratory of Parallel Software and Computational Science, Institute of Software, The Chinese Academy of Sciences, Beijing 100190, China; State Key Laboratory of Computing Science, Institute of Software, The Chinese Academy of Sciences, Beijing 100190, Ch
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference [10]

Related [20]

Cited by

Materials

Comments

Abstract:

Reduction algorithm has a wide range of applications in areas such as scientific computing and image processing. This paper systematically studies the reduction algorithm optimization on the GPU’s cross-platform performance optimization based on the OpenCL framework. Previous research has generally focused on a single hardware architecture, however, this paper based on the OpenCL, studies various kinds of optimization methods, such as using vector, on-chip memory bank conflict, threads organization, instruction selection and so on. The research takes the minMax function for example, dilatationed each optimization method for develep the performance, and detailed the reason. The study tests the algorithm both on AMD GPU and NVIDIA GPU platforms. The test results show that the optimized algorithm on both platforms has achieved good performance. In the AMD ATI Radeon HD 5850 platform, Int and Float types of data bandwidth utilization up to 89%. In the NVIDIA GPU Tesla C2050 platform, the performance has reached 1.3 to 1.9 times compare to appropriate function version of CUDA.

Key words:GPU;parallel reduction;OpenCL;CUDA

Get Citation

颜深根,张云泉,龙国平,李焱.基于OpenCL 的归约算法优化.软件学报,2011,22(zk2):163-171

Copy

Article Metrics

Abstract:3660
PDF: 7539
HTML: 0
Cited by: 0

History

Received:July 15,2011
Revised:December 02,2011
Adopted:
Online: March 30,2012
Published:

You are the first2034842Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History