静态程序切片的GPU通用计算功耗预测模型

doi:10.3724/SP.J.1001.2013.04361

微信服务号

微信订阅号

2025年4月3日 14:13 星期四

首页 > 过刊浏览>2013年第24卷第8期 >1746-1760. DOI:10.3724/SP.J.1001.2013.04361

PDF HTML阅读 XML下载导出引用引用提醒

静态程序切片的GPU通用计算功耗预测模型
DOI:
                        10.3724/SP.J.1001.2013.04361
                    
CSTR:
                        
                    
作者:
                        王海峰王海峰
上海理工大学 管理学院, 上海 200093;临沂大学 信息学院, 山东 临沂 276000
在期刊界中查找
在百度中查找
在本站中查找
陈庆奎陈庆奎
上海理工大学 光电信息与计算机工程学院, 上海 200093;上海理工大学 管理学院, 上海 200093
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家自然科学基金(60970012); 教育部博士点专项基金(20113120110008); 上海重点科技攻关项目(09511501000,09220502800); 上海市一流学科建设项目(XTKX2012)

Power Consumption Prediction Model of General-Purpose Computing GPU with Static Program Slicing

Author:

WANG Hai-Feng
WANG Hai-Feng
School of Management, University of Shanghai for Science and Technology, Shanghai 200093, China;Information School, Linyi University, Linyi 276000, China
在期刊界中查找
在百度中查找
在本站中查找
CHEN Qing-Kui
CHEN Qing-Kui
School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China;School of Management, University of Shanghai for Science and Technology, Shanghai 200093, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [24]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

随着图形处理器通用计算的发展,GPU(graphics processing unit)通用计算程序功耗的度量与优化成为绿色计算领域中的一个基础问题.当前,GPU 计算能耗评测主要通过硬件来实现,而开发人员无法在编译之前了解应用程序能耗,难以实现能耗约束下的代码优化与重构.为了解决开发人员评估应用程序能耗的问题,提出了针对应用程序源代码的静态功耗预测模型,根据分支结构的疏密程度以及静态程序切片技术,分别建立分支稀疏和稠密两类应用程序的功耗预测模型.程序切片是介于指令与函数之间的度量粒度,在分析GPU应用程序时具有较强的理论支持和可行性.用非线性回归和小波神经网络建立两种切片功耗模型.针对特定GPU 非线性回归模型的准确性较好.小波神经网络预测模型适合各种体系的GPU,具有较好的通用性.对应用程序分支结构进行分析后,为分支稀疏程序提供加权功率统计模型,以保证功耗评估算法的效率.分支稠密程序则采用基于执行路径概率的功耗预测法,以提高预测模型的准确性.实验结果表明,两种预测模型及算法能够有效评估GPU 通用计算程序的功耗,模型预测值与实际测量值的相对误差低于6%.

关键词:功耗模型;GPU 计算;非线性回归;程序切片;小波神经网络

Abstract:

With the development of general-purpose computing of GPUs (graphics processing units), power consumption measurements and optimization have become an essential issue in the green computing field. The current power consumption of GPUs is mainly measured by the hardware. However the programmers have had difficulty understanding the power consumption profile of the applications used to optimize and refactor before the compile phase. To solve this issue, power consumption models were proposed for GPU applications with regard to sparseness- branch and denseness-branch programs based on program slicing, respectively. The program slicing is fine-grained level that lies between the function and the instruction levels and has good feasibility and accuracy in the power consumption estimation. The power consumption prediction models for program slicing were proposed through no-linear regression and wavelet neural networks. To specific GPUs, the power prediction model based on no-linear regression is more precise than the prediction model based on wavelet neural networks. However the wavelet neural networks model has better generality to various kinds of GPUs. After analyzing the structure of the applications, the weighted power model for sparseness-branch programs was provided to achieve better effectiveness. The probability slicing power model for denseness-branch programs was also proposed to improve the accuracy that is based on the probability of the execution paths. The results indicate that the two different models can effectively predict the power consumption. And the average relative error between the predicted value and the measured value is less than 6%.

Key words:power consumption model;GPU computing;no-linear regression;program slicing;wavelet neural network

参考文献

[1] Kurp P. Green computing. Communications of the ACM, 2008,51(10):11-13. [doi: 10.1145/1400181.1400186]

[2] Wang XR, Chen M, Fu X. MIMI power control for high-density servers in an enclosure. IEEE Trans. on Parallel and DistributedSystem, 2010,21(10):1412-1426. [doi: 10.1109/TPDS.2010.31]

[3] Lin YS, Yang XJ, Tang T, Wang GB, Xu XH. A GPU low-power optimization based on parallelism analysis model. ChineseJournal of Computers, 2011,34(4):705-716 (in Chinese with English abstract). [doi: 10.3724/SP.J.1016.2011.00705]

[4] Lin YS, Yang XJ, Tang T, Wang GB, Xu XH. An integrated energy optimization approach for CPU-GPU heterogeneous systembased on critical path analysis. Chinese Journal of Computers, 2012,35(1):123-133 (in Chinese with English abstract). [doi: 10.3724/SP.J.1016.2012.00123]

[5] Gebhart M, Johnson DR, Tarjan D, Keckler SW, Dally WJ, Lindholm E, Skadron K. Energy-Efficient mechanisms for managingthread context in throughput processors. Computer Architecture News, 2011,39(3):235-246. [doi: 10.1145/2024723.2000093]

[6] Wang GB, Lin YS, Yi W. Kernel fusion: An effective method for better power efficiency on multithreaded GPU. In: Proc. of the2010 IEEE/ACM Int’l Conf. on Green Computing and Communications & Int’l Conf. on Cyber, Physical and Social Computing.2010. 344-350. [doi: 10.1109/GreenCom-CPSCom.2010.102]

[7] Shaikh MZ, Gregoire M, Li W, Wroblewski M, Simon S. In situ power analysis of general purpose graphical processing unit. In:Proc. of the 19th Euromicro Int’l Conf. on Parallel, Distributed and Network-Based Processing. 2011. 40-44. [doi: 10.1109/PDP.2011.67]

[8] Collange S, Defour D, Tisserand A. Power consumption of GPUs from a software perspective. In: Proc. of the ComputationalScience (ICCS 2009). LNCS, Heidelberg: Springer-Verlag, 2009. 914-923.

[9] Jiao Y, Lin H, Balarji P, Feng W. Power and performance characterization of computational kernel on the GPU. In: Proc. of theIEEE/ACM Int’l Conf. on Green Computing and Communications & Int’l Conf. on Cyber, Physical and Social Computing. 2010.221-228. [doi: 10.1109/GreenCom-CPSCom.2010.143]

[10] Hong S, Kim H. An integrated GPU power and performance model. Computer Architecture News, 2010,38(3):280-289. [doi: 10.1145/1816038.1815998]

[11] Lee D, Ishihara T, Muroyama M, Yasuura H, Fallah F. An energy characterization framework for software-based embeddedsystems. In: Proc. of the 2006 IEEE/ACM/IFIP Workshop on Embedded Systems for Real Time Multimedia. 2006. 59-64. [doi: 10.1109/ESTMED.2006.321275]

[12] Sinha A, Ickes N, Chandrakasn AP. Instruction level and operating system profiling for energy exposed software. IEEE Trans. onVery Large Scale Integration (VLSI) Systems, 2003,11(6):1044-1057. [doi: 10.1109/TVLSI.2003.819569]

[13] Tan TK, Raghunathan A, Lakishminarayana G, Jha NK. High-Level software energy macro-modeling. In: Proc. of the 38th DesignAutomation Conf. 2001. 605-610. [doi: 10.1109/DAC.2001.935580]

[14] Senn E, Laurent J, Juin E, Diguet JP. Refining power consumption estimations in the component based AADL design flow. In:Proc. of the IEEE Conf. on Specification, Verification and Design Language. 2008. 173-178. [doi: 10.1109/FDL.2008.4641441]

[15] Zhang TT, Wu X, Li CD, Dong YW. On energy-consumption analysis and evaluation for component-based embedded system withCSP. Chinese Journal of Computers, 2009,32(9):1-8 (in Chinese with English abstract).

[16] Liu XB, Guo B, Shen Y, Xiong B, Wang JH, Wu YS, Liu YB. Embedded software energy modeling method at architecture level.扒獵瑡牮愠捊瑩???存摵潥椠??????????卬倠????は????ひづ???????崲?戨爲?嬺有?崰??攳渹攠猨獩祮?????健慳瑥琠敷物獴潨渠??????潨洠灡畢瑳整牲??牴挩栮椠瑨整捴瑰町爯支????兪畯慳渮瑯楲瑧愮瑣楮瘯攱‰?瀰瀭爹漸愲挵栯???琵栮?整摭???卯慩渺??爰愮渳挷椲猴振潓???漮爱朰愰渱??愰由昲洮愰渴渱?倵畝戼汢楲猾桛攱爷獝??ちの????????????扴牨?孮?ぁ嵋??汊慨牡欠?????畯湦瑴?卡???慡汲慣捨慩牴楥慣?偵??公甠慴湲瑡楮晳楦敯摲?楡湴瑩敯牮晳攺爠敁渠据敥?映潡牰?慲?睡档楨氠整?氠慬湯杷甠慥杮敥???氠敥捭瑢牥潤湤楥捤?乳潯瑦整獷?楲湥?吠桉敮漺牐敲瑯楣挮愠汯??潴浨灥甠瑄敥牳?卧据椬攠湁捵整??ちぴ???ㄠ??????????????孲摯潰楥???の??????樠?數湨瑩换獩??はの??ぐ??っ??嵳?扮牧?嬠??崰″匮椠渱朰攴父???‵吱漮眠慛牤摯獩?瀠爱漰戮愱戱椰氹椯獄瑁楔捅?瀲爰漰朳爮愱洲‵猳氷椴挲楝渼杢?￣?渱??倠牌潥捯??潁晓?琠桌敡??慬来獹琠畂栬氠?卨敩浮椠湊慌爮????ㄠ??坴慲摡敓牐湁???慔朱猠瑰畲桯汣?剳敳獯敲愺爠捃桍???づぬ?????????戠牉?嬺??嵲?坣愮渠杯??????楉??塅?′娰栰漶甠?塵女??偭爠潉普楴汥楧湲条?慥汤汃?灲慣瑵桩獴??剃畯慮湦??椲愰渰?堮甠攵‵?愭漵??漮甠牛湤慯汩?漠昱‰匮漱昱琰眹愯牃敉??金??日?金????????????㈱???楁湭??桦楴渠敔献攠?睬楩瑣桩?湧朠汦楯獲栠?慯扤獥瑲牮愠捰瑲???桡瑭琠灳???督睴睵?橥潳猺?潁爠杴?捥湯???ては????????ち??桮瑧洠?孲摲潥楬???の??????即倮????はひ???ど???ぐ??っ?嵳?扩牮?嬠??嵴??潲敳挬欠攲‰???匱攰愶爨愲?刺?‵匭漵爱琮椠湛杤?物愺琠攱猰?椱渰?瘶椯摪攮潩?敬渮挲漰搰椷渮朱‰瀮爰漰挲敝猼獢?显潛爲‰捝漠浍灡汲整硩楮琠祐?爬攠摈畵捳瑳楥潩湮???????呩牶慩湮獧??漠湳??楣物据畧椠瑡獬?慯湲摩?卨祭猠瑶敩浡猠?晥潲牭噡楔搠整潲?呮敳捦桯湲潭污潴杩祯??金??ぅ???????????のㄠ??学摴潷楡??ㄠぅ???の??呲?卮噧听?有????社?㈨?〩呼有崴-47. [doi: 10.1109/TSE.2010.13]

[21] Pharr M. GPU Gems2. 3rd ed., Boston: Addison Wesley, 2005. 493-495.

[22] Wang HF, Chen QK. Power estimating model and analysis of general programming on GPU. JOURNAL OF SOFTWARE, 2012,7(5):1164-1170. [doi: 10.4304/jsw.7.5.1164-1170]

[23] Bates DM, Watts DG. Nonlinear Regression Analysis and Its Applications. New York: Wiley, 1997. 36-37.

[24] NVIDIA_Corporation.CUDA c programming guide. 2012. http://www.nvidia.com/

[25] Parboil benchmark suite. 2012. http://impact.crhc.illinois.edu/parboil.php

[26] Che S, Boyer M, Meng J, Tarjan D, Sheaffer JW, Lee SH, Skadro K. Rodinia: A benchmark suite for heterogeneous computing. In:Proc. of the 2009 IEEE Int’l Symp. on Workload Characterization. 2009. 44-54. [doi: 10.1109/IISWC.2009.5306797]

[27] Zhang QH, Benveniste A. Wavelet networks. IEEE Trans. on Neural Networks, 1992,3(6):889-898. [doi: 10.1109/72.165591]

[28] Wang DW, Dou Y, Li SK. Loop kernel pipelining mapping onto coarse-grained reconfigurable architectures. Chinese Journal ofComputers, 2009,32(6):1089-1098 (in Chinese with English a

引用本文

王海峰,陈庆奎.静态程序切片的GPU通用计算功耗预测模型.软件学报,2013,24(8):1746-1760

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2012-08-03
最后修改日期:2012-10-19
录用日期:
在线发布日期: 2013-07-26
出版日期:

微信服务号

微信订阅号

引用本文

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码