GPU上两阶段负载调度问题的建模与近似算法

doi:10.13328/j.cnki.jos.004527

微信服务号

微信订阅号

2025年5月1日 11:16 星期四

首页 > 过刊浏览>2014年第25卷第2期 >298-313. DOI:10.13328/j.cnki.jos.004527

PDF HTML阅读 XML下载导出引用引用提醒

GPU上两阶段负载调度问题的建模与近似算法
DOI:
                        10.13328/j.cnki.jos.004527
                    
CSTR:
                        
                    
作者:
                        孙景昊孙景昊
东北大学 信息科学与工程学院,辽宁 沈阳 110004
在期刊界中查找
在百度中查找
在本站中查找
邓庆绪邓庆绪
东北大学 信息科学与工程学院,辽宁 沈阳 110004
在期刊界中查找
在百度中查找
在本站中查找
孟亚坤孟亚坤
东北大学 信息科学与工程学院,辽宁 沈阳 110004
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家自然科学基金（61300194）；国家教育部博士点基金（20110042110021）；国家科技支撑计划（2012BAK24B01）；河北省自然科学基金（F2013501048）

Two-Stage Workload Scheduling Problem on GPU Architectures: Formulation and Approximation Algorithm

Author:

SUN Jing-Hao
SUN Jing-Hao
School of Information Science and Engineering, Northeastern University, Shenyang 110004, China
在期刊界中查找
在百度中查找
在本站中查找
DENG Qing-Xu
DENG Qing-Xu
School of Information Science and Engineering, Northeastern University, Shenyang 110004, China
在期刊界中查找
在百度中查找
在本站中查找
MENG Ya-Kun
MENG Ya-Kun
School of Information Science and Engineering, Northeastern University, Shenyang 110004, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

随着硬件功能的不断丰富和软件开发环境的逐渐成熟，GPU(graphics processing unit)越来越多地被应用到通用计算领域，并对诸多计算系统(尤其是嵌入式系统)性能的显著提升起到了至关重要的作用.在基于GPU的计算系统中，大规模并行负载同时进行数据传输和加载的情况时常发生，数据传输延时在系统性能全局最优化中变得不容忽视.综合考虑负载的传输时间和执行时间，以总负载makespan最小化作为系统性能的全局优化目标，研究了GPU上负载“传输-执行”联合调度问题.首先，将负载的时间信息和并行任务数与矩形域的二维空间联系起来，建立了负载的2D双层矩形域模型；然后，将GPU上负载调度问题归结为一类Strip-Packing问题；最后，基于贪婪策略给出了近似度为3的多项式时间近似算法，算法复杂度为O(nlogn).该近似算法的核心是对数据传输阶段进行负载排序调度.这从理论层面上证明了GPU系统采取“传输-执行”两阶段调度的有效性，即，在数据传输阶段采取负载排序调度，在负载执行阶段采取先来先服务(first-come-first-serve，简称FCFS)调度，能够使GPU 性能达到全局最优或近似最优.

关键词:GPU(graphics processing unit);数据传输;负载排序;strip-packing;近似算法

Abstract:

With the prevalence of general purpose computation, GPUs (graphics processing units) are becoming extremely important to significantly improve system performances for many computing systems, including embedded systems. Running massively parallel kernels on GPUs is challenging for system’s overall performance especially when large amount of workloads (kernels) are running together. This paper investigates how to schedule large amount of workloads that have to be executed on GPUs to minimize the makespan of all workloads to improve the system overall performance. By considering the transfer time and execution time together, the study makes an abstraction for each workload and formulate the scheduling problem on GPUs into a 2D rectangular strip-packing model. A polynomial 3-approxiamation algorithm is proposed to solve the strip-packing problem. The approximation results exhibit an effective approach for workload sequencing during the data offloading on GPUs. It also implies that the scheduling jointed by workload sequencing for GPUs data offloading and first-come-first-serve (FCFS) scheduling inside GPUs with workload conserving can improve the system performance optimally or near-optimally.

Key words:GPU (graphics processing unit);data transfer;workload sequencing;strip-packing;approximation algorithm

引用本文

孙景昊,邓庆绪,孟亚坤. GPU上两阶段负载调度问题的建模与近似算法.软件学报,2014,25(2):298-313

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2013-05-06
最后修改日期:2003-09-29
录用日期:
在线发布日期: 2014-01-26
出版日期:

微信服务号

微信订阅号

引用本文

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码