众核处理器系统核资源动态分组的自适应调度算法
作者:
基金项目:

国家自然科学基金(61073011, 61133004, 61173039); 国家高技术研究发展计划(863)(2008AA01A202, 2009AA01A131); 中意国际合作项目(2009DFA12110)


Adaptive Scheduling Algorithm Based on Dynamic Core-Resource Partitions for Many-Core Processor Systems
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [17]
  • |
  • 相似文献
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    针对众核处理器系统的核资源优化使用问题,提出了一种支持核资源动态分组的自适应调度算法CASM (core-partitioned adaptive scheduling for many-core systems).该算法通过对任务簇的拆分与合并,动态构建可弹性分区的核逻辑组,实现核资源的隔离优化访问.为了平衡核资源利用率及任务调度效率,CASM 算法针对任务簇间和簇内的不同特点,分别采用公平性较好的均衡调度算法和资源利用率较高的自适应调度算法.在线竞争理论分析表明,CASM 算法的任务执行时间在线竞争比为常数2,其性能可扩展性较好.实验结果表明,与WS(work-stealing),AGDEQ(adaptive greedy dynamic equi-partitioning)和EQUI?EQUI 算法相比,CASM算法使任务集运行时间分别减少了近46%,32%和15%.在相同能耗情况下,CASM 算法大幅度地提升了系统吞吐量.

    Abstract:

    With the aim to address the increasing difficulty of efficiently using large number of cores in many-core processors, a core-partitioned adaptive scheduling algorithm, named CASM (core-partitioned adaptive scheduling for many-core systems), is proposed. CASM dynamically aggregates cores into different partitions by splitting or merging task-clusters, which ensures the efficiency of isolated accessing in these core partitions. To improve the scheduling efficiency of CASM, equi-partitioning scheduling algorithm is adopted to reallocate the cores among task-clusters, and the feedback-driven adaptive scheduling algorithm is implemented within the task-clusters. Online competitive analysis shows that CASM achieves 2-competitiveness ratio with respect to the execution time of parallel jobs, which indicates that CASM has better performance and scalability. The experimental results demonstrate that compared with WS (work-stealing), AGDEQ (adaptive greedy dynamic equi-partitioning) and EQUI?EQUI, CASM reduces the execution time of the same workload by nearly 46%, 32% and 15% respectively. Under the same power consumption, CASM greatly enhances the system throughput.

    参考文献
    [1] Geer D. Chip makers turn to multicore processors. Computer, 2005,38(5):11-13. [doi: 10.1109/MC.2005.160]
    [2] Borkar S, Chien AA. The future of microprocessors. Communications of the ACM, 2011,54(5):67-77. [doi: 10.1145/1941487.1941507]
    [3] Bridges MJ, Vachharajani N, Zhang Y, Jablin T, August DI. Revisiting the sequential programming model for the multicore Era. IEEE Micro, 2008,28(1):12-20. [doi: 10.1109/MM.2008.13]
    [4] Abadi M, Birrell A, Harris T, Isard M. Semantics of transactional memory and automatic mutual exclusion. ACM Trans. on Programming Languages and Systems, 2011,33(1):1-50. [doi: 10.1145/1889997.1889999]
    [5] Loh E. The ideal HPC programming language. Communications of the ACM, 2010,53(7):42-47. [doi: 10.1145/1785414.1785433]
    [6] Bhattacharjee A, Contreras G, Martonosi M. Parallelization libraries: Characterizing and reducing overheads. ACM Trans. on Architecture and Code Optimization, 2011,8(1):5-29. [doi: 10.1145/1952998.1953003]
    [7] Broquedis F, Diakhat? F, Thibault S, Aumage O, Namyst R, Wacremier PA. Scheduling dynamic OpenMP applications over multicore architectures. In: Eigenmann R, Supinski BR, eds. OpenMP in a New Era of Parallelism. Berlin: Springer-Verlag, 2008. 170-180. [doi: 10.1007/978-3-540-79561-2_15]
    [8] Frigo M, Leiserson CE, Randall KH. The implementation of the Cilk-5 multithreaded language. ACM SIGPLAN Notices, 1998, 33(5):212-223. [doi: 10.1145/277652.277725]
    [9] Long GP, Zhang JC, Fan DR. Architectural support and evaluation of Cilk language on many-core architectures. Chinese Journal of Computers, 2008,31(11):1975-1985 (in Chinese with English abstract).
    [10] Ebrahimi E, Lee CJ, Mutlu O, Patt YN. Fairness via source throttling: A configurable and high-performance fairness substrate for multi-core memory systems. ACM SIGPLAN Notices, 2010,45(3):335-346. [doi: 10.1145/1735971.1736058]
    [11] Das R, Mutlu O, Moscibroda T, Das C. A?rgia: A network-on-chip exploiting packet latency slack. IEEE Micro, 2011,31(1): 29-41. [doi :10.1109/MM.2010.98]
    [12] Edmonds J. Scheduling in the dark (improved result). Theoretical Computer Science, 2007,235(1):109-141. [doi: 10.1145/301250.301299]
    [13] Robert J, Schabanel N. Non-Clairvoyant batch sets scheduling: Fairness is fair enough. In: Arge L, Hoffmann M, Welzl E, eds. Lecture Notes in Computer Science, Berlin: Springer-Verlag, 2007. 741-753. [doi: 10.1007/978-3-540-75520-3_65]
    [14] Kumar V, Fedorova A. Towards better performance per watt in virtual environments on asymmetric single-isa multi-core systems. ACM SIGOPS Operating Systems Review, 2009,43(3):105-109. [doi: 10.1145/1618525.1618538]
    [15] Marr, S. Haupt, M, Timbermont, S, Adams B, D’Hondt T, Costanza P, de Meuter W. Virtual machine support for many-core architectures: Decoupling abstract from concrete concurrency models. In: Beresford RA, Gay S, eds. Proc. of the Programming Language Approaches to Concurrency and Communication-cEntric Software, EPTCS. 2010. 63-77. [doi: 10.4204/EPTCS.17.6]
    [16] He YX, Hsu WJ, Leiserson CE. Provably efficient online non-clairvoyant adaptive scheduling. IEEE Trans. on Parallel and Distributed Systems, 2008,19(9):1263-1279. [doi: 10.1109/IPDPS.2007.370303]
    [17] Pruhs K. Competitive online scheduling for server systems. ACM SIGMETRICS Performance Evaluation Review, 2007,34(4): 52-58. [doi: 10.1145/1243401.1243411]
    相似文献
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

曹仰杰,钱德沛,伍卫国,董小社.众核处理器系统核资源动态分组的自适应调度算法.软件学报,2012,23(2):240-252

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2011-07-12
  • 最后修改日期:2011-09-06
  • 在线发布日期: 2012-02-07
文章二维码
您是第19877004位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号