Temperature-aware Task Scheduling on Multicores Based on Reinforcement Learning
Author:
Affiliation:

Clc Number:

TP316

Fund Project:

National Natural Science Foundation of China (61902341)

  • Article
  • | |
  • Metrics
  • |
  • Reference [31]
  • |
  • Related [20]
  • |
  • Cited by
  • | |
  • Comments
    Abstract:

    With the increase of the number of cores in computers, temperature-aware multi-core task scheduling algorithms have become a research hotspot in computer systems. In recent years, machine learning has shown great potential in various fields, and thus many work using machine learning techniques to manage system temperature have emerged. Among them, reinforcement learning is widely used for temperature-aware task scheduling algorithms due to its strong adaptability. However, the state-of-the-art temperature-aware task scheduling algorithms based on reinforcement learning do not effectively model the system, and it is difficult to achieve a better trade-off among temperature, performance, and complexity. Therefore, this study proposes a new multi-core temperature-aware scheduling algorithm based on reinforcement learning-ReLeTA. In the new algorithm, a more comprehensive state modeling method and a more effective reward function are proposed to help the system further reduce the temperature. Experiments are conducted on three different real computer platforms. The experimental results show the effectiveness and scalability of the proposed method. Compared with existing methods, ReLeTA can control the system temperature better.

    Reference
    [1] Rudi A, Bartolini A, Lodi A, et al. Optimum:Thermal-aware task allocation for heterogeneous many-core devices. In:Proc. of the 2014 Int'l Conf. on High Performance Computing & Simulation (HPCS). IEEE, 2014. 82-87.
    [2] Saito H, Yoneda T, Nakamura Y. An ILP-based multiple task allocation method for fault tolerance in networks-on-chip. In:Proc. of the 2012 IEEE 6th Int'l Symp. on Embedded Multicore SoCs. IEEE, 2012. 100-106.
    [3] Tang H, Feng X. Train running time allocation algorithm based on dynamic programming. In:Proc. of the 32nd Chinese Control Conf. IEEE, 2013. 8157-8160.
    [4] Rowlings M, Tyrrell AM, Trefzer MA. Social-insect-inspired adaptive task allocation for many-core systems. In:Proc. of the 2016 IEEE Congress on Evolutionary Computation (CEC). IEEE, 2016. 911-918.
    [5] Rathore V, Chaturvedi V, Singh AK, et al. Life guard:A reinforcement learning-based task mapping strategy for performance-centric aging management. In:Proc. of the 201956th ACM/IEEE Design Automation Conf. (DAC). IEEE, 2019. 1-6.
    [6] Mitchell TM. Machine Learning. McGraw-Hill, 2003.
    [7] Pagani S, Manoj PDS, Jantsch A, et al. Machine learning for power, energy, and thermal management on multicore processors:A survey. IEEE Trans. on Computer-aided Design of Integrated Circuits and Systems, 2020,39(1):101-116.
    [8] Chen KCJ, Liao YH. Online machine learning-based temperature prediction for thermal-aware NoC system. In:Proc. of the 2019 Int'l SoC Design Conf. (ISOCC). IEEE, 2019. 65-66.
    [9] Yang S, Shafik RA, Merrett GV, et al. Adaptive energy minimization of embedded heterogeneous systems using regression-based learning. In:Proc. of the 201525th Int'l Workshop on Power and Timing Modeling, Optimization and Simulation (PATMOS). IEEE, 2015. 103-110.
    [10] Donald J, Martonosi M. Techniques for multicore thermal management:Classification and new exploration. ACM SIGARCH Computer Architecture News, 2006,34(2):78-88.
    [11] Mnih V, Kavukcuoglu K, Silver D, et al. Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.
    [12] Silver D, Hubert T, Schrittwieser J, et al. Mastering chess and Shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815, 2017.
    [13] Ukhov I, Bao M, Eles P, et al. Steady-state dynamic temperature analysis and reliability optimization for embedded multiprocessor systems. In:Proc. of the 49th Annual Design Automation Conf. 2012. 197-204.
    [14] Chung EY, Benini L, De Micheli G. Dynamic power management using adaptive learning tree. In:Proc. of the 1999 IEEE/ACM Int'l Conf. on Computer-Aided Design. IEEE, 1999. 274-279.
    [15] Jung H, Pedram M. Supervised learning based power management for multicore processors. IEEE Trans. on Computer-aided Design of Integrated Circuits and Systems, 2010,29(9):1395-1408.
    [16] Lee W, Patel K, Pedram M. GOP-level dynamic thermal management in MPEG-2 decoding. IEEE Trans. on Very Large Scale Integration (VLSI) Systems, 2008,16(6):662-672.
    [17] Jayaseelan R, Mitra T. Dynamic thermal management via architectural adaptation. In:Proc. of the 200946th ACM/IEEE Design Automation Conf. IEEE, 2009. 484-489.
    [18] Iranfar A, Shahsavani SN, Kamal M, et al. A heuristic machine learning-based algorithm for power and thermal management of heterogeneous MPSoCs. In:Proc. of the 2015 IEEE/ACM Int'l Symp. on Low Power Electronics and Design (ISLPED). IEEE, 2015. 291-296.
    [19] Lu S, Tessier R, Burleson W. Reinforcement learning for thermal-aware many-core task allocation. In:Proc. of the 25th Edition on Great Lakes Symp. on VLSI. 2015. 379-384.
    [20] Das A, Shafik RA, Merrett GV, et al. Reinforcement learning-based inter-and intra-application thermal optimization for lifetime improvement of multicore systems. In:Proc. of the 51st Annual Design Automation Conf. 2014. 1-6.
    [21] Benini L, Bogliolo A, De Micheli G. A survey of design techniques for system-level dynamic power management. IEEE Trans. on Very Large Scale Integration (VLSI) Systems, 2000,8(3):299-316.
    [22] Durand S, Lesecq S. Nonlinear and asymmetric thermal-aware DVFS control. In:Proc. of the 2013 European Control Conf. (ECC). IEEE, 2013. 3240-3245.
    [23] Mnih V, Badia AP, Mirza M, et al. Asynchronous methods for deep reinforcement learning. In:Proc. of the Int'l Conf. on Machine Learning. 2016. 1928-1937.
    [24] Lillicrap TP, Hunt JJ, Pritzel A, et al. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971, 2015.
    [25] Watkins CJCH, Dayan P. Q-learning. Machine Learning, 1992,8(3-4):279-292.
    [26] Bienia C, Kumar S, Singh JP, et al. The PARSEC benchmark suite:Characterization and architectural implications. In:Proc. of the 17th Int'l Conf. on Parallel Architectures and Compilation Techniques. 2008. 72-81.
    [27] Pallipadi V, Starikovskiy A. The ondemand governor:Past, present and future. In:Proc. of the Linux Symp., Vol.2. 2006. 223-238.
    [28] Ioffe S, Szegedy C. Batch normalization:Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167, 2015.
    [29] Yeo I, Liu CC, Kim EJ. Predictive dynamic thermal management for multicore systems. In:Proc. of the 45th Annual Design Automation Conf. 2008. 734-739.
    [30] Sutton RS, Barto AG. Reinforcement Learning:An Introduction. MIT Press, 2018.
    [31] Ishkov N. A complete guide to Linux process scheduling[MS. Thesis]. Tampere:University of Tampere, 2015.
    Cited by
Get Citation

杨世贵,王媛媛,刘韦辰,姜徐,赵明雄,方卉,杨宇,刘迪.基于强化学习的温度感知多核任务调度.软件学报,2021,32(8):2408-2424

Copy
Share
Article Metrics
  • Abstract:3224
  • PDF: 6827
  • HTML: 3911
  • Cited by: 0
History
  • Received:July 24,2020
  • Revised:September 07,2020
  • Online: February 07,2021
  • Published: August 06,2021
You are the first2034065Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063