一种优化MapReduce系统能耗的数据布局算法
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金(61202088, 61433008); 中国博士后科学基金(2013M540232); 教育部高等学校博士学科点专项科研基金(2012004211 0028); 中央高校基本科研业务费种子基金(N130417001)


Energy Consumption Optimization Data Placement Algorithm for MapReduce System
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    在云计算技术和大数据技术的推动下,IT资源的规模不断扩大,其能耗问题日益显著.研究表明:节点资源利用率不高、资源空闲导致的能源浪费,是目前大规模分布式系统的主要问题之一.研究了MapReduce系统的能耗优化.传统的基于软件技术的能耗优化方法多采用负载集中和节点开关算法,但由于MapReduce任务的特点,集群节点不仅要完成运算,还需要存储数据,因此,传统方法难以应用到MapReduce集群.提出了良好的数据布局可以优化集群能耗.基于此,首先定义了数据布局的能耗优化目标,并提出相应的数据布局算法;接着,从理论上证明该算法能够实现数据布局的能耗优化目标;最后,在异构集群中部署3种数据布局不同的MapReduce系统,通过对比三者在执行CPU密集型、I/O密集型和交互型这3种典型运算时的集群能耗,验证了所提出的数据布局算法的能耗优化效果.理论和实验结果均表明,所提出的布局算法能够有效地降低MapReduce集群的能耗.上述工作都将促进高能耗计算和大数据分析的应用.

    Abstract:

    Driven by big data and cloud computing techniques, the scale of the IT expenditure grows continuously and energy consumption problem has become more and more urgent. Study shows that the lower resource usage and the long idle time of network nodes are responsible for this problem in a large-scale distributed system. This paper studies the energy consumption optimization of MapReduce system. Traditional optimization approaches employ workload concentration, task live-immigration or dynamical power on-off methods. But in a MapReduce system, a node not only executes tasks but also provides data, therefore cannot be simply shut down for energy-saving while the tasks running on it are migrated. This paper presents an idea that a good data placement can optimize the energy consumption of a MapReduce system. Based on this idea, the target of data placement which optimizes the energy consumption is defined. Then the data placement algorithm achieving the target is proved efficient in theory. Finally, three MapReduce systems with different data placement algorithms are deployed on the heterogeneous MapReduce system. Comparing the energy consumption of three systems under the three typical CPU-intensive, I/O intensive and interactive jobs, the proposed data placement algorithm is proved to be able to optimize the energy consumption of a MapReduce system. The optimization efficiency of the proposed approach is proved both in theory and by experiment, demonstrating its ability to facilitate the applications of energy consumption computing and big data analysis.

    参考文献
    相似文献
    引证文献
引用本文

宋杰,王智,李甜甜,于戈.一种优化MapReduce系统能耗的数据布局算法.软件学报,2015,26(8):2091-2110

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2014-06-11
  • 最后修改日期:2014-12-09
  • 录用日期:
  • 在线发布日期: 2015-08-05
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号