集群上一种面向空间连接聚集的并行计算模型
DOI:
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金(61070035, 41271403);国家高技术研究发展计划(863)(2011AA120306, 2007AA120402);教育部高等学校博士学科点专项科研基金(20104307110017)


Parallel Computing Model for Spatial Join Aggregate on Cluster
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    单机运行环境难以满足海量空间数据的连接聚集操作对时空开销的需求,集群上的并行计算是高效处理海量空间数据的连接聚集操作的关键. Map-Reduce是云计算中一种应用于大规模集群进行大规模数据处理的分布式并行编程模型,分析发现,Map-Reduce并不直接支持以既高效又自然的方式来处理具有二次归约特征的并行空间连接聚集操作.因此,提出了一种并行计算模型——Map-Reduce-Combine(MRC)来有效地处理大规模空间数据的连接聚集操作.MRC在Map-Reduce 模型上增加一个Combine阶段,有效地合并分散在各个Reducer的部分聚集结果.针对并行任务划分中空间对象的单分配问题,提出了过滤优化算法,提高了MRC下处理空间连接聚集查询的效率.实验验证所提出的并行计算模型在处理空间连接聚集查询时具有良好的效率、有效性、可扩展性和简单性.

    Abstract:

    Since processing large-scale spatial join aggregate (SJA) is usually difficult to be implemented on a single machine, parallel computing on cluster has been the key to process large-scale SJA operation efficiently. Map-Reduce has been the mainstream parallel computing technique for massive data on cluster. However, Map-Reduce does not directly support processing parallel SJA with both high efficiency and straightforward way, for it needs to perform a second reduce operation. This paper proposes a novel parallel computing model, Map-Reduce-Combine (MRC), which is able to process large-scale SJA efficiently with a simple way on cluster. MRC adds to Map-Reduce a Combine phase that can efficiently combine partial aggregate results distributed among different Reducers, which is caused by the multiple assignment of spatial object. For the spatial object assigned only once, a filter optimization method has been proposed to pick up the result of single assignment object obtained in Reduce phase and further enhance the performance of processing SJA. Extensive experiments in large real spatial data have demonstrated the efficiency, effectiveness, scalability and simplicity of the proposed parallel computing model for processing SJA on massive spatial data.

    参考文献
    相似文献
    引证文献
引用本文

刘义,景宁,陈荦,熊伟.集群上一种面向空间连接聚集的并行计算模型.软件学报,2013,24(S2):99-109

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2012-08-05
  • 最后修改日期:2013-07-22
  • 录用日期:
  • 在线发布日期: 2014-01-02
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号