Algorithm for Processing k-Nearest Join Based on R-Tree in MapReduce

doi:10.3724/SP.J.1001.2013.04377

微信服务号

微信订阅号

2025-5-13- 4

Home > Archive>Volume 24, Issue 8, 2013 >1836-1851. DOI:10.3724/SP.J.1001.2013.04377

PDF HTML XML Export Cite reminder

Algorithm for Processing k-Nearest Join Based on R-Tree in MapReduce
DOI:
                        10.3724/SP.J.1001.2013.04377
                    
Author:
                        LIU YiLIU Yi
College of Electronic Science and Engineering, National University of Defense Technology, Changsha 410073, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
JING NingJING Ning
College of Electronic Science and Engineering, National University of Defense Technology, Changsha 410073, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
CHEN LuoCHEN Luo
College of Electronic Science and Engineering, National University of Defense Technology, Changsha 410073, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
XIONG WeiXIONG Wei
College of Electronic Science and Engineering, National University of Defense Technology, Changsha 410073, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

To accelerate the k-nearest neighbor join (knnJ) query for large scale spatial data, the study presents a knnJ based on R-tree in MapReduce. First, the research uses the formalization of independent parallelism and sequential synchronization (IPSS) computation to abstract MapReduce parallel program model. Next, based on this parallel model abstraction, this paper proposes efficient algorithms for bulk building R-tree and performing knnJ query based on the constructed R-tree respectively. In the process of bulk building R-tree, a sampling algorithm is provided to determine the spatial partition function rapidly, which make the process of building R-tree conform to IPSS model and can be expressed easily in MapReduce. In the process of knnJ query, the knn expanded bounding box is introduced to limit the knn query range and partition data, and then the generated R-tree is used to execute knnJ query in parallel fashion, achieving high performance. This paper analyzes the communication and computation cost in theory. Experimental results and analysis in large real spatial data demonstrate that the algorithm can efficiently resolve the large scale knnJ spatial query in MapReduce environment, and has a good practical application.

Key words:cloud computing;MapReduce;k-nearest neighbor join;spatial query;R-tree

Get Citation

刘义,景宁,陈荦,熊伟. MapReduce框架下基于R-树的k-近邻连接算法.软件学报,2013,24(8):1836-1851

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:November 12,2012
Revised:January 25,2013
Adopted:
Online: July 26,2013
Published:

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History