Efficient Distributed Query Processing on Large Scale RDF Graph Data

doi:10.13328/j.cnki.jos.005696

微信服务号

微信订阅号

2025-4-6- 0

Home > Archive>Volume 30, Issue 3, 2019 >498-514. DOI:10.13328/j.cnki.jos.005696

PDF HTML XML Export Cite reminder

Efficient Distributed Query Processing on Large Scale RDF Graph Data
DOI:
                        10.13328/j.cnki.jos.005696
                    
Author:
                        WANG XinWANG Xin
College of Intelligence and Computing, Tianjin University, Tianjin 300354, China;Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300354, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
XU QiangXU Qiang
College of Intelligence and Computing, Tianjin University, Tianjin 300354, China;Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300354, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
CHAI Le-LeCHAI Le-Le
College of Intelligence and Computing, Tianjin University, Tianjin 300354, China;Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300354, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
YANG Ya-JunYANG Ya-Jun
College of Intelligence and Computing, Tianjin University, Tianjin 300354, China;Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300354, China;State Key Laboratory of Digital Publishing Technology, Beijing 100871, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
CHAI Yun-PengCHAI Yun-Peng
School of Information, Renmin University of China, Beijing 100872, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:National Natural Science Foundation of China (61572353, 61402323, 61472427); Natural Science Foundation of Tianjin (17JCYBJC15400); Opening Project of State Key Laboratory of Digital Publishing Technology; Natural Science Foundation of Beijing (4172031)

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Knowledge graphs are the main representation form of intelligent data. With the development of knowledge graphs, more and more intelligent data has been released in the form of the resource description framework (RDF). It is known that the semantics of SPARQL correspond to graph homomorphism which is an NP-complete problem. Therefore, how to efficiently answer SPARQL queries in parallel over big RDF graphs has been widely recognized as a challenging problem. There are some research works using the MapReduce computational model to process big RDF graph. However, SPARQL queries in these works are decomposed into the set of query clauses without considering any semantics and graph structure embedded in RDF graph, which leads to overmuch MapReduce iterations. This study first decomposes the SPARQL query graph into a set of stars by utilizing the semantic and structural information embedded RDF graphs as heuristics, which can be matched in fewer MapReduce iterations. Meanwhile, a matching order of these stars is given to reduce intermediate results in MapReduce iterations. During the matching phase, each round of MapReduce adds one star with the join operation. The extensive experiments on both synthetic dataset WatDiv, and real-world dataset DBpedia are carried out. The experiments results demonstrate that the proposed star decomposition-based method can answer SPARQL BGP queries efficiently, which outperforms SHARD and S2X by one order of magnitude. Finally, extensive experiments show that the performance of the optimization algorithms is improved by 49.63% and 78.71% than the basic algorithm over both synthetic and real datasets.

Key words:star decomposition;distributed;basic graph pattern matching;large scale RDF graphs;MapReduce

Get Citation

王鑫,徐强,柴乐乐,杨雅君,柴云鹏.大规模RDF图数据上高效率分布式查询处理.软件学报,2019,30(3):498-514

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:July 20,2018
Revised:September 20,2018
Adopted:
Online: March 06,2019
Published:

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History