KGDB: Knowledge Graph Database System with Unified Model and Query Language

doi:10.13328/j.cnki.jos.006181

微信服务号

微信订阅号

2025-5-2- 11

Home > Archive>Volume 32, Issue 3, 2021 >781-804. DOI:10.13328/j.cnki.jos.006181

PDF HTML XML Export Cite reminder

KGDB: Knowledge Graph Database System with Unified Model and Query Language
DOI:
                        10.13328/j.cnki.jos.006181
                    
Author:
                        LIU Bao-ZhuLIU Bao-Zhu
College of Intelligence and Computing, Tianjin University, Tianjin 300350, China;Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300350, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
WANG XinWANG Xin
College of Intelligence and Computing, Tianjin University, Tianjin 300350, China;Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300350, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
LIU Peng-KaiLIU Peng-Kai
College of Intelligence and Computing, Tianjin University, Tianjin 300350, China;Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300350, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
LI Si-ZhuoLI Si-Zhuo
College of Intelligence and Computing, Tianjin University, Tianjin 300350, China;Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300350, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
ZHANG Xiao-WangZHANG Xiao-Wang
College of Intelligence and Computing, Tianjin University, Tianjin 300350, China;Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300350, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
YANG Ya-JunYANG Ya-Jun
College of Intelligence and Computing, Tianjin University, Tianjin 300350, China;Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300350, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:National Key Research and Development Program (2019YFE0198600); National Natural Science Foundation of China (61972275); CCF-Huawei Database Innovation Research Plan (CCF-Huawei DBIR2019004B)

Article

Figures

Metrics

Reference [42]

Related [20]

Cited by

Materials

Comments

Abstract:

Knowledge graph is an important cornerstone of artificial intelligence, which currently has two main data models: RDF graph and property graph. There are several query languages on these two data models. The query language on RDF graph is SPARQL, and the query language on property graph is mainly Cypher. Over the last decade, various communities have developed different data management methods for RDF graphs and property graphs. Inconsistent data models and query languages hinder the wider application of knowledge graphs. KGDB is a knowledge graph database system with unified data model and query language. (1) Based on the relational model, a unified storage scheme is proposed, which supports the efficient storage of RDF graphs and property graphs, and meets the requirement of knowledge graph data storage and query load. (2) Using the clustering method based on characteristic sets, KGDB can handle the issue of untyped triple storage. (3) It realizes the interoperability of SPARQL and Cypher, which are two different knowledge graph query languages, and enables them to operate on the same knowledge graph. The extensive experiments on real-world datasets and synthetic datasets are carried out. The experimental results show that, compared with the existing knowledge graph database management systems, KGDB can not only provide more efficient storage management, but also has higher query efficiency. KGDB saves 30% of the storage space on average compared with gStore and Neo4j. The experimental results on basic graph pattern matching query show that, for the real-world dataset, the query efficiency of KGDB is generally higher than that of gStore and Neo4j, and can be improved by at most two orders of magnitude.

Key words:knowledge graph;SPARQL;Cypher;RDF graph;property graph

Reference

[1] Wang X, Zou L, Wang CK, Peng P, Feng ZY. Research on knowledge graph data management:A survey. Ruan Jian Xue Bao/Journal of Software, 2019,30(7):2139-2174(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5841.htm[doi:10. 13328/j.cnki.jos.005841]

[2] Zou L, Özsu MT, Chen L. gStore:A graph-based SPARQL query engine. The VLDB Journal, 2014,23(4):565-590.

[3] The Neo4j Team. The Neo4j manual v4.1. 2020. https://neo4j.com/docs/developer-manual/current/

[4] Dgraph Labs, Inc. The Dgraph homepage. 2020. https://dgraph.io/

[5] The HugeGraph Team. The HugeGraph manual. 2020. https://hugegraph.github.io/hugegraph-doc/

[6] Abadi DJ, Marcus A, Madden SR. Scalable semantic Web data management using vertical partitioning. In:Klas W, ed. Proc. of the 33rd Int'l Conf. on Very Large Data Bases. Vienna:VLDB Endowment, 2007. 411-422.

[7] Bornea MA, Dolby J, Kementsietsidis A. Building an efficient RDF store over a relational database. In:Ross K, ed. Proc. of the 2013 ACM SIGMOD Int'l Conf. on Management of Data. New York:ACM, 2013. 121-132.

[8] Moerkotte G, Neumann T. Characteristic sets:Accurate cardinality estimation for RDF queries with multiple joins. IEEE Trans.on Data Engineering, 2011,984-994.

[9] Anagnostopoulos I, Mamoulis N, et al. Extended characteristic sets:Graph indexing for SPARQL query optimization. In:Proc. of the 2017 IEEE Int'l Conf. on Data Engineering (ICDE). California:IEEE, 2017. 497-508.

[10] Anyanwu K, Kim H, et al. Type-Based semantic optimization for scalable RDF graph pattern matching. In:Proc. of the 26th Int'l Conf. on World Wide Web. New York:ACM, 2017. 785-793.

[11] JanusGraph Authors. JanusGraph-Distributed graph database. 2020. http://janusgraph.org/

[12] TigerGraph. TigerGraph-The first native parallel graph. 2020. https://www.tigergraph.com/

[13] Zou L, Peng P. A survey of distributed RDF data management. Journal of Computer Research and Development, 2017,54(6):1213-1224(in Chinese with English abstract).

[14] Wang TT, Rong CT, Lu W. Survey on technologies of distributed graph processing systems (in Chinese with English abstract). Ruan Jian Xue Bao/Journal of Software, 2018,29(3):569-586. http://www.jos.org.cn/1000-9825/5450.htm[doi:10.13328/j.cnki.jos. 005450]

[15] Harris S, Gibbins N. 3store:Efficient bulk RDF storage. In:Volz R, ed. Proc. of the 1st Int'l Workshop on Practical and Scalable Semantic Systems. Sanibel Island:CEUR-WS.org, 2004. 81-95.

[16] Pan Z, Heflin J. DLDB:Extending relational databases to support semantic Web queries. In:Volz R, ed. Proc. of the 1st Int'l Workshop on Practical and Scalable Semantic Systems. Sanibel Island:CEUR-WS.org, 2004. 109-113.

[17] Wilkinson K. Jena property table implementation. In:Smart PR, ed. Proc. of the 2nd Int'l Workshop on Scalable Semantic Web Knowledge Base Systems. Athens, 2006. 35-46.

[18] Abadi DJ, Marcus A, Madden SR. SW-Store:A vertically partitioned DBMS for semantic Web data management. VLDB Journal, 2009,18(2):385-406.

[19] Yuan P, Liu P, Wu B, et al. TripleBit:A fast and compact system for large scale RDF data. Proc. of the VLDB Endowment, 2013, 6(7):517-528.

[20] Neumann T, Weikum G. RDF-3X:A RISC-style engine for RDF. Proc. of the VLDB Endowment, 2008,1(1):647-659.

[21] Weiss C, Karras P, Bernstein A. Hexastore:Sextuple indexing for semantic Web data management. Proc. of the VLDBEndowment, 2008,1(1):1008-1019.

[22] Kim H, Ravindra P, et al. A semantics-aware storage framework for scalable processing of knowledge graphs on Hadoop. IEEE Trans. on Big Data, 2017:193-202.

[23] Sun W, Fokoue A, Srinivas K. SQLgraph:An efficient relational-based property graph store. In:Sellis T, ed. Proc. of the 2015 ACM SIGMOD Int'l Conf. on Management of Data. New York:ACM, 2015. 1887-1901.

[24] The AgensGraph Team. Manual v1.0. 2020. https://bitnine.net/documentations/manual/agens_graph_developer_manual_en.html

[25] Chodorow K. MongoDB:The Definitive Guide:Powerful and Scalable Data Storage. O'Reilly Media, Inc., 2013.

[26] Blazegraph by Systap, LLC. Blazegraph. 2020. https://www.blazegraph.com/

[27] OpenLink Software. OpenLink virtuoso. 2020. https://virtuoso.openlinksw.com/

[28] Eclipse RDF4J. RDF4J. 2020. http://rdf4j.org/

[29] Neumann T, Weikum G. RDF-3X:A RISC-style engine for RDF. Proc. of the VLDB Endowment, 2008,1(1):647-659.

[30] Franz Inc. AllegroGraph. 2020. https://franz.com/agraph/allegrograph/

[31] Ontotext. GraphDB. 2020. http://graphdb.ontotext.com/

[32] Apache TinkerPop. TinkerPop3 documentation v.3.4.8. 2020. https://tinkerpop.apache.org/docs/3.4.8/reference/

[33] Callidus Software Inc. OrientDB-Multi-Model database. 2020. http://orientdb.org/

[34] S1CK. Cypher for apache spark. 2020. https://github.com/opencypher/cypher-for-apache-spark

[35] Gutierrez C, Hurtado CA, Mendelzon AO. Foundations of semantic Web databases. Journal of Computer and System Sciences, 2011,77(3):520-541.

[36] Francis N, Green A, Guagliardo P. Cypher:An evolving query language for property graphs. In:Das G, ed. Proc. of the 2018 Int'l Conf. on Management of Data. New York:ACM, 2018. 1433-1445.

[37] Guo Y, Pan Z, Heflin J. LUBM:A benchmark for OWL knowledge base systems. Web Semantics:Science, Services and Agentson the World Wide Web, 2005,3(2-3):158-182.

[38] University of Mannheim. DBpedia. 2020. http://wiki.dbpedia.org/About

附中文参考文献:

[1] 王鑫,邹磊,王朝坤,彭鹏,冯志勇.知识图谱数据管理研究综述.软件学报,2019,30(7):2139-2174. http://www.jos.org.cn/1000-9825/5841.htm[doi:10.13328/j.cnki.jos.005841]

[13] 邹磊,彭鹏.分布式RDF数据管理综述.计算机研究与发展,2017,54(6):1213-1224.

[14] 王童童,荣垂田,卢卫.分布式图处理系统技术综述.软件学报,2018,29(3):569-586. http://www.jos.org.cn/1000-9825/5450.htm[doi:10.13328/j.cnki.jos.005450]

Get Citation

刘宝珠,王鑫,柳鹏凯,李思卓,张小旺,杨雅君. KGDB:统一模型和语言的知识图谱数据库管理系统.软件学报,2021,32(3):781-804

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:July 20,2020
Revised:September 03,2020
Adopted:
Online: January 21,2021
Published: March 06,2021

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History