[关键词]
[摘要]
知识图谱是一种基于图的结构化知识表示方式.如何构造大规模高质量的知识图谱,是研究和实践面临的一个重要问题.提出了一种基于互联网群体智能的协同式知识图谱构造方法.该方法的核心是一个持续运行的回路,其中包含自由探索、自动融合、主动反馈3个活动.在自由探索活动中,每一参与者独立进行知识图谱的构造活动.在自动融合活动中,所有参与者的个体知识图谱被实时融合在一起,形成群体知识图谱.在主动反馈活动中,支撑环境根据每一参与者的个体知识图谱和当前时刻的群体知识图谱,向该参与者推荐特定的知识图谱片段信息,以提高其构造知识图谱的效率.针对这3个活动,建立了一种层次式的个体知识图谱表示机制,提出了一种以最小化广义熵为目标的个体知识图谱融合算法,设计了情境无关和情境相关两种类型的信息反馈方式.为了验证所提方法及关键技术的可行性,设计并实施了3种类型的实验:仅包含结构信息的仿真图融合实验、大规模真实知识图谱的融合实验,以及真实知识图谱的协同式构造实验.实验结果表明,该知识图谱融合算法能够有效利用知识图谱的结构信息以及节点的语义信息,形成高质量的知识图谱融合方案;基于“探索-融合-反馈”回路的协同方法能够提升群体构造知识图谱的规模和个体构造知识图谱的效率,并展现出较好的群体规模可扩展性.
[Key word]
[Abstract]
Knowledge graph is a graph-based structural representation of knowledge. One of the key problems about knowledge graph in both research and practice is how to construct large-scale high-quality knowledge graphs. This paper presents an approach to construct knowledge graphs based on Internet-based human collective intelligence. The core of this approach is a continuously executing loop, called the EIF loop or EIFL, consisting of three activities: free exploration, automatic integration, and proactive feedback. In free exploration activity, each participant tries to construct an individual knowledge graph alone. In automatic integration activity, all participants’ current individual knowledge graphs are integrated in real-time into a collective knowledge graph. In proactive feedback activity, each participant is provided with personalized feedback information from the current collective knowledge graph, in order to improve the participant’s efficiency of constructing an individual knowledge graph. In particular, a hierarchical knowledge graph representation mechanism is proposed, a knowledge graph merging algorithm is designed driven by the goal of minimizing the collective knowledge graph’s general entropy, and two ways for context-dependent and context-independent information feedback are introduced, repectively. In order to investigate the feasibility of the proposed approach, three kinds of experiment are designed and carried out: (1) the merging experiment on simulated graphs with structural information only; (2) the merging experiment on real large-scaled knowledge graphs; (3) the construction experiment of knowledge graphs with different number of participants. The experimental results show that: (1) the proposed knowledge graph merging algorithm can find high-quality merging solutions of knowledge graphs by utilizing both structural information of knowledge graphs and semantic information of elements in knowledge graphs; (2) EIFL-based collective collaboration improves both the efficiency of participants in constructing individual knowledge graphs and the scale of the collective knowledge graph merged from individual knowledge graphs, and shows sound scalability with respect to the number of participants in knowledge graph construction.
[中图分类号]
[基金项目]
科技创新2030——“新一代人工智能”重大项目(2020AAA0109402);国家自然科学基金(61690200)