近年来, 研究者已经提出多种方法来解决知识库问答(KBQA)中的复杂问题, 并取得一定成果. 然而, 由于语义构成的复杂性以及可能存在推理路径的缺失, 复杂问题的求解效果依然不佳. 为了更好地解决这类问题, 提出基于知识图谱全局和局部特征的问答方法——CGL-KBQA. 所提方法利用知识嵌入技术提取知识图谱整体的拓扑结构和语义特征作为候选实体节点的全局特征, 根据实体表示和问句表示将复杂问答建模为复合的三元组分类任务. 同时, 将图谱在搜索过程生成的核心推导路径作为局部特征, 结合问句的语义相似性来构建候选实体不同维度特征, 最终形成混合特征评分器. 由于最终推理路径可能缺失, 采用基于无监督的多重聚类方法设计了聚类器模块, 进而根据候选实体的两类特征表示直接生成最终答案簇, 这使得非完全知识图谱问答成为可能. 实验结果表明, 所提方法在两个常见KBQA数据集上均取得不错的效果, 特别是在图谱知识不完全的情况下也具备非常好的效果.
Several methods have been proposed to address complex questions of knowledge base question answering (KBQA). However, the complex semantic composition and the possible absence of inference paths lead to the poor reasoning effect of complex questions. To this end, this study proposes the CGL-KBQA method based on the global and local features of knowledge graphs. The method employs the knowledge embedding technique to extract the topological structure and semantic features of knowledge graphs as the global features of the candidate entity node, and models the complex questions as a composite triple classification task based on the entity representation and question composition. At the same time, the core inference paths generated by graphs during the search process are utilized as local features, which are then combined with the semantic similarity of questions to construct different dimensional features of the candidate entities and finally form a hybrid feature scorer. Since the final inference paths may be missing, this study also designs a cluster module with unsupervised multi-clustering methods to select final answer clusters directly according to the feature representation of candidate entities, thereby making reasoning under incomplete KG possible. Experimental results show that the proposed method performs well on two common KBQA datasets, especially when KG is incomplete.