Abstract:Knowledge graph based question answering (KGQA) analyzes natural language questions, performs reasoning over knowledge graphs, and ultimately returns accurate answers to them. It has been widely used in intelligent information services, such as modern search engines, and personalized recommendation. Considering the high cost of manual labeling of reasoning steps as supervision in the relation-supervised learning methods, scholars began to explore weak supervised learning methods, such as reinforcement learning, to design knowledge graph based question answering models. Nevertheless, as for the complex questions with constraints, existing reinforcement learning-based KGQA methods face two major challenges: (1) multi-hop long path reasoning leads to sparsity and delay rewards; (2) existing methods cannot handle the case of reasoning path branches with constraint information. To address the above challenges in constrained question answering tasks, a reward shaping strategy with constraint information is designed to solve the sparsity and delay rewards. In addition, reinforcement learning based constrained path reasoning model named COPAR is proposed. COPAR consists of an action determination strategy based on attention mechanism and an entity determination strategy based on constraint information. Itis capable of selecting the correct relations and entities according to the question constraint information, reducing the search space of reasoning, and ultimately solving the reasoning path branching problem. Moreover, an ambiguity constraint processing strategy is proposed to effectively solve the ambiguity problem of reasoning path. The performance of COPAR is verified and compared using benchmark datasets of knowledge graph based question answering task. The experimental results indicate that, compared with the existing methods, the performance on datasets of multi-hop questions is relatively improved by 2%-7%; the performance on datasets of constrained questions is higher than the rival models, and the accuracy is improved by at least 7.8%.