国家重点研发计划(2020AAA0106400); 国家自然科学基金(61922085, 61976211, U1936209, 62002353); 中国博士后科学基金(2021M701726); 中国科学院重点研究计划(ZDBS-SSW-JSC006)
选择式阅读理解通常采用证据抽取和答案预测的两阶段流水线框架, 答案预测的效果非常依赖于证据句抽取的效果. 传统的证据抽取多依赖词段匹配或利用噪声标签监督证据抽取的方法, 准确率不理想, 这极大地影响了答案预测的性能. 针对该问题, 提出一种联合学习框架下基于多视角图编码的选择式阅读理解方法, 从多视角充分挖掘文档句子之间以及文档句子和问句之间的关联关系, 实现证据句及其关系的有效建模; 同时通过联合训练证据抽取和答案预测任务, 利用证据和答案之间强关联关系提升证据抽取与答案预测的性能. 具体来说, 该方法首先基于多视角图编码模块对文档、问题和候选答案联合编码, 从统计特性、相对距离和深度语义3个视角捕捉文档、问题和候选答案之间的关系, 获得问答对感知的文档编码特征; 然后, 构建证据抽取和答案预测的联合学习模块, 通过协同训练强化证据与答案之间的关系, 证据抽取子模块实现证据句的选择, 并将其结果和文档编码特征进行选择性融合, 并用于答案预测子模块完成答案预测. 在选择式阅读理解数据集ReCO和RACE上的实验结果表明, 所提方法提升了从文档中选择证据句子的能力, 进而提高答案预测的准确率. 同时, 证据抽取与答案预测联合学习很大程度减缓了传统流水线所导致的误差累积问题.
Multiple-choice reading comprehension typically adopts the two-stage pipeline framework of evidence extraction and answer prediction, and the effect of answer prediction highly depends on evidence sentence extraction. Traditional evidence extraction methods mostly rely on phrase matching or supervise evidence extraction with noise labels. The resultant unsatisfactory accuracy significantly reduces the performance of answer prediction. To address the above problem, this study proposes a multiple-choice reading comprehension method based on multi-view graph encoding in a joint learning framework. The correlations among the sentences in the text and those of such sentences with question sentences are fully explored from multiple views to effectively model evidence sentences and their relationships. Moreover, evidence extraction and answer prediction tasks are jointly trained so that the strong correlations of the evidence with the answers can be exploited for joint learning, thereby improving the performance of evidence extraction and answer prediction. Specifically, this method encodes texts, questions, and candidate answers jointly with the multi-view graph encoding module. The relationships among the texts, questions, and candidate answers are captured from the three views of statistical characteristics, relative distance, and deep semantics, thereby obtaining question-answer-aware text encoding features. Then, a joint learning module combining evidence extraction with answer prediction is built to strengthen the relationships of evidence with answers through joint training. The evidence extraction submodule is designed to select evidence sentences and fuse the results with text encoding features selectively. The fusion results are then used by the answer prediction submodule to complete the answer prediction. Experimental results on the multiple-choice reading comprehension datasets ReCO and RACE demonstrate that the proposed method attains a higher ability to select evidence sentences from texts and ultimately achieves higher accuracy of answer prediction. In addition, joint learning combining evidence extraction with answer prediction significantly alleviates the error accumulation problem induced by the traditional pipeline framework.