基于神经网络的机器阅读理解综述
作者:
作者单位:

作者简介:

顾迎捷(1992-),男,学士,主要研究领域为机器阅读理解,自然语言处理;沈毅(1994-),男,学士,主要研究领域为机器阅读理解,自然语言处理;桂小林(1966-),男,博士,教授,博士生导师,CCF高级会员,主要研究领域为物联网,云计算,大数据分析与隐私保护,信息安全;廖东(1995-),男,学士,主要研究领域为大数据分析,边缘计算;李德福(1996-),男,学士,主要研究领域为机器阅读理解,自然语言处理.

通讯作者:

桂小林,E-mail:xlgui@mail.xjtu.edu.cn

基金项目:

国家重点研发计划(2018YFB1800304);陕西省重点研发项目(2019GY-005,2017ZDXM-GY-011,2020GY-033)


Survey of Machine Reading Comprehension Based on Neural Network
Author:
Affiliation:

Fund Project:

National Key Research and Development Program of China (2018YFB1800304); Key Development Program in Shaanxi Province of China (2019GY-005, 2017ZDXM-GY-011, 2020GY-033)

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    机器阅读理解的目标是使机器理解自然语言文本,并能够正确回答与文本相关的问题.由于数据集规模的制约,早期的机器阅读理解方法大多基于人工特征以及传统机器学习方法进行建模.近年来,随着知识库、众包群智的发展,研究者们陆续提出了高质量的大规模数据集,为神经网络模型以及机器阅读理解的发展带来了新的契机.对基于神经网络的机器阅读理解相关的最新研究成果进行了详尽的归纳:首先,概述了机器阅读理解的发展历程、问题描述以及评价指标;然后,针对当前最流行的神经阅读理解模型架构,包括嵌入层、编码层、交互层和输出层中所使用的相关技术进行了全面的综述,同时阐述了最新的BERT预训练模型及其优势;之后,归纳了近年来机器阅读理解数据集和神经阅读理解模型的研究进展,同时,详细比较分析了最具代表性的数据集以及神经网络模型;最后展望了机器阅读理解研究所面临的挑战和未来的研究方向.

    Abstract:

    The task of machine reading comprehension is to make the machine understand natural language text and correctly answer text-related questions. Due to the limitation of the dataset scale, most of the early machine reading comprehension methods were modeled based on manual features and traditional machine learning methods. In recent years, with the development of knowledge bases and crowdsourcing, high quality large-scale datasets have been proposed by researchers, which has brought a new opportunity for the advance of neural network models and machine reading comprehension. In this survey, an exhaustive review on the state-of-the-art research efforts on machine reading comprehension based on neural network is made. First, an overview of machine reading comprehension, including development process, problem formulation, and evaluation metric, is given. Then, a comprehensive review is conducted of related technologies in the most fashionable neural reading comprehension framework including the embedding layer, encoder layer, interaction layer, and output layer as well as the latest BERT pre-training model and its advantages are discussed. After that, this paper concludes the recent research progress of machine reading comprehension datasets and neural reading comprehension model, and gives a comparison and analysis of the most representative datasets and neural network models in detail. Finally, the research challenges and future direction of machine reading comprehension are presented.

    参考文献
    相似文献
    引证文献
引用本文

顾迎捷,桂小林,李德福,沈毅,廖东.基于神经网络的机器阅读理解综述.软件学报,2020,31(7):2095-2126

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2019-04-26
  • 最后修改日期:2019-06-29
  • 录用日期:
  • 在线发布日期: 2020-04-21
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号