Macro Discourse Structure Representation Schema and Corpus Construction
Author:
Affiliation:

Clc Number:

TP18

Fund Project:

National Natural Science Foundation of China (61773276, 61673290, 61836007)

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Discourse structure analysis is an important research topic in natural language processing. Discourse structure analysis not only helps to understand the discourse structure and semantics, but also provides strong support for deep applications of natural language processing, such as automatic summarization, information extraction, question answering, etc. At present, the analysis of discourse structure is mainly concentrated on the micro level. The analysis focuses on the relations and structures between sentences or sentences groups, while the analysis on macro level is less. Therefore, this study takes discourse structure as the research object, and focuses on the construction of representation schema and corpus resources on the macro level. This study discusses the importance of discourse structure analysis, expounds the research status of discourse structure analysis from three aspects, namely, theory system, corpora resource, and computing model, and puts forward the macro-micro unified discourse structure representation framework with the primary-secondary relation as the carrier. Furthermore, this study constructs the logical semantic structure and functional pragmatic structure of macro discourse level respectively. On this basis, this study annotates a macro Chinese discourse structure corpus, consisting of 720 newswire articles, and analyzes the results of the annotations in consistency and statistical data.

    Reference
    Related
    Cited by
Get Citation

褚晓敏,奚雪峰,蒋峰,徐昇,朱巧明,周国栋.宏观篇章结构表示体系和语料建设.软件学报,2020,31(2):321-343

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:January 09,2018
  • Revised:April 19,2019
  • Adopted:
  • Online: August 12,2019
  • Published: February 06,2020
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063