基于网络的动态多文档文摘系统框架
作者:
基金项目:

国家自然科学基金(60736014, 60773069, 61073130); 国家林业行业专项(201204715)


Web-Based Dynamic Multi-Document Summarization System Framework
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [28]
  • |
  • 相似文献
  • | | |
  • 文章评论
    摘要:

    在自然语言处理和计算语言学相关技术支撑下,研究基于网络的动态多文档文摘系统框架,重点描述动态多文档文摘系统框架的相关内容,介绍利用矩阵子空间方法进行动态演化建模,利用相似度和质心整体优选计算方法进行信息过滤,并利用动态流形排序方法进行句子加权的动态多文档文摘生成系统.按照多文档文摘生成步骤的划分,对3 种创新的模型方法进行融合,综合起来从不同侧重点考虑,形成互补,提高系统性能.在网络环境下,此框架保证了动态演化的多文档文摘具有较高的信息新颖性和历史信息的演化性.

    Abstract:

    This paper introduces an Internet-based dynamic multi-document summarization system to support natural language processing and computational linguistics-related technical. This paper focuses on the description of the relevant content of dynamic multi-document summarization system framework and introduces dynamic evolutionary modeling using the matrix sub-space method, the information filtering model that uses the similarity and centroid integer selection method, and weighted sentence sorting, using the dynamic manifold method to generate the dynamic multi-document summarization system. This paper fuses the three innovation modeling methods to complement and to improve the performance of the system in accordance with the division of generated step of multi-document summarization. In a network environment, the framework ensures the dynamic evolutionary multi-document summarization with high novel information and evolutionary historical information.

    参考文献
    [1] http://duc.nist.gov/
    [2] http://www.nist.gov/tac/
    [3] http://www.nist.gov/index.html
    [4] http://www-nlpir.nist.gov/projects/duc/guidelines/2007.html
    [5] http://www.trec.com/
    [6] Allan J, Gupta R, Khandelwal V. Temporal summaries of news topics. In: Proc. of the 24th Annual Int'l ACM SIGIR Conf. on Research and Development in Information Retrieval (SIGIR 2001). 2001. [doi: 10.1145/383952.383954]
    [7] Tang XN, Yang CC. Following the social media: Aspect evolution of online discussion. In: Proc. of the Computational Linguistics and Intelligent Text Processing. Iasi: Springer-Verlag, 2010. 346-360. [doi: 10.1007/978-3-642-19656-0_41]
    [8] http://www.sciencemag.org/content/253/5025/1242.abstract
    [9] Mani I. Recent developments in temporal information extraction (draft). In: Nicolov N, Mitkov R, eds. Proc. of the RANLP. Inderjeet Mani: Georgetown University, 2004.
    [10] Bollegala D, Okazakia N, Ishizukaa M. A machine learning approach to sentence ordering for multidocument summarization. In: Proc. of the Annual Meeting of the Association for Natural Language Processing. 2005. 482-488.
    [11] Ahn D, van Rantwijk J, de Rijke M. A cascaded machine learning approach to interpretingtemporal expressions. In: Proc. of the NAACL-HLT 2007. University of Amsterdam, 2007.
    [12] Piotrowski WJ, Kurmanowska Z, Antczak A, Marczak J, Górski P. Exhaled 8-isoprostane as a prognostic marker in sarcoidosis. In: Proc. of the A Short Term Follow-Up Computational Linguistics and Intelligent Text Processing. Springer-Verlag, 2010. 10-23.[doi: 10.1186/1471-2466-10-23]
    [13] ACE2007 evaluation plan. 2006. http://projects.ldc.upenn.edu/ace/intro.html
    [14] Song XC, Liu GQ. Multi-Document summarization method based on topic-concepts extract. Computer Engineer, 2010,36(4): 190-192 (in Chinese with English abstract).
    [15] Ye N, Zhu JB, Zheng Y, Ma MY, Wang HZ, Zhang B. A dynamic programming model for text segmentation based on min-max similarity. In: Proc. of the 4th Asia Information Retrieval Symp. (AIRS 2008). 2008. 141-152. [doi: 10.1007/978-3-540-68636-1_ 14]
    [16] Ye N, Zhu JB, Wang HZ, Ma MY, Zhang B. An improved model of Dotplotting for text segmentation. Journal of Chinese Language and Computing, 2007,17(1):27-40.
    [17] Yang XX, Zhang L. Information extraction based on semantic role and concept graph. Journal of Computer Applications, 2010,30(2):411-414 (in Chinese with English abstract).
    [18] Zhang J, Xu HB, Wang XL, Shen HW, Zeng YL. ICT CAS at DUC 2007. In: Proc. of the Document Understanding Conf. 2007. 231-242.
    [19] Zhang J, Cheng XQ, Wu GW, Xu HB. AdaSum: An adaptive model for summarization. In: Proc. of the ACM 17th Conf. on Information and Knowledge Management (CIKM 2008). 2008. 450-463. [doi: 10.1145/1458082.1458201]
    [20] Zhang J, Xu HB, Cheng XQ. Research on dynamic summarization for evolutionary Web information. Chinese Journal of Computers, 2008,31(4):696-701 (in Chinese with English abstract).
    [21] Boudin F, Moreno JMT. NEO-CORTEX: A performant user-oriented multi-document summarization system. In: Proc. of the Computational Linguistics and Intelligent Text Processing. Springer-Verlag, 2010. 89-99. [doi: 10.1007/978-3-540-70939-8_49]
    [22] Hovy E, Lin CY, Zhou L, Fukumoto J. Automated summarization evaluation with basic elements. In: Proc. of the Resources and Evaluation (LREC). 2006. 102-116.
    [23] Hovy E, Lin CY, Zhou L. Evaluating DUC 2005 using basic elements. In: Proc. of the Document Understanding Conf. (DUC 2005). 2005. 67-78.
    [24] Carbonell JG, Goldstein J. The use of MMR, diversity-based reranking for reordering documents and producing summaries. Information Processing & Management, 1998,31(5):675-685. [doi: 10.1145/290941.291025]
    [25] Lapata M. Probabilistic text structuring: Experiments with sentence ordering. In: Proc. of the 41st Annual Meeting of the Association for Computational Linguistics. Sapporo, 2003. 545-552. [doi: 10.3115/1075096.1075165]
    [26] Yang YM, Pedersen JO. A comparative study on feature selection in text categorization. In: Proc. of the Int'l Conf. on Machine Learning. San Francisco: Morgan Kaufmann Publishers, 1997. 412-420.
    [27] Wan X, Yang J, Xiao J. Manifold-Ranking based topic-focused multi-document summarization. In: Proc. of the IJCAI 2007. 2007. 2903-2908.
    [28] Liu ML, Zheng DQ, Zhao TJ, Yu Y, Zhou JY. Text similarity cumulative model and algorithm research for dynamic multidocument summarization. Journal of Computational Information Systems, 2011,7(5):1698-1705.
    相似文献
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

刘美玲,任洪娥,于洋,郑德权,赵铁军.基于网络的动态多文档文摘系统框架.软件学报,2013,24(5):1006-1021

复制
分享
文章指标
  • 点击次数:4218
  • 下载次数: 3427
  • HTML阅读次数: 0
  • 引用次数: 0
历史
  • 收稿日期:2011-12-12
  • 最后修改日期:2012-04-17
  • 在线发布日期: 2013-05-07
文章二维码
您是第19938597位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号