复述技术研究

微信服务号

微信订阅号

2025年5月1日 23:52 星期四

首页 > 过刊浏览>2009年第20卷第8期 >2124-2137

复述技术研究
DOI:
                        
                    
CSTR:
                        
                    
作者:
                        赵世奇赵世奇
哈尔滨工业大学 计算机科学与技术学院,黑龙江 哈尔滨 150001
在期刊界中查找
在百度中查找
在本站中查找
刘挺刘挺
哈尔滨工业大学 计算机科学与技术学院,黑龙江 哈尔滨 150001
在期刊界中查找
在百度中查找
在本站中查找
李生李生
哈尔滨工业大学 计算机科学与技术学院,黑龙江 哈尔滨 150001
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:Supported by the National Natural Science Foundation of China under Grant Nos.60803093, 60675034 (国家自然科学基金); the National High-Tech Research and Development Plan of China under Grant No.2008AA01Z144 (国家高技术研究发展计划(863))

Research on Paraphrasing Technology

Author:

ZHAO Shi-Qi
ZHAO Shi-Qi

在期刊界中查找
在百度中查找
在本站中查找
LIU Ting
LIU Ting

在期刊界中查找
在百度中查找
在本站中查找
LI Sheng
LI Sheng

在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [94]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

对自然语言处理研究中的复述的研究现状与进展进行了总结,分别介绍了复述的应用、复述资源的获取、复述句的生成、复述的评测以及与复述紧密联系的相关研究等.重在对复述研究的主流方法和前沿进展进行概括、比较和分析,以期对后续研究有所助益.

关键词:复述;复述获取;复述生成;评测

Abstract:

This paper surveys the state-of-the-art research on paraphrasing in natural language processing, including the applications, the acquisition of resources, the generation, and the evaluation of paraphrases, as well as some closely related topics. This paper aims to make a summary, comparison and analysis of the mainstream methods and the latest progress in the field, expecting to be helpful to the future research.

Key words:paraphrasing; paraphrase acquisition; paraphrase generation; evaluation

参考文献

[1] Barzilay R, McKeown KR. Extracting paraphrases from a parallel corpus. In: Proc. of the ACL/EACL. Morristown: Association for Computational Linguistics, 2001. 50?57.

[2] Rinaldi F, Dowdall J, Kaljurand K, Hess M, MolláD. Exploiting paraphrases in a question answering system. In: Proc. of the IWP. Morristown: Association for Computational Linguistics, 2003. 25?32.

[3] Boonthum C. iSTART: Paraphrase recognition. In: Proc. of the ACL 2004 Workshop on Student Research. Morristown: Association for Computational Linguistics, 2004. 31?36.

[4] Zhao SQ, Wang HF, Liu T, Li S. Pivot approach for extracting paraphrase patterns from bilingual corpora. In: Proc. of the ACL 2008: HLT. Morristown: Association for Computational Linguistics, 2008. 780?788.

[5] Zong CQ, Zhang YJ, Yamamoto K, Sakamoto M, Shirai S. Approach to spoken Chinese paraphrasing based on feature extraction. In: Proc. of the NLPRS. 2001. 551?556.

[6] Zong CQ, Zhang YJ, Yamamoto K, Sakamoto M, Shirai S. Paraphrasing Chinese utterances in spoken language translation system. In: Proc. of the ICCC. 2001. 395?401 (in Chinese with English abstract).

[7] Li WG, Liu T, Zhang Y, Li S, He W. Automated generalization of phrasal paraphrases from the Web. In: Proc. of the IWP. 2005. 49?56.

[8] Liu T, Li WG, Zhang Y, Li S. 2006. Survey on paraphrasing technology. Journal of Chinese Information Processing, 2006,40(4): 25?33 (in Chinese with English abstract).

[9] Li WG. Research on Chinese paraphrase example and paraphrase template extraction [Ph.D. Thesis]. Harbin: Harbin Institute of Technology, 2008 (in Chinese with English abstract).

[10] Zhao SQ, Niu C, Zhou M, Liu T, Li S. Combining multiple resources to improve SMT-based paraphrasing model. In: Proc. of the ACL 2008: HLT. Morristown: Association for Computational Linguistics, 2008. 1021?1029.

[11] Zhao SQ, Zhou M, Liu T. Learning question paraphrases for QA from Encarta logs. In: Proc. of the IJCAI. Menlo Park: AAAI Press, 2007. 1796?1800.

[12] Zhao SQ, Liu T, Yuan XC, Li S, Zhang Y. Automatic acquisition of context-specific lexical paraphrases. In: Proc. of the IJCAI. Menlo Park: AAAI Press, 2007. 1789?1794.

[13] Mitamura T, Nyberg E. Automatic rewriting for controlled language translation. In: Proc. of the NLPRS. 2001. 1?12.

[14] Yamamoto K. Machine translation by interaction between paraphraser and transfer. In: Proc. of the COLING. Morristown: Association for Computational Linguistics, 2002. 1107?1113.

[15] Zhang YJ, Yamamoto K. Paraphrasing of Chinese utterances. In: Proc. of the COLING. Morristown: Association for Computational Linguistics, 2002. 1163?1169.

[16] Shimohata M, Sumita E, Y Matsumoto. Building a paraphrase corpus for speech translation. In: Proc. of the LREC. Paris: ELRA, 2004. 1407?1410.

[17] Callison-Burch C, Koehn P, Osborne M. Improved statistical machine translation using paraphrases. In: Proc. of the HLT-NAACL. Morristown: Association for Computational Linguistics, 2006. 17?24.

[18] Papineni K, Roukos S, Ward T, Zhu WJ. Bleu: A method for automatic evaluation of machine translation. In: Proc. of the ACL. Morristown: Association for Computational Linguistics, 2002. 311?318.

[19] Kauchak D, Barzilay R. Paraphrasing for automatic evaluation. In: Proc. of the HLT-NAACL. Morristown: Association for Computational Linguistics, 2006. 455?462.

[20] Zhou L, Lin CY, Hovy E. Re-Evaluating machine translation results with paraphrase support. In: Proc. of the EMNLP. Morristown: Association for Computational Linguistics, 2006. 77?84.

[21] Lepage Y, Denoual E. Automatic generation of paraphrases to be used as translation references in objective evaluation measures of machine translation. In: Proc. of the IWP. 2005. 57?64.

[22] Kanayama H. Paraphrasing rules for automatic evaluation of translation into Japanese. In: Proc. of the IWP. Morristown: Association for Computational Linguistics, 2003. 88?93.

[23] Madnani N, Ayan NF, Resnik P, Dorr BJ. Using paraphrases for parameter tuning in statistical machine translation. In: Proc. of the 2nd Workshop on Statistical Machine Translation. Morristown: Association for Computational Linguistics, 2007. 120?127.

[24] McKeown KR. Paraphrasing using given and new information in a question-answer system. In: Proc. of the ACL. Morristown:Association for Computational Linguistics, 1979. 67?72.

[25] Duboue PA, Chu-Carroll J. Answering the question you wish they had asked: The impact of paraphrasing for question answering. In: Proc. of the HLT-NAACL. Morristown: Association for Computational Linguistics, 2006. 33?36.

[26] Ravichandran D, Hovy E. Learning surface text patterns for a question answering system. In: Proc. of the ACL. Morristown: Association for Computational Linguistics, 2002. 41?47.

[27] Hermjakob U, Echihabi A, Marcu D. Natural language based reformulation resource and Web exploitation for question answering. In: Proc. of the TREC. 2002.

[28] Duclaye F, Yvon F. Learning paraphrases to improve a question-answering system. In: Proc. of the EACL Workshop on NLP for Question Answering. 2003.

[29] Shinyama Y, Sekine S, Sudo K. Automatic paraphrase acquisition from news articles. In: Proc. of the HLT. San Francisco: Morgan Kaufmann Publishers Inc., 2002. 40?46.

[30] Sekine S. Automatic paraphrase discovery based on context and keywords between NE pairs. In: Proc. of the IWP. 2005. 80?87.

[31] Sekine S. On-demand information extraction. In: Proc. of the ACL. Morristown: Association for Computational Linguistics, 2006. 731?738.

[32] Romano L, Kouylekov M, Szpektor I, Dagan I, Lavelli A. Investigating a generic paraphrase-based approach for relation extraction. In: Proc. of the EACL. Morristown: Association for Computational Linguistics, 2006. 409?416.

[33] Bhagat R, Ravichandran D. Large scale acquisition of paraphrases for learning surface patterns. In: Proc. of the ACL-08: HLT. Morristown: Association for Computational Linguistics, 2008. 674?682.

[34] Zukerman I, Raskutti B. Lexical query paraphrasing for document retrieval. In: Proc. of the COLING. Morristown: Association for Computational Linguistics, 2002. 1?7.

[35] McKeown KR, Barzilay R, Evans D, Hatzivassiloglou V, Klavans JL, Nenkova A, Sable C, Schiffman B, Sigelman S. Tracking and summarizing news on a daily basis with Columbia’s newsblaster. In: Proc. of the HLT. San Francisco: Morgan Kaufmann Publishers Inc., 2002. 280?285.

[36] Zhou L, Lin CY, Munteanu DS, Hovy E. ParaEval: Using paraphrases to evaluate summaries automatically. In: Proc. of the HLT-NAACL. Morristown: Association for Computational Linguistics, 2006. 447?454.

[37] Iordanskaja L, Kittredge R, Polguère A. Lexical selection and paraphrase in a meaning-text generation model. In: Paris CL, Swartout WR, Mann WC, eds. Natural Language Generation in Artificial Intelligence and Computational Linguistics. 1991. 293?312.

[38] Knight K, Chander I. Automated postediting of documents. In: Proc. of the AAAI. Menlo Park: AAAI Press, 1994. 779?784.

[39] Carroll J, Minnen G, Pearce D, Canning Y, Devlin S, Tait J. Simplifying text for language-impaired readers. In: Proc. of the EACL. Morristown: Association for Computational Linguistics, 1999. 269?270.

[40] Bolshakov IA, Gelbukh A. Synonymous paraphrasing using WordNet and Internet. In: Proc. of the NLDB. Berlin, Heidelberg: Springer-Verlag, 2004. 312?323.

[41] Uzuner ?, Katz B, Nahnsen T. Using syntactic information to identify plagiarism. In: Proc. of the 2nd Workshop on Building Educational Applications Using NLP. Morristown: Association for Computational Linguistics, 2005. 37?44.

[42] Ibrahim A, Katz B, Lin J. Extracting structural paraphrases from aligned monolingual corpora. In: Proc. of the IWP. Morristown: Association for Computational Linguistics, 2003. 57?64.

[43] Pang B, Knight K, Marcu D. Syntax-based alignment of multiple translations: Extracting Paraphrases and Generating New Sentences. In: Proc. of the HLT-NAACL. Morristown: Association for Computational Linguistics, 2003. 102?109.

[44] Barzilay R, Lee L. Learning to paraphrase: An unsupervised approach using multiple-sequence alignment. In: Proc. of the HLT-NAACL. Morristown: Association for Computational Linguistics, 2003. 16?23.

[45] Dolan B, Quirk C, Brockett C. Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources. In: Proc. of the COLING. Morristown: Association for Computational Linguistics, 2004. 350?356.

[46] Quirk C, Brockett C, Dolan W. Monolingual machine translation for paraphrase generation. In: Proc. of the EMNLP. Morristown: Association for Computational Linguistics, 2004. 142?149.

[47] Brockett C, Dolan WB. Support vector machines for paraphrase identification and corpus construction. In: Proc. of the IWP. 2005. 1?8.

[48] Dolan WB, Brockett C. Automatically constructing a corpus of sentential paraphrases. In: Proc. of the IWP. 2005. 9?16.

[49] Finch A, Hwang YS, Sumita E. Using machine translation evaluation techniques to determine sentence-level semantic equivalence. In: Proc. of the IWP. Morristown: Association for Computational Linguistics, 2005. 17?24.

[50] Hatzivassiloglou V, Klavans JL, Eskin E. Detecting text similarity over short passages: Exploring linguistic feature combinationsvia machine learning. In: Proc. of the EMNLP. Morristown: Association for Computational Linguistics, 1999. 203?212.

[51] Wu DK. Recognizing paraphrases and textual entailment using inversion transduction grammars. In: Proc. of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment. Morristown: Association for Computational Linguistics, 2005. 25?30.

[52] Brockett C, Dolan WB. Echo chamber: A game for eliciting a colloquial paraphrase corpus. In: Proc. of the KCVC. Menlo Park: AAAI Press, 2005. 8?15.

[53] Lin DK. Automatic retrieval and clustering of similar words. In: Proc. of the COLING/ACL. Morristown: Association for Computational Linguistics, 1998. 768?774.

[54] Kaji N, Kawahara D, Kurohash S, Sato S. Verb paraphrase based on case frame alignment. In: Proc. of the ACL. Morristown: Association for Computational Linguistics, 2002. 215?222.

[55] Higashinaka R, Nagao K. Interactive paraphrasing based on linguistic annotation. In: Proc. of the COLING. Morristown: Association for Computational Linguistics, 2002. 1218?1222.

[56] Takao K, Imamura K, Kashioka H. Comparing and extracting paraphrasing words with 2-way bilingual dictionaries. In: Proc. of the LREC. Paris: ELRA, 2002. 1016?1022.

[57] Wu H, Zhou M. Synonymous collocation extraction using translation information. In: Proc. of the ACL. Morristown: Association for Computational Linguistics, 2003. 120?127.

[58] Bannard C, Callison-Burch C. Paraphrasing with bilingual parallel corpora. In: Proc. of the ACL. Morristown: Association for Computational Linguistics, 2005. 597?604.

[59] Harris ZS. Distributional structure. In: Martinet A, Weinreich U, eds. Linguistics Today. New York: Linguistic Circle of New York, 1954. 26?42.

[60] Lin DK, Pantel P. Discovery of inference rules for question answering. Natural Language Engineering, 2001,7(4):343?360.

[61] Brin S. Extracting patterns and relations from the World Wide Web. In: Proc. of the WebDB’98. Berlin, Heidelberg: Springer-Verlag, 1998. 172?183.

[62] Pasca M, Dienes P. Aligning needles in a haystack: Paraphrase acquisition across the Web. In: Proc. of the IJCNLP. Berlin, Heidelberg: Springer-Verlag, 2005. 119?130.

[63] Szpektor I, Tanev H, Dagan I, Coppola B. Scaling Web-based acquisition of entailment relations. In: Proc. of the EMNLP. Morristown: Association for Computational Linguistics, 2004. 41?48.

[64] Takahashi T, Iwakura T, Iida R, Fujita A, Inui K. KURA: A transfer-based lexico-structural paraphrasing engine. In: Proc. of the NLPRS. 2001. 37?46.

[65] Fujita A, Inui K. A class-oriented approach to building a paraphrase corpus. In: Proc. of the IWP. 2005. 25?32.

[66] Power R, Scott D. Automatic generation of large-scale paraphrases. In: Proc. of the IWP. 2005. 73?79.

[67] Fujita A, Inui K, Matsumoto Y. Exploiting lexical conceptual structure for paraphrase generation. In: Proc. of the IJCNLP. Berlin, Heidelberg: Springer-Verlag, 2005. 908?919.

[68] Kozlowski R, McCoy KF, Vijay-Shanker K. Generation of single-sentence paraphrases from predicate/argument structure using lexico-grammatical resources. In: Proc. of the IWP. Morristown: Association for Computational Linguistics, 2003. 1?8.

[69] Finch A, Watanabe T, Akiba Y, Sumita E. Paraphrasing as machine translation. Journal of Natural Language Processing, 2004, 11(5):87?111.

[70] Callison-Burch C, Cohn T, Lapata M. ParaMetric: An automatic evaluation metric for paraphrasing. In: Proc. of the COLING. Morristown: Association for Computational Linguistics, 2008. 97?104.

[71] Fujita A, Sato S. A probabilistic model for measuring grammaticality and similarity of automatically generated paraphrases of predicate phrases. In: Proc. of the COLING. Morristown: Association for Computational Linguistics, 2008. 225?232.

[72] Glickman O, Dagan I. Identifying lexical paraphrases from a single corpus: A case study for verbs. In: Proc. of the RANLP. 2003.

[73] Pantel P, Bhagat R, Coppola B, Chklovski T, Hovy E. ISP: Learning inferential selectional preferences. In: Proc. of the HLT-NAACL. Morristown: Association for Computational Linguistics, 2007. 564?571.

[74] Szpektor I, Shnarch E, Dagan I. Instance-Based evaluation of entailment rule acquisition. In: Proc. of the ACL. Morristown: Association for Computational Linguistics, 2007. 456?463.

[75] Dagan I, Glickman U. Probabilistic textual entailment: Generic applied modeling of language variability. In: Proc. of the PASCAL. 2004.

[76] Dagan I, Glickman O, Magnini B. The PASCAL recognising textual entailment challenge. In: Proc. of the MLCW 2005. Berlin, Heidelberg: Springer-Verlag, 2006. 177?190.

[77] Ferrandez O, Micol D, Munoz R, Palomar M. A perspective-based approach for solving textual entailment recognition. In: Proc. ofthe Workshop on Textual Entailment and Paraphrasing. Morristown: Association for Computational Linguistics, 2007. 66?71.

[78] Wang R, Neumann G. Recognizing textual entailment using sentence similarity based on dependency tree skeletons. In: Proc. of the Workshop on Textual Entailment and Paraphrasing. Morristown: Association for Computational Linguistics, 2007. 36?41.

[79] Malakasiotis P, Androutsopoulos I. Learning textual entailment using SVMs and string similarity measures. In: Proc. of the Workshop on Textual Entailment and Paraphrasing. Morristown: Association for Computational Linguistics, 2007. 42?47.

[80] Ferres D, Rodriguez H. Machine learning with semantic-based distances between sentences for textual entailment. In: Proc. of the Workshop on Textual Entailment and Paraphrasing. Morristown: Association for Computational Linguistics, 2007. 60?65.

[81] Montejo-Ráez A, Perea JM, Martínez-Santiago F, García-Cumbreras Má, Martín-Valdivia M, Ure?a-López A. Combining lexical-syntactic information with machine learning for recognizing textual entailment. In: Proc. of the Workshop on Textual Entailment and Paraphrasing. Morristown: Association for Computational Linguistics, 2007. 78?82.

[82] Adams R, Nicolae G, Nicolae C, Harabagiu S. Textual entailment through extended lexical overlap and lexico-semantic matching. In: Proc. of the Workshop on Textual Entailment and Paraphrasing. Morristown: Association for Computational Linguistics, 2007. 119?124.

[83] Li BL, Irwin J, Garcia EV, Ram A. Machine learning based semantic inference: Experiments and Observations at RTE-3. In: Proc. of the Workshop on Textual Entailment and Paraphrasing. Morristown: Association for Computational Linguistics, 2007. 159?164.

[84] Tatu M, Moldovan D. COGEX at RTE3. In: Proc. of the Workshop on Textual Entailment and Paraphrasing. Morristown: Association for Computational Linguistics, 2007. 22?27.

[85] Hickl A, Bensley J. A discourse commitment-based framework for recognizing textual entailment. In: Proc. of the Workshop on Textual Entailment and Paraphrasing. Morristown: Association for Computational Linguistics, 2007. 171?176.

[86] McCarthy D, Navigli R. SemEval-2007 Task 10: English lexical substitution task. In: Proc. of the SemEval 2007. Morristown: Association for Computational Linguistics, 2007. 48?53.

[87] Hassan S, Csomai A, Banea C, Sinha R, Mihalcea R. UNT: SubFinder: Combining knowledge sources for automatic lexical substitution. In: Proc. of the SemEval 2007. Morristown: Association for Computational Linguistics, 2007. 410?413.

[88] Giuliano C, Gliozzo A, Strapparava C. FBK-irst: Lexical substitution task exploiting domain and syntagmatic coherence. In: Proc. of the SemEval 2007. Morristown: Association for Computational Linguistics, 2007. 145?148.

[89] Martinez D, Kim SN, Baldwin T. MELB-MKB: Lexical substitution system based on relatives in context. In: Proc. of the SemEval 2007. Morristown: Association for Computational Linguistics, 2007. 237?240.

[90] Zhao SQ, ZhaoL, Zhang Y, Liu T, Li S. HIT: Web based scoring method for English lexical substitution. In: Proc. of the SemEval 2007. Morristown: Association for Computational Linguistics, 2007. 173?176.

[91] Brants T, Franz A. Web 1T 5-gram Version 1. Technical Report, Philadelphia: Linguistic Data Consortium, 2006. 附中文参考文献:

[6] 宗成庆,张玉洁,山本和英,坂本仁,白井谕.口语自动翻译系统中的汉语语句改写.见:中文计算国际会议(ICCC).2001.395?401.

[8] 刘挺,李维刚,张宇,李生.复述技术研究综述.中文信息学报,2006,40(4):25?33.

[9] 李维刚.中文复述实例与复述模板抽取技术研究[博士学位论文].哈尔滨:哈尔滨工业大学,2008.

引用本文

赵世奇,刘挺,李生.复述技术研究.软件学报,2009,20(8):2124-2137

复制

文章指标

点击次数:7844
下载次数: 12899
HTML阅读次数: 0
引用次数: 0

历史

收稿日期:2008-11-13
最后修改日期:2009-01-15
录用日期:
在线发布日期:
出版日期:

微信服务号

微信订阅号

引用本文

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码