口语对话中的语句分组
作者:
基金项目:

Supported by the National Natural Science Foundation of China under Grant No.60375018 (国家自然科学基金)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [16]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    研究了信息类自然口语对话中的交互模式及其自动分析.首先,基于话语分析中的Birmingham学派关于交互模式的工作和Halliday关于言语功能的分析,提出使用语句组来刻画交互模式,并建立原则性分类体系;然后,对语料中的交互模式进行标注分析;随后,根据影响语句组结构的主要因素建立交互模式分析算法,并在语料中进行实验.实验结果表明,语句组的整体分析正确率可达到55.4%~84.2%--取决于不同来源的扩展句子类型和语句主题的分析结果.

    Abstract:

    In this paper, the interaction patterns and their automatic analysis in spontaneous spoken information-seeking dialogues are studied. First, based on previous work from discourse analysis (i.e., exchange as basic interaction unit in Birmingham School) and Sytemic Functional Grammar (i.e., Halliday’s speech function), a principled scheme is proposed to model interaction patterns with utterance groups. Then a dialogue corpus is annotated with this scheme and further analyzed. Some main factors affecting the structure of utterance group are distinguished. Based on these, an algorithm is established to analyze utterance groups and is evaluated in the corpus. The results achieve a correct rate of 55.4%~84.2% for overall utterance tags, depending on the different recognition performances of the extended sentence type and utterance topic.

    参考文献
    [1]Xu WQ,Xu B,Huang TY,Xia HR.Bridging the gap between dialogue management and dialogue models.In:Jokinen K,McRoy S,eds.Proc.of the 3rd SIGdial Workshop on Discourse and Dialogue.Philadelphia:Association for Computational Linguistics,2002.201-210.
    [2]Sinclair JM,Coulthard M.Towards an Analysis of Discourse:The English Used by Teachers and Pupils.Oxford University Press,1975.
    [3]Sinclair JM,Coulthard M.Towards an analysis of discourse.In:Coulthard M,ed.Advances in Spoken Discourse Analysis.London,New York:Routledge,1992.1-34.
    [4]Coulthard M.Forensic discourse analysis.In:Coulthard M,ed.Advances in Spoken Discourse Analysis.London,New York:Routledge,1992.242-258.
    [5]Francis G,Hunston S.Analyzing everyday conversation.In:Coulthard M,ed.Advances in Spoken Discourse Analysis.London,New York:Routledge,1992.183-196.
    [6]Halliday MAK.An Introduction to Functional Grammar.2nd ed,London:Edward Arnold,1994.
    [7]Xu B,Huang TY,Zhang X,Huang C.A Chinese spoken dialogue database and its application.In:Proc.of the 2nd Int'l Workshop on East-Asia Language Resources and Evaluation.Taipei,1999.
    [8]Xu WQ,Xu B,Huang TY.Utterance Topic Identification in Spoken Dialogues.Chinese Information Processing,2005,19(4):89-96.
    [9]Carletta J,Isard A,Isard S,Kowtko JC,Doherty-Sneddon G,Anderson AH.The reliability of a dialogue structure coding scheme.Computational Linguistics,1997,23(1):13-31.
    [10]Hastie WH,Poesio M,Isard S.Automatically predicting dialogue structure using prosodic features.Speech Communication,2002,36(1):63-79.
    [11]Lochbaum KE.A collaborative planning model of intentional structure.Computational Linguistics,1998,24(4):525-572.
    [8]徐为群,徐波,黄泰翼.口语对话中的语句主题分析.中文信息学报,2005,19(4):89-96. [1]这样一个分类对下面的语句组自动分析没有直接关系.但由于分析中使用了XST和语句主题(后者也采用了XST作为输入),而XST的识别采用了统计方法(NBC和HMM,参见文献
    [9]),其中必须区分语料的训练部分、测试部分和开发部分.
    [2]XST的来源有4种,即人工标注的和3种自动分析的(分别采用启发式、朴素Bayes分类器或NBC以及隐Markov模型或HMM).语句主题的来源则包括标注的和3种对应于XST自动识别的算法分析的(因为语句主题的自动算法识别部分地依赖于XST).关于XST和语句主题的分析及其结果细节,请参见文献
    [9].这里给出定性的结果:XST识别的正确率,NBC>启发式>HMM;语句主题分析结果,XST标注>{NBC,启发式}>HMM.NBC与启发式相当.
    [3]DRI(discourse resourceinitiative),一个国际性话语资源行动组织,曾经于1998年在日本举行第3次研讨会.研讨会的主题之一就是话语结构标注.有关的研讨会报告参见Core,Mark,MasatoIshizaki,Johanna Moore.Christine Nakatani,Nobert Reithinger,DavidTraum,and Syun Tutiya.1999.The report of the third workshop of the discourse resource initiative.Technical Report,No.3 CC-TR-99-1,Chiba Univeristy and Kazusa Academia Hall.Chiba Corpus Project (http://www.stanford.edu/~jurafsky/Coreeta199.pdf),标注手册参见Nakatani,Christine H.,David R.Traum.1999.Coding discourse structure in dialogue (Version 1.0).Technical Report,UMIACS-TR-99-03,University of Maryland (http://hdl.handle.net/1903/991).
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

徐为群,徐波,黄泰翼.口语对话中的语句分组.软件学报,2006,17(2):250-258

复制
分享
文章指标
  • 点击次数:4563
  • 下载次数: 5343
  • HTML阅读次数: 0
  • 引用次数: 0
历史
  • 收稿日期:2004-12-10
  • 最后修改日期:2005-02-03
文章二维码
您是第19763323位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号