汉语框架语义角色的自动标注
DOI:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

Supported by the National Natural Science Foundation of China under Grant No.60873128 (国家自然科学基金); the National High- Tech Research and Development Plan of China under Grant No.2006AA01Z142 (国家高技术研究发展计划(863))


Automatic Labeling of Semantic Roles on Chinese FrameNet
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    基于山西大学自主开发的汉语框架语义知识库(CFN),将语义角色标注问题通过IOB策略转化为词序列标注问题,采用条件随机场模型,研究了汉语框架语义角色的自动标注.模型以词为基本标注单元,选择词、词性、词相对于目标词的位置、目标词及其组合为特征.针对每个特征设定若干可选的窗口,组合构成模型的各种特征模板,基于统计学中的正交表,给出一种较优模板选择方法.全部实验在选出的25个框架的6 692个例句的语料上进行.对每一个框架,分别按照其例句训练一个模型,同时进行语义角色的边界识别与分类,进行2-fold交叉验证.在给定句子中的目标词以及目标词所属的框架情况下,25个框架交叉验证的实验结果的准确率、召回率、F1-值分别达到74.16%,52.70%和61.62%.

    Abstract:

    Based on the semantic knowledge base of Chinese FrameNet (CFN) self-developed by Shanxi University, automatic labeling of the semantic roles of Chinese FrameNet is turned into a sequential tagging problem at word-level by applying IOB (inside/outside/begin) strategies to the exemplified sentences in CFN corpus, and the Conditional Random Fields (CRF) model is adopted. The basic unit of tagging is word. The word, its part of speech, its relative position to the target word, the target word, and their combination are chosen as the features. Various model templates are formed through optional size windows in each feature, and the orthogonal array within statistics is employed for screening of the better template. All experiments are based on the6 692 exemplified sentences of 25 frames selected from CFN corpus. The separate model is trained for each frame on its exemplified sentences by 2-fold cross-validation, and the processing of identification and classification for the semantic roles are taken simultaneously. Finally, with the target word given in a sentence, as well as the frame name of the target word, the experimental results on all 25 frames data for the precision, the recall, and F1-measure are 74.16%, 52.70%, 61.62%, respectively.

    参考文献
    相似文献
    引证文献
引用本文

李济洪,王瑞波,王蔚林,李国臣.汉语框架语义角色的自动标注.软件学报,2010,21(4):597-611

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2008-11-22
  • 最后修改日期:2009-10-14
  • 录用日期:
  • 在线发布日期:
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号