基于图的半监督关系抽取
作者:
基金项目:

Supported by the National Natural Science Foundation of China under Grant Nos.60803078, 60773011 (国家自然科学基金)


Graph-Based Semi-Supervised Relation Extraction
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [25]
  • |
  • 相似文献 [20]
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    提出利用基于图的半监督学习算法,即标注传递算法,指导计算机从非结构化的文本中自动识别出实体之间的关系.该方法首先利用图策略来建立关系抽取的模型.在这个图模型中,各个有标签和未标签的样本被表示成图上的各个节点,而样本间的距离则作为图上各边的权重.然后,关系抽取的任务就转化成在这个图上估计出一个满足全局一致性假设的标注函数.通过对ACE(automatic content extraction)语料库的评测,结果显示,当只有少量的标签样本时,采用该标注传递的方法可以获得比基于SVM(support vector machine)的有监督关系抽取更好的性能,同时也明显优于基于Bootstrapping的半监督关系抽取的方法.

    Abstract:

    This paper investigates a graph-based semi-supervised learning algorithm, that is, label propagation algorithm for relation extraction. Labeled and unlabeled examples are represented as the nodes, and their distances as the weights of edges in the graph. The relation extraction tries to obtain a labeling function on this graph to satisfy the global consistency assumption. Experimental results on the ACE (automatic content extraction) corpus showed that this method achieves a better performance than SVM (support vector machine) when only very few labeled examples are available, and it also performs better than bootstrapping for the relation extraction task.

    参考文献
    [1] Bunescu R, Mooney RJ. A shortest path dependency kernel for relation extraction. In: Proc. of Human Language Technology Conf. and Conf. on Empirical Methods in Natural Language Processing (HLT/EMNLP). Morristown: Association for Computational Linguistics, 2005. 724-731.
    [2] Culotta A, Soresen J. Dependency tree kernels for relation extraction. In: Proc. of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL2004). Morristown: Association for Computational Linguistics, 2004. 423-430.
    [3] Kambhatla N. Combining lexical, syntactic and semantic features with maximum entropy models for extracting relations. In: Proc. of the ACL Interactive Poster and Demonstration Sessions. Morristown: Association for Computational Linguistics, 2004. 178-181.
    [4] Miller S, Fox H, Ramshaw L, Weischedel R. A novel use of statistical parsing to extract information from text. In: Proc. of the 6th Applied Natural Language Processing Conf. Morristown: Association for Computational Linguistics, 2000. 226-233.
    [5] Zelenko D, Aone C, Richardella A. Kernel methods for relation extraction. In: Haji J, Matsumoto Y, eds. Proc. of the Conf. on Empirical Methods in Natural Language Processing (EMNLP). Morristown: Association for Computational Linguistics, 2002. 71-78.
    [6] Zhang M, Zhang J, Su J, Zhou GD. A composite kernel to extract relations between entities with both flat and structured features. In: Proc. of the 21st Int’l Conf. on Computational Linguistics and the 44th Annual Meeting of the ACL. Morristown: Association for Computational Linguistics, 2006. 825-832.
    [7] Zhao SB, Grishman R. Extracting relations with integrated information using kernel methods. In: Proc. of the 43rd Annual Meeting of the Association for Computational Linguistics. Morristown: Association for Computational Linguistics, 2005. 419-426.
    [8] Zhou GD, Su J, Zhang J, Zhang M. Exploring various knowledge in relation extraction. In: Proc. of the 43rd Annual Meeting of the Association for Computational Linguistics. Morristown: Association for Computational Linguistics, 2005. 427-434.
    [9] Agichtein E, Gravano L. Snowball: Extracting relations from large plain-text collections. In: Proc. of the 5th ACM Int’l Conf. on Digital Libraries (ACMDL 2000). New York: ACM Press, 2000. 85-94.
    [10] Brin S. Extracting patterns and relations from world wide Web. In: Atzeni P, Mendelzon AO, Mecca G, eds. Proc. of the WebDB Workshop at the 6th Int’l Conf. on Extending Database Technology (WebDB’98). Heidelberg: Springer-Verlag, 1998. 172-183.
    [11] Zhang Z. Weakly-Supervised relation classification for information extraction. In: Proc. of ACM the 13th Conf. on Information and Knowledge Management (CIKM 2004). Washington: ACM Press, 2004. 581-588.
    [12] Chen JX, Ji DH, Chew LT, Niu ZY. Automatic relation extraction with model order selection and discriminative label identification. In: Dale R, Wong KF, Su J, Kwong OY, eds. Proc. of the 2nd Int’l Joint Conf. on Natural Language Processing (IJCNLP 2005). Heidelberg : Springer-Verlag, 2005. 390-401.
    [13] Hasegawa T, Sekine S, Grishman R. Discovering relations among named entities from large corpora. In: Proc. of the 42nd Annual Meeting of the Association for Computational Linguistics. Morristown: Association for Computational Linguistics, 2004. 415-422.
    [14] Zhang M, Su J, Wang DM, Zhou GD, Chew LT. Discovering relations between named entities from a large raw corpus using tree similarity-based clustering. In: Dale R, Wong KF, Su J, Kwong OY, eds. Proc. of the 2nd Int’l Joint Conf. on Natural Language Processing (IJCNLP 2005). Heidelberg: Springer-Verlag, 2005. 378-389.
    [15] Che WX, Liu T, Li S. Automatic entity relation extraction. Journal of Chinese Information Processing, 2005,19(2):1-6 (in Chinese with English abstract).
    [16] Dong J, Sun L, Feng YY, Huang RH. Chinese automatic entity relation extraction. Journal of Chinese Information Processing, 2007,21(4):80-85, 91 (in Chinese with English abstract).
    [17] He TT, Xu C, Li J, Zhao JX. Named entity relation extraction method based on seed self-expansion. Computer Engineering, 2006, 32(21):183-184,193 (in Chinese with English abstract).
    [18] Liu KB, Li F, Liu L, Han Y. Implementation of a kernel-based Chinese relation extraction system. Journal of Computer Research and Development, 2007,44(8):1406-1411 (in Chinese with English abstract).
    [19] Zhang SX, Wen J, Qin Y, Yuan CX, Zhong YX. Study about automatic entity relation extraction. Journal of Harbin Engineering University, 2006,27(B07):370-373 (in Chinese with English abstract).
    [20] Belkin M, Niyogi P. Using manifold structure for partially labeled classification. In: Thrun BS, Obermayer K, eds. Advances in Neural Information Processing Systems 15. Cambridge: MIT Press, 2003. 926-936.
    [21] Blum A, Chawla S. Learning from labeled and unlabeled data using graph mincuts. In: Carla EB, Andrea PD, eds. Proc. of the 18th Int’l Conf. on Machine Learning. (ICML 2001). San Fransisco: Morgan Kaufmann Publishers, 2001.
    [22] Blum A, Lafferty J, Rwebangira MR, Reddy R. Semi-Supervised learning using randomized mincuts. In: Carla EB, ed. Proc. of the 21st Int’l Conf. on Machine Learning. Banff: ACM Press, 2004. 934-947.
    [23] Zhou DY, Bousquet O, Lal TN, Weston J, Sch?lkopf B. Learning with local and global consistency. In: Thrun S, Saul LK, Scholkopf B, eds. Advances in Neural Information Processing Systems 16. Cambridge: MIT Press, 2004.
    [24] Zhu XJ, Ghahramani ZB. Learning from labeled and unlabeled data with label propagation. Technical Report, CMU-CALD-02-107, CMU CALD, 2002.
    [25] Zhu XJ, Ghahramani ZB, Lafferty J. Semi-Supervised learning using Gaussian fields and harmonic functions. In: Fawcett T, Mishra N, eds. Proc. of the 20th Int’l Conf. on Machine Learning. AAAI Press, 2003. 912-919.
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

陈锦秀,姬东鸿.基于图的半监督关系抽取.软件学报,2008,19(11):2843-2852

复制
分享
文章指标
  • 点击次数:9248
  • 下载次数: 14213
  • HTML阅读次数: 0
  • 引用次数: 0
历史
  • 收稿日期:2008-02-29
  • 最后修改日期:2008-08-26
文章二维码
您是第19811656位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号