基于细粒度原型网络的小样本命名实体识别方法
作者:
作者简介:

戚荣志(1980-), 男, 博士, 副教授, CCF专业会员, 主要研究领域为实体识别, 关系抽取, 智能软件工程;周俊宇(1999-), 男, 硕士, 主要研究领域为小样本命名实体识别;李水艳(1980-), 女, 讲师, CCF专业会员, 主要研究领域为实体识别, 关系抽取;毛莺池(1976-), 女, 博士, 教授, 博士生导师, CCF高级会员, 主要研究领域为分布式数据处理, 边缘智能计算

通讯作者:

李水艳, E-mail: lsy@hhu.edu.cn

中图分类号:

TP18

基金项目:

国家重点研发计划(2022YFC3005401)


Few-shot Named Entity Recognition Based on Fine-grained Prototypical Network
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [36]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    原型网络直接应用于小样本命名实体识别(few-shot named entity recognition, FEW-NER)时存在以下问题: 非实体之间不具有较强的语义关系, 对实体和非实体都采用相同的方式构造原型将会造成非实体原型不能准确表示非实体的语义特征; 仅使用平均实体向量表示作为原型的计算方式将难以捕捉语义特征相差较大的同类实体. 针对上述问题, 提出基于细粒度原型网络的小样本命名实体识别(FEW-NER based on fine-grained prototypical networks, FNFP)方法, 有助于提高小样本命名实体识别的标注效果. 首先, 为不同的查询集样本构造不同的非实体原型, 捕捉句子中关键的非实体语义特征, 得到更为细粒度的原型, 提升模型对非实体的识别效果; 然后, 设计一个不一致性度量模块以衡量同类实体之间的不一致性, 对实体与非实体采用不同的度量函数, 从而减小同类样本之间的特征表示, 提升原型的特征表示能力; 最后, 引入维特比解码器捕捉标签转换关系, 优化最终的标注序列. 实验结果表明, 采用基于细粒度原型网络的小样本命名实体识别方法, 在大规模小样本命名实体识别数据集FEW-NERD上, 较基线方法获得提升; 同时在跨领域数据集上验证所提方法在不同领域场景下的泛化能力.

    Abstract:

    When prototypical networks are directly applied to few-shot named entity recognition (FEW-NER), there are the following problems: Non-entities do not have strong semantic relationships with each other, and using the same way to construct the prototype for both entities and non-entities will make non-entity prototypes fail to accurately represent the semantic characteristics of non-entities; using only the average entity vector as the computing method of the prototype will make it difficult to capture similar entities with different semantic features. To address these problems, this study proposes a FEW-NER based on fine-grained prototypical networks (FNFP) to improve the annotation effect of FEW-NER. Firstly, different non-entity prototypes are constructed for different query sets to capture the key semantic features of non-entities in sentences and obtain finer-grained prototypes to improve the recognition effect of non-entities. Then, an inconsistent metric module is designed to measure the inconsistency between similar entities, and different metric functions are applied to entities and non-entities, so as to reduce the feature representation between similar samples and improve the feature representation of the prototype. Finally, a Viterbi decoder is introduced to capture the label transformation relationship and optimize the final annotation sequence. The experimental results show that the performance of the proposed method is improved compared with that of the large-scale FEW-NER dataset, namely FEW-NERD; and the generalization ability of this method in different domain scenarios is verified on the cross-domain dataset.

    参考文献
    [1] Nadeau D, Sekine S. A survey of named entity recognition and classification. Lingvisticæ Investigationes, 2007, 30(1): 3–26. [doi: 10.1075/li.30.1.03nad]
    [2] Zhang SD, Elhadad N. Unsupervised biomedical named entity recognition: Experiments with clinical and biological texts. Journal of Biomedical Informatics, 2013, 46(6): 1088–1098. [doi: 10.1016/j.jbi.2013.08.004]
    [3] Quimbaya AP, Múnera AS, Rivera RAG, Rodríguez JCD, Velandia OMM, Peña AAG, Labbé C. Named entity recognition over electronic health records through a combined dictionary-based approach. Procedia Computer Science, 2016, 100: 55–61. [doi: 10.1016/j.procs.2016.09.123]
    [4] 张传岩, 洪晓光, 彭朝晖, 李庆忠. 基于SVM和扩展条件随机场的Web实体活动抽取. 软件学报, 2012, 23(10): 2612-2627. http://www.jos.org.cn/1000-9825/4189.htm
    Zhang CY, Hong XG, Peng ZH, Li QZ. Extracting Web entity activities based on SVM and extended conditional random fields. Ruan Jian Xue Bao/Journal of Software, 2012, 23(10): 2612-2627 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4189.htm
    [5] Morwal S, Jahan N, Chopra D. Named entity recognition using hidden markov model (HMM). International Journal on Natural Language Computing, 2012, 1(4): 15–23.
    [6] Ji ZC, Sun AX, Cong G, Han JL. Joint recognition and linking of fine-grained locations from tweets. In: Proc. of the 25th Int’l Conf. on World Wide Web. Montréal: Int’l World Wide Web Conf. Steering Committee, 2016. 1271–1281.
    [7] Li J, Sun AX, Han JL, Li CL. A survey on deep learning for named entity recognition. IEEE Transactions on Knowledge and Data Engineering, 2022, 34(1): 50–70. [doi: 10.1109/TKDE.2020.2981314]
    [8] Peng N, Dredze M. Improving named entity recognition for Chinese social media with word segmentation representation learning. In: Proc. of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin: Association for Computational Linguistics, 2016. 149–155.
    [9] Li PH, Dong RP, Wang YS, Chou JC, Ma WY. Leveraging linguistic structures for named entity recognition with bidirectional recursive neural networks. In Proc. of the 2017 Conf. on Empirical Methods in Natural Language Processing. Copenhagen: Association for Computational Linguistics, 2017. 2664–2669.
    [10] Hofer M, Kormilitzin A, Goldberg P, Nevado-Holgado A. Few-shot learning for named entity recognition in medical text. arXiv: 1811.05468, 2018.
    [11] Yang Y, Katiyar A. Simple and effective few-shot named entity recognition with structured nearest neighbor learning. In: Proc. of the 2020 Conf. on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2020. 6365–6375.
    [12] Li J, Chiu B, Feng SS, Wang H. Few-shot named entity recognition via meta-learning. IEEE Transactions on Knowledge and Data Engineering, 2022, 34(9): 4245–4256. [doi: 10.1109/TKDE.2020.3038670]
    [13] Fritzler A, Logacheva V, Kretov M. Few-shot classification in named entity recognition task. In: Proc. of the 34th ACM/SIGAPP Symp. on Applied Computing. Limassol: Association for Computing Machinery, 2019. 993–1000.
    [14] De Lichy C, Glaude H, Campbell W. Meta-learning for few-shot named entity recognition. In: Proc. of the 1st Workshop on Meta Learning and Its Applications to Natural Language Processing. Association for Computational Linguistics, 2021. 44–58.
    [15] Ding N, Xu GG, Chen YL, Wang XB, Han X, Xie PJ, Zheng HT, Liu ZY. FEW-NERD: A few-shot named entity recognition dataset. In: Proc. of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th Int’l Joint Conf. on Natural Language Processing (Vol. 1: Long Papers). Association for Computational Linguistics, 2021. 3198–3213.
    [16] Bengio Y. Deep learning of representations for unsupervised and transfer learning. In: Proc. of the 2011 Int’l Conf. on Unsupervised and Transfer Learning Workshop. Washington: JMLR.org, 2011. 17–37.
    [17] Koch G, Zemel R, Salakhutdinov R. Siamese neural networks for one-shot image recognition. In: Proc. of the 32nd Int’l Conf. on Machine Learning Deep Learning Workshop. Lille, 2015.
    [18] Snell J, Swersky K, Zemel R. Prototypical networks for few-shot learning. In: Proc. of the 31st Int’l Conf. on Neural Information Processing Systems. Long Beach: Curran Associates Inc., 2017. 4080–4090.
    [19] Sun SL, Sun QF, Zhou K, Lv TC. Hierarchical attention prototypical networks for few-shot text classification. In: Proc. of the 2019 Conf. on Empirical Methods in Natural Language Processing and the 9th Int’l Joint Conf. on Natural Language Processing. Hong Kong: Association for Computational Linguistics, 2019. 476–485.
    [20] Han X, Zhu H, Yu PF, Wang ZY, Yao Y, Liu ZY, Sun MS. Fewrel: A large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation. In: Proc. of the 2018 Conf. on Empirical Methods in Natural Language Processing. Brussels: Association for Computational Linguistics, 2018. 4803–4809.
    [21] Tong MH, Wang SA, Xu B, Cao YX, Liu MH, Hou L, Li JZ. Learning from miscellaneous other-class words for few-shot named entity recognition. In: Proc. of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th Int’l Joint Conf. on Natural Language Processing. Association for Computational Linguistics, 2021. 6236–6247.
    [22] Chen JW, Liu Q, Lin HY, Han XP, Sun L. Few-shot named entity recognition with self-describing networks. In: Proc. of the 60th Annual Meeting of the Association for Computational Linguistics. Dublin: Association for Computational Linguistics, 2022. 5711–5722.
    [23] 赵凯琳, 靳小龙, 王元卓. 小样本学习研究综述. 软件学报, 2021, 32(2): 349-369. http://www.jos.org.cn/1000-9825/6138.htm
    Zhao KL, Jin XL, Wang YZ. Survey on few-shot learning. Ruan Jian Xue Bao/Journal of Software, 2021, 32(2): 349-369 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6138.htm
    [24] Yang MD. A survey on few-shot learning in natural language processing. In: Proc. of the 2021 Int’l Conf. on Artificial Intelligence and Electromechanical Automation. Guangzhou: IEEE, 2021. 294–297.
    [25] Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C. Neural architectures for named entity recognition. In: Proc. of the 2016 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. San Diego: Association for Computational Linguistics, 2016. 260–270.
    [26] Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proc. of the 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis: Association for Computational Linguistics, 2018. 4171–4186.
    [27] Hou YT, Che WX, Lai YK, Zhou ZH, Liu YJ, Liu H, Liu T. Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network. In: Proc. of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2020. 1381–1393.
    [28] Sang EFTK, De Meulder F. Introduction to the conll-2003 shared task: Language-independent named entity recognition. In: Proc. of the 7th Conf. on Natural Language Learning at HLT-NAACL 2003. 2003. 142–147.
    [29] Zeldes A. The GUM corpus: Creating multilayer resources in the classroom. Language Resources and Evaluation, 2017, 51(3): 581–612. [doi: 10.1007/s10579-016-9343-x]
    [30] Derczynski L, Nichols E, van Erp M, Limsopatham N. Results of the wnut2017 shared task on novel and emerging entity recognition. In: Proc. of the 3rd Workshop on Noisy User-generated Text. Copenhagen: Association for Computational Linguistics, 2017. 140–147.
    [31] Pradhan S, Moschitti A, Xue NW, Ng HT, Björkelund A, Uryupina O, Zhang YC, Zhong Z. Towards robust linguistic analysis using ontonotes. In: Proc. of the 17th Conf. on Computational Natural Language Learning. Sofia: Association for Computational Linguistics, 2013. 143–152.
    [32] Ming H, Yang JY, Jiang LL, Pan Y, An N. Few-shot nested named entity recognition. arXiv:2212.00953, 2022.
    [33] Das SSS, Katiyar A, Passonneau R J, Zhang R. Container: Few-shot named entity recognition via contrastive learning. In: Proc. of the 60th Annual Meeting of the Association for Computational Linguistics. Dublin: Association for Computational Linguistics, 2021. 6338–6353.
    [34] Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D. Matching networks for one shot learning. In: Proc. of the 30th Int’l Conf. on Neural Information Processing Systems. Barcelona: Curran Associates Inc., 2016. 3637–3645.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

戚荣志,周俊宇,李水艳,毛莺池.基于细粒度原型网络的小样本命名实体识别方法.软件学报,2024,35(10):4751-4765

复制
分享
文章指标
  • 点击次数:671
  • 下载次数: 2134
  • HTML阅读次数: 670
  • 引用次数: 0
历史
  • 收稿日期:2023-01-16
  • 最后修改日期:2023-04-02
  • 在线发布日期: 2023-09-06
  • 出版日期: 2024-10-06
文章二维码
您是第20049469位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号