Few-shot Named Entity Recognition Based on Fine-grained Prototypical Network
Author:
Affiliation:

Clc Number:

TP18

  • Article
  • | |
  • Metrics
  • |
  • Reference [36]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    When prototypical networks are directly applied to few-shot named entity recognition (FEW-NER), there are the following problems: Non-entities do not have strong semantic relationships with each other, and using the same way to construct the prototype for both entities and non-entities will make non-entity prototypes fail to accurately represent the semantic characteristics of non-entities; using only the average entity vector as the computing method of the prototype will make it difficult to capture similar entities with different semantic features. To address these problems, this study proposes a FEW-NER based on fine-grained prototypical networks (FNFP) to improve the annotation effect of FEW-NER. Firstly, different non-entity prototypes are constructed for different query sets to capture the key semantic features of non-entities in sentences and obtain finer-grained prototypes to improve the recognition effect of non-entities. Then, an inconsistent metric module is designed to measure the inconsistency between similar entities, and different metric functions are applied to entities and non-entities, so as to reduce the feature representation between similar samples and improve the feature representation of the prototype. Finally, a Viterbi decoder is introduced to capture the label transformation relationship and optimize the final annotation sequence. The experimental results show that the performance of the proposed method is improved compared with that of the large-scale FEW-NER dataset, namely FEW-NERD; and the generalization ability of this method in different domain scenarios is verified on the cross-domain dataset.

    Reference
    [1] Nadeau D, Sekine S. A survey of named entity recognition and classification. Lingvisticæ Investigationes, 2007, 30(1): 3–26. [doi: 10.1075/li.30.1.03nad]
    [2] Zhang SD, Elhadad N. Unsupervised biomedical named entity recognition: Experiments with clinical and biological texts. Journal of Biomedical Informatics, 2013, 46(6): 1088–1098. [doi: 10.1016/j.jbi.2013.08.004]
    [3] Quimbaya AP, Múnera AS, Rivera RAG, Rodríguez JCD, Velandia OMM, Peña AAG, Labbé C. Named entity recognition over electronic health records through a combined dictionary-based approach. Procedia Computer Science, 2016, 100: 55–61. [doi: 10.1016/j.procs.2016.09.123]
    [4] 张传岩, 洪晓光, 彭朝晖, 李庆忠. 基于SVM和扩展条件随机场的Web实体活动抽取. 软件学报, 2012, 23(10): 2612-2627. http://www.jos.org.cn/1000-9825/4189.htm
    Zhang CY, Hong XG, Peng ZH, Li QZ. Extracting Web entity activities based on SVM and extended conditional random fields. Ruan Jian Xue Bao/Journal of Software, 2012, 23(10): 2612-2627 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4189.htm
    [5] Morwal S, Jahan N, Chopra D. Named entity recognition using hidden markov model (HMM). International Journal on Natural Language Computing, 2012, 1(4): 15–23.
    [6] Ji ZC, Sun AX, Cong G, Han JL. Joint recognition and linking of fine-grained locations from tweets. In: Proc. of the 25th Int’l Conf. on World Wide Web. Montréal: Int’l World Wide Web Conf. Steering Committee, 2016. 1271–1281.
    [7] Li J, Sun AX, Han JL, Li CL. A survey on deep learning for named entity recognition. IEEE Transactions on Knowledge and Data Engineering, 2022, 34(1): 50–70. [doi: 10.1109/TKDE.2020.2981314]
    [8] Peng N, Dredze M. Improving named entity recognition for Chinese social media with word segmentation representation learning. In: Proc. of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin: Association for Computational Linguistics, 2016. 149–155.
    [9] Li PH, Dong RP, Wang YS, Chou JC, Ma WY. Leveraging linguistic structures for named entity recognition with bidirectional recursive neural networks. In Proc. of the 2017 Conf. on Empirical Methods in Natural Language Processing. Copenhagen: Association for Computational Linguistics, 2017. 2664–2669.
    [10] Hofer M, Kormilitzin A, Goldberg P, Nevado-Holgado A. Few-shot learning for named entity recognition in medical text. arXiv: 1811.05468, 2018.
    [11] Yang Y, Katiyar A. Simple and effective few-shot named entity recognition with structured nearest neighbor learning. In: Proc. of the 2020 Conf. on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2020. 6365–6375.
    [12] Li J, Chiu B, Feng SS, Wang H. Few-shot named entity recognition via meta-learning. IEEE Transactions on Knowledge and Data Engineering, 2022, 34(9): 4245–4256. [doi: 10.1109/TKDE.2020.3038670]
    [13] Fritzler A, Logacheva V, Kretov M. Few-shot classification in named entity recognition task. In: Proc. of the 34th ACM/SIGAPP Symp. on Applied Computing. Limassol: Association for Computing Machinery, 2019. 993–1000.
    [14] De Lichy C, Glaude H, Campbell W. Meta-learning for few-shot named entity recognition. In: Proc. of the 1st Workshop on Meta Learning and Its Applications to Natural Language Processing. Association for Computational Linguistics, 2021. 44–58.
    [15] Ding N, Xu GG, Chen YL, Wang XB, Han X, Xie PJ, Zheng HT, Liu ZY. FEW-NERD: A few-shot named entity recognition dataset. In: Proc. of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th Int’l Joint Conf. on Natural Language Processing (Vol. 1: Long Papers). Association for Computational Linguistics, 2021. 3198–3213.
    [16] Bengio Y. Deep learning of representations for unsupervised and transfer learning. In: Proc. of the 2011 Int’l Conf. on Unsupervised and Transfer Learning Workshop. Washington: JMLR.org, 2011. 17–37.
    [17] Koch G, Zemel R, Salakhutdinov R. Siamese neural networks for one-shot image recognition. In: Proc. of the 32nd Int’l Conf. on Machine Learning Deep Learning Workshop. Lille, 2015.
    [18] Snell J, Swersky K, Zemel R. Prototypical networks for few-shot learning. In: Proc. of the 31st Int’l Conf. on Neural Information Processing Systems. Long Beach: Curran Associates Inc., 2017. 4080–4090.
    [19] Sun SL, Sun QF, Zhou K, Lv TC. Hierarchical attention prototypical networks for few-shot text classification. In: Proc. of the 2019 Conf. on Empirical Methods in Natural Language Processing and the 9th Int’l Joint Conf. on Natural Language Processing. Hong Kong: Association for Computational Linguistics, 2019. 476–485.
    [20] Han X, Zhu H, Yu PF, Wang ZY, Yao Y, Liu ZY, Sun MS. Fewrel: A large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation. In: Proc. of the 2018 Conf. on Empirical Methods in Natural Language Processing. Brussels: Association for Computational Linguistics, 2018. 4803–4809.
    [21] Tong MH, Wang SA, Xu B, Cao YX, Liu MH, Hou L, Li JZ. Learning from miscellaneous other-class words for few-shot named entity recognition. In: Proc. of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th Int’l Joint Conf. on Natural Language Processing. Association for Computational Linguistics, 2021. 6236–6247.
    [22] Chen JW, Liu Q, Lin HY, Han XP, Sun L. Few-shot named entity recognition with self-describing networks. In: Proc. of the 60th Annual Meeting of the Association for Computational Linguistics. Dublin: Association for Computational Linguistics, 2022. 5711–5722.
    [23] 赵凯琳, 靳小龙, 王元卓. 小样本学习研究综述. 软件学报, 2021, 32(2): 349-369. http://www.jos.org.cn/1000-9825/6138.htm
    Zhao KL, Jin XL, Wang YZ. Survey on few-shot learning. Ruan Jian Xue Bao/Journal of Software, 2021, 32(2): 349-369 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6138.htm
    [24] Yang MD. A survey on few-shot learning in natural language processing. In: Proc. of the 2021 Int’l Conf. on Artificial Intelligence and Electromechanical Automation. Guangzhou: IEEE, 2021. 294–297.
    [25] Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C. Neural architectures for named entity recognition. In: Proc. of the 2016 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. San Diego: Association for Computational Linguistics, 2016. 260–270.
    [26] Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proc. of the 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis: Association for Computational Linguistics, 2018. 4171–4186.
    [27] Hou YT, Che WX, Lai YK, Zhou ZH, Liu YJ, Liu H, Liu T. Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network. In: Proc. of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2020. 1381–1393.
    [28] Sang EFTK, De Meulder F. Introduction to the conll-2003 shared task: Language-independent named entity recognition. In: Proc. of the 7th Conf. on Natural Language Learning at HLT-NAACL 2003. 2003. 142–147.
    [29] Zeldes A. The GUM corpus: Creating multilayer resources in the classroom. Language Resources and Evaluation, 2017, 51(3): 581–612. [doi: 10.1007/s10579-016-9343-x]
    [30] Derczynski L, Nichols E, van Erp M, Limsopatham N. Results of the wnut2017 shared task on novel and emerging entity recognition. In: Proc. of the 3rd Workshop on Noisy User-generated Text. Copenhagen: Association for Computational Linguistics, 2017. 140–147.
    [31] Pradhan S, Moschitti A, Xue NW, Ng HT, Björkelund A, Uryupina O, Zhang YC, Zhong Z. Towards robust linguistic analysis using ontonotes. In: Proc. of the 17th Conf. on Computational Natural Language Learning. Sofia: Association for Computational Linguistics, 2013. 143–152.
    [32] Ming H, Yang JY, Jiang LL, Pan Y, An N. Few-shot nested named entity recognition. arXiv:2212.00953, 2022.
    [33] Das SSS, Katiyar A, Passonneau R J, Zhang R. Container: Few-shot named entity recognition via contrastive learning. In: Proc. of the 60th Annual Meeting of the Association for Computational Linguistics. Dublin: Association for Computational Linguistics, 2021. 6338–6353.
    [34] Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D. Matching networks for one shot learning. In: Proc. of the 30th Int’l Conf. on Neural Information Processing Systems. Barcelona: Curran Associates Inc., 2016. 3637–3645.
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

戚荣志,周俊宇,李水艳,毛莺池.基于细粒度原型网络的小样本命名实体识别方法.软件学报,2024,35(10):4751-4765

Copy
Share
Article Metrics
  • Abstract:655
  • PDF: 2104
  • HTML: 641
  • Cited by: 0
History
  • Received:January 16,2023
  • Revised:April 02,2023
  • Online: September 06,2023
  • Published: October 06,2024
You are the first2044825Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063