Few-shot Named Entity Recognition Based on Fine-grained Prototypical Network

doi:10.13328/j.cnki.jos.006979

微信服务号

微信订阅号

2025-6-5- 7

Home > Archive>Volume 35, Issue 10, 2024 >4751-4765. DOI:10.13328/j.cnki.jos.006979

PDF HTML XML Export Cite reminder

Few-shot Named Entity Recognition Based on Fine-grained Prototypical Network
DOI:
                        10.13328/j.cnki.jos.006979
                    
Author:
                        QI Rong-ZhiQI Rong-Zhi
College of Computer and Information, Hohai University, Nanjing 211106, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
ZHOU Jun-YuZHOU Jun-Yu
College of Computer and Information, Hohai University, Nanjing 211106, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
LI Shui-YanLI Shui-Yan
College of Science, Hohai University, Nanjing 211106, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
MAO Ying-ChiMAO Ying-Chi
College of Computer and Information, Hohai University, Nanjing 211106, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:TP18
Fund Project:

Article

Figures

Metrics

Reference [36]

Related [20]

Cited by

Materials

Comments

Abstract:

When prototypical networks are directly applied to few-shot named entity recognition (FEW-NER), there are the following problems: Non-entities do not have strong semantic relationships with each other, and using the same way to construct the prototype for both entities and non-entities will make non-entity prototypes fail to accurately represent the semantic characteristics of non-entities; using only the average entity vector as the computing method of the prototype will make it difficult to capture similar entities with different semantic features. To address these problems, this study proposes a FEW-NER based on fine-grained prototypical networks (FNFP) to improve the annotation effect of FEW-NER. Firstly, different non-entity prototypes are constructed for different query sets to capture the key semantic features of non-entities in sentences and obtain finer-grained prototypes to improve the recognition effect of non-entities. Then, an inconsistent metric module is designed to measure the inconsistency between similar entities, and different metric functions are applied to entities and non-entities, so as to reduce the feature representation between similar samples and improve the feature representation of the prototype. Finally, a Viterbi decoder is introduced to capture the label transformation relationship and optimize the final annotation sequence. The experimental results show that the performance of the proposed method is improved compared with that of the large-scale FEW-NER dataset, namely FEW-NERD; and the generalization ability of this method in different domain scenarios is verified on the cross-domain dataset.

Key words:few-shot named entity recognition (FEW-NER);fine-grained prototypical network;few-shot learning;feature representation

Reference

[1] Nadeau D, Sekine S. A survey of named entity recognition and classification. Lingvisticæ Investigationes, 2007, 30(1): 3–26. [doi: 10.1075/li.30.1.03nad]

[2] Zhang SD, Elhadad N. Unsupervised biomedical named entity recognition: Experiments with clinical and biological texts. Journal of Biomedical Informatics, 2013, 46(6): 1088–1098. [doi: 10.1016/j.jbi.2013.08.004]

[3] Quimbaya AP, Múnera AS, Rivera RAG, Rodríguez JCD, Velandia OMM, Peña AAG, Labbé C. Named entity recognition over electronic health records through a combined dictionary-based approach. Procedia Computer Science, 2016, 100: 55–61. [doi: 10.1016/j.procs.2016.09.123]

[4] 张传岩, 洪晓光, 彭朝晖, 李庆忠. 基于SVM和扩展条件随机场的Web实体活动抽取. 软件学报, 2012, 23(10): 2612-2627. http://www.jos.org.cn/1000-9825/4189.htm

Zhang CY, Hong XG, Peng ZH, Li QZ. Extracting Web entity activities based on SVM and extended conditional random fields. Ruan Jian Xue Bao/Journal of Software, 2012, 23(10): 2612-2627 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4189.htm

[5] Morwal S, Jahan N, Chopra D. Named entity recognition using hidden markov model (HMM). International Journal on Natural Language Computing, 2012, 1(4): 15–23.

[6] Ji ZC, Sun AX, Cong G, Han JL. Joint recognition and linking of fine-grained locations from tweets. In: Proc. of the 25th Int’l Conf. on World Wide Web. Montréal: Int’l World Wide Web Conf. Steering Committee, 2016. 1271–1281.

[7] Li J, Sun AX, Han JL, Li CL. A survey on deep learning for named entity recognition. IEEE Transactions on Knowledge and Data Engineering, 2022, 34(1): 50–70. [doi: 10.1109/TKDE.2020.2981314]

[8] Peng N, Dredze M. Improving named entity recognition for Chinese social media with word segmentation representation learning. In: Proc. of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin: Association for Computational Linguistics, 2016. 149–155.

[9] Li PH, Dong RP, Wang YS, Chou JC, Ma WY. Leveraging linguistic structures for named entity recognition with bidirectional recursive neural networks. In Proc. of the 2017 Conf. on Empirical Methods in Natural Language Processing. Copenhagen: Association for Computational Linguistics, 2017. 2664–2669.

[10] Hofer M, Kormilitzin A, Goldberg P, Nevado-Holgado A. Few-shot learning for named entity recognition in medical text. arXiv: 1811.05468, 2018.

[11] Yang Y, Katiyar A. Simple and effective few-shot named entity recognition with structured nearest neighbor learning. In: Proc. of the 2020 Conf. on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2020. 6365–6375.

[12] Li J, Chiu B, Feng SS, Wang H. Few-shot named entity recognition via meta-learning. IEEE Transactions on Knowledge and Data Engineering, 2022, 34(9): 4245–4256. [doi: 10.1109/TKDE.2020.3038670]

[13] Fritzler A, Logacheva V, Kretov M. Few-shot classification in named entity recognition task. In: Proc. of the 34th ACM/SIGAPP Symp. on Applied Computing. Limassol: Association for Computing Machinery, 2019. 993–1000.

[14] De Lichy C, Glaude H, Campbell W. Meta-learning for few-shot named entity recognition. In: Proc. of the 1st Workshop on Meta Learning and Its Applications to Natural Language Processing. Association for Computational Linguistics, 2021. 44–58.

[15] Ding N, Xu GG, Chen YL, Wang XB, Han X, Xie PJ, Zheng HT, Liu ZY. FEW-NERD: A few-shot named entity recognition dataset. In: Proc. of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th Int’l Joint Conf. on Natural Language Processing (Vol. 1: Long Papers). Association for Computational Linguistics, 2021. 3198–3213.

[16] Bengio Y. Deep learning of representations for unsupervised and transfer learning. In: Proc. of the 2011 Int’l Conf. on Unsupervised and Transfer Learning Workshop. Washington: JMLR.org, 2011. 17–37.

[17] Koch G, Zemel R, Salakhutdinov R. Siamese neural networks for one-shot image recognition. In: Proc. of the 32nd Int’l Conf. on Machine Learning Deep Learning Workshop. Lille, 2015.

[18] Snell J, Swersky K, Zemel R. Prototypical networks for few-shot learning. In: Proc. of the 31st Int’l Conf. on Neural Information Processing Systems. Long Beach: Curran Associates Inc., 2017. 4080–4090.

[19] Sun SL, Sun QF, Zhou K, Lv TC. Hierarchical attention prototypical networks for few-shot text classification. In: Proc. of the 2019 Conf. on Empirical Methods in Natural Language Processing and the 9th Int’l Joint Conf. on Natural Language Processing. Hong Kong: Association for Computational Linguistics, 2019. 476–485.

[20] Han X, Zhu H, Yu PF, Wang ZY, Yao Y, Liu ZY, Sun MS. Fewrel: A large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation. In: Proc. of the 2018 Conf. on Empirical Methods in Natural Language Processing. Brussels: Association for Computational Linguistics, 2018. 4803–4809.

[21] Tong MH, Wang SA, Xu B, Cao YX, Liu MH, Hou L, Li JZ. Learning from miscellaneous other-class words for few-shot named entity recognition. In: Proc. of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th Int’l Joint Conf. on Natural Language Processing. Association for Computational Linguistics, 2021. 6236–6247.

[22] Chen JW, Liu Q, Lin HY, Han XP, Sun L. Few-shot named entity recognition with self-describing networks. In: Proc. of the 60th Annual Meeting of the Association for Computational Linguistics. Dublin: Association for Computational Linguistics, 2022. 5711–5722.

[23] 赵凯琳, 靳小龙, 王元卓. 小样本学习研究综述. 软件学报, 2021, 32(2): 349-369. http://www.jos.org.cn/1000-9825/6138.htm

Zhao KL, Jin XL, Wang YZ. Survey on few-shot learning. Ruan Jian Xue Bao/Journal of Software, 2021, 32(2): 349-369 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6138.htm

[24] Yang MD. A survey on few-shot learning in natural language processing. In: Proc. of the 2021 Int’l Conf. on Artificial Intelligence and Electromechanical Automation. Guangzhou: IEEE, 2021. 294–297.

[25] Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C. Neural architectures for named entity recognition. In: Proc. of the 2016 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. San Diego: Association for Computational Linguistics, 2016. 260–270.

[26] Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proc. of the 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis: Association for Computational Linguistics, 2018. 4171–4186.

[27] Hou YT, Che WX, Lai YK, Zhou ZH, Liu YJ, Liu H, Liu T. Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network. In: Proc. of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2020. 1381–1393.

[28] Sang EFTK, De Meulder F. Introduction to the conll-2003 shared task: Language-independent named entity recognition. In: Proc. of the 7th Conf. on Natural Language Learning at HLT-NAACL 2003. 2003. 142–147.

[29] Zeldes A. The GUM corpus: Creating multilayer resources in the classroom. Language Resources and Evaluation, 2017, 51(3): 581–612. [doi: 10.1007/s10579-016-9343-x]

[30] Derczynski L, Nichols E, van Erp M, Limsopatham N. Results of the wnut2017 shared task on novel and emerging entity recognition. In: Proc. of the 3rd Workshop on Noisy User-generated Text. Copenhagen: Association for Computational Linguistics, 2017. 140–147.

[31] Pradhan S, Moschitti A, Xue NW, Ng HT, Björkelund A, Uryupina O, Zhang YC, Zhong Z. Towards robust linguistic analysis using ontonotes. In: Proc. of the 17th Conf. on Computational Natural Language Learning. Sofia: Association for Computational Linguistics, 2013. 143–152.

[32] Ming H, Yang JY, Jiang LL, Pan Y, An N. Few-shot nested named entity recognition. arXiv:2212.00953, 2022.

[33] Das SSS, Katiyar A, Passonneau R J, Zhang R. Container: Few-shot named entity recognition via contrastive learning. In: Proc. of the 60th Annual Meeting of the Association for Computational Linguistics. Dublin: Association for Computational Linguistics, 2021. 6338–6353.

[34] Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D. Matching networks for one shot learning. In: Proc. of the 30th Int’l Conf. on Neural Information Processing Systems. Barcelona: Curran Associates Inc., 2016. 3637–3645.

Get Citation

戚荣志,周俊宇,李水艳,毛莺池.基于细粒度原型网络的小样本命名实体识别方法.软件学报,2024,35(10):4751-4765

Copy

Article Metrics

Abstract:663
PDF: 2128
HTML: 660
Cited by: 0

History

Received:January 16,2023
Revised:April 02,2023
Adopted:
Online: September 06,2023
Published: October 06,2024

You are the first2051283Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History