Short Text Classification Model Combining Knowledge Aware and Dual Attention
Author:
Affiliation:

Clc Number:

TP18

  • Article
  • | |
  • Metrics
  • |
  • Reference [55]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    As the core problem of text mining, text classification task has become an essential issue in the field of natural language processing. Short text classification is a hot-spot topic, and one of many urgent problems to be solved in text classification due to its sparseness, real-time, and non-standard characteristics. In certain specific scenarios, short texts have many implicit semantics, which brings challenges to tasks such as mining implicit semantic features in limited texts. The existing research methods mainly apply traditional machine learning or deep learning algorithms for short text classification. However, this series of algorithm is complex and requires enormous cost to build an effective model, meanwhile, the algorithms are not efficient. In addition, short text contains less effective information and abundant colloquial language, which requires a stronger feature learning ability of the model. In response to the above problems, the KAeRCNN model is proposed based on the TextRCNN model, which combines knowledge aware and the dual attention mechanism. The knowledge-aware is constructed in two parts, which includes the stage of knowledge graph entity linking and knowledge graph embedding, as external knowledge can be introduced to obtain semantic features. At the same time, the dual attention mechanism can improve the model's efficiency in extracting effective information from short texts. Excessive experimental results show that the KAeRCNN model proposed in this study is significantly better than traditional machine learning algorithms in terms of classification accuracy, the F1 score, and practical application effects. The performance and adaptability of the algorithm are further verified with different datasets. The accuracy rate of the proposed approach reaches 95.54%, and the F1 score reaches 0.901. Compared with the four traditional machine learning algorithms, the accuracy rate is increased by about 14% on average, and the F1 score is increased by about 13%. Compared with TextRCNN, the KAeRCNN model improves accuracy by about 3%. In addition, the experimental results of comparison with deep learning algorithms also show that the proposed model has better performance in classification of short text from other fields. Both theoretical and experimental results indicate that the KAeRCNN model proposed in this study is effective for short text classification.

    Reference
    [1] Wang NY, Ye YX, Liu L, Feng LZ, Bao T, Peng T. Language models based on deep learning: A review. Ruan Jian Xue Bao/ Journal of Software, 2021, 32(4): 1082-1115(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6169.htm [doi: 10.13328/j.cnki.jos.006169]
    [2] Wang N. Analysis of public security education in the era of big data—Comment on “Smart public security—Policing mode in the era of big data”. Journal of the Chinese Society of Education, 2020, 12: 110-111(in Chinese with English abstract).
    [3] Kim Y. Convolutional neural networks for sentence classification. In: Proc. of the 2014 Conf. on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg: Association for Computational Linguistics, 2014. 1746-1751.
    [4] Yin WP, Schütze H, Xiang B, Zhou BW. ABCNN: Attention-based convolution neural network for modeling sentence pairs. Trans. of the Association for Computational Linguistics, 2016, 4: 259-272.
    [5] Lai SW, Xu LH, Liu J. Recurrent convolutional neural networks for text classification. In: Proc. of the 29th AAAI Conf. on Artificial Intelligence. Austin: AAAI, 2015. 2267-2273.
    [6] Wang X, Zou L, Wang CK, Peng Y, Feng ZY. Research on knowledge graph data management: A survey. Ruan Jian Xue Bao/ Journal of Software, 2019, 30(7): 2139-2174(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5841.htm [doi: 10.13328/j.cnki.jos.005841]
    [7] Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. arXiv Preprint arXiv: 1301. 3781, 2013.
    [8] Pennington J, Socher R, Manning C. Glove: Global vectors for word representation. In: Proc. of the 2014 Conf. on Empirical Methods in Natural Language Processing. Doha: Association for Computational Linguistics, 2014. 1532-1543.
    [9] Sahin A, Hurtado Grooscors H, Góngora-Cortés J. Review of FastTest: A platform for adaptive testing. Measurement: Interdisciplinary Research and Perspectives, 2018, 16(4): 256-263.
    [10] Pranckevicius T, Marcinkevicius V. Application of logistic regression with part-of-the-speech tagging for multi-class text classification. In: Proc. of the Advances in Information, Electronic & Electrical Engineering. Vilnius: IEEE, 2017. 1-5.
    [11] Abeywickrama T, Cheema M, Taniar D. k-nearest neighbors on road networks: A journey in experimentation and in-memory implementation. Proc. of the VLDB Endowment, 2016, 9(6): 492-503.
    [12] Mccallum A, Nigam K. A comparison of event models for Naive Bayes text classification. In: Proc. of the AAAI’98 Workshop on Learning for Text Categorization. 1998. 41-48.
    [13] Joachims T. Text categorization with support vector machines: Learning with many relevant features. In: Proc. of the Conf. on Machine Learning. Berlin: Springer, 1998. 137-142.
    [14] Cheng KY, Wang N, Shi WX, Zhan YZ. Research advances in the interpretability of deep learning. Journal of Computer Research and Development, 2020, 57(6): 1208-1217(in Chinese with English abstract).
    [15] Liu Q, Liang B, Xu J, Zhou Q. A deep hierarchical network model based on sentiment analysis. Chinese Journal of Computers, 2018, 41(12): 2637-2652(in Chinese with English abstract).
    [16] Hinton GE, Salakhutdinov RR. Reducing the dimensionality of data with neural networks. Science, 2006, 313(5786): 504-507.
    [17] Le-Cun Y, Bottou L, Bengio L, Haffner P. Gradient-based learning applied to document recognition. Proc. of the IEEE, 1998, 86(11): 2278-2324.
    [18] Vincent P, Larochelle H, Bengio Y, Manzagol P. Extracting and composing robust features with denoising autoencoders. In: Proc. of the 25th Int’l Conf. on Machine Learning. Helsinki: ACM, 2008. 1096-1103.
    [19] Socher R, Huval B, Manning C, Ng AY. Semantic compositionality through recursive matrix-vector spaces. In: Proc. of the Joint Conf. on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Stroudsburg: Association for Computational Linguistics, 2012. 1201-1211.
    [20] Zeng J, Jing L, Yan S, Gao C, Lyu M, King I. Topic memory networks for short text classification. In: Proc. of the 2018 Conf. on Empirical Methods in Natural Language Processing (EMNLP 2018). 2018. 3120-3131.
    [21] Hu LM, Yang TC, Shi C, Ji HY, Li XL. Heterogeneous graph attention networks for semi-supervised short text classification. In: Proc. of the 2019 Conf. on Empirical Methods in Natural Language Processing and the 9th Int’l Joint Conf. on Natural Language Processing (EMNLP-IJCNLP). Hongkong: Association for Computational Linguistics, 2019. 4821-4830.
    [22] Zhao J, Yang XJ. Application on text classification of telecom user complaints based on GRW and FastText model. Telecommunications Science, 2021, 37(6): 125-131(in Chinese with English abstract).
    [23] Wang JX, Wang ZY, Tian X. Review of natural scene text detection and recognition based on deep learning. Ruan Jian Xue Bao/ Journal of Software, 2020, 31(5): 1465-1496(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5988.htm [doi: 10.13328/j.cnki.jos.005988]
    [24] Zhou FY, Jin LP, Dong J. Review of convolutional neural network. Chinese Journal of Computers, 2017, 40(6): 1229-1251(in Chinese with English abstract).
    [25] Zhang SY, Yang Y, Xiao J, Liu XM, Yang Y, Xie D, Zhuang YT. Fusing geometric features for skeleton-based action recognition using multilayer LSTM networks. IEEE Trans. on Multimedia, 2018, 20(9): 2330-2343.
    [26] Cho K, Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proc. of the 2014 Conf. on Empirical Methods in Natural Language Processing (EMNLP). Doha: Association for Computational Linguistics, 2014. 1724-1734.
    [27] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A, Kaiser L, Polosukhin I. Attention is all you need. In: Proc. of the 31st Int’l Conf. on Neural Information Processing Systems. Long Beach: Curran Associates Inc., 2017. 6000-6010.
    [28] Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Burstein J, ed. Proc. of the 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis: Association for Computational Linguistics, 2019. 4171-4186.
    [29] Sun Y, Wang SH, Li YK, Feng SK, Tian H, Wu H, Wang HF. Ernie 2.0: A continual pre-training framework for language understanding. In: Proc. of the 34th AAAI Conf. on Artificial Intelligence. Palo Alto: AAAI, 2020. 8968-8975.
    [30] Mikolov T, Sutskever I, Chen K, Corrado G, Dean J. Distributed rep-resentations of words and phrases and their compositionality. In: Proc. of the 26th Int’l Conf. on Neural Information Processing Systems. Lake Tahoe: Curran Associates Inc., 2013. 3111-3119.
    [31] Wang Z, Zhang JW, Feng JL, Chen Z. Knowledge graph embedding by translating on hyperplanes. In: Proc. of the 28th AAAI Conf. on Artificial Intelligence. Québec City: AAAI, 2014. 1112-1119.
    [32] Lin Y, Liu ZY, Sun MS, Liu Y, Zhu X. Learning entity and relation embeddings for knowledge graph completion. In: Proc. of the 29th AAAI Conf. on Artificial Intelligence. Austin: AAAI, 2015. 2181-2187.
    [33] Chai YM, Yun WL, Wang LM, Liu Z. A cross-domain recommendation model base on dual attention mechanism and transfer learning. Chinese Journal of Computers, 2020, 43(10): 1924-1942(in Chinese with English abstract).
    [34] Zhang AM, Li BH, Wang WH, Wan S, Chen WT. MII: A novel text classification model combining deep active learning with BERT. Computers Materials and Continua, 2020, 63(3): 1499-1514.
    [35] Li BH, Zhang AM, Chen WT, Yin HL, Cai K. Active cross-query learning: A reliable labeling mechanism via crowdsourcing for smart surveillance. Computer Communications, 2020, 152: 149-154.
    [36] Xiao L, Chen BL, Huang X, Liu HF, Jing LP, Yu J. Multi-label text classification method based on label semantic information. Ruan Jian Xue Bao/Journal of Software, 2020, 31(4): 1079-1089(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5923.htm [doi: 10.13328/j.cnki.jos.005923]
    [37] Wang H, Zhang F, Xie X, et al. DKN: Deep knowledge-aware network for news recommendation. In: Proc. of the 2018 World Wide Web Conf. Lyon: Int’l World Wide Web Conf. on Steering Committee, 2018. 1835-1844.
    [38] Wang XZ, Gao TY, Zhu ZC, Zhang ZY, Liu XY, Li JZ, Tang J. KEPLER: A unified model for knowledge embedding and pretrained language representation. Trans. of the Association for Computational Linguistics, 2021, 9: 176-194.
    [39] Liu ZY, Sun MS, Lin YK, Xie RB. Knowledge representation learning: A review. Journal of Computer Research and Development, 2016, 53(2): 247-261(in Chinese with English abstract).
    [40] Mnih V, Heess N, Graves A, Kavukcuoglu K. Recurrent models of visual attention. In: Proc. of the 27th Int’l Conf. on Neural Information Processing Systems. Montreal: MIT, 2014. 2204-2212.
    [41] Liu PF, Qiu XP, Huang XJ. Recurrent neural network for text classification with multi-task learning. In: Proc. of the 25th Int’l Joint Conf. on Artificial Intelligence. New York: AAAI, 2016. 2873-2879.
    [42] Zhou P, Shi W, Tian J, Qi ZY, Li BC, Hao HW, Xu B. Attention-based bidirectional long short-term memory networks for relation classification. In: Proc. of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin: Association for Computational Linguistics, 2016. 207-212.
    [43] Johnson R, Zhang T. Deep pyramid convolutional neural networks for text categorization. In: Proc. of the 55th Annual Meeting of the Association for Computational Linguistics. Vancouver: Association for Computational Linguistics, 2017. 562-570.
    附中文参考文献:
    [1] 王乃钰, 叶育鑫, 刘露, 凤丽洲, 包铁, 彭涛. 基于深度学习的语言模型研究进展. 软件学报, 2021, 32(4): 1082-1115. http://www.jos.org.cn/1000-9825/6169.htm [doi: 10.13328/j.cnki.jos.006169]
    [2] 王楠. 浅析大数据时代下的公安教育——评《智慧公安——大数据时代的警务模式》 . 中国教育学刊, 2020, 12: 110-111.
    [6] 王鑫, 邹磊, 王朝坤, 彭鹏, 冯志勇. 知识图谱数据管理研究综述. 软件学报, 2019, 30(7): 2139-2174. http://www.jos.org.cn/1000-9825/5841.htm [doi: 10.13328/j.cnki.jos.005841]
    [14] 成科扬, 王宁, 师文喜, 詹永照. 深度学习可解释性研究进展. 计算机研究与发展, 2020, 57(6): 1208-1217.
    [15] 刘全, 梁斌, 徐进, 周倩. 一种用于基于方面情感分析的深度分层网络模型. 计算机学报, 2018, 41(12): 2637-2652.
    [22] 赵进, 杨小军. 基于GRW和FastText模型的电信用户投诉文本分类应用. 电信科学, 2021, 37(6): 125-131.
    [23] 王建新, 王子亚, 田萱. 基于深度学习的自然场景文本检测与识别综述. 软件学报, 2020, 31(5): 1465-1496. http://www.jos.org.cn/1000-9825/5988.htm [doi: 10.13328/j.cnki.jos.005988]
    [24] 周飞燕, 金林鹏, 董军. 卷积神经网络研究综述. 计算机学报, 2017, 40(6): 1229-1251.
    [33] 柴玉梅, 员武莲, 王黎明, 刘箴. 基于双注意力机制和迁移学习的跨领域推荐模型. 计算机学报, 2020, 43(10): 1924-1942.
    [36] 肖琳, 陈博理, 黄鑫, 刘华锋, 景丽萍, 于剑. 基于标签语义注意力的多标签文本分类. 软件学报, 2020, 31(4): 1079-1089. http://www.jos.org.cn/1000-9825/5923.htm [doi: 10.13328/j.cnki.jos.005923]
    [39] 刘知远, 孙茂松, 林衍凯, 谢若冰. 知识表示学习研究进展. 计算机研究与发展, 2016, 53(2): 247-261.
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

李博涵,向宇轩,封顶,何志超,吴佳骏,戴天伦,李静.融合知识感知与双重注意力的短文本分类模型.软件学报,2022,33(10):3565-3581

Copy
Share
Article Metrics
  • Abstract:2512
  • PDF: 5370
  • HTML: 3111
  • Cited by: 0
History
  • Received:July 20,2021
  • Revised:August 30,2021
  • Online: February 22,2022
  • Published: October 06,2022
You are the first2041648Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063