Multi-label Text Classification Method Based on Label Semantic Information
Author:
Affiliation:

Clc Number:

TP311

Fund Project:

National Natural Science Foundation of China (61822601, 61773050, 61632004); Beijing Natural Science Foundation of China (Z180006); Beijing Municipal Science & Technology Commission (Z181100008918012)

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Multi-label classification has been a practical and important problem since the boom of big data. There are many practical applications, such as text classification, image recognition, video annotation, multimedia information retrieval, etc. Traditional multi-label text classification algorithms regard labels as symbols without inherent semantics. However, in many scenarios these labels have specific semantics, and the semantic information of labels have corresponding relationship with the content information of the documents, in order to establish the connection between them and make use of them, a label semantic attention multi-label classification (LASA) method is proposed based on label semantic attention. The texts and labels of the document are relied on to share the word representation between the texts and labels. For documents embedding, bi-directional long short-term memory (Bi-LSTM) is used to obtain the hidden representation of each word. The weight of each word in the document is obtained by using the semantic representation of the label, thus taking into account the importance of each word to the current label. In addition, labels are often related to each other in the semantic space, by using the semantic information of the labels, the correlation of the labels is considered to improve the classification performance of the model. The experimental results on the standard multi-label classification datasets show that the proposed method can effectively capture important words, and its performance is better than the existing state-of-the-art multi-label classification algorithms.

    Reference
    Related
    Cited by
Get Citation

肖琳,陈博理,黄鑫,刘华锋,景丽萍,于剑.基于标签语义注意力的多标签文本分类.软件学报,2020,31(4):1079-1089

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:May 29,2019
  • Revised:July 29,2019
  • Adopted:
  • Online: January 14,2020
  • Published: April 06,2020
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063