A Sequence-Based Automatic Text Classification Algorithm
DOI:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    An automatic text-classification algorithm based on sequence is presented in this paper. It utilizes the semantic relevance on two levels: relevance between sentences (subpattern) and between keywords which represent specific meaning (concept node) in one sentence. In this way, each keyword can be combined with dynamic weight. For subpatterns which contain no keywords, Markov model is used to estimate the amplitude of their signals, thereby the feature sequence for the text which needs to be classified is created.In the experiment of classifying Chinese documents,it is BEP value is about 83%.Furthermore,it is easy to implement in actual system.

    Reference
    Related
    Cited by
Get Citation

解冲锋,李星.基于序列的文本自动分类算法.软件学报,2002,13(4):783-789

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:August 01,2000
  • Revised:October 30,2000
  • Adopted:
  • Online:
  • Published:
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063