A Text Filtering System Based on Vector Space Model

微信服务号

微信订阅号

2025-4-25- 17

Home > Archive>Volume 14, Issue 3, 2003 >435-442

A Text Filtering System Based on Vector Space Model
DOI:
                        
                    
Author:
                        HUANG Xuan-JingHUANG Xuan-Jing

Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
XIA Ying-JuXIA Ying-Ju

Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
WU Li-DeWU Li-De

Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Text filtering is the procedure of retrieving documents relevant to the requirements of specific users from a large-scale text data stream. First, the TREC (text retrieval conference) as well as its text filtering track are introduced, which is the most authoritative international evaluation conference on text retrieval, from the aspects of tasks, topics, corpus and evaluation metrics. Then a text filtering system based on vector space model is presented. This system is composed of two phases of training and adaptive filtering. During the training phase, feature selection and pseudo feedback are used to select the initial filtering profiles and thresholds. During the filtering phase, user feedback is utilized to modify the profiles and thresholds adaptively. This system took participate in the 9th Text Retrieval Conference in 2000, and ranked high among all the 15 systems from many countries. Good performance has been achieved, where the average precisions of adaptive and batch filtering are 26.5% and 31.7% respectively.

Key words:text retrieval; text filtering; text categorization; machine learning; vector space model

Get Citation

黄萱菁,夏迎炬,吴立德.基于向量空间模型的文本过滤系统.软件学报,2003,14(3):435-442

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:September 14,2001
Revised:April 10,2002
Adopted:
Online:
Published:

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History