A Web Document Clustering Algorithm Based on Association Rule
DOI:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    By grouping similar Web documents into clusters, the search space can be reduced, the search accelerated, and its precision improved. In this paper, a new clustering algorithm is introduced. In the clustering technique, topics are represented according to VSM (vector space model), documents are represented according to topics, and the relation between documents and topics is viewed in a transactional form, each document corresponds to a transaction and each topic corresponds to an item. A frequent item sets can be found by using the association riles discovery algorithm,corresponding documents can be seen as initial clusters.These clusters are merged according to the disance between clusters,or divided aivided according to the strength of connection among documents of a cluster.By real Wed documents,experimental results show the algorithm's effectivenss and suitability for tackling the overlapping clusters inhered by documents.

    Reference
    Related
    Cited by
Get Citation

宋擒豹,沈钧毅.基于关联规则的Web文档聚类算法.软件学报,2002,13(3):417-423

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:April 04,2000
  • Revised:August 28,2000
  • Adopted:
  • Online:
  • Published:
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063