图像-文本相关性挖掘的Web图像聚类方法
DOI:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

Supported by the National Natural Science Foundation of China under Grant Nos.60603096, 60533090 (国家自然科学基金); the National High-Tech Research and Development Plan of China under Grant No.2006AA010107 (国家高技术研究发展计划(863)); the Program for Changjiang Scholars and Innovative Research Team in University of China under Grant Nos.IRT0652, PCSIRT (长江学者和创新团队发展计划)


Clustering Web Images by Correlation Mining of Image-Text
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    为了实现Web图像检索结果的聚类,提出了一种Web图像的图聚类方法.首先定义了两种类型关联:单词与图像结点之间的异构链接以及单词结点之间的同构链接.为了克服传统的TF-IDF方法不能直接反映单词与图像之间的语义关联局限性,提出并定义了单词可见度(visibility)这一属性,并将其集成到传统的tf-idf模型中以挖掘单词-图像之间关联的权重.根据LDA(latent Dirichlet allocation)模型,单词-单词之间关联权重通过一个定义的主题相关度函数来计算.最后,应用复杂图聚类和二部图协同谱聚类等算法验证了在图模型上引入两种相关性关联的有效性,达到了改进了Web图像聚类性能的目的.

    Abstract:

    To cluster the retrieval results of Web image, a framework for the clustering is proposed in this paper. It explores the surrounding text to mine the correlations between words and images and therefore the correlations are used to improve clustering results. Two kinds of correlations, namely word to image and word to word correlations, are mainly considered. As a standard text process technique, tf-idf method cannot measure the correlation of word to image directly. Therefore, this paper proposes to combine tf-idf method with a feature of word, namely visibility, to infer the correlation of word to image. Through LDA model, it defines a topic relevance function to compute the weights of word to word correlations. Finally, complex graph clustering and spectral co-clustering algorithms are used to testify the effect of introducing visibility and topic relevance into image clustering. Encouraging experimental results are reported in this paper.

    参考文献
    相似文献
    引证文献
引用本文

吴 飞,韩亚洪,庄越挺,邵 健.图像-文本相关性挖掘的Web图像聚类方法.软件学报,2010,21(7):1561-1575

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2009-02-27
  • 最后修改日期:2009-07-23
  • 录用日期:
  • 在线发布日期:
  • 出版日期:
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号