基于矢量量化的快速图像检索
作者:
基金项目:

Supported by the National Natural Science Foundation of China under Grant No.60273005(国家自然科学基金)


Fast Image Search Using Vector Quantization
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [19]
  • |
  • 相似文献 [20]
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    传统索引方法对高维数据存在"维数灾难"的困难.而对数据分布的精确描述及对数据空间的有效划分是高维索引机制中的关键问题.提出一种基于矢量量化的索引方法.该方法使用高斯混合模型描述数据的整体分布,并训练优化的矢量量化器划分数据空间.高斯混合模型能更好地描述真实图像库的数据分布;而矢量量化的划分方法可以充分利用维之间的统计相关性,能够对数据向量构造出更加精确的近似表示,从而提高索引结构的过滤效率并减少需要访问的数据向量.在大容量真实图像库上的实验表明,该方法显著减少了支配检索时间的I/O开销,提高了索引性能.

    Abstract:

    Traditional indexing methods face the difficulty of 慶urse of dimensionality?at high dimensionality. Accurate estimate of data distribution and efficient partition of data space are the key problems in high-dimensional indexing schemes. In this paper, a novel indexing method using vector quantization is proposed. It assumes a Gaussian mixture distribution which fits real-world image data reasonably well. After estimating this distribution through EM (expectation-maximization) method, this approach trains the optimized vector quantizers to partition the data space, which will gain from the dependency of dimensions and achieve more accurate vector approximation and less quantization distortion. Experiments on a large real-world dataset show a remarkable reduction of I/O overhead of the vector accesses which dominate the query time in the exact NN (nearest neighbor) searches. They also show an improvement on the indexing performance compared with the existing indexing schemes.

    参考文献
    [1]Rui Y, Huang TS, Chang SF. Image retrieval: Current techniques, promising directions and open issues. Journal of Visual Communication and Image Representation, 1999,10(4):39~62.
    [2]Rui Y, Huang TS, Ortega M, Mehrotra S. Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Trans. on Circuits and Systems for Video Technology, 1998,8(5):644~655.
    [3]Nievergelt J, Hinterberger H, Sevcik K. The gridfile: An adaptable symmetric multikey file structure. ACM Trans. on Database Systems, 1984,9(1):38~71.
    [4]Robinson J. The k-d-b-tree: A search structure for large multidimensional dynamic indexes. In: Edmund YL, ed. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. New York: ACM Press, 1981.10~18.
    [5]Beckmann N, Kriegel HP, Schneider R, Seeger B. The R*-tree: An efficient and robust access method for points and rectangles. In:Garcia-Molina H, Jagadish HV, eds. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. New York: ACM Press, 1990.322~331.
    [6]Katayama N, Satoh S. The SR-tree: An index structure for high-dimensional nearest neighbor queries. In: Peckham J, ed. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. New York: ACM Press, 1997. 369~380.
    [7]Weber R, Schek H, Blott S. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Gupta A, Shmueli O, Widom J, eds. Proc. of the 24th ACM Int'l Conf. on Very Large Data Bases (VLDB'98). New York: Morgan Kaufmann Publishers, 1998. 194~205.
    [8]Beyer K, Goldstein J, Ramakrishnan R. When is ‘nearest neighbor' meaningful? In: Beeri C, Buneman P, eds. Proc. of the 7th ACM Int'l Conf. on Database Theory (ICDT'99). Lecture Notes in Computer Science 1540, Berlin: Springer-Verlag, 1999.217~235.
    [9]Scott DW. Multivariate Density Estimation. New York: John Wiley and Sons, 1992.
    [10]Gersho A, Gray RM. Vector Quantization and Signal Compression. Boston: Kluwer Academic Press, 1992.
    [11]Sun SH, Lu ZM. Vector Quantization Technology and Applications. Beijing: Science Press, 2002 (in Chinese).
    [12]Lookabaugh TD, Gray RM. High-Resolution theory and the vector quantizer advantage. IEEE Trans. on Information Theory,1989,35(5):1020~1033.
    [13]Wu P, Manjunath B, Chandrasekaran S. An adaptive index structure for high-dimensional similarity search. In: Shnm HY, Liao M,Chang SF, eds. Proc. of Advances in Multimedia Information Processing-PCM 2001, the 2nd IEEE Pacific Rim Conf. on Multimedia. Lecture Notes in Computer Science 2195, Berlin: Springer-Verlag, 2001.71~77.
    [14]Ferhatosmanoglu H, Tuncel E, Agrawal D. Vector approximation based indexing for non-uniform high dimensional data sets. In:Proc. of the ACM Int'l Conf. on Information and Knowledge Management (CIKM 2000). New York: ACM Press, 2000. 202~209.
    [15]Forgy E. Cluster analysis of multivariate data: Efficiency vs. interpretability of classifications. Biometrics, 1965,21(3):768.
    [16]Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B, 1977,39(1):1~38.
    [17]Manjunath BS. Aerial photo image database. 2000. http:∥vision.ece.ucsb.edu/datasets/
    [18]Manjunath BS, Ma WY. Texture features for browsing and retrieval of image data. IEEE Trans. on Pattern Analysis and Machine Intelligence, 1996,18(8):837~842.
    [19]孙圣和,陆哲明.矢量量化技术及应用.北京:科学出版社,2002.
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

叶航军,徐光祐.基于矢量量化的快速图像检索.软件学报,2004,15(5):712-719

复制
分享
文章指标
  • 点击次数:4695
  • 下载次数: 5777
  • HTML阅读次数: 0
  • 引用次数: 0
历史
  • 收稿日期:2003-03-04
  • 最后修改日期:2003-09-05
文章二维码
您是第19784061位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号