权重残差向量量化: 向量压缩与分层索引结构
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家重点研发计划(2024YFC2607402); 国家自然科学基金(61972151)


Weight Residual Vector Quantization: Vector Compression and Hierarchical Indexing Structure
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    随着多源异构数据、多模态等在大模型和数据湖等场景的广泛应用, 基于向量的数据检索和存储管理显著增长. 通过将异构数据映射为高维向量表示, 并以向量索引为基础, 向量数据库将多种数据类型统一管理和高质量相似性检索, 成为生成式检索和AI数据库等重要基础. 然而, 现有向量数据库在存储索引效率、索引构建复杂度及检索准确性方面面临显著瓶颈: 一方面, 海量高维向量导致索引存储开销和维护成本增加; 另一方面, 向量索引结构冗长, 内存消耗巨大; 此外, 压缩技术失真引发的检索准确性下降问题仍未有效解决. 提出了一种基于权重残差向量量化 (weight residual vector quantization, WRVQ)的框架. 该方法通过将量化方向与残差长度分离处理, 以单位向量形式存储残差方向并附加权重标记, 实现了低失真率下的高效压缩与存储. 在索引构建方面, 设计了适配WRVQ量化特性的三层倒排索引结构——精确匹配层、模糊匹配层与搜索层, 有机结合非对称距离计算 (asymmetric distance computation, ADC)与近邻搜索技术, 实现了高准确度与高效率兼具的近似最近邻检索. 在大规模数据集上的实验结果表明, 与传统低维嵌入模型及现有量化方法相比, WRVQ在量化损失、存储压缩比和检索召回率等关键指标上均取得了显著提升, 且索引构建与查询性能具有显著优势.

    Abstract:

    With the widespread application of multi-source, heterogeneous, and multi-modal data in scenarios such as large models and data lakes, there has been a significant growth in vector-based data retrieval and storage management. By mapping heterogeneous data into high-dimensional vector representations and leveraging vector indices, vector databases facilitate the unified management of diverse data types and enable high-quality similarity search, establishing them as a crucial foundation for applications like generative retrieval and AI-native databases. However, existing vector databases face significant bottlenecks in terms of storage and indexing efficiency, index construction complexity, and retrieval accuracy. Specifically, massive high-dimensional vectors lead to increased storage overhead and maintenance costs for indices. Furthermore, vector index structures are often bloated, resulting in substantial memory consumption. Moreover, the degradation of retrieval accuracy caused by distortion from compression techniques remains an unresolved challenge. This study proposes a framework based on weight residual vector quantization (WRVQ). This method achieves efficient compression and storage with very low distortion by decoupling the quantization direction from the residual magnitude. It stores the residual direction as a unit vector and appends a weight marker. For indexing, a three-layer inverted index structure tailored to the characteristics of WRVQ is designed, comprising an exact match layer, a fuzzy match layer, and a search layer. This structure organically integrates asymmetric distance computation (ADC) with nearest neighbor search techniques to realize approximate nearest neighbor (ANN) search that balances both high accuracy and high efficiency. Experimental results on large-scale datasets demonstrate that, compared to traditional low-dimensional embedding models and existing quantization methods, WRVQ achieves significant improvements across key metrics, including quantization loss, storage compression ratio, and retrieval recall. Furthermore, it exhibits considerable advantages in both index construction and query performance.

    参考文献
    相似文献
    引证文献
引用本文

江宇轩,姚俊杰,侯宇轩.权重残差向量量化: 向量压缩与分层索引结构.软件学报,2026,37(3):1104-1120

复制
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-05-06
  • 最后修改日期:2025-06-30
  • 录用日期:
  • 在线发布日期: 2025-09-02
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号