基于锚点的无监督跨模态哈希算法

doi:10.13328/j.cnki.jos.006960

微信服务号

微信订阅号

2025年6月1日 8:00 星期日

首页 > 过刊浏览>2024年第35卷第8期 >3739-3751. DOI:10.13328/j.cnki.jos.006960

PDF HTML阅读 XML下载导出引用引用提醒

基于锚点的无监督跨模态哈希算法
DOI:
                        10.13328/j.cnki.jos.006960
                    
CSTR:
                        
                    
作者:
                        胡鹏胡鹏
四川大学 计算机学院, 四川 成都 610065
在期刊界中查找
在百度中查找
在本站中查找
彭玺彭玺
四川大学 计算机学院, 四川 成都 610065
在期刊界中查找
在百度中查找
在本站中查找
彭德中彭德中
四川大学 计算机学院, 四川 成都 610065;成都瑞贝英特信息技术有限公司, 四川 成都 610094
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:胡鹏(1990－), 男, 博士, 副研究员, 博士生导师, CCF专业会员, 主要研究领域为机器学习, 多媒体分析;彭玺(1983－), 男, 博士, 教授, 博士生导师, CCF专业会员, 主要研究领域为机器学习, 多媒体分析;彭德中(1975－), 男, 博士, 教授, 博士生导师, CCF专业会员, 主要研究领域为盲信号处理, 神经网络.
通讯作者:彭玺, E-mail: pengx.gm@gmail.com
中图分类号:TP301
基金项目:国家自然科学基金(62102274, 62176171, U21B2040 U19A2078); 四川省科技计划(2021YFS0389, 2022YFQ0014, 2022YFSY0047, 2022YFH0021); 中央高校基本科研业务费专项资金(YJ202140); 中国博士后科学基金(2021M692270)

Anchor-based Unsupervised Cross-modal Hashing

Author:

HU Peng
HU Peng
College of Computer Science, Sichuan University, Chengdu 610065, China
在期刊界中查找
在百度中查找
在本站中查找
PENG Xi
PENG Xi
College of Computer Science, Sichuan University, Chengdu 610065, China
在期刊界中查找
在百度中查找
在本站中查找
PENG De-Zhong
PENG De-Zhong
College of Computer Science, Sichuan University, Chengdu 610065, China;Chengdu Ruibei Yingte Information Technology Co. Ltd., Chengdu 610094, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

基于图的无监督跨模态哈希学习具有存储空间小、检索效率高等优点, 受到学术界和工业界的广泛关注, 已成为跨模态检索不可或缺的工具之一. 然而, 图构造的高计算复杂度阻碍其应用于大规模多模态应用. 主要尝试解决基于图的无监督跨模态哈希学习面临的两个重要挑战: 1)在无监督跨模态哈希学习中如何高效地构建图? 2)如何解决跨模态哈希学习中的离散值优化问题? 针对这两个问题, 分别提出基于锚点图的跨模态学习和可微分哈希层. 具体地, 首先从训练集中随机地选择若干图文对作为锚点集, 利用该锚点集作为中介计算每批数据的图矩阵, 以该图矩阵指导跨模态哈希学习, 从而能极大地降低空间与时间开销; 其次, 提出的可微分哈希层可在网络前向传播时直接由二值编码计算, 在反向传播时亦可产生梯度进行网络更新, 而无需连续值松弛, 从而具有更好的哈希编码效果; 最后, 引入跨模态排序损失, 使得在训练过程中考虑排序结果, 从而提升跨模态检索正确率. 通过在3个通用数据集上与10种跨模态哈希算法进行对比, 验证了提出算法的有效性.

关键词:无监督哈希学习;跨模态检索;锚点图;可微分哈希;公共汉明空间

Abstract:

Thanks to the low storage cost and high retrieval speed, graph-based unsupervised cross-modal hash learning has attracted much attention from academic and industrial researchers and has been an indispensable tool for cross-modal retrieval. However, the high computational complexity of graph structures prevents its application in large-scale multi-modal applications. This study mainly attempts to solve two important challenges facing graph-based unsupervised cross-modal hash learning: 1) How to efficiently construct graphs in unsupervised cross-modal hash learning? 2) How to handle the discrete optimization in cross-modal hash learning? To address such two problems, this study presents anchor-based cross-modal learning and a differentiable hash layer. To be specific, the study first randomly samples some image-text pairs from the training set as anchor sets and uses the anchor sets as the agent to compute the graph matrix of each batch of data. The graph matrix is used to guide cross-modal hash learning, thus remarkably reducing the space and time cost; second, the proposed differentiable hash layer directly adopts binary coding for computation during network forward propagation and produces gradient to update the network without continuous-value relaxation during backpropagation, thus embracing better hash encoding performance. Finally, the study introduces cross-modal ranking loss to consider the ranking results in the training process and improve the cross-modal retrieval accuracy. To verify the effectiveness of the proposed algorithm, the study compares the algorithm with 10 cross-modal hash algorithms on three general data sets.

Key words:unsupervised hashing learning;cross-modal retrieval;anchor graph;differentiable hashing;common Hamming space

引用本文

胡鹏,彭玺,彭德中.基于锚点的无监督跨模态哈希算法.软件学报,2024,35(8):3739-3751

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2021-08-30
最后修改日期:2022-10-13
录用日期:
在线发布日期: 2023-09-06
出版日期: 2024-08-06

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码