国家自然科学基金(61966004, 61866004); 广西自然科学基金(2019GXNSFDA245018)
大多数跨模态哈希检索方法仅使用余弦相似度进行特征匹配, 计算方式过于单一, 没有考虑到实例的关系对于性能的影响. 为此, 提出一种基于多重实例关系图推理的方法, 通过构造相似度矩阵, 建立全局和局部的实例关系图, 充分挖掘实例之间的细粒度关系. 在多重实例关系图的基础上进行相似度推理, 首先分别进行图像模态和文本模态关系图内部的推理, 然后将模态内的关系映射到实例图中进行推理, 最后执行实例图内部的推理. 此外, 为了适应图像和文本两种模态的特点, 使用分步训练策略训练神经网络. 在MIRFlickr和NUS-WIDE数据集上实验表明, 提出的方法在mAP指标上具有很明显的优势, 在Top-k-Precision曲线上也获得良好的效果. 这也说明所提方法对实例关系进行深入挖掘, 从而显著地提升检索性能.
Most cross-modal hash retrieval methods only use cosine similarity for feature matching, employ one single calculation method, and do not take into account the impact of instance relations on performance. For this reason, the study proposes a novel method based on reasoning in multiple instance relation graphs. Global and local instance relation graphs are generated by constructing similarity matrices to fully explore the fine-grained relations among the instances. Similarity reasoning is then conducted on the basis of the multiple instance relation graphs. For this purpose, reasoning is performed within the relation graphs in the image and text modalities, respectively. Then, the relations within each modality are mapped to the instance graphs for reasoning. Finally, reasoning within the instance graphs is performed. Furthermore, the neural network is trained by a step-by-step training strategy to adapt to the features of the image and text modalities. Experiments on the MIRFlickr and NUS-WIDE datasets demonstrate that the proposed method has distinct advantages in the metric mean average precision (mAP) and obtains a favorable Top-k-Precision curve. This also indicates that the proposed method deeply explores instance relations and thereby significantly improves the retrieval performance.