Cross-media Deep Fine-grained Correlation Learning

doi:10.13328/j.cnki.jos.005664

微信服务号

微信订阅号

Home > Archive>Volume 30, Issue 4, 2019 >884-895. DOI:10.13328/j.cnki.jos.005664

PDF HTML XML Export Cite reminder

Cross-media Deep Fine-grained Correlation Learning
DOI:
                        10.13328/j.cnki.jos.005664
                    
Author:
                        
                        
                    
Affiliation:
Clc Number:
Fund Project:National Natural Science Foundation of China (61771025, 61532005)

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

With the rapid development of the Internet and multimedia technology, data on the Internet is expanded from only text to image, video, text, audio, 3D model, and other media types, which makes cross-media retrieval become a new trend of information retrieval. However, the "heterogeneity gap" leads to inconsistent representations of different media types, and it is hard to measure the similarity between the data of any two kinds of media, which makes it quite challenging to realize cross-media retrieval across multiple media types. With the recent advances of deep learning, it is hopeful to break the boundaries between different media types with the strong learning ability of deep neural network. But most existing deep learning based methods mainly focus on the pairwise correlation between two media types as image and text, and it is difficult to extend them to multi-media scenario. To address the above problem, Deep Fine-grained Correlation Learning (DFCL) approach is proposed, which can support cross-media retrieval with up to five media types (image, video, text, audio, and 3D model). First, cross-media recurrent neural network is proposed to jointly model the fine-grained information of up to five media types, which can fully exploit the internal details and context information of different media types. Second, cross-media joint correlation loss is proposed, which combines distribution alignment and semantic alignment to exploit both intra-media and inter-media fine-grained correlation, while it can further enhance the semantic discrimination capability by semantic category information, aiming to promote the accuracy of cross-media retrieval effectively. Extensive experiments on 2 cross-media datasets are conducted, namely PKU XMedia and PKU XMediaNet datasets, which contain up to five media types. The experimental results verify the effectiveness of the proposed approach.

Reference

Cited by

Get Citation

卓昀侃,綦金玮,彭宇新.跨媒体深层细粒度关联学习方法.软件学报,2019,30(4):884-895

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:April 16,2018
Revised:June 13,2018
Adopted:
Online: April 01,2019
Published:

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

Article Metrics

History