Transductive Multi-Modality Video Semantic Concept Detection with Tensor Representation

微信服务号

微信订阅号

2025-4-22- 19

Home > Archive>Volume 19, Issue 11, 2008 >2853-2868

Transductive Multi-Modality Video Semantic Concept Detection with Tensor Representation
DOI:
                        
                    
Author:
                        WU FeiWU Fei

Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
LIU Ya-NanLIU Ya-Nan

Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
ZHUANG Yue-TingZHUANG Yue-Ting

Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

A higher-order tensor framework for video analysis and understanding is proposed in this paper. In this framework, image frame, audio and text are represented, which are the three modalities in video shots as data points by the 3rd-order tensor. Then a subspace embedding and dimension reduction method is proposed, which explicitly considers the manifold structure of the tensor space from temporal-sequenced associated co-occurring multimodal media data in video. It is called TensorShot approach. Transductive learning uses a large amount of unlabeled data together with the labeled data to build better classifiers. A transductive support tensor machines algorithm is proposed to train effective classifier. This algorithm preserves the intrinsic structure of the submanifold where tensorshots are sampled, and is also able to map out-of-sample data points directly. Moreover, the utilization of unlabeled data improves classification ability. Experimental results show that this method improves the performance of video semantic concept detection.

Key words:multi-modality; TensorShot; temporal associated cooccurrence (TAC); higher order SVD (HOSVD); dimensionality reduction; transductive support tensor machine (TSTM)

Get Citation

吴飞,刘亚楠,庄越挺.基于张量表示的直推式多模态视频语义概念检测.软件学报,2008,19(11):2853-2868

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:March 01,2008
Revised:August 26,2008
Adopted:
Online:
Published:

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History