基于动作图的视角无关动作识别

微信服务号

微信订阅号

2025年4月1日 18:56 星期二

首页 > 过刊浏览>2009年第20卷第10期 >2679-2691

基于动作图的视角无关动作识别
DOI:
                        
                    
CSTR:
                        
                    
作者:
                        杨跃东杨跃东
北京航空航天大学 虚拟现实技术与系统国家重点实验室,北京 100191
在期刊界中查找
在百度中查找
在本站中查找
郝爱民郝爱民
北京航空航天大学 虚拟现实技术与系统国家重点实验室,北京 100191
在期刊界中查找
在百度中查找
在本站中查找
褚庆军褚庆军
国家教育部考试中心,北京 100084
在期刊界中查找
在百度中查找
在本站中查找
赵沁平赵沁平
北京航空航天大学 虚拟现实技术与系统国家重点实验室,北京 100191
在期刊界中查找
在百度中查找
在本站中查找
王莉莉王莉莉
北京航空航天大学 虚拟现实技术与系统国家重点实验室,北京 100191
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:Supported by the National High-Tech Research and Development Plan of China under Grant Nos.2006AA01Z333, 2007AA01Z337 (国家高技术研究发展计划(863)); the China High-Tech Olympics Project under Grant No.Z0005191041211 (中国科技奥运专项)

View-Invariant Action Recognition Based on Action Graphs

Author:

YANG Yue-Dong
YANG Yue-Dong

在期刊界中查找
在百度中查找
在本站中查找
HAO Ai-Min
HAO Ai-Min

在期刊界中查找
在百度中查找
在本站中查找
CHU Qing-Jun
CHU Qing-Jun

在期刊界中查找
在百度中查找
在本站中查找
ZHAO Qin-Ping
ZHAO Qin-Ping

在期刊界中查找
在百度中查找
在本站中查找
WANG Li-Li
WANG Li-Li

在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [31]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

针对视角无关的动作识别,提出加权字典向量描述方法和动作图识别模型.将视频中的局部兴趣点特征和全局形状描述有机结合,形成加权字典向量的描述方法,该方法既具有兴趣点抗噪声强的优点,又可克服兴趣点无法识别静态动作的缺点.根据运动捕获、点云等三维运动数据构建能量曲线,提取关键姿势,生成基本运动单元,并通过自连接、向前连接和向后连接3种连接方式构成有向图,称为本质图.本质图向各个方向投影,根据节点近邻规则建立的有向图称为动作图.通过Na?ve Bayes训练动作图模型,采用Viterbi算法计算视频与动作图的匹配度,根据最大匹配度标定视频序列.动作图具有多角度投影和投影平滑过渡等特点,因此可识别任意角度、任意运动方向的视频序列.实验结果表明,该算法具有较好的识别效果,可识别单目视频、多目视频和多动作视频.

关键词:动作识别;角度无关;动作图;兴趣点;Na?ve Bayes

Abstract:

This paper proposes a weighted codebook vector representation and an action graph model for view-invariant human action recognition. A video is represented as a weighted codebook vector combining dynamic interest points and static shapes. This combined representation has strong noise robusticity and high classification performance on static actions. Several 3D key poses are extracted from the motion capture data or points cloud data, and a set of primitive motion segments are generated. A directed graph called Essential Graph is built of these segments according to self-link, forward-link and back-link. Action Graph is generated from the essential graph projected from a wide range of viewpoints. This paper uses Na?ve Bayes to train a statistical model for each node. Given an unlabeled video, Viterbi algorithm is used for computing the match score between the video and the action graph. The video is then labeled based on the maximum score. Finally, the algorithm is tested on the IXMAS dataset, and the CMU motion capture library. The experimental results demonstrate that this algorithm can recognize the view-invariant actions and achieve high recognition rates.

Key words:action recognition; view-invariant; action graph; interest point; Na?ve Bayes

参考文献

[1] Gavrila DM. The visual analysis of human movement: A survey. Computer Vision and Image Understanding, 1999,73(1):82-98.

[2] Wang L, Hu WM, Tan TN. Recent developments in human motion analysis. Pattern Recognition, 2003,36(3):585-601.

[3] Aggarwal JK, Park S. Human motion: Modeling and recognition of actions and interactions. In: Proc. of the 3D Data Processing, Visualization, and Transmission, the 2nd Int’l Symp. Washington: IEEE Computer Society, 2004. 640-647. http://ieeexplore. ieee.org/xpl/freeabs_all.jsp?arnumber=1335299

[4] Moeslund TB, Hilton A, Kruger V. A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding, 2006,104(2):90-126.

[5] Ahmad M, Lee S. Human action recognition using shape and CLG-motion flow from multi-view image sequences. Pattern Recognition, 2008,41(7):2237-2252.

[6] Ahmad M, Lee S. HMM-Based human action recognition using multiview image sequences. In: Proc. of the 18th Int’l Conf. on Pattern Recognition, Vol.01. Washington: IEEE Computer Society, 2006. 263-266. http://dx.doi.org/10.1109/ICPR.2006.630

[7] Efros AA, Berg AC, Mori G, Malik J. Recognizing action at a distance. In: Proc. of the 9th IEEE Int’l Conf. on Computer Vision, Vol.2. Washington: IEEE Computer Society, 2003. 726. http://portal.acm.org/citation.cfm?id=946720

[8] Bobick AF, Davis JW. The recognition of human movement using temporal templates. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2001,23(3):257-267.

[9] Yilmaz A, Shah M. Actions sketch: A novel action representation. In: Proc. of the 2005 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition (CVPR 2005), Vol.1-Vol.01. Washington: IEEE Computer Society, 2005. 984-989. http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?tp=&arnumber=1467373&isnumber=31472

[10] Gorelick L, Blank M, Shechtman E, Irani M, Basri R. Actions as space-time shapes. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2007,29(12):2247-2253.

[11] Laptev I. On space-time interest points. Int’l Journal of Computer Vision, 2005,64(2-3):107-123.

[12] Oikonomopoulos A, Patras I, Pantic M. Spatiotemporal salient points for visual recognition of human actions. IEEE Trans. on Systems, Man, and Cybernetics, 2006,36(3):710-719.

[13] Dollár P, Rabaud V, Cottrelln G, Belongie S. Behavior recognition via sparse spatio-temporal features. In: Proc. of the 14th Int’l Conf. on Computer Communications and Networks. Washington: IEEE Computer Society, 2005. 65-72. http://ieeexplore.ieee.org/ xpl/freeabs_all.jsp?tp=&arnumber=1570899&isnumber=33252

[14] Schuldt C, Laptev I, Caputo B. Recognizing human actions: A local SVM approach. In: Proc. of the 17th Int’l Conf. on Pattern Recognition. Washington: IEEE Computer Society, 2004. 32-36. http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1334462

[15] Wong S, Cipolla R. Extracting spatiotemporal interest points using global information. In: Proc. of the 11th IEEE Int’l Conf. on Computer Vision. Los Alamitos: IEEE Computer Society, 2007. 1-8. http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber= 4408923

[16] Niebles JC, Wang H, Li FF. Unsupervised learning of human action categories using spatial-temporal words. In: Proc. of the British Machine Vision Conf. (BMVC). The British Machine Vision Association, 2006. http://citeseerx.ist.psu.edu/viewdoc/summary?doi= 10.1.1.83.8353

[17] Wong S, Kim T, Cipolla R. Learning motion categories using both semantic and structural information. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2007. 1-6. http://ieeexplore.ieee.org/xpl/ freeabs_all.jsp?arnumber=4270330

[18] Niebles JC, Wang H, Li FF. Unsupervised learning of human action categories using spatial-temporal words. Int’l Journal of Computer Vision, 2008,79(3):299-318.

[19] Ramanan D, Forsyth DA. Automatic annotation of everyday movements. Technical Report, CSD-03-1262, UC Berkeley, 2003.

[20] Ikizler N, Forsyth D. Searching video for complex activities with finite state models. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2007. 1-8. http://ieeexplore.ieee.org/xpl/freeabs_all.jsp? arnumber=4270193

[21] Weinland D, Ronfard R, Boyer E. Automatic discovery of action taxonomies from multiple views. In: Proc. of the 2006 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2006. 1639-1645. http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1640952

[22] Weinland D, Ronfard R, Boyer E. Free viewpoint action recognition using motion history volumes. Computer Vision and Image Understanding, 2006,104(2):249-257.

[23] Parameswaran V, Chellappa R. View invariance for human action recognition. Int’l Journal of Computer Vision, 2006,66(1): 83-101.

[24] Huang FY, Xu GY. Viewpoint independent action recognition. Journal of Software, 2008,19(7):1623-1634 (in Chinese with English abstract). http://www.jos.org.cn/ 1000-9825/19/1623.htm

[25] Ogale AS, Karapurkar A, Aloimonos Y. View invariant modeling and recognition of human actions using grammars. In: Proc. of the Int’l Conf. on Computer Vision, Workshop on Dynamical Vision (ICCV-WDM). Berlin, Heidelberg: Springer-Verlag, 2005. http://dx.doi.org/10.1007/978-3-540-70932-9_9

[26] Weinland D, Boyer E, Ronfard R. Action recognition from arbitrary views using 3D exemplars. In: Proc. of the 11th IEEE Int’l Conf. on Computer Vision. Los Alamitos: IEEE Computer Society, 2007. 1-7. http://ieeexplore.ieee.org/xpl/freeabs_all.jsp? arnumber=4408849

[27] Lv F, Nevatia R. Single view human action recognition using key pose matching and Viterbi path searching. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2007. 1-8. http://ieeexplore.ieee.org/ xpl/freeabs_all.jsp?tp=&arnumber=4270156&isnumber=4269956

[28] Zivkovic Z, Heijden F. Efficient adaptive density estimation per image pixel for the task of background subtraction. Pattern Recognition Letters, 2006,27(7):773-780.

[29] Zivkovic Z. Improved adaptive Gaussian mixture model for background subtraction. In: Proc. of the 17th Int’l Conf. on Pattern Recognition. Washington: IEEE Computer Society, 2004. 28-31. http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1333992

[30] Horprasert T, Harwood D, Davis LS. A statistical approach for real-time robust background subtraction and shadow detection. In: Proc. of the IEEE ICCV Frame-Rate Workshop. 1999. 1-19. http://www.citeulike.org/user/nob/article/1402206

附中文参考文献: [24] 黄飞跃,徐光祐.视角无关的动作识别.软件学报,2008,19(7):1623-1634. http://www.jos.org.cn/1000-9825/19/1623.htm

引用本文

杨跃东,郝爱民,褚庆军,赵沁平,王莉莉.基于动作图的视角无关动作识别.软件学报,2009,20(10):2679-2691

复制

文章指标

点击次数:5319
下载次数: 8486
HTML阅读次数: 0
引用次数: 0

历史

收稿日期:2008-07-31
最后修改日期:2009-06-09
录用日期:
在线发布日期:
出版日期:

微信服务号

微信订阅号

引用本文

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码