基于动作图的视角无关动作识别
作者:
基金项目:

Supported by the National High-Tech Research and Development Plan of China under Grant Nos.2006AA01Z333, 2007AA01Z337 (国家高技术研究发展计划(863)); the China High-Tech Olympics Project under Grant No.Z0005191041211 (中国科技奥运专项)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [31]
  • |
  • 相似文献 [20]
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    针对视角无关的动作识别,提出加权字典向量描述方法和动作图识别模型.将视频中的局部兴趣点特征和全局形状描述有机结合,形成加权字典向量的描述方法,该方法既具有兴趣点抗噪声强的优点,又可克服兴趣点无法识别静态动作的缺点.根据运动捕获、点云等三维运动数据构建能量曲线,提取关键姿势,生成基本运动单元,并通过自连接、向前连接和向后连接3种连接方式构成有向图,称为本质图.本质图向各个方向投影,根据节点近邻规则建立的有向图称为动作图.通过Na?ve Bayes训练动作图模型,采用Viterbi算法计算视频与动作图的匹配度,根据最大匹配度标定视频序列.动作图具有多角度投影和投影平滑过渡等特点,因此可识别任意角度、任意运动方向的视频序列.实验结果表明,该算法具有较好的识别效果,可识别单目视频、多目视频和多动作视频.

    Abstract:

    This paper proposes a weighted codebook vector representation and an action graph model for view-invariant human action recognition. A video is represented as a weighted codebook vector combining dynamic interest points and static shapes. This combined representation has strong noise robusticity and high classification performance on static actions. Several 3D key poses are extracted from the motion capture data or points cloud data, and a set of primitive motion segments are generated. A directed graph called Essential Graph is built of these segments according to self-link, forward-link and back-link. Action Graph is generated from the essential graph projected from a wide range of viewpoints. This paper uses Na?ve Bayes to train a statistical model for each node. Given an unlabeled video, Viterbi algorithm is used for computing the match score between the video and the action graph. The video is then labeled based on the maximum score. Finally, the algorithm is tested on the IXMAS dataset, and the CMU motion capture library. The experimental results demonstrate that this algorithm can recognize the view-invariant actions and achieve high recognition rates.

    参考文献
    [1] Gavrila DM. The visual analysis of human movement: A survey. Computer Vision and Image Understanding, 1999,73(1):82-98.
    [2] Wang L, Hu WM, Tan TN. Recent developments in human motion analysis. Pattern Recognition, 2003,36(3):585-601.
    [3] Aggarwal JK, Park S. Human motion: Modeling and recognition of actions and interactions. In: Proc. of the 3D Data Processing, Visualization, and Transmission, the 2nd Int’l Symp. Washington: IEEE Computer Society, 2004. 640-647. http://ieeexplore. ieee.org/xpl/freeabs_all.jsp?arnumber=1335299
    [4] Moeslund TB, Hilton A, Kruger V. A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding, 2006,104(2):90-126.
    [5] Ahmad M, Lee S. Human action recognition using shape and CLG-motion flow from multi-view image sequences. Pattern Recognition, 2008,41(7):2237-2252.
    [6] Ahmad M, Lee S. HMM-Based human action recognition using multiview image sequences. In: Proc. of the 18th Int’l Conf. on Pattern Recognition, Vol.01. Washington: IEEE Computer Society, 2006. 263-266. http://dx.doi.org/10.1109/ICPR.2006.630
    [7] Efros AA, Berg AC, Mori G, Malik J. Recognizing action at a distance. In: Proc. of the 9th IEEE Int’l Conf. on Computer Vision, Vol.2. Washington: IEEE Computer Society, 2003. 726. http://portal.acm.org/citation.cfm?id=946720
    [8] Bobick AF, Davis JW. The recognition of human movement using temporal templates. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2001,23(3):257-267.
    [9] Yilmaz A, Shah M. Actions sketch: A novel action representation. In: Proc. of the 2005 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition (CVPR 2005), Vol.1-Vol.01. Washington: IEEE Computer Society, 2005. 984-989. http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?tp=&arnumber=1467373&isnumber=31472
    [10] Gorelick L, Blank M, Shechtman E, Irani M, Basri R. Actions as space-time shapes. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2007,29(12):2247-2253.
    [11] Laptev I. On space-time interest points. Int’l Journal of Computer Vision, 2005,64(2-3):107-123.
    [12] Oikonomopoulos A, Patras I, Pantic M. Spatiotemporal salient points for visual recognition of human actions. IEEE Trans. on Systems, Man, and Cybernetics, 2006,36(3):710-719.
    [13] Dollár P, Rabaud V, Cottrelln G, Belongie S. Behavior recognition via sparse spatio-temporal features. In: Proc. of the 14th Int’l Conf. on Computer Communications and Networks. Washington: IEEE Computer Society, 2005. 65-72. http://ieeexplore.ieee.org/ xpl/freeabs_all.jsp?tp=&arnumber=1570899&isnumber=33252
    [14] Schuldt C, Laptev I, Caputo B. Recognizing human actions: A local SVM approach. In: Proc. of the 17th Int’l Conf. on Pattern Recognition. Washington: IEEE Computer Society, 2004. 32-36. http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1334462
    [15] Wong S, Cipolla R. Extracting spatiotemporal interest points using global information. In: Proc. of the 11th IEEE Int’l Conf. on Computer Vision. Los Alamitos: IEEE Computer Society, 2007. 1-8. http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber= 4408923
    [16] Niebles JC, Wang H, Li FF. Unsupervised learning of human action categories using spatial-temporal words. In: Proc. of the British Machine Vision Conf. (BMVC). The British Machine Vision Association, 2006. http://citeseerx.ist.psu.edu/viewdoc/summary?doi= 10.1.1.83.8353
    [17] Wong S, Kim T, Cipolla R. Learning motion categories using both semantic and structural information. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2007. 1-6. http://ieeexplore.ieee.org/xpl/ freeabs_all.jsp?arnumber=4270330
    [18] Niebles JC, Wang H, Li FF. Unsupervised learning of human action categories using spatial-temporal words. Int’l Journal of Computer Vision, 2008,79(3):299-318.
    [19] Ramanan D, Forsyth DA. Automatic annotation of everyday movements. Technical Report, CSD-03-1262, UC Berkeley, 2003.
    [20] Ikizler N, Forsyth D. Searching video for complex activities with finite state models. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2007. 1-8. http://ieeexplore.ieee.org/xpl/freeabs_all.jsp? arnumber=4270193
    [21] Weinland D, Ronfard R, Boyer E. Automatic discovery of action taxonomies from multiple views. In: Proc. of the 2006 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2006. 1639-1645. http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1640952
    [22] Weinland D, Ronfard R, Boyer E. Free viewpoint action recognition using motion history volumes. Computer Vision and Image Understanding, 2006,104(2):249-257.
    [23] Parameswaran V, Chellappa R. View invariance for human action recognition. Int’l Journal of Computer Vision, 2006,66(1): 83-101.
    [24] Huang FY, Xu GY. Viewpoint independent action recognition. Journal of Software, 2008,19(7):1623-1634 (in Chinese with English abstract). http://www.jos.org.cn/ 1000-9825/19/1623.htm
    [25] Ogale AS, Karapurkar A, Aloimonos Y. View invariant modeling and recognition of human actions using grammars. In: Proc. of the Int’l Conf. on Computer Vision, Workshop on Dynamical Vision (ICCV-WDM). Berlin, Heidelberg: Springer-Verlag, 2005. http://dx.doi.org/10.1007/978-3-540-70932-9_9
    [26] Weinland D, Boyer E, Ronfard R. Action recognition from arbitrary views using 3D exemplars. In: Proc. of the 11th IEEE Int’l Conf. on Computer Vision. Los Alamitos: IEEE Computer Society, 2007. 1-7. http://ieeexplore.ieee.org/xpl/freeabs_all.jsp? arnumber=4408849
    [27] Lv F, Nevatia R. Single view human action recognition using key pose matching and Viterbi path searching. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2007. 1-8. http://ieeexplore.ieee.org/ xpl/freeabs_all.jsp?tp=&arnumber=4270156&isnumber=4269956
    [28] Zivkovic Z, Heijden F. Efficient adaptive density estimation per image pixel for the task of background subtraction. Pattern Recognition Letters, 2006,27(7):773-780.
    [29] Zivkovic Z. Improved adaptive Gaussian mixture model for background subtraction. In: Proc. of the 17th Int’l Conf. on Pattern Recognition. Washington: IEEE Computer Society, 2004. 28-31. http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1333992
    [30] Horprasert T, Harwood D, Davis LS. A statistical approach for real-time robust background subtraction and shadow detection. In: Proc. of the IEEE ICCV Frame-Rate Workshop. 1999. 1-19. http://www.citeulike.org/user/nob/article/1402206
    附中文参考文献: [24] 黄飞跃,徐光祐.视角无关的动作识别.软件学报,2008,19(7):1623-1634. http://www.jos.org.cn/1000-9825/19/1623.htm
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

杨跃东,郝爱民,褚庆军,赵沁平,王莉莉.基于动作图的视角无关动作识别.软件学报,2009,20(10):2679-2691

复制
分享
文章指标
  • 点击次数:5319
  • 下载次数: 8486
  • HTML阅读次数: 0
  • 引用次数: 0
历史
  • 收稿日期:2008-07-31
  • 最后修改日期:2009-06-09
文章二维码
您是第19765674位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号