动态手势理解与交互综述
作者:
作者简介:

张维(1982-),男,硕士,主要研究领域为人机交互,手势识别.
林泽一(1995-),男,硕士,主要研究领域为人机交互,动态手势识别.
程坚(1996-),男,硕士,主要研究领域为计算机视觉.
柯铭雨(1997-),男,硕士,主要研究领域为手势交互、分类、重建.
邓小明(1980-),男,博士,研究员,CCF高级会员,主要研究领域为计算机视觉,人机交互.
王宏安(1963-),男,博士,研究员,博士生导师,主要研究领域为自然人机交互,实时智能计算.

通讯作者:

邓小明,E-mail:xiaoming@iscas.ac.cn;王宏安,hongan@iscas.ac.cn

基金项目:

国家重点研发计划(2018YFC0809300)


Survey of Dynamic Hand Gesture Understanding and Interaction
Author:
Fund Project:

National Key Research and Development Project of China (2018YFC0809300)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [98]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    近年来,手势作为一种输入通道,已在人机交互、虚拟现实等领域得到了广泛的应用,引起了研究者的关注.特别是随着先进人机交互技术的出现以及计算机技术(特别是深度学习、GPU并行计算等)的飞速发展,手势理解和交互方法取得了突破性的成果,引发了研究的热潮.综述了动态手势理解与交互的研究进展与典型应用:首先阐述手势交互的核心概念,分析了动态手势识别与检测进展;而后阐述了动态手势交互在人机交互中的代表性应用,并总结了手势交互现状,分析了下一步的发展趋势.

    Abstract:

    In recent years, hand gesture has been widely used in human-computer interaction, virtual reality, and other fields as an input channel. Especially, with the emergence of advanced technology of human-computer interaction and the rapid development of computer technology (such as deep learning, GPU, parallel computation technology, etc.), gesture understanding and interaction methods have made breakthroughs. This paper reviews the research progress of dynamic gesture understanding and typical interaction applications. Firstly, the core concepts of gesture interactions are elaborated. Secondly, the progress of dynamic gesture recognition and detection is introduced. Thirdly, the representative applications of dynamic gesture interaction are elaborated. Finally, the future development trend of gesture interaction is discussed.

    参考文献
    [1] Wachs JP, Kölsch M, Stern H, Edan Y, Vision-based hand-gesture applications. Communications of the ACM, 2011,54(2):60-71.
    [2] Xia SH, Gao L, Lai YK, Yuan MZ, Chai JX. A survey on human performance capture and animation. Journal of Computer Science and Technology, 2017,32(3):536-554.
    [3] Gartner 2017. Gartner's top 10 strategic technology trends for 2017. https://www.gartner.com/smarterwithgartner/gartners-top-10-technology-trends-2017/
    [4] Zhang FJ, Dai GZ, Peng XL. A survey on human-computer interaction in virtual reality. Scientia Sinica Informationis, 2016,46(12):1711-1736(in Chinese with English abstract).
    [5] Huang J, Han DQ, Chen YN, Tian F, Wang HA, Dai GZ. A survey on human-computer interaction in mixed reality. Journal of Computer-Aided Design & Computer Graphics, 2016,28(6):869-880(in Chinese with English abstract).
    [6] Yu HC, Yang XD, Zhang YW, Zhong X, Chen YQ. A review on the recognition of mid-air gestures. Science & Technology Review, 2017,35(16):64-73(in Chinese with English abstract).
    [7] Guo XH, Wang J, Xu GH. The latest progress in the research of hand function rehabilitation robot. Chinese Journal of Rehabilitation Medicine, 2017,32(2):235-240(in Chinese with English abstract).
    [8] Vuletic T, Duffy A, Hay L, McTeague C, Campbell G, Grealy M. Systematic literature review of hand gestures used in human computer interaction interfaces. Int'l Journal of Human-Computer Studies, 2019,129:74-94.
    [9] Zhu YJ, Li CP, Ma WL, Xia SH, Zhang TL, Wang ZQ. Interaction feature modeling of virtual object in inmmersive virtual assembly. Journal of Computer Research and Development, 2011,48(7):1298-1306(in Chinese with English abstract).
    [10] Xu YH, Li JR. Research and implementation of virtual hand interaction in virtual mechanical assembly. Machinery, Design & Manufacture, 2014,5:262-266(in Chinese with English abstract).
    [11] Wu HY, Zhang FJ, Liu YJ, Dai GZ. Research on key issues of vision-based gesture interfaces. Chinese Journal of Computers, 2019, 32(10):2030-2041(in Chinese with English abstract).
    [12] Ren P, Zhou MQ, Fan YC, Qian L, Shui WY. A rapid ancient architecture modeling method facing the gesture interaction. Trans. of Beijing Institute of Technology, 2018,38(4):412-416, 436(in Chinese with English abstract).
    [13] Wang XH, Hua W, Bao HJ. Design and development of a gesture-based interaction system for multi-projector tiled display wall. Journal of Computer-Aided Design & Computer Graphics, 2007,19(3):318-322, 328(in Chinese with English abstract).
    [14] Weichert F, Bachmann D, Rudak B, Fisseler D. Analysis of the accuracy and robustness of the leap motion controller. Sensors, 2013,13(5):6380-6393.
    [15] Oikonomidis I, Kyriazis N, Argyros AA. Efficient model-based 3D tracking of hand articulations using Kinect. British Machine Vision Conference (BMVC), 2011,3(1).
    [16] Romero J, Tzionas D, Black JM. Embodied hands:Modeling and capturing hands and bodies together. In:Proc. of the SIGGRAPH Asia 2017.2017.
    [17] Tompson J, Stein M, LeCun Y, Perlin K. Real-time continuous pose recovery of human hands using convolutional networks. ACM Trans. on Graph, 2014,33(5):1-10.
    [18] Oberweger M, Lepetit V. Improving fast and accurate 3D hand pose estimation. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2017.585-594.
    [19] Oberweger M, Wohlhart P, Lepetit V. Training a feed back loop for hand pose estimation. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2015.3316-3324.
    [20] Ge LH, Liang H, Yuan JS, Thalmann D. Robust 3D hand pose estimation in single depth images:From single-view CNN to multi-view CNNs. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2016.3593-3601.
    [21] Ge LH, Cai YJ, Weng GW, Yuan JS. Hand pointnet:3D hand pose estimation using point sets. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018.8417-8426.
    [22] Moon GS, Chang YJ, Lee KM. V2v-Posenet:Voxel-to-voxel prediction network for accurate 3D hand and human pose estimation from a single depth map. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018.5080-5088.
    [23] Rad M, Oberweger M, Lepetit V. Feature mapping for learing fast and accurate 3D pose inference from synthetic images. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018.4663-4672.
    [24] Poier G, Schinagl D, Bischof H. Learning pose specific representations by predicting different views. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018.60-69.
    [25] Baek SR, Kim KI, Kim TK. Augmented skeleton space transfer for depth-based hand pose estimation. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018.8330-8339.
    [26] Dibra E, Wolf T, Oztireli C, Gross M. How to refine 3D hand pose estimation from unlabelled depth data? In:Proc. of the 2017 Int'l Conf. on 3D Vision (3DV). IEEE, 2017.135-144.
    [27] Wan CD, Probst T, Van Gool L, Yao A. Dense 3D regression for hand pose estimation. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018.5147-5156.
    [28] Ye Q, Yuan S, Kim TK. Spatial attention deep net with partial PSO for hierarchical hyrid hand pose estimation. In:Proc. of the European Conf. on Computer Vision. Springer-Verlag, 2016.346-361.
    [29] Malik J, Elhayek A, Nummari F, Varanasi K, Tamaddon K, Heloir A, Stricker D. DeepHPS:End-to-end estimation of 3D hand pose and shape by learning from synthetic depth. In:Proc. of the 2018 Int'l Conf. on 3D Vision (3DV). Verona, 2018.110-119.[doi:10.1109/3DV.2018.00023]
    [30] Zimmermann C, Brox T. Learning to estimate 3D hand pose from single RGB images. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017.
    [31] Simon T, Joo H, Matthews I, Sheikh Y. Hand keypoint detection in single images using multiview bootstrapping. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017.1145-1153.
    [32] Mueller F, Bernard F, Sotnychenko O, Mehta D, Sridhar S, Casas D, Theobalt C. GANerated hands for real-time 3D hand tracking from monocular RGB. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018.49-59.
    [33] Spurr A, Song J, Park S, Hilliges O. Cross-modal deel variational hand pose estimation. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018.89-98.
    [34] Lien J, Gillian N, Karagozler ME, Amyhood P, Schwesig C, Olson E, Raja H, Poupyrev I. Soli:Ubiquitous gesture sensing with millimeter wave radar. ACM Trans. on Graphics, 2016,35(4):1-19.
    [35] Nymoen K, Haugen MR, Jensenius AR. MuMYO-Evaluating and exploring the MYO armband for musical interaction. In:Proc. of the Int'l Conf. on New Interfaces for Musical Expression. The School of Music and the Center for Computation and Technology (CCT), Louisiana State University, 2015.
    [36] Han SC, Liu BB, Wang R, Ye YT, Twigg CD, Chen K. Online optical marker-based hand tracking with deep labels. ACM Trans. on Graphics, 2018,34(7):166.
    [37] Ng CW, Ranganath S. Real-time gesture recognition system and application. Image and Vision Computing, 2002,20:993-1007.
    [38] Cheng H, Yang L, Liu ZC. A survey on 3D hand gesture recognition. IEEE Trans. on Circuits and Systems for Video Technology, 2015,26:1.
    [39] Elmezain M, Al-Hamadi A, Appenrodt J, Michaelis B. A hidden markov model-based continuous gesture recognition system for hand motion trajectory. In:Proc. of the 19th Int'l Conf. on Pattern Recognition. 2008.1-4.
    [40] Keskin C, Cemgil AT, Akarun L. DTW based clustering to improve hand gesture recognition. In:Proc. of the Human Behavior Understanding. 2011.72-81.
    [41] Arici T, Celebi S, Aydin AS, Temiz TT, Robust gesture recognition using feature pre-processing and weighted dynamic time warping. Multimedia Tools Application, 2014,72(3):3045-3062.
    [42] Reyes M, Dominguez G, Escalera S. Feature weighting in dynamic time warping for gesture recognition in depth data. In:Proc. of the IEEE Int'l Conf. on Computer Vision Workshops. 2011.1182-1188.
    [43] Dong C, Leu MC, Yin Z. American sign language alphabet recognition using Microsoft Kinect. In:Proc. of the 2015 IEEE Conf. on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, 2015.
    [44] Song Y, Demirdjian D, Davis R. Tracking body and hands for gesture recognition:Natops aircraft handling signals database. In:Proc. of the 9th IEEE Int'l Conf. on Automatic Face and Gesture Recognition (FG 2011). Santa Barbara, 2011.
    [45] Molchanov P, Yang X, Gupta S, Kim KW, Tyree S, Kautz J. Online detection and classification of dynamic hand gestures with recurrent 3D convolutional neural networks. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). IEEE, 2016.
    [46] Tran D, Bourdev L, Fergus R, Torresani L, Paluri M. Learning spatiotemporal features with 3D convolutional networks. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2015.4489-4497.
    [47] Camgoz NC, Hadfield S, Koller O, Bowden R. SubUNets:End-to-end hand shape and continuous sign language recognition. In:Proc. of the IEEE Int'l Conf. on Computer Vision (ICCV). 2017.
    [48] Cui RP, Liu H, Zhang CS. Recurrent convolutional neural networks for continuous sign language recognition by staged optimization. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017.1610-1618.
    [49] Cao CQ, Zhang YF, Wu Y, Lu HQ, Cheng J. Egocentric gesture recognition using recurrent 3d convolutional neural networks with spatiotemporal transformer modules. In:Proc. of the IEEE Int'l Conf. on Computer Vision (ICCV). IEEE Computer Society, 2017.
    [50] Narayana P, Beveridge RJ, Draper BA. Gesture recognition:Focus on the hands. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018.5235-5244.
    [51] Bambach S, Lee S, Crandall DJ, Chen Y. Lending a hand:Detecting hands and recognizing activities in complex egocentric interactions. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2015.1949-1957.
    [52] Rogez G, Supancic JS, Ramanan D. Understanding everyday hands in action from RGB-D images. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2015.3889-3897.
    [53] Joshi A, Ghosh S, Betke M, Sclaroff S, Pfister H. Personalizing gesture recognition using hierarchical bayesian neural networks. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017.455-464.
    [54] Hu ZX, Hu YM, Liu J, Wu B, Han DM, Kurfess T. 3D separable convolutional neural network for dynamic hand gesture recognition. Neurocomputing, 2018,318:151-161.
    [55] Sánchez J, Perronnin F, Mensink T, Verbeek J. Image classification with the fisher vector:Theory and practice. Int'l Journal of Computer Vision, 2013,105(3):222-245.
    [56] Smedt DQ, Wannous H, Vandeborre JP. Heterogeneous hand gesture recognition using 3D dynamic skeletal data. In:Proc. of the Computer Vision and Image Understanding. 2019.
    [57] Zhao D, Liu Y, Li GC. Skeleton-based dynamic hand gesture recognition using 3D depth data. In:Proc. of the Electronic Imaging. 2018.
    [58] Boulahia SY, Anquetil E, Multon F, Kulpa R. Dynamic hand gesture recognition based on 3D pattern assembled trajectories. In:Proc. of the 7th Int'l Conf. on Image Processing Theory, Tools and Applications (IPTA). IEEE, 2017.1-6.
    [59] Devineau G, Moutarde F, Xi W, Yang J. Deep learning for hand gesture recognition on skeletal data. In:Proc. of the 13th IEEE Int'l Conf. on Automatic Face & Gesture Recognition (FG 2018). 2018.106-113.
    [60] Chen XH, Wang GJ, Guo HK, Zhang CR, Wang H, Zhang L. Motion feature augmented recurrent neural network for skeleton-based dynamic hand gesture recognition. In:Proc. of the IEEE Int'l Conf. on Image Processing. 2017.
    [61] Hou JX, Wang GJ, Chen XH, Xue JH, Zhu R, Yang HZ. Spatial-temporal attention RES-TCN for skeleton-based dynamic hand gesture recognition. In:Proc. of the European Conf. on Computer Vision. Cham:Springer-Verlag, 2018.
    [62] Avola D, Bernardi M, Cinque L, Foresti LG, Massaroni C. Exploiting recurrent neural networks and leap motion controller for the recognition of sign language and semaphoric hand gestures. IEEE Trans. on Multimedia, 2018,21(1):234-245.
    [63] Pisharady PK, Saerbeck M. Recent methods and databases in vision-based hand gesture recognition:A review. Computer Vision & Image Understanding, 2015,141(C):152-165.
    [64] Guyon I, Athitsos V, Jangyodsuk P, Escalante HJ. The ChaLearn gesture dataset (CGD 2011). Machine Vision and Applications, 2014,25(8):1929-1951.
    [65] Escalera S, Gonzàlez J, Baró X, Reyes M, Lopes O, Guyon I, Athistos V, Escalante HJ. Multi-modal gesture recognition challenge 2013:Dataset and results. In:Proc. of the ACM Int'l Conf. on Multimodal Interaction. 2013.
    [66] Triesch J, Christoph M. Robust classification of hand postures against complex backgrounds. In:Proc. of the 2nd Int'l Conf. on Automatic Face and Gesture Recognition. 1996.170-175.
    [67] Triesch J, Christoph M. A system for person-independent hand posture recognition against complex backgrounds. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2001,23(12):1449-1453.
    [68] Marcel S, Bernier O. Hand posture recognition in a body-face centered space. In:Proc. of the Conf. on Human Factors in Computer Systems (CHI). 1999.
    [69] Marcel S, Bernier O, Viallet JE, Collobert D. Hand gesture recognition using input/ouput hidden markov models. In:Proc. of the 4th Int'l Conf. on Automatic Face and Gesture Recognition (AFGR). 2000.
    [70] Smedt QD, Wannous H, Vandeborre JP. Skeleton-based dynamic hand gesture recognition. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition Workshops (CVPRW). 2016.
    [71] Smedt QD, Wannous H, Vandeborre JP, Guerry J, Bertrand LS, Filliat D. SHREC 2017 track:3D hand gesture recognition using a depth and skeletal dataset. In:Proc. of the 10th Eurographics Workshop on 3D Object Retrieval. 2017.
    [72] Zhang YF, Cao CQ, Cheng J, Lu HQ. EgoGesture:A new dataset and benchmark for egocentric hand gesture recognition. IEEE Trans. on Multimedia (T-MM), 2018,20(5):1038-1050.
    [73] Garcia-Hernando G, Yuan SX, Baek SR, Kim TK. First-person hand action benchmark with RGB-D videos and 3D hand pose annotations. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017.
    [74] Bullock IM, Feix T, Dollar AM. The Yale human grasping dataset:Grasp, object, and task data in household and machine shop environments. The Int'l Journal of Robotics Research, 2015,34(3):251-255.
    [75] Zhang YP, Han T, Ren ZM, Umetani N, Tong X, Liu Y, Shiratori T, Cao X. BodyAvatar:Creating freeform 3d avatars using first-person body gestures. In:Proc. of the 26th Annual ACM Symp. on User Interface Software and Technology. 2013.387-396.
    [76] Pfeil KP, Koh SL, LaViola JJ. Exploring 3D gesture metaphors for interaction with unmanned aerial vehicles. In:Proc. of the 2013 Int'l Conf. on Intelligent User Interfaces. 2013.257-266.
    [77] Vinayak K, Ramani K. Extracting hand grasp and motion for intent expression in mid-air shape deformation:A concrete and iterative exploration through a virtual pottery application. Computers & Graphics, 2016,55:143-156.
    [78] Hilliges O, Kim D, Izadi S, Weiss M, Wilson DA. HoloDesk:Direct 3D interactions with a situated see-through display. In:Proc. of the SIGCHI Conf. on Human Factors in Computing Systems. 2012.2421-2430.
    [79] Colaço A, Kirmani A, Yang HS. Mime:Compact, low-power 3D gesture sensing for interaction with head-mounted displays. In:Proc. of the 26th Annual ACM Symp. on User Interface Software and Technology. 2013.227-236.
    [80] Liang H, Wang J, Sun Q, Liu YJ, Yuan JS, Luo J, He Y. Barehanded music:Real-time hand interaction for virtual piano. In:Proc. of the 20th Acm Siggraph Symp. on Interactive 3D Graphics and Games. 2016.87-94.
    [81] Kim, YK, Bae SH. SketchingWithHands:3D sketching handheld products with first-person hand posture. In:Proc. of the 29th Annual Symp. on User Interface Software and Technology. ACM, 2016.797-808.
    [82] Yi X, Yu C, Zhang MR, Gao SD, Sun K, Shi YC. ATK:Enabling ten-finger freehand typing in air based on 3d hand tracking data. In:Proc. of the 28th Annual ACM Symp. on User Interface Software. 2015.539-548.
    [83] Cui J, Fellner DW, Kuijper A, Sourin A. Mid-air Gestures for Virtual Modeling with Leap Motion. Springer Int'l Publishing, 2016.
    [84] Liang H, Chang J, Kazmi IK, Zhang JJ, Jiao PF. Hand gesture-based interactive puppetry system to assist storytelling for children. The Visual Computer, 2016,33(4):517-531.
    [85] Hatscher B, Luz M, Nacke LE, Elkmann N, Müller V, Hansen C. GazeTap:Towards hands-free interaction in the operating room. In:Proc. of the 19th ACM Int'l Conf. on Multimodal Interaction. 2017.243-251.
    [86] Sun SQ, Zhang LS. Three-dimension sketch design oriented to product innovation. Computer Integrated Manufacturing Systems, 2007,13(2):224-227, 274(in Chinese with English abstract).
    [87] Shen JC, Luo YL, Wu ZK, Tian Y, Deng QQ. CUDA-based real-time hand gesture interaction and visualization for CT volume dataset using leap motion. The Visual Computer, 2016,32(3):359-370.
    附中文参考文献:
    [4] 张凤军,戴国忠,彭晓兰.虚拟现实的人机交互综述.中国科学:信息科学,2016,46(12):1711-1736.
    [5] 黄进,韩冬奇,陈毅能,田丰,王宏安,戴国忠.混合现实中的人机交互综述.计算机辅助设计与图形学学报,2016,28(6):869-880.
    [6] 于汉超,杨晓东,张迎伟,钟习,陈益强.凌空手势识别综述.科技导报,2017,35(16):64-73.
    [7] 郭晓辉,王晶,徐光华.手部功能康复机器人研究最新进展.中国康复医学杂志,2017,32(2):235-240.
    [9] 朱英杰,李淳芃,马万里,夏时洪,张铁林,王兆其.沉浸式虚拟装配中物体交互特征建模方法研究.计算机研究与发展,2011,48(7):1298-1306.
    [10] 绪玉花,李静蓉.面向虚拟装配的虚拟手交互技术研究.机械设计与制造,2014,5:262-266.
    [11] 武汇岳,张凤军,刘玉进,戴国忠.基于视觉的手势界面关键技术研究.计算机学报,2009,32(10):2030-2041.
    [12] 任镤,周明全,樊亚春,钱露,税午阳.面向手势交互的古建场景快速搭建方法.北京理工大学学报,2018,38(4):412-416,436.
    [13] 王修晖,华炜,鲍虎军.面向多投影显示墙的手势交互系统设计与实现.计算机辅助设计与图形学学报,2007,19(3):318-322,328.
    [86] 黄琦,孙守迁,张立珊.面向产品创新的3维草图设计技术研究.计算机集成制造系统,2007,13(2):224-227,274.
    引证文献
引用本文

张维,林泽一,程坚,柯铭雨,邓小明,王宏安.动态手势理解与交互综述.软件学报,2021,32(10):3051-3067

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2020-02-21
  • 最后修改日期:2020-05-10
  • 在线发布日期: 2021-01-15
  • 出版日期: 2021-10-06
文章二维码
您是第19769289位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号