一种无标记的身体与面部运动同步捕获方法
作者:
作者简介:

王志勇(1989-),男,天津人,学士,主要研究领域为计算机图形学,虚拟人动画;袁铭泽(1988-),男,硕士,主要研究领域为计算机图形学,虚拟人动画,人工智能;王从艺(1990-),男,博士,主要研究领域为计算机图形学,虚拟人动画,人工智能;夏时洪(1974-),男,博士,博士生导师,CCF高级会员,主要研究领域为计算机图形学,虚拟现实与人工智能;张子豪(1993-),男,学士,主要研究领域为计算机图形学,虚拟人动画,人工智能.

通讯作者:

夏时洪,E-mail:xsh@ict.ac.cn

基金项目:

国家自然科学基金(61772499)


Markerless Motion Capture Method Combining Body Capture and Face Capture
Author:
Fund Project:

National Natural Science Foundation of China (61772499)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [31]
  • |
  • 相似文献 [20]
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    提供了一个无标记点的身体与面部运动同步捕获的方法.利用经过时间同步和空间标定的长焦彩色相机和Kinect相机来进行同步捕获.利用在环境中加入闪光来进行时间同步,使用张氏标定法进行空间标定,从而组成一组时间同步且空间对齐的混合相机(hybrid camera).然后利用Kinect fusion扫描用户的人体模型并嵌入骨骼.最后利用时间和空间都对齐好的两个相机来进行同步采集.首先从深度图像中得到人脸的平移参考值,然后在平移参考值的帮助下根据彩色图像的2D特征点重建人脸.随后,把彩色图像中得到的头部姿态传递给身体捕获结果.结果对比实验和用户调研实验均表明所提出的运动捕获的结果要好于单个的运动捕获结果.

    Abstract:

    This paper presents a markerless synchronized motion capture method for body and face. A long-focus color camera and a standard Kinect camera are used which are synthesized and calibrated. Flush is added into the capture space to make the cameras synthesized. Then, the cameras are calibrated to get the relative transformation of the cameras by Zhang's calibration, In this way, the hybrid camera is configured. The user's body is scanned by Kinect fusion, and the skeleton is embedded for the body model. During capture, the translation reference is firstly obtained from the depth camera. After that, the facial pose and expression and identity are reconstructed by 2D feature points. Non-rigid ICP is applied to reconstruct the body pose. Finally, the head pose from the face capture camera is transformed to body capture camera to get the merged results. The comparison and user study show that the result of proposed synchronized motion capture by two cameras is better than that of single camera.

    参考文献
    [1] Xia SH, Gao L, Lai YK, Yuan MZ, Chai JX. A survey on human performance capture and animation. Journal of Computer Science and Technology, 2017,32(3):536-554.
    [2] Zelezny M, Krnoul Z, Jedlicka P, et al. Analysis of facial motion capture data for visual speech synthesis. In:Proc. of the Int'l Conf. on Speech and Computer. 2015. 81-88.
    [3] Xia SH, Su L, Fei XY, Wang H. Toward accurate realtime marker labeling for live optical motion capture. Visual Computer, 2017,33(6-8):993-1003.
    [4] Zhang Z. Microsoft Kinect sensor and its effect. IEEE Multimedia, 2012,19(2):4-10.
    [5] Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A. Real-time human pose recognition in parts from a single depth image. In:Proc. of the CVPR. IEEE, 2011.
    [6] Girshick R, Shotton J, Kohli P, et al. Efficient regression of general activity human poses from depth images. In:Proc. of the 13th IEEE Int'l Conf. on Computer Vision. 2011. 415-422.
    [7] Ye M, Wang X, Yang R, Ren L, Pollefeys M. Accurate 3D pose estimation from a single depth image. In:Proc. of the 13th IEEE Int'l Conf. on Computer Vision. 2011. 731-738.
    [8] Su L, Chai JX, Xia SH. Local pose prior based 3D human motion capture from depth local pose prior based 3D human motion capture from depth. Ruan Jian Xue Bao/Journal of Software, 2016,27(S2):172-183(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/16032.htm
    [9] Xia SH, Zhang ZH, Su L. Cascaded 3D full-body pose regression from single depth image at 100 fps. In:Proc. of the IEEE VR. 2018.
    [10] Grest D, Woetzel J, Koch R. Nonlinear body pose estimation from depth images. In:Proc. of the DAGM. Vienna, 2005.
    [11] Grest D, Kruger V, Koch R. Single view motion tracking by depth and silhouette information. In:Proc. of the 15th Scandinavian Conf. on Image Analysis (SCIA). 2007. 719-729.
    [12] Knoop S, Vacek S, Dillmann R. Fusion of 2D and 3D sensor data for articulated body tracking. Robotics and Auto-nomous Systems, 2009,57(3):321-329.
    [13] Wei X, Zhang P, Chai J. Accurate realtime full-body motion capture using a single depth camera. ACM Trans. on Graphics, 2012,31(6):188:1-188:12.
    [14] Wang KK, Zhang GF, Xia SH. Templateless non-rigid reconstruction and motion tracking with a single RGB-D camera. IEEE Trans. on Image Processing, 2017,26(12):5966-5979.
    [15] Cao C, Weng Y, Zhou S, Tong Y, Zhou K. Facewarehouse:A 3d facial expression database for visual computing. IEEE Trans. on Visualization and Computer Graphics, 2013,20(3):413-425.
    [16] Cao C, Hou Q, Zhou K. Displaced dynamic expression regression for realtime facial tracking and animation. ACM Trans. on Graphics, 2014,33(4).
    [17] Wang CY, Shi FH, Xia SH, Chai JX. Realtime 3D eye gaze animation using a single RGB camera. ACM Trans. on Graphics (TOG)-Proc. of the ACM SIGGRAPH, 2016,35(4):118:1-118:14.
    [18] Wang H, Xia SH. Construction facial shape with details single image. Journal of Computer-Aided Design & Computer Graphics, 2016,29(7):1256-1266(in Chinese with English abstract).
    [19] Zhang Z. A flexible new technique for camera calibration. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2000, 22(11):1330-1334.
    [20] Vlasic D, Brand M, Pfister H. Face transfer with multilinear models. ACM Trans. on Graphics, 2005,24(3):426-433.
    [21] Ren S, Cao X, Wei Y. Face alignment at 3000 fps via regressing local binary features. In:Proc. of the Computer Vision and Pattern Recognition. 2014.
    [22] Kolda TG, Bader BW. Tensor decompositions and applications. SIAM Review, 2009,51(3):455-500.
    [23] Lourakis MIA. levmar:Leven-berg-Marquardt nonlinear least squares al-gorithms in C/C++. 2004. http://www.ics.forth.gr/~lourakis/levmar/
    [24] Newcombe RA, Izadi S, Hilliges O, et al. KinectFusion:Real-time dense surface mapping and tracking. In:Proc. of the 10th IEEE Int'l Symp. on Mixed and Augmented Reality (ISMAR). IEEE, 2011. 127-136.
    [25] Lewis JP, Cordner M, Fong N. Pose space deformations:A uni-fied approach to shape interpolation and skeleton-driven deformation. In:Proc. of the ACM SIGGRAPH 2000, Annual Conf. Series, ACM SIGGRAPH. 2000.
    [26] Box JF. Guinness, Gosset, Fisher, and Small Samples. Statistical Science, 1987.
    [27] Loper M, Mahmood N, Romero J, Pons-Moll G, Black MJ. SMPL:A skinned multi-person linear model. ACM Trans. on Graphics (TOG)-Proc. of the ACM SIGGRAPH Asia, 2015,34(6):248:1-248:16.
    [28] Bogo F, Kanazawa A, Lassner C, Gehler P, Romero J, Black MJ. Keep it SMPL:Automatic estimation of 3d human pose and shape from a single image. arXiv:1607.08128, 2016. https://arxiv.org/abs/1607.08128[doi:10.1007/978-3-319-46454-1_34]
    附中文参考文献:
    [8] 苏乐,柴金祥,夏时洪.基于局部姿态先验的深度图像3D人体运动捕获方法.软件学报,2016,27(S2):172-183. http://www.jos.org.cn/1000-9825/16032.htm
    [18] 王涵,夏时洪.单张图片自动重建带几何细节的人脸形状.计算机辅助设计与图形学学报,2016,29(7):1256-1266.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

王志勇,王从艺,张子豪,袁铭择,夏时洪.一种无标记的身体与面部运动同步捕获方法.软件学报,2019,30(10):3026-3036

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2018-08-18
  • 最后修改日期:2018-11-01
  • 在线发布日期: 2019-05-16
文章二维码
您是第19862284位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号