用于口型识别的实时唇定位方法
作者:
基金项目:

本文研究得到国家自然科学基金(No.69789301)和国家863高科技项目基金(No.863- 306-ZT03-01-2)资助.

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [1]
  • |
  • 相似文献
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    在许多应用于有噪声环境下的语音识别系统中,唇读技术能有效地降低噪声的影响,通过视觉 通道来补充仅取决于听觉通道的信息量,从而提高语音识别系统的识别率.该文提出了一种有 效和稳健的唇定位跟踪方法,以满足不用特殊标识物和规范性照明就能对信息进行有效提取 的应用需求.该方法首先用肤色模型查找脸;然后用迭代算法搜索脸部区域内的眼睛;再根据 眼睛的位置来确定脸的大小和位置,并对脸的下半部分采用彩色坐标变换法将唇从肤色中明 显地区分出来;最后,用可变模板将上下唇的内外轮廓描述出来.

    Abstract:

    For speech recognition systems under noisy environment, lip-reading technique c an effectively reduce the influence of noise and improve the accurate rate o f speech recognition system by adding visual information to acoustic channel. In this paper, an effective and robust approach for lip and mouth locating and tra cking is presented to enable the information extraction under abnormal illumina tion and without special marks. This approach first locates face region with skin-color model, then finds the eyes from the face region with iterative algo rithm, modifies the position and size of face according to the position of eyes, transforms the lower part of face by specific color coordinators to clearly dis tinguish lip color from skin color, and finally describes the outline of upper l ip and lower lip with deformable template.

    参考文献
    1  Hennecke M, Prasad K, Stork D. Using deformable templates to in fer visual speech dynamics. In: Proceddings of the 28th Annual Asilomar Conferen ce on Signals, Systems and Computers, Vol 1. Pacific Grove: IEEE Computer Societ y Press, 1994. 578~582 2  Wolff G, Prasad K, Stork D et al. Lip-reading by neural networks: visual preprocessing, learning and sensory integration. In: Cowan J, Tesauro G, Alspector J. eds. Proceedings of the Neural Information Processing Systems-6. S an Mateo, CA: Morgan Kaufmann Publishers, Inc., 1994. 1027~1034 3  Petajan E D. Automatic lip-reading to enhance speech recognition [Ph. D. Thesis]. University of Illinois at Urbana-Champain, 1984 4  Coianiz T, Torresani L, Caprile B. 2D deformable model for visual speec h analysis. In: Stork D, Hennecke M eds. Speechreading by Humans and Machines: M odels, Methods, and Applications. Volume 150, NATO-ASI Series, Series F: Comput er and Systems Sciences. Berlin: Springer-Verlag, 1995 5  Kass M, Witkin A, Terzopoulus D. Snakes: active contour models. In: Pro ceedings of the 1st International Conference on Computer Vision. New York: IEEE Computer Society Press, 1987. 259~268 6  Finn K, Montgomery A. Automatic optically-based recognition of speech. Pattern Recognition, 1988,8(3):159~164 7  Mase K, Pentland A. Automatic lip-reading by optical flow analysis. Sy stems and Computers in Japan, 1991,22(6):67~76 8  Bregler C, Konig Y. “Eigenlips” for robust speech recognition. In: Bo ngner R E ed. Proceedings of the IEEE International Conference on Acoustics, Spe ech and Signal Processing. Adelaide: Adelaide Convention Center, 1994. 667~674 9  Bregler C, Hild H et al. Improving connected letter recognition by lip-reading. In: Kaveh M ed. Proceedings of the IEEE International Conference o n Acousticsm, Speech and Signal Processing. Minnesota: Minneapolis Convention Ce nter, 1993. 557~560 10  Kinmanlam, Yan H. Locating and extracting the eye in human face images. Patt ern Recognition, 1996,29(5):771~779
    相似文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

姚鸿勋,高文,李静梅,吕雅娟,王瑞.用于口型识别的实时唇定位方法.软件学报,2000,11(8):1126-1132

复制
分享
文章指标
  • 点击次数:4140
  • 下载次数: 4649
  • HTML阅读次数: 0
  • 引用次数: 0
历史
  • 收稿日期:1999-05-17
  • 最后修改日期:1999-09-09
文章二维码
您是第19807559位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号