• Article
  • | |
  • Metrics
  • |
  • Reference [1]
  • |
  • Related [20]
  • |
  • Cited by [2]
  • | |
  • Comments
    Abstract:

    In this paper, the authors present a novel method to incorporate temporal correlation into a speech recognition system based on conventional hidden Markov model (HMM). The temporal correlation is considered to be useful for recognition because of the fact that the speech features of the present frame are highly informative about the feature characteristics of neighboring frames. An obvious way to incorporate temporal correlation is to condition the probability of the current observation on the current state as well as on the previous observation and the previous state. But using this method directly must lead to unreliable parameter estimation for the number of parameters to be estimated may increase too excessively to limited train data. In this paper, the authors approximate the joint conditional PD by non-linear estimation method. As a result, they can still use mixture Gaussian density to represent the joint conditional PD for the principle of any PD can be approximated by mixture Gaussian density. The HMM incorporated temporal correlation by non-linear estimation method, which they called FC (frame correlation) HMM does not need any additional parameters and it only brings a little additional computing quantity. The results of the experiment show that the top 1 recognition rate of FC HMM has been raised by 6 percent compared to the conventional HMM method.

    Reference
    1  Ostendorf M, Roukos S. A stochastic segment model for phoneme-based continuous speech recognition. IEEE Transactions on Acoustics, Speech and Signal Processing, 1989,37(12):1857~1869 2  Digalakis V, Rohlicek J R, Ostendorf M. A dynamical system approach to continuous speech recognition. In: Proceedings of the International Conference Acoustics, Speech, and Signal Processing. Mississauga: Imperial Press Limited, 1991. 289~292 3  Wellekens C J. Explicit correlation in hidden Markov model for speech recognition. In: Proceedings of the International Conference Acoustics, Speech, and Signal Processing. San Francisco: IEEE Signal Processing Society, 1987. 383~386 4  Kenny P, Lennig M, Mermelstein P. A linear predictive HMM for vector-valued observations with applications to speech recognition. IEEE Transactions on Acoustics, Speech and Signal Processing, 1990,38(2):220~225 5  Paliwal K K. Use of temporal correlation between successive frames in hidden Markov model based speech recognizer. In: Proceedings of the International Conference Acoustics, Speech, and Signal Processing. San Francisco: IEEE Signal Processing Society, 1993. 215~218 6  Takahashi S. Phonemic HMM constrained by statistical VQ-code transition. In: Proceedings of the International Conference Acoustics, Speech, and Signal Processing. San Francisco: IEEE Signal Processing Society, 1992. 553~556 7  Takahashi S. Phoneme HMM's constrained by frame correlation. In: Proceedings of the International Conference Acoustics, Speech, and Signal Processing. San Francisco: IEEE Signal Processing Society, 1993. 219~222 8  Nam Soo Kim, Chong Kwan Un. Frame-correlated hidden Markov model based on extended logarithmic pool. IEEE Transactions on Speech and Audio Processing, 1997,5(2):149~160
    Comments
    Comments
    分享到微博
    Submit
Get Citation

郭 庆,吴文虎,方棣棠.隐马尔可夫模型中一种新的帧相关建模方法.软件学报,1999,10(6):631-635

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:April 27,1998
  • Revised:June 23,1998
You are the first2038577Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063