Domain Dependent Language Model Based on Fuzzy Training Subset

微信服务号

微信订阅号

2025-4-6- 21

Home > Archive>Volume 11, Issue 7, 2000 >971-978

Domain Dependent Language Model Based on Fuzzy Training Subset
DOI:
                        
                    
Author:
                        CHEN Lang-zhouCHEN Lang-zhou

Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
HUANG Tai-yiHUANG Tai-yi

Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference [7]

Related [20]

Cited by

Materials

Comments

Abstract:

Statistical language model is very important to speech recognition. To a system of special topic, domain dependent language model is much better than the general model. There are two problems in traditional method. (1) The corpus of special topic is not large enough as general corpus. (2) An article is always related to more than one topic, but these phenomena have not been considered during the process of model training. In this paper, the authors try to solve these two problems. They present a new method to organize the corpus——the method based on fuzzy training subset. And the training of domain dependent models is based on these fuzzy subsets. At the same time, self organized learning has been introduced in training process to improve the models' prediction ability. It can improve the performance of models evidently.

Key words:Speech recognition, statistical language model, fuzzy, self organized learning.

Reference

[1]Jelinek F. Self-Organized Language Model for Speech Recognition. Readings in Speech Recognition. San Mateo, CA: Morgan Kaufmann Publishers, Inc., 1990

[2]Lin Sung-chien, Lee Lin-shan. Chinese language model adaptation based on document classification and multiple domain-specific language models. In: Kokkinakis G, Fakotakis N, Dermates E eds. Proceedings of European Conference of Speech Communication and Technology. Greece, European Speech Communication Association. 1997. 1463～1466

[3]Clarkson P R, Robinson A J. Language model adaptation using mixtures and an exponentially dacaying cache. In: Pango P A ed. Proceedings of the International Conference of Acoustics Speech and Signal Processing. Munich: IEEE Signal Processing Society, 1997. 799～802

[4]Chen Lang-zhou, Huang Tai-yi. A new method for text segmenting based on neural network. In: Huang Chang-ning ed. Proceedings of the International Conference on Chinese Information Processing. Beijing: Tsinghua University Press, 1998. 125～129 (陈浪舟,黄泰翼.一种基于神经网络的文本切分算法.见：黄昌宁编.中文信息处理国际会议论文集.北京:清华大学出版社,1998. 125～129)

[5]Kneser R, Steinbiss V. On the dynamic adaptation of stochastic language modeling. In: Proceedings of the International Conference of Acoustics Speech and Signal Processing. Minneapolis: IEEE Signal Processing Society, 1993. 586～589

[6]Huang De-shuang. Neural Network and Pattern Recognition System Theory. Beijing: Publishing House of Electronics Industry, 1996 (黄德双.神经网络模式识别理论.北京:电子工业出版社,1996)

[7]Federico M. Bayesian estimation methods for n-gram language model adaptation. In: Bunnell T H ed. Proceedings of 1996 International Conference of Spoken Language Processing. Philadelphia: Press of University of Delaware, 1996. 240～243

Get Citation

陈浪舟,黄泰翼.基于模糊训练集的领域相关统计语言模型.软件学报,2000,11(7):971-978

Copy

Article Metrics

Abstract:3614
PDF: 4439
HTML: 0
Cited by: 0

History

Received:February 08,1999
Revised:June 17,1999
Adopted:
Online:
Published:

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History