以语音出现时频相关性为基础的语音掩模估计

微信服务号

微信订阅号

2025年5月10日 9:26 星期六

首页 > 过刊浏览>2016年第27卷第S2期 >64-68

PDF HTML阅读 XML下载导出引用引用提醒

以语音出现时频相关性为基础的语音掩模估计
DOI:
                        
                    
CSTR:
                        
                    
作者:
                        战鸽战鸽
中国科学院 声学研究所, 北京 100190
在期刊界中查找
在百度中查找
在本站中查找
黄兆琼黄兆琼
中国科学院 声学研究所, 北京 100190
在期刊界中查找
在百度中查找
在本站中查找
应冬文应冬文
中国科学院 声学研究所, 北京 100190
在期刊界中查找
在百度中查找
在本站中查找
潘接林潘接林
中国科学院 声学研究所, 北京 100190
在期刊界中查找
在百度中查找
在本站中查找
颜永红颜永红
中国科学院 声学研究所, 北京 100190
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家自然科学基金（11461141004，91120001，61271426)；中国科学院战略性先导科技专项（XDA06030100，XDA06030500)；国家高技术研究发展计划（863)（2012AA012503)；中国科学院重点部署项目（KGZD-EW-103-2)

Speech Mask Estimation Using the Time-Frequency Correlation of Speech Presence

Author:

ZHAN Ge
ZHAN Ge
Institute of Acoustics, The Chinese Academy of Sciences, Beijing 100190, China
在期刊界中查找
在百度中查找
在本站中查找
HUANG Zhao-Qiong
HUANG Zhao-Qiong
Institute of Acoustics, The Chinese Academy of Sciences, Beijing 100190, China
在期刊界中查找
在百度中查找
在本站中查找
YING Dong-Wen
YING Dong-Wen
Institute of Acoustics, The Chinese Academy of Sciences, Beijing 100190, China
在期刊界中查找
在百度中查找
在本站中查找
PAN Jie-Lin
PAN Jie-Lin
Institute of Acoustics, The Chinese Academy of Sciences, Beijing 100190, China
在期刊界中查找
在百度中查找
在本站中查找
YAN Yong-Hong
YAN Yong-Hong
Institute of Acoustics, The Chinese Academy of Sciences, Beijing 100190, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

National Natural Science Foundation of China (11461141004, 91120001, 61271426); Strategic Priority Research Program of the Chinese Academy of Sciences (XDA06030100, XDA06030500); National High-Tech R&D Program of China (863) (2012 AA012503); CAS Priority Deployment Project (KGZD-EW-103-2)

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

在二维的时频域网格结构中，相邻点上语音信号的存在与否是相关的，传统的马尔可夫链不能对二维的时频相关性进行自适应的建模.基于语音信号在时频域中的相关性，提出了一种利用二维的相关模型估计语音掩模的方法.该方法将时频域中带噪语音信号的对数功率谱划分为语音和非语音类，利用时域中的状态转移概率和前向因子描述语音信号的时域相关性，同时利用频域中的状态转移概率和邻域因子描述语音信号的频域相关性.通过全局的统计最优化，该模型将时域相关性和频域相关性相结合.给出了该模型的序贯化更新方法，逐帧更新模型并估计语音出现概率.在当前已知对数功率谱和模型参数的条件下，通过最大化后验概率得到的语音信号状态矩阵可以作为语音掩模的最优估计.将该方法与几种现有的语音掩模在线估计方法进行比较，实验结果显示出了该方法的优越性.

关键词:语音掩模;时频相关性;语音出现概率;邻域因子;在线估计

Abstract:

This paper proposes a method to estimate the spectrographic speech mask based on a two-dimensional (2-D) correlation model. The proposed method is motivated by a fact that the time and frequency correlations of speech presence are interwoven with each other in the time-frequency domain. Conventional Markov chain is incapable of simultaneously modeling the time and frequency correlations in an adaptive way. The 2-D correlation model is presented to describe the correlation of speech presence in the TF domain, where the speech presence and absence are taken as two states of the model. The time correlation is modeled by the time state-transition probability and the forward factor, while the frequency state-transition probability and the corresponding neighbor factor are defined to describe the frequency correlation. The time and frequency correlations are incorporated into the model by maximizing the Q-function. A sequential scheme is presented to online estimate the parameter set. Given the observed spectrum and the parameter set, the state matrix that maximizes the posteriori probability is regarded as the optimal estimate of the speech mask. The proposed method was compared with some well-established methods. The experimental results confirmed its superiority.

Key words:speech mask;time-frequency correlation;speech presence probability;neighbor factor;online estimation

引用本文

战鸽,黄兆琼,应冬文,潘接林,颜永红.以语音出现时频相关性为基础的语音掩模估计.软件学报,2016,27(S2):64-68

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2015-06-01
最后修改日期:2016-01-05
录用日期:
在线发布日期: 2017-01-10
出版日期:

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码