Speech Mask Estimation Using the Time-Frequency Correlation of Speech Presence

微信服务号

微信订阅号

2025-4-24- 21

Home > Archive>Volume 27, Issue S2, 2016 >64-68

PDF HTML XML Export Cite reminder

Speech Mask Estimation Using the Time-Frequency Correlation of Speech Presence
DOI:
                        
                    
Author:
                        ZHAN GeZHAN Ge
Institute of Acoustics, The Chinese Academy of Sciences, Beijing 100190, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
HUANG Zhao-QiongHUANG Zhao-Qiong
Institute of Acoustics, The Chinese Academy of Sciences, Beijing 100190, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
YING Dong-WenYING Dong-Wen
Institute of Acoustics, The Chinese Academy of Sciences, Beijing 100190, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
PAN Jie-LinPAN Jie-Lin
Institute of Acoustics, The Chinese Academy of Sciences, Beijing 100190, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
YAN Yong-HongYAN Yong-Hong
Institute of Acoustics, The Chinese Academy of Sciences, Beijing 100190, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:National Natural Science Foundation of China (11461141004, 91120001, 61271426); Strategic Priority Research Program of the Chinese Academy of Sciences (XDA06030100, XDA06030500); National High-Tech R&D Program of China (863) (2012 AA012503); CAS Priority Deployment Project (KGZD-EW-103-2)

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

This paper proposes a method to estimate the spectrographic speech mask based on a two-dimensional (2-D) correlation model. The proposed method is motivated by a fact that the time and frequency correlations of speech presence are interwoven with each other in the time-frequency domain. Conventional Markov chain is incapable of simultaneously modeling the time and frequency correlations in an adaptive way. The 2-D correlation model is presented to describe the correlation of speech presence in the TF domain, where the speech presence and absence are taken as two states of the model. The time correlation is modeled by the time state-transition probability and the forward factor, while the frequency state-transition probability and the corresponding neighbor factor are defined to describe the frequency correlation. The time and frequency correlations are incorporated into the model by maximizing the Q-function. A sequential scheme is presented to online estimate the parameter set. Given the observed spectrum and the parameter set, the state matrix that maximizes the posteriori probability is regarded as the optimal estimate of the speech mask. The proposed method was compared with some well-established methods. The experimental results confirmed its superiority.

Key words:speech mask;time-frequency correlation;speech presence probability;neighbor factor;online estimation

Get Citation

战鸽,黄兆琼,应冬文,潘接林,颜永红.以语音出现时频相关性为基础的语音掩模估计.软件学报,2016,27(S2):64-68

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:June 01,2015
Revised:January 05,2016
Adopted:
Online: January 10,2017
Published:

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History