当前位置: X-MOL 学术IEEE ACM Trans. Audio Speech Lang. Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep Learning Based Binaural Speech Separation in Reverberant Environments.
IEEE/ACM Transactions on Audio, Speech, and Language Processing ( IF 4.1 ) Pub Date : 2017-10-24 , DOI: 10.1109/taslp.2017.2687104
Xueliang Zhang 1 , DeLiang Wang 2
Affiliation  

Speech signal is usually degraded by room reverberation and additive noises in real environments. This paper focuses on separating target speech signal in reverberant conditions from binaural inputs. Binaural separation is formulated as a supervised learning problem, and we employ deep learning to map from both spatial and spectral features to a training target. With binaural inputs, we first apply a fixed beamformer and then extract several spectral features. A new spatial feature is proposed and extracted to complement the spectral features. The training target is the recently suggested ideal ratio mask. Systematic evaluations and comparisons show that the proposed system achieves very good separation performance and substantially outperforms related algorithms under challenging multi-source and reverberant environments.

中文翻译:

混响环境中基于深度学习的双耳语音分离。

在真实环境中,语音信号通常会因房间混响和附加噪声而降低。本文着重于从双耳输入中分离出混响条件下的目标语音信号。将双耳分离公式化为有监督的学习问题,并且我们采用深度学习将空间和频谱特征映射到训练目标。对于双耳输入,我们首先应用固定的波束形成器,然后提取几个光谱特征。提出并提取了新的空间特征以补充光谱特征。训练目标是最近建议的理想比例遮罩。系统的评估和比较表明,在具有挑战性的多源和混响环境下,所提出的系统实现了很好的分离性能,并且明显优于相关算法。
更新日期:2019-11-01
down
wechat
bug