当前位置:
X-MOL 学术
›
arXiv.cs.SD
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Generalized RNN beamformer for target speech separation
arXiv - CS - Sound Pub Date : 2021-01-04 , DOI: arxiv-2101.01280 Yong Xu, Zhuohuang Zhang, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Dong Yu
arXiv - CS - Sound Pub Date : 2021-01-04 , DOI: arxiv-2101.01280 Yong Xu, Zhuohuang Zhang, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Dong Yu
Recently we proposed an all-deep-learning minimum variance distortionless
response (ADL-MVDR) method where the unstable matrix inverse and principal
component analysis (PCA) operations in the MVDR were replaced by recurrent
neural networks (RNNs). However, it is not clear whether the success of the
ADL-MVDR is owed to the calculated covariance matrices or following the MVDR
formula. In this work, we demonstrate the importance of the calculated
covariance matrices and propose three types of generalized RNN beamformers
(GRNN-BFs) where the beamforming solution is beyond the MVDR and optimal. The
GRNN-BFs could predict the frame-wise beamforming weights by leveraging on the
temporal modeling capability of RNNs. The proposed GRNN-BF method obtains
better performance than the state-of-the-art ADL-MVDR and the traditional
mask-based MVDR methods in terms of speech quality (PESQ), speech-to-noise
ratio (SNR), and word error rate (WER).
中文翻译:
用于目标语音分离的通用RNN波束形成器
最近,我们提出了一种全深度学习最小方差无失真响应(ADL-MVDR)方法,该方法将MVDR中的不稳定矩阵逆和主成分分析(PCA)操作替换为递归神经网络(RNN)。但是,尚不清楚ADL-MVDR的成功是归因于计算的协方差矩阵还是遵循MVDR公式。在这项工作中,我们证明了计算出的协方差矩阵的重要性,并提出了三种类型的广义RNN波束成形器(GRNN-BFs),其中波束成形解决方案超出了MVDR且是最优的。GRNN-BF可以利用RNN的时间建模能力来预测逐帧波束成形权重。
更新日期:2021-01-06
中文翻译:
用于目标语音分离的通用RNN波束形成器
最近,我们提出了一种全深度学习最小方差无失真响应(ADL-MVDR)方法,该方法将MVDR中的不稳定矩阵逆和主成分分析(PCA)操作替换为递归神经网络(RNN)。但是,尚不清楚ADL-MVDR的成功是归因于计算的协方差矩阵还是遵循MVDR公式。在这项工作中,我们证明了计算出的协方差矩阵的重要性,并提出了三种类型的广义RNN波束成形器(GRNN-BFs),其中波束成形解决方案超出了MVDR且是最优的。GRNN-BF可以利用RNN的时间建模能力来预测逐帧波束成形权重。