当前位置: X-MOL 学术IEEE Trans. Signal Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Robust Multichannel Linear Prediction for Online Speech Dereverberation Using Weighted Householder Least Squares Lattice Adaptive Filter
IEEE Transactions on Signal Processing ( IF 4.6 ) Pub Date : 2020-05-25 , DOI: 10.1109/tsp.2020.2997201
Jason Wung , Ante Jukic , Sarmad Malik , Mehrez Souden , Ramin Pichevar , Joshua Atkins , Devang Naik , Alex Acero

Speech dereverberation has been an important component of effective far-field voice interfaces in many applications. Algorithms based on multichannel linear prediction (MCLP) have been shown to be especially effective for blind speech dereverberation and numerous variants have been introduced in the literature. Most of these approaches can be derived from a common framework, where the MCLP problem for speech dereverberation is formulated as a weighted least squares problem that can be solved analytically. Since conventional batch MCLP-based dereverberation algorithms are not suitable for low-latency applications, a number of online variants based on the recursive least squares (RLS) algorithm have been proposed. However, RLS-based approaches often suffer from numerical instability and their use in online systems can further be limited due to high computational complexity with a large number of channels or filter taps. In this paper, we aim to address the issues of numerical robustness and computational complexity. More specifically, we derive alternative online weighted least squares algorithms through Householder RLS and Householder least squares lattice (HLSL), which are numerically stable and retain the fast convergence capability of the RLS algorithm. Furthermore, we derive an angle-normalized variant of the HLSL algorithm and show that it is robust to speech cancellation for a wide range of forgetting factors and filter taps. Finally, we support our findings through experimental results and demonstrate numerical and algorithmic robustness, long-term stability, linear complexity in filter taps, low memory footprint, and effectiveness in speech recognition applications.

中文翻译:


使用加权 Householder 最小二乘格自适应滤波器进行在线语音去混响的鲁棒多通道线性预测



在许多应用中,语音去混响一直是有效远场语音接口的重要组成部分。基于多通道线性预测 (MCLP) 的算法已被证明对于盲语音去混响特别有效,并且文献中引入了许多变体。这些方法中的大多数都可以源自一个通用框架,其中语音去混响的 MCLP 问题被表述为可以通过分析解决的加权最小二乘问题。由于传统的基于 MCLP 的批量去混响算法不适合低延迟应用,因此人们提出了许多基于递归最小二乘 (RLS) 算法的在线变体。然而,基于 RLS 的方法经常遭受数值不稳定的困扰,并且由于大量通道或滤波器抽头的高计算复杂性,它们在在线系统中的使用可能进一步受到限制。在本文中,我们的目标是解决数值鲁棒性和计算复杂性的问题。更具体地说,我们通过Householder RLS和Householder最小二乘点阵(HLSL)推导出替代的在线加权最小二乘算法,这些算法在数值上稳定并保留了RLS算法的快速收敛能力。此外,我们推导了 HLSL 算法的角度归一化变体,并表明它对于各种遗忘因子和滤波器抽头的语音消除具有鲁棒性。最后,我们通过实验结果支持我们的发现,并证明了数值和算法的鲁棒性、长期稳定性、滤波器抽头的线性复杂性、低内存占用以及语音识别应用的有效性。
更新日期:2020-05-25
down
wechat
bug