当前位置: X-MOL 学术IEEE Signal Process. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Spectro-Temporal Attenton-Based Voice Activity Detection
IEEE Signal Processing Letters ( IF 3.2 ) Pub Date : 2020-01-01 , DOI: 10.1109/lsp.2019.2959917
Younglo Lee , Jeongki Min , David K. Han , Hanseok Ko

Voice Activity Detection (VAD) systems suffer from unexpected and non-stationary background noises at magnitudes sufficiently high to mask the speech signal.Although several methods of increasing the performance of VAD have been proposed, their approaches have yet to mitigate the influence of the background noise itself. This letter proposes an effective noise-robust VAD system approach. The proposed method uses spectral attention and temporal attention through applying a deep learning-based attention mechanism. The proposed method is demonstrated and compared with several other deep learning-based methods in terms of the area under the curve in experiments with either known or unknown noise-added, and real-world noisy data. The results show that the proposed method outperforms the other methods in all the scenarios considered, but moreover generalizes well in environments of unknown or unexpected noise.

中文翻译:

基于光谱时间注意力的语音活动检测

语音活动检测 (VAD) 系统遭受意外和非平稳背景噪声的影响,其幅度足够高以掩盖语音信号。 尽管已经提出了几种提高 VAD 性能的方法,但它们的方法尚未减轻背景的影响噪音本身。这封信提出了一种有效的抗噪声 VAD 系统方法。所提出的方法通过应用基于深度学习的注意力机制来使用光谱注意力和时间注意力。在添加已知或未知噪声和真实世界噪声数据的实验中,在曲线下面积方面证明了所提出的方法并与其他几种基于深度学习的方法进行了比较。结果表明,所提出的方法在所有考虑的场景中都优于其他方法,
更新日期:2020-01-01
down
wechat
bug