当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Controlling the Perceived Sound Quality for Dialogue Enhancement with Deep Learning
arXiv - CS - Sound Pub Date : 2021-07-22 , DOI: arxiv-2107.10562
Christian Uhle, Matteo Torcoli, Jouni Paulus

Speech enhancement attenuates interfering sounds in speech signals but may introduce artifacts that perceivably deteriorate the output signal. We propose a method for controlling the trade-off between the attenuation of the interfering background signal and the loss of sound quality. A deep neural network estimates the attenuation of the separated background signal such that the sound quality, quantified using the Artifact-related Perceptual Score, meets an adjustable target. Subjective evaluations indicate that consistent sound quality is obtained across various input signals. Our experiments show that the proposed method is able to control the trade-off with an accuracy that is adequate for real-world dialogue enhancement applications.

中文翻译:

使用深度学习控制用于对话增强的感知声音质量

语音增强会减弱语音信号中的干扰声音,但可能会引入伪影,从而明显降低输出信号的质量。我们提出了一种控制干扰背景信号衰减和音质损失之间的权衡的方法。深度神经网络估计分离的背景信号的衰减,从而使使用 Artifact-related Perceptual Score 量化的声音质量满足可调整的目标。主观评估表明,在各种输入信号中获得了一致的音质。我们的实验表明,所提出的方法能够以足以满足现实世界对话增强应用程序的精度来控制权衡。
更新日期:2021-07-23
down
wechat
bug