当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Improving Reverberant Speech Separation with Multi-stage Training and Curriculum Learning
arXiv - CS - Sound Pub Date : 2021-07-19 , DOI: arxiv-2107.09177
Rohith Aralikatti, Anton Ratnarajah, Zhenyu Tang, Dinesh Manocha

We present a novel approach that improves the performance of reverberant speech separation. Our approach is based on an accurate geometric acoustic simulator (GAS) which generates realistic room impulse responses (RIRs) by modeling both specular and diffuse reflections. We also propose three training methods - pre-training, multi-stage training and curriculum learning that significantly improve separation quality in the presence of reverberation. We also demonstrate that mixing the synthetic RIRs with a small number of real RIRs during training enhances separation performance. We evaluate our approach on reverberant mixtures generated from real, recorded data (in several different room configurations) from the VOiCES dataset. Our novel approach (curriculum learning+pre-training+multi-stage training) results in a significant relative improvement over prior techniques based on image source method (ISM).

中文翻译:

通过多阶段训练和课程学习改善混响语音分离

我们提出了一种改进混响语音分离性能的新方法。我们的方法基于精确的几何声学模拟器 (GAS),该模拟器通过对镜面反射和漫反射进行建模来生成逼真的房间脉冲响应 (RIR)。我们还提出了三种训练方法——预训练、多阶段训练和课程学习,在存在混响的情况下显着提高分离质量。我们还证明,在训练期间将合成 RIR 与少量真实 RIR 混合可提高分离性能。我们评估了从 VOiCES 数据集的真实记录数据(在几个不同的房间配置中)生成的混响混合的方法。
更新日期:2021-07-21
down
wechat
bug