当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Improved Robust ASR for Social Robots in Public Spaces
arXiv - CS - Sound Pub Date : 2020-01-14 , DOI: arxiv-2001.04619
Charles Jankowski, Vishwas Mruthyunjaya, Ruixi Lin

Social robots deployed in public spaces present a challenging task for ASR because of a variety of factors, including noise SNR of 20 to 5 dB. Existing ASR models perform well for higher SNRs in this range, but degrade considerably with more noise. This work explores methods for providing improved ASR performance in such conditions. We use the AiShell-1 Chinese speech corpus and the Kaldi ASR toolkit for evaluations. We were able to exceed state-of-the-art ASR performance with SNR lower than 20 dB, demonstrating the feasibility of achieving relatively high performing ASR with open-source toolkits and hundreds of hours of training data, which is commonly available.

中文翻译:

改进公共空间社交机器人的鲁棒 ASR

由于多种因素,包括 20 到 5 dB 的噪声 SNR,部署在公共场所的社交机器人对 ASR 来说是一项具有挑战性的任务。现有的 ASR 模型在此范围内对于更高的 SNR 表现良好,但随着更多噪声而显着降低。这项工作探索了在这种条件下提供改进的 ASR 性能的方法。我们使用 AiShell-1 中文语音语料库和 Kaldi ASR 工具包进行评估。我们能够在 SNR 低于 20 dB 的情况下超越最先进的 ASR 性能,这证明了使用开源工具包和数百小时的训练数据实现相对高性能的 ASR 的可行性,这些数据通常可用。
更新日期:2020-01-15
down
wechat
bug