当前位置: X-MOL 学术IEEE Signal Proc. Mag. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Speech Processing for Digital Home Assistants: Combining signal processing with deep-learning techniques
IEEE Signal Processing Magazine ( IF 9.4 ) Pub Date : 2019-11-01 , DOI: 10.1109/msp.2019.2918706
Reinhold Haeb-Umbach , Shinji Watanabe , Tomohiro Nakatani , Michiel Bacchiani , Bjorn Hoffmeister , Michael L. Seltzer , Heiga Zen , Mehrez Souden

Once a popular theme of futuristic science fiction or far-fetched technology forecasts, digital home assistants with a spoken language interface have become a ubiquitous commodity today. This success has been made possible by major advancements in signal processing and machine learning for so-called far-field speech recognition, where the commands are spoken at a distance from the sound-capturing device. The challenges encountered are quite unique and different from many other use cases of automatic speech recognition (ASR). The purpose of this article is to describe, in a way that is amenable to the nonspecialist, the key speech processing algorithms that enable reliable, fully hands-free speech interaction with digital home assistants. These technologies include multichannel acoustic echo cancellation (MAEC), microphone array processing and dereverberation techniques for signal enhancement, reliable wake-up word and end-of-interaction detection, and high-quality speech synthesis as well as sophisticated statistical models for speech and language, learned from large amounts of heterogeneous training data. In all of these fields, deep learning (DL) has played a critical role.

中文翻译:

数字家庭助理的语音处理:将信号处理与深度学习技术相结合

曾经是未来科幻小说或牵强附会的技术预测的热门主题,具有口语界面的数字家庭助理如今已成为无处不在的商品。这一成功得益于信号处理和机器学习的重大进步,用于所谓的远场语音识别,其中命令是在距离声音捕获设备一定距离的地方说出的。遇到的挑战非常独特,与自动语音识别 (ASR) 的许多其他用例不同。本文的目的是以非专业人士也能接受的方式描述关键的语音处理算法,这些算法能够与数字家庭助理进行可靠、完全免提的语音交互。这些技术包括多通道声学回声消除 (MAEC)、麦克风阵列处理和混响消除技术,用于信号增强、可靠的唤醒词和交互结束检测、高质量语音合成以及复杂的语音和语言统计模型,从大量异构训练数据中学习。在所有这些领域中,深度学习 (DL) 都发挥了关键作用。
更新日期:2019-11-01
down
wechat
bug