Personalized Keyphrase Detection using Speaker and Environment Information,arXiv - CS - Sound

当前位置： X-MOL 学术 › arXiv.cs.SD › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Personalized Keyphrase Detection using Speaker and Environment Information
arXiv - CS - Sound Pub Date : 2021-04-28 , DOI: arxiv-2104.13970
Rajeev RikhyeArden, Quan WangArden, Qiao LiangArden, Yanzhang HeArden, Ding ZhaoArden, YitengArden, Huang, Arun Narayanan, Ian McGraw

In this paper, we introduce a streaming keyphrase detection system that can be easily customized to accurately detect any phrase composed of words from a large vocabulary. The system is implemented with an end-to-end trained automatic speech recognition (ASR) model and a text-independent speaker verification model. To address the challenge of detecting these keyphrases under various noisy conditions, a speaker separation model is added to the feature frontend of the speaker verification model, and an adaptive noise cancellation (ANC) algorithm is included to exploit cross-microphone noise coherence. Our experiments show that the text-independent speaker verification model largely reduces the false triggering rate of the keyphrase detection, while the speaker separation model and adaptive noise cancellation largely reduce false rejections.

中文翻译：

使用说话者和环境信息进行个性化关键词检测

在本文中，我们介绍了一种流关键字短语检测系统，该系统可以轻松定制以准确地检测由大量词汇中的单词组成的任何短语。该系统通过端到端训练的自动语音识别（ASR）模型和与文本无关的说话者验证模型来实现。为了解决在各种嘈杂条件下检测这些关键短语的挑战，将说话人分离模型添加到说话人验证模型的功能前端，并包括自适应噪声消除（ANC）算法以利用跨麦克风噪声相干性。我们的实验表明，与文本无关的说话者验证模型可以大大降低关键短语检测的错误触发率，而说话者分离模型和自适应噪声消除则可以大大减少错误拒绝。

更新日期：2021-04-30

点击分享查看原文

点击收藏

阅读更多本刊最新论文