Query-by-example on-device keyword spotting,arXiv - CS - Computation and Language

当前位置： X-MOL 学术 › arXiv.cs.CL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Query-by-example on-device keyword spotting
arXiv - CS - Computation and Language Pub Date : 2019-10-11 , DOI: arxiv-1910.05171
Byeonggeun Kim, Mingu Lee, Jinkyu Lee, Yeonseok Kim, and Kyuwoong Hwang

A keyword spotting (KWS) system determines the existence of, usually predefined, keyword in a continuous speech stream. This paper presents a query-by-example on-device KWS system which is user-specific. The proposed system consists of two main steps: query enrollment and testing. In query enrollment step, phonetic posteriors are output by a small-footprint automatic speech recognition model based on connectionist temporal classification. Using the phonetic-level posteriorgram, hypothesis graph of finite-state transducer (FST) is built, thus can enroll any keywords thus avoiding an out-of-vocabulary problem. In testing, a log-likelihood is scored for input audio using the FST. We propose a threshold prediction method while using the user-specific keyword hypothesis only. The system generates query-specific negatives by rearranging each query utterance in waveform. The threshold is decided based on the enrollment queries and generated negatives. We tested two keywords in English, and the proposed work shows promising performance while preserving simplicity.

中文翻译：

按示例查询设备上的关键字发现

关键字识别 (KWS) 系统确定连续语音流中是否存在（通常是预定义的）关键字。本文介绍了一个特定于用户的示例查询设备上 KWS 系统。所提出的系统包括两个主要步骤：查询注册和测试。在查询注册步骤中，语音后验由基于连接主义时间分类的小足迹自动语音识别模型输出。使用语音级后验图，构建有限状态转换器（FST）的假设图，从而可以注册任何关键字从而避免词汇外问题。在测试中，使用 FST 对输入音频进行对数似然评分。我们提出了一种阈值预测方法，同时仅使用用户特定的关键字假设。系统通过在波形中重新排列每个查询语句来生成特定于查询的否定词。阈值是根据注册查询和生成的否定决定的。我们用英语测试了两个关键字，建议的工作在保持简单性的同时表现出良好的性能。

更新日期：2020-01-15

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>