当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multi-task Learning with Cross Attention for Keyword Spotting
arXiv - CS - Sound Pub Date : 2021-07-15 , DOI: arxiv-2107.07634
Takuya Higuchi, Anmol Gupta, Chandra Dhir

Keyword spotting (KWS) is an important technique for speech applications, which enables users to activate devices by speaking a keyword phrase. Although a phoneme classifier can be used for KWS, exploiting a large amount of transcribed data for automatic speech recognition (ASR), there is a mismatch between the training criterion (phoneme recognition) and the target task (KWS). Recently, multi-task learning has been applied to KWS to exploit both ASR and KWS training data. In this approach, an output of an acoustic model is split into two branches for the two tasks, one for phoneme transcription trained with the ASR data and one for keyword classification trained with the KWS data. In this paper, we introduce a cross attention decoder in the multi-task learning framework. Unlike the conventional multi-task learning approach with the simple split of the output layer, the cross attention decoder summarizes information from a phonetic encoder by performing cross attention between the encoder outputs and a trainable query sequence to predict a confidence score for the KWS task. Experimental results on KWS tasks show that the proposed approach outperformed the conventional multi-task learning with split branches and a bi-directional long short-team memory decoder by 12% on average.

中文翻译:

具有交叉注意力的多任务学习用于关键字发现

关键字识别 (KWS) 是语音应用程序的一项重要技术,它使用户能够通过说出关键字短语来激活设备。尽管音素分类器可用于 KWS,利用大量转录数据进行自动语音识别 (ASR),但训练标准(音素识别)与目标任务 (KWS) 之间存在不匹配。最近,多任务学习已应用于 KWS 以利用 ASR 和 KWS 训练数据。在这种方法中,声学模型的输出分为两个分支用于两个任务,一个用于使用 ASR 数据训练的音素转录,另一个用于使用 KWS 数据训练的关键字分类。在本文中,我们在多任务学习框架中引入了交叉注意力解码器。与具有简单输出层分割的传统多任务学习方法不同,交叉注意力解码器通过在编码器输出和可训练查询序列之间执行交叉注意力来总结来自语音编码器的信息,以预测 KWS 任务的置信度分数。在 KWS 任务上的实验结果表明,所提出的方法比具有拆分分支和双向长短团队记忆解码器的传统多任务学习平均高出 12%。
更新日期:2021-07-19
down
wechat
bug