Evaluation of the effectiveness and efficiency of state-of-the-art features and models for automatic speech recognition error detection,Journal of Big Data

当前位置： X-MOL 学术 › J. Big Data › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Evaluation of the effectiveness and efficiency of state-of-the-art features and models for automatic speech recognition error detection
Journal of Big Data ( IF 8.6 ) Pub Date : 2021-01-06 , DOI: 10.1186/s40537-020-00391-w
Asmaa El Hannani , Rahhal Errattahi , Fatima Zahra Salmam , Thomas Hain , Hassan Ouahmane

Speech based human-machine interaction and natural language understanding applications have seen a rapid development and wide adoption over the last few decades. This has led to a proliferation of studies that investigate Error detection and classification in Automatic Speech Recognition (ASR) systems. However, different data sets and evaluation protocols are used, making direct comparisons of the proposed approaches (e.g. features and models) difficult. In this paper we perform an extensive evaluation of the effectiveness and efficiency of state-of-the-art approaches in a unified framework for both errors detection and errors type classification. We make three primary contributions throughout this paper: (1) we have compared our Variant Recurrent Neural Network (V-RNN) model with three other state-of-the-art neural based models, and have shown that the V-RNN model is the most effective classifier for ASR error detection in term of accuracy and speed, (2) we have compared four features’ settings, corresponding to different categories of predictor features and have shown that the generic features are particularly suitable for real-time ASR error detection applications, and (3) we have looked at the post generalization ability of our error detection framework and performed a detailed post detection analysis in order to perceive the recognition errors that are difficult to detect.

中文翻译：

评估用于自动语音识别错误检测的最新功能和模型的有效性和效率

在过去的几十年中，基于语音的人机交互和自然语言理解应用程序得到了迅速发展并被广泛采用。这导致研究自动语音识别（ASR）系统中的错误检测和分类的研究激增。但是，由于使用了不同的数据集和评估协议，因此很难直接比较建议的方法（例如功能和模型）。在本文中，我们在错误检测和错误类型分类的统一框架中对最新技术的有效性和效率进行了广泛的评估。在整篇论文中，我们做出了三点主要贡献：（1）我们将变异递归神经网络（V-RNN）模型与其他三个基于神经的最新模型进行了比较，

更新日期：2021-01-07

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文