Building and benchmarking an Arabic Speech Commands dataset for small-footprint keyword spotting,Engineering Applications of Artificial Intelligence

当前位置： X-MOL 学术 › Eng. Appl. Artif. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Building and benchmarking an Arabic Speech Commands dataset for small-footprint keyword spotting
Engineering Applications of Artificial Intelligence ( IF 8 ) Pub Date : 2021-05-07 , DOI: 10.1016/j.engappai.2021.104267
Abdulkader Ghandoura , Farouk Hjabo , Oumayma Al Dakkak

The introduction of the Google Speech Commands dataset accelerated research and resulted in a variety of new deep learning approaches that address keyword spotting tasks. The main contribution of this work is the building of an Arabic Speech Commands dataset, a counterpart to Google’s dataset. Our dataset consists of 12000 instances, collected from 30 contributors, and grouped into 40 keywords. We also report different experiments to benchmark this dataset using classical machine learning and deep learning approaches, the best of which is a Convolutional Neural Network with Mel-Frequency Cepstral Coefficients that achieved an accuracy of $\sim$ 98%. Additionally, we point out some key ideas to be considered in such tasks.

中文翻译：

建立和基准化阿拉伯语语音命令数据集，以发现小尺寸的关键字

Google Speech Commands数据集的引入加快了研究速度，并产生了各种新的深度学习方法，这些方法可解决关键字发现任务。这项工作的主要贡献是建立了阿拉伯语语音命令数据集，该数据集与Google的数据集相对应。我们的数据集包含12000个实例，这些实例是从30个贡献者那里收集的，并分为40个关键字。我们还报告了使用经典机器学习和深度学习方法对这个数据集进行基准测试的不同实验，其中最好的是具有梅尔频率倒谱系数的卷积神经网络，其准确度达到了 $〜$ 98％。此外，我们指出了在此类任务中要考虑的一些关键思想。

更新日期：2021-05-07

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>