当前位置: X-MOL 学术J. Ambient Intell. Human. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
EMG-based speech recognition using dimensionality reduction methods
Journal of Ambient Intelligence and Humanized Computing Pub Date : 2021-05-23 , DOI: 10.1007/s12652-021-03315-5
Anat Ratnovsky , Sarit Malayev , Shahar Ratnovsky , Sara Naftali , Neta Rabin

Automatic speech recognition is the main form of man–machine communication. Recently, several studies have shown the ability to automatically recognize speech based on electromyography (EMG) signals of the facial muscles using machine learning methods. The objective of this study was to utilize machine learning methods for automatic identification of speech based on EMG signals. EMG signals from three facial muscles were measured from four healthy female subjects while pronouncing seven different words 50 times. Short time Fourier transform features were extracted from the EMG data. Principle component analysis (PCA) and locally linear embedding (LLE) methods were applied and compared for reducing the dimensions of the EMG data. K-nearest-neighbors was used to examine the ability to identify different word sets of a subject based on his own dataset, and to identify words of one subject based on another subject's dataset, utilizing an affine transformation for aligning between the reduced feature spaces of two subjects. The PCA and LLE achieved average recognizing rate of 81% for five words-sets in the single-subject approach. The best average recognition success rates for three and five words-sets were 88.8% and 74.6%, respectively, for the multi-subject classification approach. Both the PCA and LLE achieved satisfactory classification rates for both the single-subject and multi-subject approaches. The multi-subject classification approach enables robust classification of words recorded from a new subject based on another subject’s dataset and thus can be applicable for people who have lost their ability to speak.



中文翻译:

使用降维方法的基于EMG的语音识别

自动语音识别是人机通信的主要形式。最近,一些研究显示了使用机器学习方法基于面部肌肉的肌电图(EMG)信号自动识别语音的能力。这项研究的目的是利用机器学习方法基于EMG信号自动识别语音。测量了来自四个健康女性受试者的三个面部肌肉的EMG信号,同时发音了70个不同的单词50次。从EMG数据中提取了短时傅立叶变换特征。应用主成分分析(PCA)和局部线性嵌入(LLE)方法并进行了比较,以减小EMG数据的维数。使用K近邻算法来检查基于对象自己的数据集识别一个对象的不同单词集的能力,以及根据另一个对象的数据集来识别一个对象的单词的能力,并利用仿射变换在两个对象的约简特征空间之间进行对齐两个科目。在单主题方法中,五个单词集的PCA和LLE的平均识别率达到81%。对于多主题分类方法,三个和五个单词集的最佳平均识别成功率分别为88.8%和74.6%。PCA和LLE在单主题和多主题方法上均达到令人满意的分类率。

更新日期:2021-05-24
down
wechat
bug