Non-negative matrix factorization-based time-frequency feature extraction of voice signal for Parkinson's disease prediction,Computer Speech & Language

当前位置： X-MOL 学术 › Comput. Speech Lang › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Non-negative matrix factorization-based time-frequency feature extraction of voice signal for Parkinson's disease prediction
Computer Speech & Language ( IF 4.3 ) Pub Date : 2021-03-06 , DOI: 10.1016/j.csl.2021.101216
Biswajit Karan , Sitanshu Sekhar Sahu , Juan Rafael Orozco-Arroyave , Kartik Mahto

Parkinson's disease (PD) is a neuron related disorder that affects the people in old age. The majority of people suffering from PD develop several voice impairments mainly related to what is known as dysarthric speech. Voice analysis can help in PD detection and in the evaluation of the dysarthria level of the patients. This study introduces time-frequency features to model discontinuities and abrupt changes that arise in the voice signal due to PD. The proposed method consists of four stages: time-frequency matrix (TFM) representation, TFM decomposition using non-negative matrix factorization (NMF), feature extraction and classification. Statistical analyses show that the proposed time-frequency features significantly differentiate between PD patients and healthy speakers. Experiments with sustained vowel phonations and isolated words of the corpus PC–GITA are conducted. The proposed method achieved average classification accuracies of up to 92% in vowels, and 97% in words. There is an improvement in accuracy ranging from 10% to 40% compared to existing methods. Further, the developed models are evaluated upon an independent dataset. Results on this separate test set show accuracies ranging from 63% to 75% in vowels, and from 53% to 75% in isolated words. Regarding the dysarthria level evaluation, Spearman's correlations between original and predicted labels are around 0.81 in sustained vowels and in isolated words. The results indicate that the proposed approach is suitable and robust for the automatic detection of PD.

中文翻译：

基于非负矩阵分解的语音信号时频特征提取用于帕金森病预测

帕金森氏病（PD）是一种与神经元有关的疾病，会影响老年人。大多数患有PD的人会发展出几种声音障碍，这主要与所谓的构音障碍有关。语音分析可以帮助检测PD和评估患者的构音障碍水平。这项研究引入了时频特征，以建模由于PD导致的语音信号中出现的不连续性和突变。所提出的方法包括四个阶段：时频矩阵（TFM）表示，使用非负矩阵分解（NMF）的TFM分解，特征提取和分类。统计分析表明，所提出的时频特征显着区分了PD患者和健康说话者。进行了持续元音发声和语料库PC–GITA孤立词的实验。所提出的方法在元音中的平均分类准确度高达92％，在单词中的平均分类准确度高达97％。与现有方法相比，准确性提高了10％到40％。此外，在独立的数据集上评估开发的模型。在这个单独的测试集中的结果显示，元音的准确度范围从63％到75％，孤立单词的准确度从53％到75％。关于构音障碍水平评估，在持续元音和孤立词中，Spearman原始标签和预测标签之间的相关性约为0.81。结果表明，该方法适用于PD的自动检测。所提出的方法在元音中的平均分类准确度高达92％，在单词中的平均分类准确度高达97％。与现有方法相比，准确性提高了10％到40％。此外，在独立的数据集上评估开发的模型。在这个单独的测试集中的结果显示，元音的准确度范围从63％到75％，孤立单词的准确度从53％到75％。关于构音障碍水平评估，在持续元音和孤立词中，Spearman原始标签和预测标签之间的相关性约为0.81。结果表明，该方法适用于PD的自动检测。所提出的方法在元音中的平均分类准确度高达92％，在单词中的平均分类准确度高达97％。与现有方法相比，准确性提高了10％到40％。此外，在独立的数据集上评估开发的模型。在这个单独的测试集中的结果显示，元音的准确度范围从63％到75％，孤立单词的准确度从53％到75％。关于构音障碍水平评估，在持续元音和孤立词中，Spearman原始标签和预测标签之间的相关性约为0.81。结果表明，该方法适用于PD的自动检测。在独立的数据集上评估开发的模型。在这个单独的测试集中的结果显示，元音的准确度范围从63％到75％，孤立单词的准确度从53％到75％。关于构音障碍水平评估，在持续元音和孤立词中，Spearman原始标签和预测标签之间的相关性约为0.81。结果表明，该方法适用于PD的自动检测。在独立的数据集上评估开发的模型。在这个单独的测试集中的结果显示，元音的准确度范围从63％到75％，孤立单词的准确度从53％到75％。关于构音障碍水平评估，在持续元音和孤立词中，Spearman原始标签和预测标签之间的相关性约为0.81。结果表明，该方法适用于PD的自动检测。

更新日期：2021-03-19

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>