Automatic intelligibility assessment of dysarthric speech using glottal parameters,Speech Communication

当前位置： X-MOL 学术 › Speech Commun. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Automatic intelligibility assessment of dysarthric speech using glottal parameters
Speech Communication ( IF 2.4 ) Pub Date : 2020-07-01 , DOI: 10.1016/j.specom.2020.06.003
N P Narendra , Paavo Alku

Objective intelligibility assessment of dysarthric speech can assist clinicians in diagnosis of speech disorders as well as in medical treatment. This study investigates the use of glottal parameters (i.e. parameters that describe the acoustical excitation of voiced speech, the glottal flow) in the automatic intelligibility assessment of dysarthric speech. Instead of directly predicting the intelligibility of dysarthric speech using a single-stage system, the proposed method utilizes a two-stage framework. In the first stage, two-class severity classification of dysarthria is performed using support vector machines (SVMs). In the second stage, intelligibility estimation of dysarthric speech is computed using a linear regression model. Two sets of glottal parameters are explored: (1) time-domain and frequency-domain parameters and (2) parameters based on principal component analysis (PCA). Acoustic parameters proposed in a similar intelligibility prediction study by Falk et al. (2012) are used as baseline features. Evaluation results show that the two-stage framework leads to improvement in the intelligibility assessment measures (correlation and root mean square error) compared to the single-stage framework. The combination of the glottal parameters sets results in better performance in the severity classification and intelligibility estimation tasks compared to the baseline features.

中文翻译：

使用声门参数自动评估发音异常的语音

dysarthric语音的客观清晰度评估可以帮助临床医生诊断语言障碍以及进行药物治疗。本研究调查了声调异常语音的自动清晰度评估中的声门参数（即描述浊语音的声激励，声门流量的参数）的使用。所提出的方法不是使用单阶段系统来直接预测构音障碍语音的清晰度，而是利用了两阶段框架。在第一阶段，使用支持向量机（SVM）对构音障碍进行两类严重性分类。在第二阶段，使用线性回归模型计算构音障碍语音的清晰度。探索了两组声门参数：（1）时域和频域参数，以及（2）基于主成分分析（PCA）的参数。Falk等人在类似的清晰度预测研究中提出了声学参数。（2012）被用作基线特征。评估结果表明，与单阶段框架相比，两阶段框架可改善清晰度评估措施（相关性和均方根误差）。与基线特征相比，声门参数集的组合在严重性分类和清晰度评估任务中具有更好的性能。评估结果表明，与单阶段框架相比，两阶段框架可改善清晰度评估措施（相关性和均方根误差）。与基线特征相比，声门参数集的组合在严重性分类和清晰度评估任务中具有更好的性能。评估结果表明，与单阶段框架相比，两阶段框架可改善清晰度评估措施（相关性和均方根误差）。与基线特征相比，声门参数集的组合在严重性分类和清晰度评估任务中具有更好的性能。

更新日期：2020-07-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11