当前位置: X-MOL 学术Biomed. Eng. Biomed. Tech. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Automatic hypernasality grade assessment in cleft palate speech based on the spectral envelope method
Biomedical Engineering / Biomedizinische Technik ( IF 1.3 ) Pub Date : 2019-09-16 , DOI: 10.1515/bmt-2018-0181
Jing Zhang 1 , Sen Yang 1 , Xiyue Wang 1 , Ming Tang 1 , Heng Yin 2 , Ling He 1
Affiliation  

Due to velopharyngeal incompetence, airflow overflows from the oral cavity to the nasal cavity, which results in hypernasality. Hypernasality greatly reduces speech intelligibility and affects the daily communication of patients with cleft palate. Accurate assessment of hypernasality grades can provide assisted diagnosis for speech-language pathologists (SLPs) in clinical settings. Utilizing a support vector machine (SVM), this paper classifies speech recordings into four grades (normal, mild, moderate and severe hypernasality) based on vocal tract characteristics. Linear prediction (LP) analysis is widely used to model the vocal tract. Glottal source information may be included in the LP-based spectrum. The stabilized weighted linear prediction (SWLP) method, which imposes the temporal weights on the closed-phase interval of the glottal cycle, is a more robust approach for modeling the vocal tract. The extended weighted linear prediction (XLP) method weights each lagged speech signal separately, which achieves a finer time scale on the spectral envelope than the SWLP method. Tested speech recordings were collected from 60 subjects with cleft palate and 20 control subjects, and included a total of 4640 Mandarin syllables. The experimental results showed that the spectral envelope of normal speech decreases faster than that of hypernasal speech in the high-frequency part. The experimental results also indicate that the SWLP- and XLP-based methods have smaller correlation coefficients between normal and hypernasal speech than the LP method. Thus, the SWLP and XLP methods have better ability to distinguish hypernasal from normal speech than the LP method. The classification accuracies of the four hypernasality grades using the SWLP and XLP methods range from 83.86% to 97.47%. The selection of the model order and the size of the weight function are also discussed in this paper.

中文翻译:

基于谱包络法的腭裂语音自动高鼻度等级评定

由于腭咽功能不全,气流从口腔溢出到鼻腔,从而导致鼻过度。过度鼻音会大大降低言语清晰度,影响腭裂患者的日常交流。准确评估过度鼻音等级可以为临床环境中的语言病理学家 (SLP) 提供辅助诊断。利用支持向量机(SVM),本文根据声道特征将语音记录分为四个等级(正常、轻度、中度和重度鼻音)。线性预测 (LP) 分析广泛用于模拟声道。声门源信息可能包含在基于 LP 的频谱中。稳定加权线性预测 (SWLP) 方法,将时间权重施加在声门周期的闭合相位间隔上,是一种更强大的声道建模方法。扩展加权线性预测 (XLP) 方法分别对每个滞后语音信号进行加权,与 SWLP 方法相比,它在频谱包络上实现了更精细的时间尺度。从 60 名腭裂受试者和 20 名对照受试者中收集测试的语音录音,共包括 4640 个普通话音节。实验结果表明,在高频部分,正常语音的频谱包络比鼻音的频谱包络下降更快。实验结果还表明,与 LP 方法相比,基于 SWLP 和 XLP 的方法在正常语音和高鼻语音之间的相关系数更小。因此,SWLP 和 XLP 方法比 LP 方法具有更好的区分鼻音和正常语音的能力。使用 SWLP 和 XLP 方法对四个鼻音等级的分类准确度范围为 83.86% 至 97.47%。本文还讨论了模型阶数的选择和权重函数的大小。
更新日期:2019-09-16
down
wechat
bug