当前位置: X-MOL 学术Speech Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sinusoidal model-based hypernasality detection in cleft palate speech using CVCV sequence
Speech Communication ( IF 2.4 ) Pub Date : 2020-08-08 , DOI: 10.1016/j.specom.2020.08.001
Akhilesh Kumar Dubey , S.R. Mahadeva Prasanna , S. Dandapat

Hypernasality in the speech of children with cleft palate is a consequence of velopharyngeal insufficiency. The spectral analysis of hypernasal speech shows the presence of nasal formants and anti-formants in the spectrum which affects the harmonic-intensity. The nasal formants increase whereas the anti-formants decrease the magnitude of harmonics around its location of addition. Hence, the spectrum of hypernasal and normal speech is different from each other. To capture the spectral difference, three features namely, normalized harmonic amplitude (NHA), harmonic amplitude ratio (HAR), and prominent harmonics frequency (PHF) are proposed in this work. NHA feature is the magnitude of harmonics after their normalization with respect to the maximum magnitude, HAR feature is the relative magnitude of harmonics with respect to their previous harmonics, and the PHF feature is the frequencies of prominent harmonics in the spectrum. The combination of three features gives an accuracy of 82.46%, 87.89%, 84.25% for /a/, /i/ and /u/ vowels respectively for the detection of hypernasality using support vector machine classifier.



中文翻译:

基于正弦模型的CVCV序列检测c裂语音中的鼻音

left裂患儿的言语亢进是咽喉功能不全的结果。高鼻语音的频谱分析表明,频谱中存在鼻共振峰和反共振峰,这会影响谐波强度。鼻腔共振峰增加,而反共振峰减少其添加位置周围的谐波幅度。因此,鼻音和正常语音的频谱彼此不同。为了捕获频谱差异,在这项工作中提出了三个特征,即归一化谐波幅度(NHA),谐波幅度比(HAR)和突出谐波频率(PHF)。NHA特征是相对于最大幅度归一化后的谐波幅度,HAR特征是相对于其先前谐波的相对谐波幅度,PHF功能是频谱中主要谐波的频率。这三个特征的组合对于使用支持向量机分类器检测鼻音异常的/ a /,/ i /和/ u /元音的准确度分别为82.46%,87.89%和84.25%。

更新日期:2020-08-08
down
wechat
bug