当前位置: X-MOL 学术BioMed. Eng. OnLine › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Acoustic analysis and detection of pharyngeal fricative in cleft palate speech using correlation of signals in independent frequency bands and octave spectrum prominent peak.
BioMedical Engineering OnLine ( IF 2.9 ) Pub Date : 2020-05-27 , DOI: 10.1186/s12938-020-00782-3
Fei He 1 , Xiyue Wang 1 , Heng Yin 2 , Han Zhang 1 , Gang Yang 1 , Ling He 1
Affiliation  

Pharyngeal fricative is one typical compensatory articulation error of cleft palate speech. It passively influences daily communication for people who suffer from it. The automatic detection of pharyngeal fricatives in cleft palate speech can provide information for clinical doctors and speech-language pathologists to aid in diagnosis. This paper proposes two features (CSIFs: correlation of signals in independent frequency bands; OSPP: octave spectrum prominent peak) to detect pharyngeal fricative speech. CSIFs feature is proposed to detect the distribution characteristics of frequency components in pharyngeal fricative speech caused by the changed place of articulation and movement of articulators. While OSPP is presented to reflect the concentration degree of prominent peak which is closely related to the place of articulation in pharyngeal fricative, both features are investigated to relate to the altered production process of pharyngeal fricative. To evaluate the capability of these two features to detect pharyngeal fricative, we collected a speech database covering all the types of initial consonants in which pharyngeal fricatives occur. In this detection task, the classifier used to discriminate pharyngeal fricative speech and normal speech is based on ensemble learning. The detection accuracy obtained with CSIFs and OSPP features ranges from 83.5 to 84.5% and from 85 to 87%, respectively. When these two features are combined, the detection accuracy for pharyngeal fricative speech ranges from 88 to 89%, with an AUC (area under the receiver operating characteristic curve) value of 93%.

中文翻译:

利用独立频段信号和倍频程频谱突出峰的相关性对腭裂语音中的咽擦音进行声学分析和检测。

咽擦音是腭裂语音的一种典型代偿性发音错误。它被动地影响了患有该病的人的日常交流。自动检测腭裂语音中的咽擦音可以为临床医生和言语病理学家提供帮助诊断的信息。本文提出了两个特征(CSIF:独立频带中信号的相关性;OSPP:倍频程频谱突出峰值)来检测咽擦音语音。提出CSIF特征来检测由于发音部位的变化和发音器官的运动而引起的咽部摩擦语音中的频率分量的分布特征。虽然提出OSPP是为了反映与咽擦音发音位置密切相关的突出峰的集中程度,但研究这两个特征与咽擦音产生过程的改变有关。为了评估这两个特征检测咽擦音的能力,我们收集了一个涵盖所有出现咽擦音的声母辅音类型的语音数据库。在这个检测任务中,用于区分咽擦音语音和正常语音的分类器是基于集成学习的。使用 CSIF 和 OSPP 特征获得的检测准确率分别为 83.5% 至 84.5% 和 85% 至 87%。当这两个特征结合起来时,咽摩擦音语音的检测精度范围为 88% 至 89%,AUC(接收者工作特征曲线下面积)值为 93%。
更新日期:2020-05-27
down
wechat
bug