当前位置: X-MOL 学术Circuits Syst. Signal Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Event-Based Transformation of Misarticulated Stops in Cleft Lip and Palate Speech
Circuits, Systems, and Signal Processing ( IF 2.3 ) Pub Date : 2021-02-20 , DOI: 10.1007/s00034-021-01663-3
Protima Nomo Sudro , C. M. Vikram , S. R. Mahadeva Prasanna

The cleft of the lip and palate (CLP) is a congenital disability affecting the craniofacial region and it impacts the speech production system. The current work focuses on the modification of misarticulations produced for unvoiced stop consonants in CLP speech. Three types of misarticulations are studied: glottal, palatal, and velar stop substitutions. The stop consonants are misarticulated due to inadequate buildup of intra-oral pressure caused by velopharyngeal dysfunction and oro-nasal fistula. The misarticulated stops affect the speech intelligibility and quality, and this further affects the use of speech-based applications. The misarticulated stops are analyzed and modified using the speech data collected from 60 Kannada speaking children (normal and CLP). An event-based modification approach is used to correct the misarticulated stops. At first, automatic detection of burst onset and vowel onset events is carried out. Then, the region from vowel onset to 20 ms duration of the vowel is extracted. Further, the region from burst onset point to 20 ms duration of the vowel is defined as the region for modification. It is transformed using the nonnegative matrix factorization (NMF) method. The objective and subjective evaluation results show that the proposed event-based transformation approach provides a relative improvement compared to the entire-word modification (signal processed without using the knowledge of burst and vowel onset events). The event-based transformed misarticulated stops showed close similarity with the normal stops in perceptual quality. The improved performance accuracy of modified stops suggests that the speech distortion is minimized.



中文翻译:

基于事件的唇裂和Pal裂语音中误止的转换

唇pa裂(CLP)是一种先天性残疾,影响颅面部区域,并影响言语产生系统。当前的工作集中于对CLP语音中为清音停止辅音产生的发音错误进行修改。研究了三种类型的口齿不清:声门、,和停止。由于口咽部功能障碍和口鼻瘘引起的口内压力不足,终止辅音发音不清。错位的停止会影响语音的清晰度和质量,并且这进一步影响基于语音的应用程序的使用。使用从60名卡纳达语儿童(正常儿童和CLP)收集的语音数据来分析和修改错接的停车位。基于事件的修改方法用于更正错位的停靠点。首先,自动检测猝发发作和元音发作事件。然后,提取从元音开始到元音持续时间20 ms的区域。此外,从元音的爆发开始点到持续时间20ms的区域被定义为用于修改的区域。使用非负矩阵分解(NMF)方法对其进行转换。客观和主观的评估结果表明,与全字修改(不使用突发和元音起音事件知识的情况下处理信号)相比,所提出的基于事件的转换方法提供了相对的改进。基于事件的转换后的错位停靠点在感知质量上与正常停靠点非常相似。改进的止动件提高的性能精度表明语音失真已降至最低。自动检测爆发开始和元音开始事件。然后,提取从元音开始到元音持续时间20 ms的区域。此外,从元音的爆发开始点到持续时间20ms的区域被定义为用于修改的区域。使用非负矩阵分解(NMF)方法对其进行转换。客观和主观的评估结果表明,与全字修改(在不使用突发和元音起音事件知识的情况下处理信号)相比,所提出的基于事件的转换方法提供了相对改进。基于事件的转换后的错位停靠点在感知质量上与正常停靠点非常相似。改进的止动件提高的性能精度表明语音失真已降至最低。自动检测爆发开始和元音开始事件。然后,提取从元音开始到元音持续时间20 ms的区域。此外,从元音的爆发开始点到持续时间20ms的区域被定义为用于修改的区域。使用非负矩阵分解(NMF)方法对其进行转换。客观和主观的评估结果表明,与全字修改(不使用突发和元音起音事件知识的情况下处理信号)相比,所提出的基于事件的转换方法提供了相对的改进。基于事件的转换后的错位停靠点在感知质量上与正常停靠点非常相似。改进的止动件提高的性能精度表明语音失真已降至最低。

更新日期:2021-02-21
down
wechat
bug