当前位置: X-MOL 学术IEEE J. Sel. Top. Signal Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Epoch Detection Using Hilbert Envelope for Glottal Excitation Enhancement and Maximum-Sum Subarray for Epoch Marking
IEEE Journal of Selected Topics in Signal Processing ( IF 7.5 ) Pub Date : 2020-02-01 , DOI: 10.1109/jstsp.2019.2951458
Hirak Dasgupta , Prem C. Pandey , K. S. Nataraj

A technique is presented for detecting glottal excitation epochs in the speech signal, using Hilbert envelope for enhancing the glottal excitation and maximum-sum subarray for marking the epochs. The processing comprises the steps of dynamic range compression, Hilbert envelope calculation, saliency enhancing, and epoch marking. The dynamic range compression reduces the amplitude variation of the signal. The Hilbert envelope enhances the glottal excitation. The saliency enhancing further enhances the instants of significant excitation by reducing the residual ripples related to the vocal tract filter, by using a dynamic peak detector, a nonlinear smoother, and a differentiator. The epoch marking locates the peak of the maximum-sum subarray as the instant of significant glottal excitation. Evaluation of the proposed technique showed its performance measures to be similar to those of the state-of-the-art techniques for normal speech, telephone-quality speech, and highpass filtered speech and better than or similar to them for pathological speech. The averaged accuracy-weighted identification rates with the proposed technique for normal speech, telephone-quality speech, and pathological speech were 79.46%, 77.04%, and 71.52%, respectively. The proposed technique employs single-pass processing and may find applications in speech training aids, diagnosis of speech disorders, and voice conversion of speech with voice disorders.

中文翻译:

使用 Hilbert 包络进行声门激发增强和最大和子阵列进行历元标记的历元检测

提出了一种检测语音信号中声门激发历元的技术,使用希尔伯特包络来增强声门激励和最大和子阵列来标记历元。该处理包括动态范围压缩、希尔伯特包络计算、显着性增强和纪元标记步骤。动态范围压缩减少了信号的幅度变化。希尔伯特包络增强了声门激发。显着性增强通过使用动态峰值检测器、非线性平滑器和微分器减少与声道滤波器相关的残余波纹,进一步增强了显着激发的瞬间。纪元标记将最大和子阵列的峰值定位为显着声门激发的瞬间。对所提出技术的评估表明,它的性能指标与用于正常语音、电话质量语音和高通滤波语音的最先进技术的性能指标相似,并且在病理语音方面优于或类似于它们。所提出的技术对正常语音、电话质量语音和病理语音的平均准确度加权识别率分别为 79.46%、77.04% 和 71.52%。所提出的技术采用单程处理,可用于语音训练辅助、语音障碍诊断以及语音障碍语音的语音转换。所提出的技术对正常语音、电话质量语音和病理语音的平均准确度加权识别率分别为 79.46%、77.04% 和 71.52%。所提出的技术采用单程处理,可用于语音训练辅助、语音障碍诊断以及语音障碍语音的语音转换。所提出的技术对正常语音、电话质量语音和病理语音的平均准确度加权识别率分别为 79.46%、77.04% 和 71.52%。所提出的技术采用单程处理,可用于语音训练辅助、语音障碍诊断以及语音障碍语音的语音转换。
更新日期:2020-02-01
down
wechat
bug