当前位置: X-MOL 学术Multimed. Tools Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Robust vowel region detection method for multimode speech
Multimedia Tools and Applications ( IF 3.6 ) Pub Date : 2021-01-16 , DOI: 10.1007/s11042-020-10394-7
Kumud Tripathi , K. Sreenivasa Rao

The aim of this paper is to explore a robust method for vowel region detection from multimode speech. In realistic scenario, speech can be classified into three modes namely; conversation, extempore, and read. The existing method detects the vowel form the speech recorded in clean environment which may not be appropriate for the multimode speech tasks. To address this issue, we proposed an approach based on continuous wavelet transform coefficients and phone boundaries for detecting the vowel regions from different modes of the speech signal. For evaluation of the proposed vowel region (VR) detection technique, TIMIT (read speech) and Bengali (read, extempore, and conversation speech) corpora are used. The proposed VR detection technique is compared to the state-of-the-art methods. The experiments has recorded significant gain in the performance of the proposed technique than the state-of-the-art methods. The efficiency of the proposed technique is shown by extracting vocal tract and excitation source features from automatically detected VRs for developing the multilingual speech mode classification (MSMC) model. The evaluation results report that the performance of the MSMC model is significantly improved when features are extracted from the vowel regions than the entire speech utterance.



中文翻译:

用于多模语音的鲁棒元音区域检测方法

本文的目的是探索一种从多模语音中检测元音区域的鲁棒方法。在现实情况下,语音可以分为三种模式:交谈,随意和阅读。现有方法从干净环境中记录的语音中检测元音,这可能不适合多模式语音任务。为了解决这个问题,我们提出了一种基于连续小波变换系数和电话边界的方法,用于从语音信号的不同模式中检测元音区域。为了评估建议的元音区域(VR)检测技术,使用了TIMIT(阅读语音)和Bengali(阅读,临时和对话语音)语料库。将拟议的VR检测技术与最新技术进行了比较。与最先进的方法相比,实验已经证明了所提出技术的性能有了显着提高。通过从自动检测到的VR中提取声道和激励源特征以开发多语言语音模式分类(MSMC)模型,可以显示所提出技术的效率。评估结果报告说,从元音区域中提取特征而不是整个语音时,MSMC模型的性能得到了显着改善。

更新日期:2021-01-18
down
wechat
bug