当前位置: X-MOL 学术J. Ambient Intell. Human. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A comparison of Laryngeal effect in the dialects of Punjabi language
Journal of Ambient Intelligence and Humanized Computing ( IF 3.662 ) Pub Date : 2021-04-27 , DOI: 10.1007/s12652-021-03235-4
Kanika Goyal , Amitoj Singh , Virender Kadyan

Human beings have their own speaking style which helped them in depicting their native language. The major reason behind variability in some language is due to varying dialect of the speakers. In the field of Automatic Speech Recognition (ASR), key challenge is to recognize and to generate an acoustic model which represents differences of redundant acoustic features. In this paper, an issue of dialect classification is perform on the basis of tonal aspects of laryngeal phoneme [h]. This is an empirical study of [h] sound words in four major dialects of Indian Punjabi language with two key parameters, namely F0 variation, and acoustic space, which are calculated using two formant frequencies: F1, and F2. The results are based on four different dialects which provide us some interesting hypotheses and are explored with self-created dataset. The speech analysis tool PRAAT features have been extracted and correlations are studied using Statistical Package for the Social Sciences (SPSS). Each variable has been compared with same variable of all other dialects. The results analysis showed that the fundamental frequency of these vowels are influenced distinctly in different dialectal conditions. Apart F1 and F2 have shown a significant correlation with each spoken dialect. Further work is extended through processing of acoustic information at feature level or by comparing the performance analysis using basic or hybrid Linear Predictive Cepstral Coefficients feature extraction methods. The result shows that the hybrid LPCC + F0 system achieved a Relative Improvement (R.I.) of 6.94% on Subspace Gaussian Mixture Model model in comparison to that of basic LPCC approach respectively.



中文翻译:

旁遮普语方言中喉音效果的比较

人类有自己的说话风格,这有助于他们描述自己的母语。某些语言可变性背后的主要原因是由于说话者方言的变化。在自动语音识别(ASR)领域,关键挑战是识别并生成代表冗余声学特征差异的声学模型。在本文中,方言分类的问题是基于喉音的音调方面[h]。这是对印度旁遮普语的四种主要方言中[h]个声词的实证研究,具有两个关键参数,即F0变异和声学空间,这两个参数使用两个共振峰频率F1和F2计算。结果基于四种不同的方言,这些方言为我们提供了一些有趣的假设,并使用自行创建的数据集进行了探索。语音分析工具PRAAT的功能已被提取,并使用社会科学统计软件包(SPSS)研究了相关性。每个变量都已与所有其他方言的相同变量进行了比较。结果分析表明,在不同的方言条件下,这些元音的基本频率受到明显​​的影响。F1和F2与每个方言都显示出显着的相关性。通过在特征级别处理声学信息,或通过比较使用基本或混合线性预测倒谱系数特征提取方法的性能分析,可以扩展进一步的工作。结果表明,与基本LPCC方法相比,混合LPCC + F0系统在子空间高斯混合模型模型上实现了6.94%的相对改进(RI)。

更新日期:2021-04-27
down
wechat
bug