当前位置: X-MOL 学术Circuits Syst. Signal Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Variance Normalised Features for Language and Dialect Discrimination
Circuits, Systems, and Signal Processing ( IF 1.8 ) Pub Date : 2021-01-11 , DOI: 10.1007/s00034-020-01641-1
Xiaoxiao Miao , Ian McLoughlin , Yan Song

This paper proposes novel features for automated language and dialect identification that aim to improve discriminative power by ensuring that each element of the feature vector has a normalised contribution to inter-class variance. The method firstly computes inter- and intra-class frequency variance statistics and then distributes the overall spectral variance across spectral regions which are sized to contain near-equal-variance difference. Spectral features are average pooled within regions to obtain variance normalised features (VNFs). The proposed VNFs are low complexity drop-in replacements for MFCC, SDC, PLP or other input features used for speech-related tasks. In this paper, they are evaluated in three types of system, against MFCCs, for two data-constrained language and dialect identification tasks. VNFs demonstrate good results, comfortably outperforming MFCCs at most dimension sizes, and yielding particularly good performance for the most challenging data-constrained 3s utterance length in the LID task.



中文翻译:

语言和方言歧视的方差归一化特征

本文提出了用于自动语言和方言识别的新颖功能,旨在通过确保特征向量的每个元素对类间方差具有归一化的贡献来提高判别能力。该方法首先计算类间和类内频率方差统计量,然后将整个频谱方差分布在频谱区域中,频谱区域的大小确定为包含近似相等的方差。将频谱特征平均合并到区域中,以获得方差归一化特征(VNF)。提议的VNF是MFCC,SDC,PLP或用于与语音相关的任务的其他输入功能的低复杂度直接替换。在本文中,针对MFCC,针对三种数据受限的语言和方言识别任务,在三种类型的系统中对它们进行了评估。VNF表现出良好的效果,

更新日期:2021-01-11
down
wechat
bug