当前位置: X-MOL 学术Soft Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An information set-based robust text-independent speaker authentication
Soft Computing ( IF 3.1 ) Pub Date : 2019-08-14 , DOI: 10.1007/s00500-019-04277-9
Jeevan Medikonda , Saurabh Bhardwaj , Hanmandlu Madasu

Abstract

This paper presents a method for the extraction of twofold information set (TFIS) features for the text-independent speaker recognition. The method takes the Mel frequency cepstral coefficients from the frames of a sample speech signal and forms a matrix. From this, both spatial and temporal information components are derived based on the information set concept using the entropy framework. The TFIS features comprising their combination of two components are less in number thus reducing the computational time, complexity and improving the performance under the noisy environment. The proposed approach is tested on three datasets namely NIST-2003, VoxForge 2014 speech corpus and VCTK speech corpus in terms of speed, computational complexity, memory requirement and accuracy. Its performance is validated under different noisy environments at different signal-to-noise ratios.



中文翻译:

基于信息集的健壮的独立于文本的说话者身份验证

摘要

本文提出了一种方法,用于提取与文本无关的说话人识别的双重信息集(TFIS)特征。该方法从样本语音信号的帧中获取梅尔频率倒谱系数,并形成矩阵。由此,基于信息集的概念,使用熵框架可以导出时空信息成分。TFIS功能包括两个部分的组合,因此数量较少,因此减少了计算时间,复杂性并提高了在嘈杂环境下的性能。在速度,计算复杂度,内存需求和准确性方面,对三种方法(NIST-2003,VoxForge 2014语音语料库和VCTK语音语料库)进行了测试。

更新日期:2020-03-20
down
wechat
bug