当前位置: X-MOL 学术Appl. Soft Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
CASA-based speaker identification using cascaded GMM-CNN classifier in noisy and emotional talking conditions
Applied Soft Computing ( IF 8.7 ) Pub Date : 2021-02-08 , DOI: 10.1016/j.asoc.2021.107141
Ali Bou Nassif , Ismail Shahin , Shibani Hamsa , Nawel Nemmour , Keikichi Hirose

This work aims at intensifying text-independent speaker identification performance in real application situations such as noisy and emotional talking conditions. This is achieved by incorporating two different modules: a Computational Auditory Scene Analysis (CASA) based pre-processing module for noise reduction and “cascaded Gaussian Mixture Model – Convolutional Neural Network (GMM-CNN) classifier for speaker identification” followed by emotion recognition. This research proposes and evaluates a novel algorithm to improve the accuracy of speaker identification in emotional and highly-noise susceptible conditions. Experiments demonstrate that the proposed model yields promising results in comparison with other classifiers when “Speech Under Simulated and Actual Stress (SUSAS) database, Emirati Speech Database (ESD), the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS)” database and the “Fluent Speech Commands” database are used in a noisy environment.



中文翻译:

在嘈杂和情绪化的谈话条件下,使用级联GMM-CNN分类器进行基于CASA的说话人识别

这项工作旨在增强在嘈杂和情绪化的谈话条件等实际应用情况下与文本无关的说话人识别性能。这可以通过合并两个不同的模块来实现:一个用于减少噪声的基于计算听觉场景分析(CASA)的预处理模块,以及一个用于情感识别的“级联高斯混合模型-用于说话人识别的卷积神经网络(GMM-CNN)分类器”。这项研究提出并评估了一种新颖的算法,可以提高情绪和高噪声易感性条件下说话人识别的准确性。实验证明,与“语音模拟和实际压力(SUSAS)数据库,阿联酋语音数据库(ESD),

更新日期:2021-02-12
down
wechat
bug