当前位置: X-MOL 学术Egypt. Inform. J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An improved gaussian mixture hidden conditional random fields model for audio-based emotions classification
Egyptian Informatics Journal ( IF 5.0 ) Pub Date : 2020-04-15 , DOI: 10.1016/j.eij.2020.03.001
Muhammad Hameed Siddiqi

The analysis of human emotions plays a significant role in providing sufficient information about patients in monitoring their feelings for better management of their diseases. Audio-based emotions recognition has become a fascinating research interest for such domains during the last decade. Mostly, audio-based emotions systems depend on the recognition stage. The existing model has a common issue called objectivity suppositions problem, which might decrease the recognition rate. Therefore, this study investigates the improved version of a classifier that is based on hidden conditional random fields (HCRFs) model to classify emotional speech. In this model, we introduced a novel methodology that will incorporate multifaceted dissemination with the help of employing a combination of complete covariance Gaussian concreteness function. Due to this incorporation, the proposed model tackle most of the limitations of existing classifiers. Some of the well-known features like Mel-frequency cepstral coefficients (MFCC) are extracted in our experiments. The proposed model has been validated and evaluated on two publicly available datasets likes Berlin Database of Emotional Speech (Emo-DB) and the eNTER FACE’05 Audio-Visual Emotion dataset. For validation and comparison against the existing techniques, we utilized 10-fold cross validation scheme. The proposed method achieved significant improvement under the p-value <0.03 for classification. Moreover, we also prove that computational wise, our computation technique is less expensive against state of the art works.



中文翻译:

改进的高斯混合隐藏条件随机场模型,用于基于音频的情感分类

对人类情绪的分析在提供有关患者的足够信息以监控他们的感觉以更好地控制疾病方面起着重要作用。在过去的十年中,基于音频的情绪识别已成为此类领域的引人入胜的研究兴趣。通常,基于音频的情绪系统取决于识别阶段。现有模型存在一个称为客观假设问题的常见问题,这可能会降低识别率。因此,本研究调查了基于隐藏条件随机场(HCRF)模型对情感言语进行分类的分类器的改进版本。在此模型中,我们引入了一种新颖的方法,该方法将结合采用完全协方差高斯具体性函数的组合来进行多面传播。由于这种结合,提出的模型解决了现有分类器的大多数局限性。我们的实验中提取了一些众所周知的特征,例如梅尔频率倒谱系数(MFCC)。所提出的模型已在两个公开可用的数据集上得到了验证和评估,例如柏林情感言语数据库(Emo-DB)和eNTER FACE'05视听情感数据集。为了验证和与现有技术进行比较,我们利用了 所提出的模型已在两个公开可用的数据集上得到了验证和评估,例如柏林情感言语数据库(Emo-DB)和eNTER FACE'05视听情感数据集。为了验证和与现有技术进行比较,我们利用了 所提出的模型已在两个公开可用的数据集上得到了验证和评估,例如柏林情感言语数据库(Emo-DB)和eNTER FACE'05视听情感数据集。为了验证和与现有技术进行比较,我们利用了10折交叉验证方案。所提出的方法在p值<0.03下进行分类时取得了显着改进。此外,我们还证明,在计算方面,相对于最新技术,我们的计算技术较为便宜。

更新日期:2020-04-15
down
wechat
bug