当前位置: X-MOL 学术arXiv.cs.HC › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Accounting for Variations in Speech Emotion Recognition with Nonparametric Hierarchical Neural Network
arXiv - CS - Human-Computer Interaction Pub Date : 2021-09-09 , DOI: arxiv-2109.04316
Lance Ying, Amrit Romana, Emily Mower Provost

In recent years, deep-learning-based speech emotion recognition models have outperformed classical machine learning models. Previously, neural network designs, such as Multitask Learning, have accounted for variations in emotional expressions due to demographic and contextual factors. However, existing models face a few constraints: 1) they rely on a clear definition of domains (e.g. gender, noise condition, etc.) and the availability of domain labels; 2) they often attempt to learn domain-invariant features while emotion expressions can be domain-specific. In the present study, we propose the Nonparametric Hierarchical Neural Network (NHNN), a lightweight hierarchical neural network model based on Bayesian nonparametric clustering. In comparison to Multitask Learning approaches, the proposed model does not require domain/task labels. In our experiments, the NHNN models generally outperform the models with similar levels of complexity and state-of-the-art models in within-corpus and cross-corpus tests. Through clustering analysis, we show that the NHNN models are able to learn group-specific features and bridge the performance gap between groups.

中文翻译:

用非参数分层神经网络解释语音情感识别的变化

近年来,基于深度学习的语音情感识别模型的表现优于经典机器学习模型。以前,多任务学习等神经网络设计已经考虑了由于人口统计和上下文因素而导致的情绪表达变化。然而,现有模型面临一些限制:1)它们依赖于域的明确定义(例如性别、噪声条件等)和域标签的可用性;2)他们经常尝试学习领域不变的特征,而情感表达可以是领域特定的。在本研究中,我们提出了非参数分层神经网络 (NHNN),这是一种基于贝叶斯非参数聚类的轻量级分层神经网络模型。与多任务学习方法相比,所提出的模型不需要域/任务标签。在我们的实验中,NHNN 模型在语料库内和跨语料库测试中通常优于具有相似复杂度的模型和最先进的模型。通过聚类分析,我们表明 NHNN 模型能够学习特定于组的特征并弥合组之间的性能差距。
更新日期:2021-09-10
down
wechat
bug