当前位置:
X-MOL 学术
›
IEEE Trans. Affect. Comput.
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Towards Contrastive Context-Aware Conversational Emotion Recognition
IEEE Transactions on Affective Computing ( IF 11.2 ) Pub Date : 2022-10-10 , DOI: 10.1109/taffc.2022.3212994 Hanqing Zhang 1 , Dawei Song 1
IEEE Transactions on Affective Computing ( IF 11.2 ) Pub Date : 2022-10-10 , DOI: 10.1109/taffc.2022.3212994 Hanqing Zhang 1 , Dawei Song 1
Affiliation
Conversational Emotion Recognition (CER) aims at classifying the emotion of each utterance in a conversation. For a target utterance, its emotion is jointly determined by multiple factors, such as conversation topics, emotion labels and intra/inter-speaker influences, in the conversational context of it. Then an important research question arises: can the effects of these contextual factors be sufficiently captured by the current CER models? To answer this question, we carry out an empirical study on four representative CER models by a context-replacement methodology. The results suggest that these models either exhibit a label-copying effect, or rely heavily on the intra/inter-speaker dependency structure within the conversation, but do not make a good use of the semantics carried by the conversational context. Thus, there is a high risk that they overfit certain single factors, yet lacking a holistic understanding of the semantic context. To tackle the problem, we propose a semantic-guided contrastive context-aware CER method, namely C3ER, to augment/regularize a backbone CER model, which can be any neural CER framework. Specifically, C3ER takes the hidden states of utterances from the CER model as input, extracts the contrast pairs consisting of relevant and irrelevant utterances to the conversational context of a target utterance, and uses contrastive learning to establish a soft semantic constraint between the target utterance and its context. It is then jointly trained with the main CER model, forcing the model to gain a semantic understanding of the context. Extensive experimental results show that C3ER can significantly boost the accuracy and improve the robustness of the representative CER models.
中文翻译:
迈向对比情境感知会话情绪识别
会话情绪识别 (CER) 旨在对对话中每个话语的情绪进行分类。对于目标话语,其情感由对话主题、情感标签和说话者内部/内部影响等多种因素共同决定。那么一个重要的研究问题就出现了:这些背景因素的影响能否被当前的 CER 模型充分捕捉到?为了回答这个问题,我们通过上下文替换方法对四个具有代表性的 CER 模型进行了实证研究。结果表明,这些模型要么表现出标签复制效应,要么严重依赖对话中的说话人内部/说话人之间的依赖结构,但没有很好地利用对话上下文所携带的语义。因此,它们很可能过度拟合某些单一因素,但缺乏对语义上下文的整体理解。为了解决这个问题,我们提出了一种语义引导的对比上下文感知 CER 方法,即 C3ER,以增强/规范化主干 CER 模型,该模型可以是任何神经 CER 框架。具体来说,C3ER 以 CER 模型中话语的隐藏状态作为输入,提取由相关和不相关话语组成的对比对到目标话语的会话上下文,并使用对比学习在目标话语和目标话语之间建立软语义约束。它的上下文。然后与主要 CER 模型联合训练,迫使模型获得对上下文的语义理解。
更新日期:2022-10-10
中文翻译:
迈向对比情境感知会话情绪识别
会话情绪识别 (CER) 旨在对对话中每个话语的情绪进行分类。对于目标话语,其情感由对话主题、情感标签和说话者内部/内部影响等多种因素共同决定。那么一个重要的研究问题就出现了:这些背景因素的影响能否被当前的 CER 模型充分捕捉到?为了回答这个问题,我们通过上下文替换方法对四个具有代表性的 CER 模型进行了实证研究。结果表明,这些模型要么表现出标签复制效应,要么严重依赖对话中的说话人内部/说话人之间的依赖结构,但没有很好地利用对话上下文所携带的语义。因此,它们很可能过度拟合某些单一因素,但缺乏对语义上下文的整体理解。为了解决这个问题,我们提出了一种语义引导的对比上下文感知 CER 方法,即 C3ER,以增强/规范化主干 CER 模型,该模型可以是任何神经 CER 框架。具体来说,C3ER 以 CER 模型中话语的隐藏状态作为输入,提取由相关和不相关话语组成的对比对到目标话语的会话上下文,并使用对比学习在目标话语和目标话语之间建立软语义约束。它的上下文。然后与主要 CER 模型联合训练,迫使模型获得对上下文的语义理解。