当前位置: X-MOL 学术Journal of Science and Technology of the Arts › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
End-to-End Listening Agent for Audiovisual Emotional and Naturalistic Interactions
Journal of Science and Technology of the Arts ( IF 0.2 ) Pub Date : 2018-11-08 , DOI: 10.7559/citarj.v10i2.424
Kevin El Haddad , Yara Rizk , Louise Heron , Nadine Hajj , Yong Zhao , Jaebok Kim , Trung Ngô Trọng , Minha Lee , Marwan Doumit , Payton Lin , Yelin Kim , Hüseyin Çakmak

In this work, we established the foundations of a framework with the goal to build an end-to-end naturalistic expressive listening agent. The project was split into modules for recognition of the user’s paralinguistic and nonverbal expressions, prediction of the agent’s reactions, synthesis of the agent’s expressions and data recordings of nonverbal conversation expressions. First, a multimodal multitask deep learning-based emotion classification system was built along with a rule-based visual expression detection system. Then several sequence prediction systems for nonverbal expressions were implemented and compared. Also, an audiovisual concatenation-based synthesis system was implemented. Finally, a naturalistic, dyadic emotional conversation database was collected. We report here the work made for each of these modules and our planned future improvements.

中文翻译:

视听情感与自然互动的端到端听力代理

在这项工作中,我们建立了一个框架的基础,目的是建立一个端到端的自然主义表达侦听代理。该项目分为多个模块,用于识别用户的副语言和非语言表达,预测座席反应,座席表达的合成以及非语言会话表达的数据记录。首先,构建了基于多模式多任务深度学习的情绪分类系统以及基于规则的视觉表情检测系统。然后,实现并比较了几种非语言表达的序列预测系统。而且,实现了基于视听级联的合成系统。最后,收集了一个自然的,二进式的情感对话数据库。
更新日期:2018-11-08
down
wechat
bug