A Multimodal Interlocutor-Modulated Attentional BLSTM for Classifying Autism Subgroups during Clinical Interviews,IEEE Journal of Selected Topics in Signal Processing

当前位置： X-MOL 学术 › IEEE J. Sel. Top. Signal Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Multimodal Interlocutor-Modulated Attentional BLSTM for Classifying Autism Subgroups during Clinical Interviews
IEEE Journal of Selected Topics in Signal Processing ( IF 8.7 ) Pub Date : 2020-02-01 , DOI: 10.1109/jstsp.2020.2970578
Yun-Shao Lin , Susan Shur-Fen Gau , Chi-Chun Lee

The heterogeneity in Autism Spectrum Disorder (ASD) remains a challenging and unsolved issue in the current clinical practice. The behavioral differences between ASD subgroups are subtle and can be hard to be manually discerned by experts. Here, we propose a computational framework that is capable of modeling both vocal behaviors and body gestural movements of the interlocutors with their intricate dependency captured through a learnable interlocutor-modulated (IM) attention mechanism during dyadic clinical interviews of Autism Diagnostic Observation Schedule (ADOS). Specifically, our multimodal network architecture includes two modality-specific networks, a speech-IM-aBLSTM and a motion-IM-aBLSTM, that are combined in a fusion network to perform the final three ASD subgroups differentiation, i.e., Autistic Disorder (AD) vs. High-Functioning Autism (HFA) vs. Asperger Syndrome (AS). Our model uniquely introduces the IM attention mechanism to capture the non-linear behavior dependency between interlocutors, which is essential in providing improved discriminability in classifying the three subgroups. We evaluate our framework on a large ADOS collection, and we obtain a 66.8% unweighted average recall (UAR) that is 14.3% better than the previous work on the same dataset. Furthermore, based on the learned attention weights, we analyze essential behavior descriptors in differentiating subgroup pairs. We further identify the most critical self-disclosure emotion topics within the ADOS interview sessions, and it shows that anger and fear are the most informative interaction segments for observing the subtle interactive behavior differences between these three sub-types of ASD.

中文翻译：

用于在临床访谈期间对自闭症亚组进行分类的多模式对话者调制注意 BLSTM

自闭症谱系障碍 (ASD) 的异质性在当前的临床实践中仍然是一个具有挑战性和未解决的问题。ASD 亚组之间的行为差异很微妙，专家很难手动辨别。在这里，我们提出了一个计算框架，该框架能够对对话者的声音行为和身体手势运动进行建模，并在自闭症诊断观察计划 (ADOS) 的二元临床访谈期间通过可学习的对话者调制 (IM) 注意力机制捕获对话者的复杂依赖性. 具体来说，我们的多模态网络架构包括两个特定于模态的网络，一个语音-IM-aBLSTM 和一个运动-IM-aBLSTM，它们结合在一个融合网络中以执行最后三个 ASD 亚组分化，即自闭症 (AD)对比高功能自闭症 (HFA) 与阿斯伯格综合症 (AS)。我们的模型独特地引入了 IM 注意力机制来捕获对话者之间的非线性行为依赖性，这对于提高对三个子组进行分类的可辨别性至关重要。我们在大型 ADOS 集合上评估我们的框架，我们获得了 66.8% 的未加权平均召回率 (UAR)，比之前在同一数据集上的工作好 14.3%。此外，基于学习到的注意力权重，我们分析了区分子组对中的基本行为描述符。我们进一步确定了 ADOS 面试中最关键的自我表露情绪话题，

更新日期：2020-02-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11