Discriminative Few Shot Learning of Facial Dynamics in Interview Videos for Autism Trait Classification,IEEE Transactions on Affective Computing

当前位置： X-MOL 学术 › IEEE Trans. Affect. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Discriminative Few Shot Learning of Facial Dynamics in Interview Videos for Autism Trait Classification
IEEE Transactions on Affective Computing ( IF 9.6 ) Pub Date : 5-30-2022 , DOI: 10.1109/taffc.2022.3178946
Na Zhang ₁ , Mindi Ruan ₁ , Shuo Wang ₂ , Lynn Paul ₃ , Xin Li ₁

Affiliation

Autism is a prevalent neurodevelopmental disorder characterized by impairments in social and communicative behaviors. Possible connections between autism and facial expression recognition have recently been studied in the literature. However, most works are based on facial images or short videos. Few works aim at Autism Diagnostic Observation Schedule (ADOS) videos due to their complexity (e.g., interaction between interviewer and interviewee) and length (e.g., usually last for hours). In this paper, we attempt to fill this gap by developing a novel discriminative few shot learning method to analyze hour-long video data and exploring the fusion of facial dynamics for the trait classification of ASD. Leveraging well-established computer vision tools from spatio-temporal feature extraction and marginal fisher analysis to few-shot learning and scene-level fusion, we have constructed a three-category system to classify an individual into Autism, Autism Spectrum, and Non-Spectrum. For the first time, we have shown that certain interview scenes carry more discriminative information for ASD trait classification than others. Experimental results are reported to demonstrate the potential of the proposed automatic ASD trait classification system (achieving 91.72% accuracy on the Caltech ADOS video dataset) and the benefits of few-shot learning and scene-level fusion strategy by extensive ablation studies.

中文翻译：

用于自闭症特征分类的访谈视频中面部动态的判别性少镜头学习

自闭症是一种普遍的神经发育障碍，其特征是社交和沟通行为障碍。最近的文献研究了自闭症和面部表情识别之间可能的联系。然而，大多数作品都是基于面部图像或短视频。由于自闭症诊断观察表（ADOS）视频的复杂性（例如，访谈者和受访者之间的互动）和长度（例如，通常持续数小时），很少有作品针对自闭症诊断观察表（ADOS）视频。在本文中，我们试图通过开发一种新颖的判别性少镜头学习方法来分析长达一小时的视频数据，并探索面部动态的融合来进行自闭症谱系障碍的特征分类，从而填补这一空白。利用成熟的计算机视觉工具，从时空特征提取和边缘渔民分析到少镜头学习和场景级融合，我们构建了一个三类系统，将个体分为自闭症、自闭症谱系和非谱系。我们首次证明，某些采访场景比其他场景携带更多的 ASD 特征分类歧视信息。实验结果证明了所提出的自动 ASD 特征分类系统的潜力（在加州理工学院 ADOS 视频数据集上达到 91.72% 的准确率）以及通过广泛的消融研究进行的少样本学习和场景级融合策略的好处。

更新日期：2024-08-26

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11