Exploiting Multi-Modal Features From Pre-trained Networks for Alzheimer's Dementia Recognition,arXiv - CS - Sound

当前位置： X-MOL 学术 › arXiv.cs.SD › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Exploiting Multi-Modal Features From Pre-trained Networks for Alzheimer's Dementia Recognition
arXiv - CS - Sound Pub Date : 2020-09-09 , DOI: arxiv-2009.04070
Junghyun Koo, Jie Hwan Lee, Jaewoo Pyo, Yujin Jo, Kyogu Lee

Collecting and accessing a large amount of medical data is very time-consuming and laborious, not only because it is difficult to find specific patients but also because it is required to resolve the confidentiality of a patient's medical records. On the other hand, there are deep learning models, trained on easily collectible, large scale datasets such as Youtube or Wikipedia, offering useful representations. It could therefore be very advantageous to utilize the features from these pre-trained networks for handling a small amount of data at hand. In this work, we exploit various multi-modal features extracted from pre-trained networks to recognize Alzheimer's Dementia using a neural network, with a small dataset provided by the ADReSS Challenge at INTERSPEECH 2020. The challenge regards to discern patients suspicious of Alzheimer's Dementia by providing acoustic and textual data. With the multi-modal features, we modify a Convolutional Recurrent Neural Network based structure to perform classification and regression tasks simultaneously and is capable of computing conversations with variable lengths. Our test results surpass baseline's accuracy by 18.75%, and our validation result for the regression task shows the possibility of classifying 4 classes of cognitive impairment with an accuracy of 78.70%.

中文翻译：

利用预训练网络的多模态特征进行阿尔茨海默氏痴呆识别

收集和访问大量的医疗数据非常耗时费力，不仅因为很难找到特定的患者，还因为需要解决患者病历的保密问题。另一方面，有深度学习模型，在易于收集的大型数据集（如 Youtube 或 Wikipedia）上进行训练，提供有用的表示。因此，利用这些预训练网络的特征来处理手头的少量数据可能非常有利。在这项工作中，我们利用从 2020 年 INTERSPEECH 上的 ADReSS 挑战赛提供的小数据集，利用从预训练网络中提取的各种多模态特征来识别阿尔茨海默氏痴呆症。s 通过提供声音和文本数据来治疗痴呆症。凭借多模态特征，我们修改了基于卷积循环神经网络的结构以同时执行分类和回归任务，并且能够计算可变长度的对话。我们的测试结果超过了基线的准确率 18.75%，我们对回归任务的验证结果显示了分类 4 类认知障碍的可能性，准确率为 78.70%。

更新日期：2020-09-10

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>