Linguistic features and automatic classifiers for identifying mild cognitive impairment and dementia,Computer Speech & Language

当前位置： X-MOL 学术 › Comput. Speech Lang › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Linguistic features and automatic classifiers for identifying mild cognitive impairment and dementia
Computer Speech & Language ( IF 3.1 ) Pub Date : 2020-06-29 , DOI: 10.1016/j.csl.2020.101113
Laura Calzà , Gloria Gagliardi , Rema Rossini Favretti , Fabio Tamburini

Almost 50 million people are living with dementia in 2018 worldwide, and the number will double every 20 years. The effectiveness of existing pharmacologic treatments for the disease is limited to symptoms control, and none of them are able to prevent, reverse or turn off the neurodegenerative process that leads to dementia; therefore, a prompt detection of the “disease signature” is a key problem, in order to develop and test new drugs and to support the management of clinical and domestic context. Recent studies showed that linguistic alterations may be one of the earliest signs of the pathology, years before other neurocognitive deficits become evident. Traditional tests fail to identify these slight but noticeable changes; whereas, the analysis of spoken language productions by Natural Language Processing (NLP) techniques can ecologically and inexpensively identify minor language modifications in potential patients.

This interdisciplinary study aims at quantifying and describing alterations of linguistic features due to cognitive decline and build an automatic system for early diagnosis and screening purpose. To this aim, we enrolled 96 participants: 48 healthy controls and 48 impaired subjects. Of the latter, 32 was diagnosed with Mild Cognitive Impairment and 16 with early Dementia (eD). Each subject underwent a brief neuropsychological screening, and samples of semi-spontaneous speech productions was collected by means of three elicitation tasks. Recorded sessions were orthographically transcribed, PoS tagged and parsed building two different corpora: in the first we kept the automatic annotations, while in the second the transcripts were manually corrected in order to remove all mistakes. A multidimensional parameter computation was performed on the data, taking into consideration a set of 87 acoustical, rhythmical, morpho-syntactic and lexical feature as well as some readability indexes and demographic information. After these preparatory steps, some automatic classifiers were trained to distinguish healthy controls from MCI subjects employing two different algorithms, Support Vector (SVC) and Random Forest Classifiers (RFC). Our system was able to distinguish between controls and MCI subjects exhibiting high F1 scores, around 75%, thus it seems to be a promising approach for the identification of preclinical stages of dementia.

中文翻译：

语言特征和自动分类器，用于识别轻度认知障碍和痴呆

2018年，全球有近5000万人患有痴呆症，并且每20年翻一番。现有的针对该疾病的药物治疗的有效性仅限于症状控制，而且没有一种能够预防，逆转或关闭导致痴呆的神经变性过程。因此，为了开发和测试新药并支持临床和家庭环境管理，及时发现“疾病特征”是一个关键问题。最近的研究表明，语言改变可能是这种病理学的最早迹象之一，比其他神经认知功能缺陷变得明显要早几年。传统测试无法识别这些细微但引人注目的变化。而，

这项跨学科研究旨在量化和描述由于认知能力下降引起的语言特征变化，并建立一个用于早期诊断和筛查的自动系统。为此，我们招募了96位参与者：48位健康对照者和48位受损受试者。在后者中，有32名被诊断为轻度认知障碍，而16名被诊断为早期痴呆。每个对象进行了简短的神经心理学筛选，并通过三个激发任务收集了半自发性言语产生的样本。对记录的会话进行拼写转录，对PoS进行标记和解析，以构建两个不同的语料库：首先，我们保留了自动注释，而在第二步中，对笔录进行了手动更正，以消除所有错误。对数据进行了多维参数计算，考虑了一组87个声学，节奏，形态句法和词汇特征，以及一些可读性指标和人口统计信息。在这些准备步骤之后，训练了一些自动分类器，以使用两种不同的算法（支持向量（SVC）和随机森林分类器（RFC））将健康对照与MCI受试者区分开。我们的系统能够区分对照组和表现出较高F1分数（大约75％）的MCI受试者，因此这似乎是鉴定痴呆症临床前阶段的有前途的方法。训练了一些自动分类器，以使用两种不同的算法（支持向量（SVC）和随机森林分类器（RFC））将健康控制与MCI受试者区分开。我们的系统能够区分对照组和表现出较高F1分数（大约75％）的MCI受试者，因此这似乎是鉴定痴呆症临床前阶段的有前途的方法。训练了一些自动分类器，以使用两种不同的算法（支持向量（SVC）和随机森林分类器（RFC））将健康控制与MCI受试者区分开。我们的系统能够区分对照组和表现出较高F1分数（大约75％）的MCI受试者，因此这似乎是鉴定痴呆症临床前阶段的有前途的方法。

更新日期：2020-06-29

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文