当前位置: X-MOL 学术Artif. Intell. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A machine learning perspective on the emotional content of Parkinsonian speech
Artificial Intelligence in Medicine ( IF 6.1 ) Pub Date : 2021-04-01 , DOI: 10.1016/j.artmed.2021.102061
Konstantinos Sechidis 1 , Riccardo Fusaroli 2 , Juan Rafael Orozco-Arroyave 3 , Detlef Wolf 1 , Yan-Ping Zhang 1
Affiliation  

Patients with Parkinson's disease (PD) have distinctive voice patterns, often perceived as expressing sad emotion. While this characteristic of Parkinsonian speech has been supported through the perspective of listeners, where both PD and healthy control (HC) subjects repeat the same speaking tasks, it has never been explored through a machine learning modelling approach. Our work provides an objective evaluation of this characteristic of the PD speech, by building a transfer learning system to assess how the PD pathology affects the sadness perception. To do so we introduce a Mixture-of-Experts (MoE) architecture for speech emotion recognition designed to be transferable across datasets. Firstly, by relying on publicly available emotional speech corpora, we train the MoE model and then we use it to quantify perceived sadness in never seen before PD and matched HC speech recordings. To build our models (experts), we extracted spectral features of the voicing parts of speech and we trained a gradient boosting decision trees model in each corpus to predict happiness vs. sadness. MoE predictions are created by weighting each expert's prediction according to the distance between the new sample and the expert-specific training samples. The MoE approach systematically infers more negative emotional characteristics in PD speech than in HC. Crucially, these judgments are related to the disease severity and the severity of speech impairment in the PD patients: the more impairment, the more likely the speech is to be judged as sad. Our findings pave the way towards a better understanding of the characteristics of PD speech and show how publicly available datasets can be used to train models that provide interesting insights on clinical data.



中文翻译:

帕金森言语情感内容的机器学习视角

帕金森病 (PD) 患者具有独特的声音模式,通常被认为是表达悲伤的情绪。虽然帕金森言语的这一特征已经通过听众的角度得到支持,其中 PD 和健康对照 (HC) 受试者重复相同的说话任务,但从未通过机器学习建模方法对其进行探索。我们的工作通过构建迁移学习系统来评估 PD 病理如何影响悲伤感知,从而对 PD 语音的这一特征进行了客观评估。为此,我们引入了一种专家混合 (MoE) 架构,用于语音情感识别,旨在跨数据集进行迁移。首先,依靠公开的情感语料库,我们训练 MoE 模型,然后我们用它来量化从未见过的 PD 和匹配的 HC 语音记录中感知到的悲伤。为了构建我们的模型(专家),我们提取了语音部分的频谱特征,并在每个语料库中训练了一个梯度提升决策树模型来预测幸福与悲伤。MoE 预测是通过根据新样本与专家特定训练样本之间的距离对每个专家的预测进行加权来创建的。MoE 方法系统地推断出 PD 语音中比 HC 中更多的负面情绪特征。至关重要的是,这些判断与 PD 患者的疾病严重程度和言语障碍的严重程度有关:障碍越多,言语越有可能被判断为悲伤。

更新日期:2021-04-19
down
wechat
bug