当前位置: X-MOL 学术J. Med. Internet Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Social Reminiscence in Older Adults' Everyday Conversations: Automated Detection Using Natural Language Processing and Machine Learning.
Journal of Medical Internet Research ( IF 5.8 ) Pub Date : 2020-09-15 , DOI: 10.2196/19133
Andrea Ferrario 1 , Burcu Demiray 2, 3, 4 , Kristina Yordanova 5, 6, 7 , Minxia Luo 2, 3 , Mike Martin 2, 3, 4, 8
Affiliation  

Background: Reminiscence is the act of thinking or talking about personal experiences that occurred in the past. It is a central task of old age that is essential for healthy aging, and it serves multiple functions, such as decision-making and introspection, transmitting life lessons, and bonding with others. The study of social reminiscence behavior in everyday life can be used to generate data and detect reminiscence from general conversations. Objective: The aims of this original paper are to (1) preprocess coded transcripts of conversations in German of older adults with natural language processing (NLP), and (2) implement and evaluate learning strategies using different NLP features and machine learning algorithms to detect reminiscence in a corpus of transcripts. Methods: The methods in this study comprise (1) collecting and coding of transcripts of older adults’ conversations in German, (2) preprocessing transcripts to generate NLP features (bag-of-words models, part-of-speech tags, pretrained German word embeddings), and (3) training machine learning models to detect reminiscence using random forests, support vector machines, and adaptive and extreme gradient boosting algorithms. The data set comprises 2214 transcripts, including 109 transcripts with reminiscence. Due to class imbalance in the data, we introduced three learning strategies: (1) class-weighted learning, (2) a meta-classifier consisting of a voting ensemble, and (3) data augmentation with the Synthetic Minority Oversampling Technique (SMOTE) algorithm. For each learning strategy, we performed cross-validation on a random sample of the training data set of transcripts. We computed the area under the curve (AUC), the average precision (AP), precision, recall, as well as F1 score and specificity measures on the test data, for all combinations of NLP features, algorithms, and learning strategies. Results: Class-weighted support vector machines on bag-of-words features outperformed all other classifiers (AUC=0.91, AP=0.56, precision=0.5, recall=0.45, F1=0.48, specificity=0.98), followed by support vector machines on SMOTE-augmented data and word embeddings features (AUC=0.89, AP=0.54, precision=0.35, recall=0.59, F1=0.44, specificity=0.94). For the meta-classifier strategy, adaptive and extreme gradient boosting algorithms trained on word embeddings and bag-of-words outperformed all other classifiers and NLP features; however, the performance of the meta-classifier learning strategy was lower compared to other strategies, with highly imbalanced precision-recall trade-offs. Conclusions: This study provides evidence of the applicability of NLP and machine learning pipelines for the automated detection of reminiscence in older adults’ everyday conversations in German. The methods and findings of this study could be relevant for designing unobtrusive computer systems for the real-time detection of social reminiscence in the everyday life of older adults and classifying their functions. With further improvements, these systems could be deployed in health interventions aimed at improving older adults’ well-being by promoting self-reflection and suggesting coping strategies to be used in the case of dysfunctional reminiscence cases, which can undermine physical and mental health.

This is the abstract only. Read the full article on the JMIR site. JMIR is the leading open access journal for eHealth and healthcare in the Internet age.


中文翻译:

老年人日常会话中的社交回忆:使用自然语言处理和机器学习的自动检测。

背景:回忆是思考或谈论过去发生的个人经历的行为。这是健康老龄化必不可少的老年核心任务,它具有多种功能,例如决策和自省,传递生活经验以及与他人的联系。对日常生活中的社交回忆行为的研究可用于生成数据并从一般对话中检测回忆。目的:本原始论文的目的是(1)使用自然语言处理(NLP)来预处理老年人德语会话的编码对话记录,以及(2)使用不同的NLP功能和机器学习算法来实施和评估学习策略以检测回忆录的语料库。方法:这项研究中的方法包括(1)收集和编码德语老年人对话的笔录,(2)预处理笔录以生成NLP功能(词袋模型,词性标签,预训练的德语单词嵌入),以及(3)训练机器学习模型以使用随机森林,支持向量机以及自适应和极端梯度增强算法来检测回忆。数据集包含2214个转录本,其中109个具有怀旧感。由于数据中的班级不平衡,我们引入了三种学习策略:(1)班级加权学习;(2)由投票合奏组成的元分类器;(3)使用综合少数群体过采样技术(SMOTE)进行数据增强算法。对于每种学习策略,我们对成绩单的训练数据集的随机样本进行了交叉验证。我们针对NLP功能,算法和学习策略的所有组合,计算了测试数据下的曲线下面积(AUC),平均精度(AP),精度,召回率以及F1得分和特异性度量。结果:词袋上的类加权支持向量机的性能优于所有其他分类器(AUC = 0.91,AP = 0.56,精度= 0.5,召回率= 0.45,F1 = 0.48,特异性= 0.98),其次是支持向量机SMOTE增强的数据和单词嵌入功能(AUC = 0.89,AP = 0.54,精度= 0.35,召回率= 0.59,F1 = 0.44,特异性= 0.94)。对于元分类器策略,针对词嵌入和词袋训练的自适应和极端梯度增强算法优于所有其他分类器和NLP功能。然而,与其他策略相比,元分类器学习策略的性能要低一些,并且在精确调用权衡方面存在高度不平衡。结论:这项研究提供了NLP和机器学习管道用于自动检测老年人日常德语会话中的回忆的证据。这项研究的方法和发现可能与设计用于实时检测老年人日常生活中的社会回忆并对其功能进行分类的不干扰计算机系统有关。通过进一步改进,这些系统可以部署在旨在通过提高自我反省并建议应对功能失调的情况下使用的应对策略,以改善老年人的健康的卫生干预措施中,

这仅仅是抽象的。阅读JMIR网站上的全文。JMIR是互联网时代电子健康和医疗保健领域领先的开放获取期刊。
更新日期:2020-09-15
down
wechat
bug