Fine-grained depression analysis based on Chinese micro-blog reviews,Information Processing & Management

当前位置： X-MOL 学术 › Inf. Process. Manag. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Fine-grained depression analysis based on Chinese micro-blog reviews
Information Processing & Management ( IF 7.4 ) Pub Date : 2021-07-21 , DOI: 10.1016/j.ipm.2021.102681
Tingting Yang ₁ , Fei Li ₁ , Donghong Ji ₁ , Xiaohui Liang ₂ , Tian Xie ₃ , Shuwan Tian ₃ , Bobo Li ₁ , Peitong Liang ₄

Affiliation

Depression is a widespread and intractable problem in modern society, which may lead to suicide ideation and behavior. Analyzing depression or suicide based on the posts of social media such as Twitter or Reddit has achieved great progress in recent years. However, most work focuses on English social media and depression prediction is typically formalized as being present or absent. In this paper, we construct a human-annotated dataset for depression analysis via Chinese microblog reviews which includes 6,100 manually-annotated posts. Our dataset includes two fine-grained tasks, namely depression degree prediction and depression cause prediction. The object of the former task is to classify a Microblog post into one of 5 categories based on the depression degree, while the object of the latter one is selecting one or multiple reasons that cause the depression from 7 predefined categories. To set up a benchmark, we design a neural model for joint depression degree and cause prediction, and compare it with several widely-used neural models such as TextCNN, BiLSTM and BERT. Our model outperforms the baselines and achieves at most 65+% F1 for depression degree prediction, 70+% F1 and 90+% AUC for depression cause prediction, which shows that neural models achieve promising results, but there is still room for improvement. Our work can extend the area of social-media-based depression analyses, and our annotated data and code can also facilitate related research.

中文翻译：

基于中文微博评论的细粒度抑郁分析

抑郁症是现代社会普遍存在且棘手的问题，可能导致自杀意念和行为。近年来，基于 Twitter 或 Reddit 等社交媒体的帖子分析抑郁或自杀取得了很大进展。然而，大多数工作都集中在英语社交媒体上，抑郁症预测通常被形式化为存在或不存在。在本文中，我们通过中文微博评论构建了一个用于抑郁症分析的人工注释数据集，其中包括 6,100 个手动注释的帖子。我们的数据集包括两个细粒度的任务，即抑郁程度预测和抑郁原因预测。前一个任务的目标是根据抑郁程度将微博帖子归类为 5 个类别之一，而后者的目的是从7个预定义的类别中选择一个或多个导致抑郁的原因。为了建立一个基准，我们设计了一个用于关节抑郁程度和原因预测的神经模型，并将其与几种广泛使用的神经模型（如 TextCNN、BiLSTM 和 BERT）进行比较。我们的模型优于基线，抑郁程度预测最多达到 65+% F1，抑郁原因预测达到 70+% F1 和 90+% AUC，这表明神经模型取得了可喜的结果，但仍有改进的空间。我们的工作可以扩展基于社交媒体的抑郁症分析领域，我们的注释数据和代码也可以促进相关研究。并将其与几种广泛使用的神经模型进行比较，例如 TextCNN、BiLSTM 和 BERT。我们的模型优于基线，抑郁程度预测最多达到 65+% F1，抑郁原因预测达到 70+% F1 和 90+% AUC，这表明神经模型取得了可喜的结果，但仍有改进的空间。我们的工作可以扩展基于社交媒体的抑郁症分析领域，我们的注释数据和代码也可以促进相关研究。并将其与几种广泛使用的神经模型进行比较，例如 TextCNN、BiLSTM 和 BERT。我们的模型优于基线，抑郁程度预测最多达到 65+% F1，抑郁原因预测达到 70+% F1 和 90+% AUC，这表明神经模型取得了可喜的结果，但仍有改进的空间。我们的工作可以扩展基于社交媒体的抑郁症分析领域，我们的注释数据和代码也可以促进相关研究。

更新日期：2021-07-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11