当前位置: X-MOL 学术Computing › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A novel imbalanced data classification approach for suicidal ideation detection on social media
Computing ( IF 3.7 ) Pub Date : 2021-08-17 , DOI: 10.1007/s00607-021-00984-0
Mohamed Ali Ben Hassine 1 , Safa Abdellatif 2 , Sadok Ben Yahia 2
Affiliation  

Suicide has become a serious social health issue in modern society. Suicidal ideation is people’s thoughts about committing or planning suicide. Many factors, such as long-term exposure to negative feelings or life events, can lead to suicidal ideation and suicide attempts. Among these approaches to suicide prevention, early detection of suicidal ideation is one of the most effective ways. Using social networking services provides a platform for people to express their sufferings and feelings in the real world, which provides a source for a deeper investigation into models and approaches for the detection of suicidal intent to enable prevention. This paper addresses the early detection of suicide ideation through the associative classification approach applied to Twitter social media. However, since the number of suicide intention tweets is tiny compared to the number of all the tweets, this leads us to an imbalanced classification problem, in which, the minority class (suicide intention) is more important than the majority class (no suicide intention). In such a situation, classical classifiers usually yield very inaccurate results regarding minor classes, since they can easily discover rules predicting the majority class and overlook those related to the minor. This paper aims to contribute to this line of research by introducing a new interestingness measure to enhance the classification process. This measure highlights the two classes regardless of their imbalanced distribution. Carried out experiments proved that the adapted CBA outweighs in terms of prediction accuracy the original one, and other pioneering baseline classification approaches.



中文翻译:

一种新的不平衡数据分类方法,用于社交媒体上的自杀意念检测

自杀已成为现代社会严重的社会健康问题。自杀意念是人们关于自杀或计划自杀的想法。许多因素,例如长期暴露于负面情绪或生活事件,可导致自杀意念和自杀企图。在这些预防自杀的方法中,早期发现自杀意念是最有效的方法之一。使用社交网络服务为人们提供了一个平台来表达他们在现实世界中的痛苦和感受,这为深入研究检测自杀意图以实现预防的模型和方法提供了一个来源。本文通过应用于 Twitter 社交媒体的关联分类方法解决了早期发现自杀意念的问题。然而,由于自杀意图推文的数量与所有推文的数量相比很小,这导致我们出现了一个不平衡的分类问题,其中少数类(自杀意图)比多数类(无自杀意图)更重要。在这种情况下,经典分类器通常会产生关于次要类别的非常不准确的结果,因为它们很容易发现预测多数类别的规则而忽略与次要类别相关的规则。本文旨在通过引入一种新的兴趣度度量来增强分类过程,从而为这方面的研究做出贡献。无论它们的分布不平衡如何,该度量都会突出显示这两个类别。进行的实验证明,适应后的 CBA 在预测精度方面优于原始 CBA,

更新日期:2021-08-19
down
wechat
bug