当前位置: X-MOL 学术Inform. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multi-label Arabic text classification in Online Social Networks
Information Systems ( IF 3.0 ) Pub Date : 2021-04-10 , DOI: 10.1016/j.is.2021.101785
Ahmed Omar , Tarek M. Mahmoud , Tarek Abd-El-Hafeez , Ahmed Mahfouz

Online Social Networks (OSNs) are the most popular interactive media for communicating, posting, and sharing indefinite amounts of personal information. However, along with interesting and attractive topics and contents, some users neither like the fact that certain topics that are not among their interests can fill their personal pages nor do they wish to see disappointing negative posts that may appear repeatedly. Also, people sometimes post inappropriate or abusive content on these networks, such as insults or pornography. Most of the efforts in the field of text classification have focused on the English language, while research on the Arabic language, which has numerous challenges is scarce. In this paper, we constructed a standard multi-label Arabic dataset using manual annotation and a semi-supervised annotation technique that can be used for short text classification, sentiment analysis, and multilabel classification. Then, we evaluated the topics classification, sentiment analysis, and multilabel classification. Based on that evaluation we found a relationship between topics published in OSNs and hate speech. The experimental results validate the effectiveness of the proposed technique.



中文翻译:

在线社交网络中的多标签阿拉伯语文本分类

在线社交网络(OSN)是最流行的交互式媒体,用于交流,发布和共享无限量的个人信息。但是,除了有趣和有吸引力的主题和内容之外,某些用户既不喜欢不在自己兴趣范围内的某些主题可以填充其个人页面,也不希望看到令人失望的负面帖子可能反复出现。同样,人们有时会在这些网络上发布不适当或辱骂性的内容,例如侮辱或色情内容。文本分类领域的大多数工作都集中在英语上,而对阿拉伯语言的研究却面临很多挑战,而阿拉伯语言却面临许多挑战。在本文中,我们使用手动注释和半监督注释技术构建了标准的多标签阿拉伯数据集,该技术可用于短文本分类,情感分析和多标签分类。然后,我们评估了主题分类,情感分析和多标签分类。基于该评估,我们发现OSN中发布的主题与仇恨言论之间存在关联。实验结果验证了所提技术的有效性。

更新日期:2021-04-19
down
wechat
bug