当前位置: X-MOL 学术J. Electr. Syst. Inf Technol › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
IMPROVE THE AUTOMATIC CLASSIFICATION ACCURACY FOR ARABIC TWEETS USING ENSEMBLE METHODS
Journal of Electrical Systems and Information Technology Pub Date : 2018-12-01 , DOI: 10.1016/j.jesit.2018.03.001
Hammam M. Abdelaal , Ahmed N. Elmahdy , Ali A. Halawa , Hassan A. Youness

Abstract Tweets classification became interest topics in recent years, especially for the Arabic language. In this paper, the Arabic tweets are classified automatically into one of some predetermined categories mainly: sport, culture, politics, technology and general, based on their linguistic characteristics and their contents, also the classification accuracy is improved for Arabic tweets, by using ensemble methods mainly: bagging, boosting and stacking on the same dataset that we used it before in the classification, to verify of the results, and identify the best classifier gives high accuracy. The experimental results showed that using ensemble methods are better than using individual classifier, to improve the accuracy of classification. Increased accuracy of classifier Naive Bayes (NB) to 1.6%, classifier Sequential Minimal Optimization (SMO) to 2.2% and finally Decision Tree (J48) classifier reached up to 3.2%, comparing to using the J48, NB, or SMO as a single classifier.

中文翻译:

使用集成方法提高阿拉伯语推文的自动分类准确度

摘要 推文分类近年来成为热门话题,尤其是阿拉伯语。本文根据阿拉伯语推文的语言特征和内容自动将其分类为体育、文化、政治、技术和一般等预先确定的类别之一,并通过使用集成提高了阿拉伯语推文的分类精度。方法主要是:在我们之前在分类中使用过的同一数据集上进行bagging、boosting和stacking,以验证结果,并确定最佳分类器,从而提供较高的准确性。实验结果表明,使用集成方法比使用单独分类器更好,提高了分类的准确性。将分类器朴素贝叶斯 (NB) 的准确度提高到 1.6%,
更新日期:2018-12-01
down
wechat
bug