当前位置: X-MOL 学术WIREs Data Mining Knowl. Discov. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A novel methodology for Arabic news classification
WIREs Data Mining and Knowledge Discovery ( IF 7.8 ) Pub Date : 2021-12-06 , DOI: 10.1002/widm.1440
Marco Alfonse 1 , Mariam Gawich 2
Affiliation  

The automated news classification concerns the assignment of news to one or more predefined categories. The automated classified news helps the search engines to mine and categorize the type of news that the user asks for. Most of the researchers focused on the classification of English news and ignore the Arabic news due to the complexity of the Arabic morphology. This article presents a novel methodology to classify the Arabic news. It relies on the use of features extraction and the application of machine learning classifiers which are the Naive Bayes (NB), the Logistic Regression (LR), the Random Forest (RF), the Xtreme Gradient Boosting (XGB), the K-Nearest Neighbors (KNN), the Stochastic Gradient Descent (SGD), the Decision Tree (DT), and the Multi-Layer Perceptron (MLP). The methodology is applied to the Arabic news dataset provided by Mendeley. The accuracy of the classification is more than 95%.

中文翻译:

一种新的阿拉伯新闻分类方法

自动新闻分类涉及将新闻分配给一个或多个预定义类别。自动分类新闻帮助搜索引擎挖掘和分类用户要求的新闻类型。由于阿拉伯语形态的复杂性,大多数研究人员专注于英语新闻的分类而忽略了阿拉伯语新闻。本文提出了一种对阿拉伯新闻进行分类的新方法。它依赖于使用特征提取和机器学习分类器的应用,这些分类器是朴素贝叶斯 (NB)、逻辑回归 (LR)、随机森林 (RF)、Xtreme Gradient Boosting (XGB)、K-Nearest邻居 (KNN)、随机梯度下降 (SGD)、决策树 (DT) 和多层感知器 (MLP)。该方法应用于 Mendeley 提供的阿拉伯新闻数据集。分类准确率达95%以上。
更新日期:2021-12-06
down
wechat
bug