A classification-based fuzzy-rules proxy model to assist in the full model selection problem in high volume datasets,Journal of Experimental & Theoretical Artificial Intelligence

当前位置： X-MOL 学术 › J. Exp. Theor. Artif. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A classification-based fuzzy-rules proxy model to assist in the full model selection problem in high volume datasets
Journal of Experimental & Theoretical Artificial Intelligence ( IF 1.7 ) Pub Date : 2021-06-18 , DOI: 10.1080/0952813x.2021.1925972
Angel Díaz-Pacheco ₁ , Carlos Alberto Reyes-Garcia ₁

Affiliation

ABSTRACT

Improvement of accuracy in classifiers is a crucial topic in the machine learning field. The problem has been addressed, making new algorithms and selecting the fittest classifier for a given dataset. The latter approach combined with feature selection and pre-processing form up a new paradigm known as Full Model Selection. This paradigm is like a black box whose input is a dataset, and as an output, a precise classification model is obtained. Despite that, full model selection is not the first alternative with the larger datasets of nowadays. We propose the use of MapReduce to deal with huge datasets, a bio-inspired optimisation algorithm and the use of a novel algorithm based on fuzzy classification rules as a proxy model to guide the optimisation process. To the best of our knowledge, this work is the first to propose a classification algorithm based on fuzzy rules as a proxy model. Obtained results showed an accuracy improvement and a considerable reduction of the computing time in datasets of a wide range of sizes.

中文翻译：

一种基于分类的模糊规则代理模型，可帮助解决大容量数据集中的完整模型选择问题

摘要

提高分类器的准确性是机器学习领域的一个重要课题。该问题已得到解决，制定了新算法并为给定数据集选择了最合适的分类器。后一种方法与特征选择和预处理相结合，形成了一种称为全模型选择的新范式。这种范式就像一个黑匣子，输入是一个数据集，作为输出，得到一个精确的分类模型。尽管如此，完整的模型选择并不是当今较大数据集的第一个选择。我们建议使用 MapReduce 处理庞大的数据集、仿生优化算法以及使用基于模糊分类规则的新算法作为代理模型来指导优化过程。据我们所知，这项工作是第一个提出基于模糊规则作为代理模型的分类算法。获得的结果表明，在各种大小的数据集中，准确性得到了提高，并且计算时间大大减少。

更新日期：2021-06-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11