Improving Naive Bayes for Regression with Optimized Artificial Surrogate Data,Applied Artificial Intelligence

当前位置： X-MOL 学术 › Appl. Artif. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Improving Naive Bayes for Regression with Optimized Artificial Surrogate Data
Applied Artificial Intelligence ( IF 2.9 ) Pub Date : 2020-02-12 , DOI: 10.1080/08839514.2020.1726615
Michael Mayo ₁ , Eibe Frank ₁

Affiliation

ABSTRACT Can we evolve better training data for machine learning algorithms? To investigate this question we use population-based optimization algorithms to generate artificial surrogate training data for naive Bayes for regression. We demonstrate that the generalization performance of naive Bayes for regression models is enhanced by training them on the artificial data as opposed to the real data. These results are important for two reasons. Firstly, naive Bayes models are simple and interpretable but frequently underperform compared to more complex “black box” models, and therefore new methods of enhancing accuracy are called for. Secondly, the idea of using the real training data indirectly in the construction of the artificial training data, as opposed to directly for model training, is a novel twist on the usual machine learning paradigm.

中文翻译：

使用优化的人工代理数据改进回归的朴素贝叶斯

摘要我们能否为机器学习算法演化出更好的训练数据？为了研究这个问题，我们使用基于群体的优化算法为朴素贝叶斯生成人工替代训练数据以进行回归。我们证明了朴素贝叶斯对回归模型的泛化性能通过在人工数据而不是真实数据上训练它们而得到增强。这些结果很重要，原因有二。首先，朴素贝叶斯模型简单且可解释，但与更复杂的“黑匣子”模型相比，其性能往往较差，因此需要提高准确性的新方法。其次，在人工训练数据的构建中间接使用真实训练数据的想法，而不是直接用于模型训练，是对通常机器学习范式的一种新颖转变。

更新日期：2020-02-12

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11