当前位置: X-MOL 学术Nat. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A study of model and hyper-parameter selection strategies for classifier ensembles: a robust analysis on different optimization algorithms and extended results
Natural Computing ( IF 2.1 ) Pub Date : 2020-10-30 , DOI: 10.1007/s11047-020-09816-0
Antonino A. Feitosa-Neto , João C. Xavier-Júnior , Anne M. P. Canuto , Alexandre C. M. Oliveira

It is well known that machine learning (ML) techniques have been playing an important role in several real world applications. However, one of the main challenges is the selection of the most accurate technique to be used in a specific application. In the classification context, for instance, two main approaches can be applied, model selection and hyper-parameter selection. In the first approach, the best classification algorithm is selected for a given input dataset, by doing a heuristic search in a large space of candidate classification algorithms and their corresponding hyper-parameter settings. As the main focus of this approach is the selection of the classification algorithms, it is referred to as model selection and they are also called automated machine learning (Auto-ML). The second approach defines one classification system and performs an extensive search to select the best hyper-parameters for this model. In this paper, we perform a wide and robust comparative analysis of both approaches for Classifier Ensembles. In this analysis, two methods of the first approach (Auto-WEKA and H\(_{2}\)O) are compared to four methods of the second approach (Genetic Algorithm, Particle Swarm Optimization, Tabu Search and GRASP). The main aim is to determine which of these techniques generate more accurate Classifier Ensembles, given a time constraint. Additionally, an empirical analysis will be conducted with 21 classification datasets for evaluating the performance of the aforementioned techniques. Our findings indicate that the use of a hyper-parameter selection method provides the most accurate classifier ensembles, but this improvement was not detected by the statistical test.



中文翻译:

分类器集成的模型和超参数选择策略研究:对不同优化算法和扩展结果的可靠分析

众所周知,机器学习(ML)技术已在多个实际应用程序中发挥了重要作用。但是,主要挑战之一是选择要在特定应用中使用的最准确的技术。例如,在分类上下文中,可以应用两种主要方法:模型选择和超参数选择。在第一种方法中,通过在很大的候选分类算法及其对应的超参数设置空间中进行启发式搜索,为给定的输入数据集选择最佳分类算法。这种方法的主要重点是分类算法的选择,这被称为模型选择,它们也称为自动机器学习(Auto-ML)。第二种方法定义了一个分类系统,并进行了广泛的搜索,以为此模型选择最佳的超参数。在本文中,我们对两种分类器集成方法进行了广泛而稳健的比较分析。在此分析中,采用第一种方法的两种方法(Auto-WEKA和H\(_ {2} \) O)与第二种方法的四种方法(遗传算法,粒子群优化,禁忌搜索和GRASP)进行了比较。主要目标是在给定时间限制的情况下确定这些技术中的哪一种可以生成更准确的分类器乐团。此外,将使用21个分类数据集进行经验分析,以评估上述技术的性能。我们的发现表明,使用超参数选择方法可提供最准确的分类器集合,但统计测试未检测到这种改进。

更新日期:2020-11-02
down
wechat
bug