当前位置: X-MOL 学术Inform. Fusion › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Hyperparameter self-tuning for data streams
Information Fusion ( IF 14.7 ) Pub Date : 2021-04-28 , DOI: 10.1016/j.inffus.2021.04.011
Bruno Veloso , João Gama , Benedita Malheiro , João Vinagre

The number of Internet of Things devices generating data streams is expected to grow exponentially with the support of emergent technologies such as 5G networks. Therefore, the online processing of these data streams requires the design and development of suitable machine learning algorithms, able to learn online, as data is generated. Like their batch-learning counterparts, stream-based learning algorithms require careful hyperparameter settings. However, this problem is exacerbated in online learning settings, especially with the occurrence of concept drifts, which frequently require the reconfiguration of hyperparameters. In this article, we present SSPT, an extension of the Self Parameter Tuning (SPT) optimisation algorithm for data streams. We apply the Nelder–Mead algorithm to dynamically-sized samples, converging to optimal settings in a single pass over data while using a relatively small number of hyperparameter configurations. In addition, our proposal automatically readjusts hyperparameters when concept drift occurs. To assess the effectiveness of SSPT, the algorithm is evaluated with three different machine learning problems: recommendation, regression, and classification. Experiments with well-known data sets show that the proposed algorithm can outperform previous hyperparameter tuning efforts by human experts. Results also show that SSPT converges significantly faster and presents at least similar accuracy when compared with the previous double-pass version of the SPT algorithm.



中文翻译:

数据流的超参数自调整

在新兴技术(例如5G网络)的支持下,生成数据流的物联网设备的数量有望成倍增长。因此,这些数据流的在线处理需要设计和开发合适的机器学习算法,该算法能够在生成数据时进行在线学习。像它们的批量学习副本一样,基于流的学习算法需要仔细的超参数设置。但是,在在线学习环境中,尤其是随着概念漂移的发生,这个问题会更加严重,因为概念漂移经常需要重新配置超参数。在本文中,我们介绍SSPT,它是自我参数调整(SPT)的扩展)数据流的优化算法。我们将Nelder–Mead算法应用于动态大小的样本,一次使用数据就可以收敛到最佳设置,同时使用相对少量的超参数配置。此外,当概念发生偏移时,我们的建议会自动重新调整超参数。为了评估SSPT的有效性,对该算法进行了三种不同的机器学习问题的评估:推荐,回归和分类。使用知名数据集进行的实验表明,该算法可以胜过人类专家以前的超参数调整工作。结果还显示,与以前的双通道版本的SSPT相比,SSPT的收敛速度显着加快,并且至少具有相似的准确性。SPT算法。

更新日期:2021-05-26
down
wechat
bug