MultiETSC: automated machine learning for early time series classification,Data Mining and Knowledge Discovery

当前位置： X-MOL 学术 › Data Min. Knowl. Discov. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

MultiETSC: automated machine learning for early time series classification
Data Mining and Knowledge Discovery ( IF 2.8 ) Pub Date : 2021-08-16 , DOI: 10.1007/s10618-021-00781-5
Gilles Ottervanger ₁ , Mitra Baratchi ₁ , Holger H. Hoos ₁

Affiliation

Early time series classification (EarlyTSC) involves the prediction of a class label based on partial observation of a given time series. Most EarlyTSC algorithms consider the trade-off between accuracy and earliness as two competing objectives, using a single dedicated hyperparameter. To obtain insights into this trade-off requires finding a set of non-dominated (Pareto efficient) classifiers. So far, this has been approached through manual hyperparameter tuning. Since the trade-off hyperparameters only provide indirect control over the earliness-accuracy trade-off, manual tuning is tedious and tends to result in many sub-optimal hyperparameter settings. This complicates the search for optimal hyperparameter settings and forms a hurdle for the application of EarlyTSC to real-world problems. To address these issues, we propose an automated approach to hyperparameter tuning and algorithm selection for EarlyTSC, building on developments in the fast-moving research area known as automated machine learning (AutoML). To deal with the challenging task of optimising two conflicting objectives in early time series classification, we propose MultiETSC, a system for multi-objective algorithm selection and hyperparameter optimisation (MO-CASH) for EarlyTSC. MultiETSC can potentially leverage any existing or future EarlyTSC algorithm and produces a set of Pareto optimal algorithm configurations from which a user can choose a posteriori. As an additional benefit, our proposed framework can incorporate and leverage time-series classification algorithms not originally designed for EarlyTSC for improving performance on EarlyTSC; we demonstrate this property using a newly defined, “naïve” fixed-time algorithm. In an extensive empirical evaluation of our new approach on a benchmark of 115 data sets, we show that MultiETSC performs substantially better than baseline methods, ranking highest (avg. rank 1.98) compared to conceptually simpler single-algorithm (2.98) and single-objective alternatives (4.36).

中文翻译：

MultiETSC：用于早期时间序列分类的自动化机器学习

早期时间序列分类 (EarlyTSC) 涉及基于对给定时间序列的部分观察来预测类别标签。大多数 EarlyTSC 算法使用单个专用超参数将准确性和早期性之间的权衡视为两个相互竞争的目标。要深入了解这种权衡，需要找到一组非支配（帕累托有效）分类器。到目前为止，这是通过手动超参数调整来实现的。由于权衡超参数仅提供对早期准确度权衡的间接控制，因此手动调整很乏味，并且往往会导致许多次优超参数设置。这使得对最佳超参数设置的搜索变得复杂，并成为将 EarlyTSC 应用于现实世界问题的障碍。为了解决这些问题，我们提出了一种用于 EarlyTSC 的超参数调整和算法选择的自动化方法，以快速发展的研究领域的发展为基础，即自动机器学习 (AutoML)。为了应对在早期时间序列分类中优化两个冲突目标的挑战性任务，我们提出了 MultiETSC，这是一个用于 EarlyTSC 的多目标算法选择和超参数优化 (MO-CASH) 系统。MultiETSC 可以潜在地利用任何现有或未来的 EarlyTSC 算法，并生成一组帕累托最优算法配置，用户可以从中选择后验。作为一个额外的好处，我们提出的框架可以合并和利用最初不是为 EarlyTSC 设计的时间序列分类算法来提高 EarlyTSC 的性能；我们使用新定义的“朴素”固定时间算法来演示此属性。在对 115 个数据集的基准测试中对我们的新方法进行广泛的实证评估中，我们表明 MultiETSC 的性能明显优于基线方法，与概念上更简单的单一算法 (2.98) 和单一目标相比，排名最高（平均排名 1.98）替代方案（4.36）。

更新日期：2021-08-19

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11