当前位置: X-MOL 学术Data Min. Knowl. Discov. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
FastEE: Fast Ensembles of Elastic Distances for time series classification
Data Mining and Knowledge Discovery ( IF 2.8 ) Pub Date : 2019-11-18 , DOI: 10.1007/s10618-019-00663-x
Chang Wei Tan , François Petitjean , Geoffrey I. Webb

In recent years, many new ensemble-based time series classification (TSC) algorithms have been proposed. Each of them is significantly more accurate than their predecessors. The Hierarchical Vote Collective of Transformation-based Ensembles (HIVE-COTE) is currently the most accurate TSC algorithm when assessed on the UCR repository. It is a meta-ensemble of 5 state-of-the-art ensemble-based classifiers. The time complexity of HIVE-COTE—particularly for training—is prohibitive for most datasets. There is thus a critical need to speed up the classifiers that compose HIVE-COTE. This paper focuses on speeding up one of its components: Ensembles of Elastic Distances (EE), which is the classifier that leverages on the decades of research into the development of time-dedicated measures. Training EE can be prohibitive for many datasets. For example, it takes a month on the ElectricDevices dataset with 9000 instances. This is because EE needs to cross-validate the hyper-parameters used for the 11 similarity measures it encompasses. In this work, Fast Ensembles of Elastic Distances is proposed to train EE faster. There are two versions to it. The exact version makes it possible to train EE 10 times faster. The approximate version is 40 times faster than EE without significantly impacting the classification accuracy. This translates to being able to train EE on ElectricDevices in 13 h.

中文翻译:

FastEE:用于时间序列分类的弹性距离快速集合

近年来,提出了许多新的基于集合的时间序列分类(TSC)算法。他们每个人都比他们的前任更加准确。在UCR存储库中评估时,基于转换的集成体分层投票集合(HIVE-COTE)是目前最准确的TSC算法。它是5个基于集合的最新分类器的元集合。对于大多数数据集,HIVE-COTE的时间复杂性(尤其是对于培训而言)是令人望而却步的。因此,迫切需要加快组成HIVE-COTE的分类器。本文着重于加快其组成部分之一:弹性距离集合EE),这是利用数十年来对时间专用度量的开发进行研究的分类器。对于许多数据集,培训EE可能是禁止的。例如,在带有9000个实例的ElectricDevices数据集上花费一个月。这是因为EE需要交叉验证用于其包含的11种相似性度量的超参数。在这项工作中,提出了弹性距离快速集合以更快地训练EE。有两个版本。确切的版本使EE训练速度提高了10倍。近似版本比EE快40倍不会显着影响分类准确性。这意味着要能培养EEElectricDevices 13小时。
更新日期:2019-11-18
down
wechat
bug