当前位置: X-MOL 学术Knowl. Based Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Shape-based outlier detection in multivariate functional data
Knowledge-Based Systems ( IF 7.2 ) Pub Date : 2020-04-23 , DOI: 10.1016/j.knosys.2020.105960
Clément Lejeune , Josiane Mothe , Adil Soubki , Olivier Teste

Multivariate functional data refer to a population of multivariate functions generated by a system involving dynamic parameters depending on continuous variables (e.g., multivariate time series). Outlier detection in such a context is a challenging problem because both the individual behavior of the parameters and the dynamic correlation between them are important. To address this problem, recent work has focused on multivariate functional depth to identify the outliers in a given dataset. However, most previous approaches fail when the outlyingness manifests itself in curve shape rather than curve magnitude. In this paper, we propose identifying outliers in multivariate functional data by a method whereby different outlying features are captured based on mapping functions from differential geometry. In this regard, we extract shape features reflecting the outlyingness of a curve with a high degree of interpretability. We conduct an experimental study on real and synthetic data sets and compare the proposed method with functional-depth-based methods. The results demonstrate that the proposed method, combined with state-of-the-art outlier detection algorithms, can outperform the functional-depth-based methods. Moreover, in contrast with the baseline methods, it is efficient regardless of the proportion of outliers.



中文翻译:

多元功能数据中基于形状的离群值检测

多元函数数据是指由系统根据连续变量(例如,多元时间序列)涉及动态参数而生成的多元函数集合。在这种情况下,异常值检测是一个具有挑战性的问题,因为参数的个别行为以及它们之间的动态相关性都很重要。为了解决这个问题,最近的工作集中在多元函数深度上,以识别给定数据集中的离群值。但是,当偏远点以曲线形状而不是曲线幅度表现出来时,大多数以前的方法都将失败。在本文中,我们建议通过一种方法来识别多元函数数据中的离群值,在该方法中,基于来自微分几何的映射函数来捕获不同的离群特征。在这方面,我们提取具有高度可解释性的,反映曲线边缘的形状特征。我们对真实和综合数据集进行了实验研究,并将该方法与基于功能深度的方法进行了比较。结果表明,该方法与最新的离群值检测算法相结合,可以胜过基于功能深度的方法。此外,与基线方法相比,无论离群值的比例如何,它都是有效的。可以胜过基于功能深度的方法。此外,与基线方法相比,无论离群值的比例如何,它都是有效的。可以胜过基于功能深度的方法。此外,与基线方法相比,无论离群值的比例如何,它都是有效的。

更新日期:2020-04-23
down
wechat
bug