Abstract
One of the most challenging topics in analyzing multi-dimensional geo-spatial data such as geophysical data-sets is detecting outlier data. The issue mainly originates from the difficulty in describing “normality” or “abnormality” due to the complexity of the relationships between the data elements. Considerable number of methods have been proposed and applied for detecting outliers whether they are assumed to be noise, anomalies within the data-set or simply isolated events. A new outlier detection method reached from automatic training of Local Linear Model Tree (LOLIMOT) network, and based on the data selected by K-Nearest Neighborhood (KNN) search is proposed in this research. The procedure of selecting data pairs is through decile analysis using distances calculated during KNN data grouping. Experiment on a synthetic 12 cluster 3D data-set is indicative of the method’s robust performance where calculated Cumulative Error Percentage (CEP) is 13% for the method whereas the nearest value for the KNN is 19%. Also, by applying the method on a micro-gravimetric data and an earthquake catalogue related to the north Zagros- west Alborz, and based on the output of the analyses performed, the superiority of the method in outlier detection was confirmed.
Similar content being viewed by others
References
Arabelos D, Asteriadis G, Contadakis M, Zioutas G, Xu D, Zhang C, Zheng B (2001) The use of an outlier detecting method in time series of continuous daily measurements of underground water level and temperature in earthquake prediction investigation. Tectonophysics 338(3):315–323. https://doi.org/10.1016/S0040-1951(01)00086-5
Balta H, Velagic J, Bosschaerts W, Cubber GD, Siciliano B (2018) Fast statistical outlier removal based method for large 3D point clouds of outdoor environments. IFAC-PapersOnLine 51(22):348–353. https://doi.org/10.1016/j.ifacol.2018.11.566
Dang TT, Ngan HYT, Liu W (2015) Distance-based k-nearest neighbors outlier detection method in large-scale traffic data. 2015 IEEE international conference on digital signal processing (DSP), Singapore, pp 507–510. https://doi.org/10.1109/ICDSP.2015.7251924
Divya D, Babu SS (2016) Methods to detect different types of outliers. 2016 International Conference on Data Mining and Advanced Computing (SAPIENCE), Ernakulam, pp 23–28. https://doi.org/10.1109/SAPIENCE.2016.7684114
Domingues R, Filippone M, Michiardi P, Zouaoui J (2018) A comparative evaluation of outlier detection algorithms: experiments and analyses. Pattern Recogn 74:406–421. https://doi.org/10.1016/j.patcog.2017.09.037
Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27(8):861–874. https://doi.org/10.1016/j.patrec.2005.10.010
Font Y, Kao H, Lallemand S, Liu CS, Chiao LY (2004) Hypocenter determination offshore of eastern Taiwan using the maximum intersection method. Geophys J Int 158(2):655–675. https://doi.org/10.1111/j.1365-246X.2004.02317.x
Hajian A, Zomorrodian H, Styles P, Greco F, Lucas C (2011) Depth estimation of cavities from microgravity data using a new approach: the local linear model tree (LOLIMOT). Near Surf Geophys 10(3):221–234
Hodge V, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22:85–126
Hu L-Y, Huang M-W, Ke S-W, Tsai C-F (2016) The distance function effect on k-nearest neighbor classification for medical datasets. SpringerPlus 5 (1)
Junior ABB, Pires PS, d. M. (2014) An approach to outlier detection and smoothing applied to a Trajectography radar data. J Aerosp Technol Manag 6(3):237–248. https://doi.org/10.5028/jatm.v6i3.325
Liu FT, Ting KM, Zhou ZH (2008) Isolation forest in data mining. In: ICDM’08. Eighth IEEE international conference on data mining, pp 413–422
Martyshko PS, Ladovskiy IV, Byzov DD (2016) Stable methods of interpretation of gravimetric data. Doklady Earth Sci 471:1319–1322. https://doi.org/10.1134/S1028334X16120199
Nelles O (1997) Nonlinear system identification with neurofuzzy, intelligent hybrid systems-fuzzy logic, neural networks, Genetic Algorithms. Kluwer Academic Publishers
Nelles O (2001) Nonlinear systems identification. Springer
Onda S, Sano Y, Takahata N, Kagoshima T, Miyajima T, Shibata T, Pinti DL, Lan T, Kim NK, Kusakabe M, Nishio Y (2018) Groundwater oxygen isotope anomaly before the M6. 6 Tottori earthquake in Southwest Japan. Sci Rep 8(1):1–7
Purkhauser AF, Pail R (2019) Next generation gravity missions: near-real time gravity field retrieval strategy. Geophys J Int 217(2):1314–1333
Rashed M (2018) Outliers-out stack: a new algorithm for processing seismic data. Explor Geophys 49(1):42–49. https://doi.org/10.1071/EG16025
Samadi, H.R., Kimiaefar, R. Hajian, A. (2020) Robust earthquake cluster analysis based on K-nearest neighbor search. Pure Appl Geophys. 177 5661–5671. https://doi.org/10.1007/s00024-020-02618-6
Seth S., Babbar S. (2015) Outlier detection algorithm suite in MATLAB
Steiner M., Straka W., Flores-Orozco A. (2018) Event detection in ambient seismic noise by means of robust outlier detection. EGU 2018 General Assembly Conference: 17365.
Xu X, Liu H, Li L, Yao M (2018) A comparison of outlier detection techniques for high-dimensional data. Int J Comput Intell Syst 11(1):652–662
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: H. Babaie
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Tabatabaei, M., Kimiaefar, R., Hajian, A. et al. Robust outlier detection in geo-spatial data based on LOLIMOT and KNN search. Earth Sci Inform 14, 1065–1072 (2021). https://doi.org/10.1007/s12145-021-00610-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12145-021-00610-9