Knowledge-Based Systems ( IF 5.921 ) Pub Date : 2019-10-09 , DOI: 10.1016/j.knosys.2019.105088 Yiwei Pan; Zhibin Pan; Yikun Wang; Wei Wang
The k-nearest neighbor (KNN) algorithm has been widely used in pattern recognition, regression, outlier detection and other data mining areas. However, it suffers from the large distance computation cost, especially when dealing with big data applications. In this paper, we propose a new fast search (FS) algorithm for exact k-nearest neighbors based on optimal triangle-inequality-based (OTI) check strategy. During the procedure of searching exact k-nearest neighbors for any query, the OTI check strategy can eliminate more redundant distance computations for the instances located in the marginal area of neighboring clusters compared with the original TI check strategy. Considering the large space complexity and extra time complexity of OTI, we also propose an efficient optimal triangle-inequality-based (EOTI) check strategy. The experimental results demonstrate that our proposed two algorithms (OTI and EOTI) achieve the best performance compared with other related KNN fast search algorithms, especially in the case of dealing with high-dimensional datasets.