当前位置: X-MOL 学术Inform. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An efficient algorithm for approximated self-similarity joins in metric spaces
Information Systems ( IF 3.7 ) Pub Date : 2020-02-24 , DOI: 10.1016/j.is.2020.101510
Sebastián Ferrada , Benjamin Bustos , Nora Reyes

Similarity join is a key operation in metric databases. It retrieves all pairs of elements that are similar. Solving such a problem usually requires comparing every pair of objects of the datasets, even when indexing and ad hoc algorithms are used. We propose a simple and efficient algorithm for the computation of the approximated k nearest neighbor self-similarity join. This algorithm computes Θ(n32) distances and it is empirically shown that it reaches an empirical precision of 46% in real-world datasets. We provide a comparison to other common techniques such as Quickjoin and Locality-Sensitive Hashing and argue that our proposal has a better execution time and average precision.



中文翻译:

度量空间中近似自相似联接的有效算法

相似性联接是度量标准数据库中的关键操作。它检索所有相似的元素对。解决此问题通常需要比较数据集的每对对象,即使使用索引和即席算法也是如此。我们提出了一种简单有效的算法来计算近似值ķ最近邻居自相似联接。该算法计算Θñ32距离,并根据经验表明,它在实际数据集中达到46%的经验精度。我们提供了与其他常见技术(如快速联接和局部敏感哈希)的比较,并认为我们的建议具有更好的执行时间和平均精度。

更新日期:2020-02-24
down
wechat
bug