当前位置: X-MOL 学术J. Intell. Inf. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Skyline-based dissimilarity of images
Journal of Intelligent Information Systems ( IF 3.4 ) Pub Date : 2019-08-14 , DOI: 10.1007/s10844-019-00571-y
Nikolaos Georgiadis , Eleftherios Tiakas , Yannis Manolopoulos , Apostolos N. Papadopoulos

Large image collections are being used in many modern applications. In this paper, we aim at capturing the intrinsic dissimilarities of image descriptors in large image collections, i.e., to detect dissimilar (or else diverse) images without defining an explicit similarity or distance measure. Towards this goal, we adopt skyline processing techniques for large image databases, based on their high-dimensional descriptor vectors. The novelty of the proposed methodology lies in the use of skyline techniques empowered by state-of-the-art hashing schemes to enable effective data partitioning and indexing in secondary memory, towards supporting large image databases. The proposed approach is evaluated experimentally by using three real-world image datasets. Performance evaluation results demonstrate that images lying on the skyline have significantly different characteristics, which depend on the type of the descriptor. Thus, these skyline items may be used as seeds to apply clustering in large image databases. In addition, we observe that skyline processing using hash-based indexing structures is significantly faster than index-free skyline computation and also more efficient than skyline computation with hierarchical indexing structures. Based on our results, the proposed approach is both efficient (regarding runtime) and effective (with respect to image diversity) and therefore can be used as a base for more complex data mining tasks such as clustering.

中文翻译:

基于天际线的图像差异

许多现代应用程序都在使用大型图像集合。在本文中,我们的目标是捕捉大型图像集合中图像描述符的内在差异,即在不定义明确的相似性或距离度量的情况下检测不同(或不同)的图像。为实现这一目标,我们基于大型图像数据库的高维描述符向量采用天际线处理技术。所提出方法的新颖之处在于使用由最先进的散列方案授权的 Skyline 技术,以在辅助存储器中实现有效的数据分区和索引,以支持大型图像数据库。通过使用三个真实世界的图像数据集,对所提出的方法进行了实验评估。性能评估结果表明,位于天际线上的图像具有显着不同的特征,这取决于描述符的类型。因此,这些天际线项目可用作种子以在大型图像数据库中应用聚类。此外,我们观察到使用基于哈希的索引结构的 Skyline 处理比无索引的 Skyline 计算快得多,也比使用分层索引结构的 Skyline 计算更有效。根据我们的结果,所提出的方法既高效(关于运行时间)又有效(关于图像多样性),因此可以用作更复杂的数据挖掘任务(如聚类)的基础。这些天际线项目可用作种子以在大型图像数据库中应用聚类。此外,我们观察到使用基于哈希的索引结构的 Skyline 处理比无索引的 Skyline 计算快得多,也比使用分层索引结构的 Skyline 计算更有效。根据我们的结果,所提出的方法既高效(关于运行时间)又有效(关于图像多样性),因此可以用作更复杂的数据挖掘任务(如聚类)的基础。这些天际线项目可用作种子以在大型图像数据库中应用聚类。此外,我们观察到使用基于哈希的索引结构的 Skyline 处理比无索引的 Skyline 计算快得多,也比使用分层索引结构的 Skyline 计算更有效。根据我们的结果,所提出的方法既高效(关于运行时间)又有效(关于图像多样性),因此可以用作更复杂的数据挖掘任务(如聚类)的基础。
更新日期:2019-08-14
down
wechat
bug