On spatial keyword covering,Knowledge and Information Systems

当前位置： X-MOL 学术 › Knowl. Inf. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

On spatial keyword covering
Knowledge and Information Systems ( IF 2.5 ) Pub Date : 2020-02-18 , DOI: 10.1007/s10115-020-01446-3
Dong-Wan Choi , Jian Pei , Xuemin Lin

This article introduces and solves a spatial keyword cover problem (SK-Cover for short), which aims to identify the group of spatio-textual objects covering all the keywords in a query and minimizing a distance cost function that leads to fewer objects in the answer set. In a broad sense, SK-Cover has been actively studied in the literature of spatial keyword search, such as the m-closest keywords query and the collective spatial keyword query. However, these existing works focus on minimizing only the largest pairwise distance even though the actual spatial cost is highly influenced by the number of objects in the answer group. Motivated by this, the present article further generalizes the problem definition in such a way that the total cost takes the cardinality of the group as well as the spatial distance. We prove that SK-Cover is not only NP-hard but also does not allow an approximation better than \(O(\log {|T|})\) in polynomial time, where T is the set of query keywords. We first establish an \(O(\log {|T|})\)-approximation algorithm, which is asymptotically optimal in terms of the approximability of SK-Cover, together with effective accessing strategies and pruning rules to improve the overall efficiency and scalability. Despite the NP-hardness of SK-Cover, this article also develops exact solutions that find the optimal group of objects in a reasonably fast manner in practice, especially when it is required to cover a relatively small number of query keywords. In addition to our algorithmic results, we empirically show that our approximation algorithm always achieves the best accuracy and the efficiency comparable to that of a state-of-the-art algorithm intended for \(m\hbox {CK}\), a problem similar to yet theoretically easier than SK-Cover, and also demonstrate that our exact algorithm using the proposed approximation scheme runs much faster than the baseline algorithm adapted from the existing solution for \(m\hbox {CK}\).

中文翻译：

关于空间关键字覆盖

本文介绍并解决了空间关键字覆盖问题（简称SK-Cover），该问题旨在确定覆盖查询中所有关键字的时空文本对象组，并最小化导致代价的对象减少的距离成本函数组。从广义上讲，SK-Cover已经在空间关键字搜索的文献中得到了积极的研究，例如m-最近关键字查询和集体空间关键字查询。。但是，这些现有的工作集中在最小化最大成对距离上，即使实际空间成本受答案组中对象数量的很大影响。出于此目的，本文进一步概括了问题的定义，以使总成本占该组的基数以及空间距离。我们证明，在多项式时间内，SK-Cover不仅是NP-hard的，而且不允许逼近于\（O（\ log {| T |}）\），其中T是查询关键字的集合。我们首先建立一个\（O（\ log {| T |}）\） -近似算法，该算法在SK-Cover的近似性方面是渐近最优的，以及有效的访问策略和修剪规则，以提高整体效率和可伸缩性。尽管具有SK-Cover的NP难点，但本文还开发了精确的解决方案，可以在实践中以相当快的方式找到最佳的对象组，尤其是在需要涵盖相对少量的查询关键字时。除了我们的算法结果，我们经验表明，我们的近似算法总是达到最佳的精度和效率相媲美，一个国家的最先进的算法，旨在为\（M \ hbox中{CK} \），出了问题类似于理论上比SK-Cover更容易，并且还证明了使用所提出的近似方案的精确算法比根据\（m \ hbox {CK} \）的现有解决方案改编的基线算法运行得更快。

更新日期：2020-02-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11