当前位置: X-MOL 学术Front. Inform. Technol. Electron. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient keyword search over graph-structured data based on minimal covered r -cliques
Frontiers of Information Technology & Electronic Engineering ( IF 2.7 ) Pub Date : 2020-04-01 , DOI: 10.1631/fitee.1800133
Asieh Ghanbarpour , Khashayar Niknafs , Hassan Naderi

Keyword search is an alternative for structured languages in querying graph-structured data. A result to a keyword query is a connected structure covering all or part of the queried keywords. The textual coverage and structural compactness have been known as the two main properties of a relevant result to a keyword query. Many previous works examined these properties after retrieving all of the candidate results using a ranking function in a comparative manner. However, this needs a time-consuming search process, which is not appropriate for an interactive system in which the user expects results in the least possible time. This problem has been addressed in recent works by confining the shape of results to examine their coverage and compactness during the search. However, these methods still suffer from the existence of redundant nodes in the retrieved results. In this paper, we introduce the semantic of minimal covered r-clique (MCCr) for the results of a keyword query as an extended model of existing definitions. We propose some efficient algorithms to detect the MCCrs of a given query. These algorithms can retrieve a comprehensive set of non-duplicate MCCrs in response to a keyword query. In addition, these algorithms can be executed in a distributive manner, which makes them outstanding in the field of keyword search. We also propose the approximate versions of these algorithms to retrieve the top-k approximate MCCrs in a polynomial delay. It is proved that the approximate algorithms can retrieve results in two-approximation. Extensive experiments on two real-world datasets confirm the efficiency and effectiveness of the proposed algorithms.



中文翻译:

基于最小覆盖r -clique的图结构化数据的有效关键字搜索

关键字搜索是查询图结构化数据时结构化语言的替代方法。关键字查询的结果是覆盖所有或部分查询关键字的连接结构。文本覆盖率和结构紧凑性已成为关键字查询相关结果的两个主要属性。许多先前的工作在以比较方式使用排名函数检索所有候选结果之后,检查了这些属性。但是,这需要耗时的搜索过程,这不适用于用户希望在尽可能短的时间内获得结果的交互式系统。在最近的工作中,通过限制结果的形状以检查其在搜索过程中的覆盖范围和紧密度,解决了该问题。然而,这些方法仍然遭受检索结果中存在冗余节点的困扰。在本文中,我们介绍了最小覆盖r-clique(MCCr)作为关键字查询的结果,作为现有定义的扩展模型。我们提出了一些有效的算法来检测给定查询的MCC r。这些算法可以响应关键字查询来检索一组全面的非重复MCC r。另外,这些算法可以以分布式方式执行,这使得它们在关键字搜索领域表现出色。我们还提出了这些算法的近似版本,以在多项式延迟中检索前k个近似MCC r s。证明了该近似算法能够以二近似值检索结果。在两个真实世界的数据集上进行的大量实验证实了所提出算法的效率和有效性。

更新日期:2020-04-18
down
wechat
bug