OntoBestFit: A Best-Fit Occurrence Estimation strategy for RDF driven faceted semantic search,Computer Communications

当前位置： X-MOL 学术 › Comput. Commun. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

OntoBestFit: A Best-Fit Occurrence Estimation strategy for RDF driven faceted semantic search
Computer Communications ( IF 4.5 ) Pub Date : 2020-06-15 , DOI: 10.1016/j.comcom.2020.06.013
Gerard Deepak , A. Santhanavijayan

As Web 2.0 is transforming into a more systematized Semantic Web, there is an urgent need for semantic compliant techniques for web search. In this paper, the OntoBestFit has been proposed which is an RDF driven approach for minimizing ambiguity in search results and increasing the diversity of results, thereby solving both the context irrelevance and the serendipity problem. OntoBestFit focuses on harvesting RDF from standardized industry scale Semantic Wikis and transforming it into an intermediate dyadic structure, which makes it suitable for heterogeneous real-world domains. The approach focuses on deriving an RDF prioritization vector from a Term-Frequency Matrix and a Term co-occurrence Matrix formulated from the reduced dyadic RDF entities over a corpus of web pages to yield Query Indicator terms. The incorporation of dynamic query expansion by generating query facets from the Domain Ontologies and Query Indicator terms using the best-fit occurrence estimation algorithm has made this approach novel. The best-fit occurrence estimation algorithm is based on the adaptation of Simpson’s Diversity Index to compute the reduced and highly appropriate semantically similar domain ontologies based on domain richness computation, to increase the diversity in search results. Also, an Enriched Adaptive Pointwise Mutual Information measure has been proposed to compute the semantic similarity. OntoBestFit yields an average accuracy of 94.91% with a very low False Discovery Rate of 0.07, with a response time of 0.28 ms which is the best in class semantic search strategy in the era of Semantic Web.

中文翻译：

OntoBestFit：RDF驱动的多面语义搜索的最佳匹配发生率估计策略

随着Web 2.0转变为更加系统化的语义Web，迫切需要用于Web搜索的语义兼容技术。在本文中，已经提出了OntoBestFit，它是一种RDF驱动的方法，用于最小化搜索结果中的歧义并增加结果的多样性，从而解决上下文无关性和偶然性问题。OntoBestFit专注于从标准化的行业规模语义Wiki收集RDF，并将其转换为中间二元结构，从而使其适用于异构的现实世界域。该方法的重点是从术语频度矩阵和术语共现矩阵得出RDF优先级向量，这些术语是根据网页数据库上的精简二元RDF实体制定的，以生成查询指示词。通过使用最佳匹配发生估计算法从领域本体和查询指标术语生成查询构面来合并动态查询扩展，使该方法变得新颖。最佳匹配发生估计算法基于Simpson's Diversity Index的改编，基于领域丰富度计算来计算简化且非常合适的语义相似的领域本体，以增加搜索结果的多样性。此外，已经提出了一种丰富的自适应点向互信息度量来计算语义相似度。OntoBestFit的平均准确度为94.91％，错误发现率仅为0.07，非常低，响应时间为0.28 ms，这是语义Web时代最佳的语义搜索策略。

更新日期：2020-06-19

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11