当前位置: X-MOL 学术Future Gener. Comput. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
QAOC: Novel query analysis and ontology-based clustering for data management in Hadoop
Future Generation Computer Systems ( IF 6.2 ) Pub Date : 2020-03-04 , DOI: 10.1016/j.future.2020.03.010
D. Pradeep , C. Sundar

Bottleneck issues handled in the field of information retrieval are analysis of query and management of data storage. Hadoop is a large scale environment that is supported with larger storage and faster processing. Even though, it suffers from these challenging issues while the number of information requesters is higher. This paper addresses these two bottleneck issues in Hadoop by retrieving the information with the design of Query Analysis and Ontology-based Clustering (QAOC) architecture. In QAOC architecture, the components involved are query manager, scheduler and data management. Initially the query manager consolidates the query if they are similar; hereby the searching time is effectively minimized. Then the user query is scheduled in neuro-fuzzy by computing query arrival time, query length and query expiry time. The data management in the back-end is operated by weighted ontology-based clustering method to cluster the data based on their relevancy. The scheduled user query is searched in the ontology based balanced binary tree and lastly the relevant results are ranked using Okapi BM25 and delivered to user. This QAOC architecture is experimented on Hadoop 2.7 and the results are compared in terms of execution time, processing speed and memory consumption.



中文翻译:

QAOC:用于Hadoop中数据管理的新颖查询分析和基于本体的集群

信息检索领域中处理的瓶颈问题是查询分析和数据存储管理。Hadoop是一个大规模环境,具有更大的存储空间和更快的处理速度。即使,当信息请求者的数量增加时,它也遭受了这些具有挑战性的问题的困扰。本文通过使用查询分析和基于本体的集群(QAOC)架构的设计来检索信息,从而解决了Hadoop中的两个瓶颈问题。在QAOC体系结构中,涉及的组件是查询管理器,调度程序和数据管理。最初,如果查询管理器相似,则将其合并。由此有效地使搜索时间最小化。然后,通过计算查询到达时间,查询长度和查询到期时间来对用户查询进行神经模糊调度。后端中的数据管理通过基于加权本体的聚类方法进行操作,以基于数据的相关性对数据进行聚类。在基于本体的平衡二叉树中搜索计划的用户查询,最后使用Okapi BM25对相关结果进行排名并交付给用户。此QAOC架构在Hadoop 2.7上进行了实验,并对执行时间,处理速度和内存消耗进行了比较。

更新日期:2020-03-04
down
wechat
bug