Diversification on big data in query processing,Frontiers of Computer Science

当前位置： X-MOL 学术 › Front. Comput. Sci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Diversification on big data in query processing
Frontiers of Computer Science ( IF 4.2 ) Pub Date : 2020-01-03 , DOI: 10.1007/s11704-019-8324-9
Meifan Zhang , Hongzhi Wang , Jianzhong Li , Hong Gao

Recently, in the area of big data, some popular applications such as web search engines and recommendation systems, face the problem to diversify results during query processing. In this sense, it is both significant and essential to propose methods to deal with big data in order to increase the diversity of the result set. In this paper, we firstly define the diversity of a set and the ability of an element to improve the overall diversity. Based on these definitions, we propose a diversification framework which has good performance in terms of effectiveness and efficiency. Also, this framework has theoretical guarantee on probability of success. Secondly, we design implementation algorithms based on this framework for both numerical and string data. Thirdly, for numerical and string data respectively, we carry out extensive experiments on real data to verify the performance of our proposed framework, and also perform scalability experiments on synthetic data.

中文翻译：

查询处理中大数据的多样化

最近，在大数据领域，一些流行的应用程序（例如Web搜索引擎和推荐系统）面临着在查询处理期间使结果多样化的问题。从这个意义上说，提出处理大数据的方法以增加结果集的多样性既重要又必不可少。在本文中，我们首先定义了集合的多样性和元素改善整体多样性的能力。基于这些定义，我们提出了一个多元化的框架，该框架在有效性和效率方面均具有良好的表现。而且，该框架对成功的可能性具有理论上的保证。其次，我们基于该框架设计用于数字和字符串数据的实现算法。第三，分别针对数字和字符串数据，

更新日期：2020-01-03

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>