Subscribing to Big Data at Scale,arXiv - CS - Databases

当前位置： X-MOL 学术 › arXiv.cs.DB › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Subscribing to Big Data at Scale
arXiv - CS - Databases Pub Date : 2020-09-10 , DOI: arxiv-2009.04611
Xikui Wang, Michael J. Carey, Vassilis J. Tsotras

Today, data is being actively generated by a variety of devices, services, and applications. Such data is important not only for the information that it contains, but also for its relationships to other data and to interested users. Most existing Big Data systems focus on passively answering queries from users, rather than actively collecting data, processing it, and serving it to users. To satisfy both passive and active requests at scale, users need either to heavily customize an existing passive Big Data system or to glue multiple systems together. Either choice would require significant effort from users and incur additional overhead. In this paper, we present the BAD (Big Active Data) system, which is designed to preserve the merits of passive Big Data systems and introduce new features for actively serving Big Data to users at scale. We show the design and implementation of the BAD system, demonstrate how BAD facilitates providing both passive and active data services, investigate the BAD system's performance at scale, and illustrate the complexities that would result from instead providing BAD-like services with a "glued" system.

中文翻译：

大规模订阅大数据

如今，各种设备、服务和应用程序正在积极生成数据。此类数据不仅对于它包含的信息很重要，而且对于它与其他数据和感兴趣的用户的关系也很重要。大多数现有的大数据系统专注于被动地回答用户的查询，而不是主动收集、处理数据并将其提供给用户。为了大规模满足被动和主动请求，用户需要大量定制现有的被动大数据系统或将多个系统粘合在一起。任何一种选择都需要用户付出巨大的努力并产生额外的开销。在本文中，我们介绍了 BAD（大主动数据）系统，该系统旨在保留被动大数据系统的优点并引入新功能，以便为大规模用户主动提供大数据服务。

更新日期：2020-09-11

点击分享查看原文

点击收藏

阅读更多本刊最新论文