当前位置: X-MOL 学术Big Data › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Online Analytical Processing for Business Intelligence in Big Data
Big Data ( IF 2.6 ) Pub Date : 2020-12-15 , DOI: 10.1089/big.2020.0045
Jigna Ashish Patel 1 , Priyanka Sharma 2
Affiliation  

Online analytical processing (OLAP) approach is widely used in business intelligence to cater the multidimensional queries for decades. In this era of cutting-edge technology and the internet, data generation rates have been rising exponentially. Internet of things sensors and social media platforms are some of the major contributors, leading toward the absolute data boom. Storage and speed are the crucial parameters and undoubtedly the burning issues in efficient data handling. The key idea here is to address these two challenges of big data computing in OLAP. In this article, the authors have proposed and implemented OLAP on Hadoop by Indexing (OOHI). OOHI offers a simplified multidimensional model that stores dimensions in the schema server and measures on the Hadoop cluster. Overall setup is divided into various modules, namely: data storage module (DSM), dimension encoding module (DEM), cube segmentation module, segment selection module (SSM), and block selection and process (BSAP) module. Serialization and deserialization concept applied by DSM for storage and retrieval of the data for efficient space utilization. Integer encoding adopted by DEM in dimension hierarchy is selected to escape sparsity problem in multidimensional big data. To reduce search space by chunks of the cube from the queried chunks, SSM plays an important role. Map reduce-based indexing approach and series of seek operations of BSAP module were integrated to achieve parallelism and fault tolerance. Real-time oceanography data and supermarket data sets are applied to demonstrate that OOHI model is data independent. Various test cases are designed to cover the scope of each dimension and volume of data set. Comparative results and performance analytics portray that OOHI outperforms in data storage, dice, slice, and roll-up operations compared with Hadoop based OLAP.

中文翻译:

大数据商业智能在线分析处理

几十年来,在线分析处理(OLAP)方法被广泛用于商业智能以满足多维查询。在这个尖端技术和互联网时代,数据生成率呈指数级增长。物联网传感器和社交媒体平台是一些主要贡献者,导致绝对数据繁荣。存储和速度是关键参数,无疑是高效数据处理中的紧迫问题。这里的关键思想是解决 OLAP 中大数据计算的这两个挑战。在本文中,作者通过索引(OOHI)在 Hadoop 上提出并实现了 OLAP。OOHI 提供了一个简化的多维模型,该模型将维度存储在模式服务器中,并将度量存储在 Hadoop 集群上。整体设置分为多个模块,即:数据存储模块(DSM)、维度编码模块(DEM)、立方体分割模块、段选择模块(SSM)和块选择和处理(BSAP)模块。DSM 应用序列化和反序列化概念来存储和检索数据以有效利用空间。选择DEM在维度层次中采用的整数编码来解决多维大数据中的稀疏问题。为了从查询的块中减少多维数据集块的搜索空间,SSM 起着重要作用。结合基于Map Reduce的索引方法和BSAP模块的一系列seek操作,实现并行性和容错性。应用实时海洋学数据和超市数据集来证明 OOHI 模型是数据独立的。各种测试用例旨在覆盖数据集的每个维度和数量的范围。比较结果和性能分析表明,与基于 Hadoop 的 OLAP 相比,OOHI 在数据存储、骰子、切片和汇总操作方面的表现要好。
更新日期:2020-12-21
down
wechat
bug