当前位置: X-MOL 学术IEEE Can. J. Electr. Comput. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Extracting OLAP Cubes From Document-Oriented NoSQL Database Based on Parallel Similarity Algorithms
IEEE Canadian Journal of Electrical and Computer Engineering ( IF 2.1 ) Pub Date : 2020-01-01 , DOI: 10.1109/cjece.2019.2953049
Farnaz Davardoost , Amin Babazadeh Sangar , Kambiz Majidzadeh

Today, the relational database is not suitable for data management due to the large variety and volume of data which are mostly untrusted. Therefore, NoSQL has attracted the attention of companies. Despite it being a proper choice for managing a variety of large volume data, there is a big challenge and difficulty in performing online analytical processing (OLAP) on NoSQL since it is schema-less. This article aims to introduce a model to overcome null value in converting document-oriented NoSQL databases into relational databases using parallel similarity techniques. The proposed model includes four phases, shingling, chunck, minhashing, and locality-sensitive hashing MapReduce (LSHMR). Each phase performs a proper process on input NoSQL databases. The main idea of LSHMR is based on the nature of both locality-sensitive hashing (LSH) and MapReduce (MR). In this article, the LSH similarity search technique is used on the MR framework to extract OLAP cubes. LSH is used to decrease the number of comparisons. Furthermore, MR enables efficient distributed and parallel computing. The proposed model is an efficient and suitable approach for extracting OLAP cubes from an NoSQL database.

中文翻译:

基于并行相似性算法从面向文档的 NoSQL 数据库中提取 OLAP 多维数据集

如今,关系型数据库不适合数据管理,因为数据种类繁多,数据量大,而且大多不受信任。因此,NoSQL 引起了企业的关注。尽管它是管理各种大容量数据的合适选择,但由于它是无模式的,因此在 NoSQL 上执行在线分析处理 (OLAP) 存在很大的挑战和困难。本文旨在介绍一种模型,以克服使用并行相似性技术将面向文档的 NoSQL 数据库转换为关系数据库时的空值问题。所提出的模型包括四个阶段,shingling、chunck、minhashing 和局部敏感哈希 MapReduce (LSHMR)。每个阶段对输入的 NoSQL 数据库执行适当的处​​理。LSHMR 的主要思想基于局部敏感散列 (LSH) 和 MapReduce (MR) 的性质。在本文中,在 MR 框架上使用 LSH 相似性搜索技术来提取 OLAP 多维数据集。LSH 用于减少比较次数。此外,MR 可实现高效的分布式和并行计算。所提出的模型是一种从 NoSQL 数据库中提取 OLAP 多维数据集的有效且合适的方法。
更新日期:2020-01-01
down
wechat
bug