On K-means clustering-based approach for DDBSs design,Journal of Big Data

当前位置： X-MOL 学术 › J. Big Data › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

On K-means clustering-based approach for DDBSs design
Journal of Big Data ( IF 8.1 ) Pub Date : 2020-05-11 , DOI: 10.1186/s40537-020-00306-9
Ali A. Amer

In Distributed Database Systems (DDBS), communication costs and response time have long been open-ended challenges. Nevertheless, when DDBS is carefully designed, the desired reduction in communication costs will be achieved. Data fragmentation (data clustering) and data allocation are on popularity as the prime strategies in constant use to design DDBS. Based on these strategies, on the other hand, several design techniques have been presented in the literature to improve DDBS performance using either empirical results or data statistics, making most of them imperfect or invalid particularly, at least, at the initial stage of DDBSs design. In this paper, thus, a heuristic k-means approach for vertical fragmentation and allocation is introduced. This approach is primarily focused on DDBS design at the initial stage. Many techniques are being joined in a step to make a promising work. A brief yet effective experimental study, on both artificially-created and real datasets, has been conducted to demonstrate the optimality of the proposed approach, comparing with its counterparts, as the obtained results has been shown encouraging.

中文翻译：

基于K均值聚类的DDBS设计方法

在分布式数据库系统（DDBS）中，通信成本和响应时间一直是无限制的挑战。但是，当精心设计DDBS时，可以实现所需的通信成本降低。数据碎片化（数据聚类）和数据分配作为不断使用的DDBS设计的主要策略而受到欢迎。另一方面，基于这些策略，文献中已经提出了几种设计技术，这些技术可以使用经验结果或数据统计数据来改善DDBS性能，尤其是至少在DDBS设计的初始阶段，它们中的大多数都不完美或无效。。因此，本文介绍了一种启发式k均值方法，用于垂直分割和分配。在最初阶段，这种方法主要集中在DDBS设计上。许多技术正一步步迈向成功的一步。进行了简短而有效的实验研究，无论是人工创建的数据集还是实际的数据集，都证明了所提出方法与其他方法相比的最优性，因为所获得的结果令人鼓舞。

更新日期：2020-05-11

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>