当前位置: X-MOL 学术Simul. Model. Pract. Theory › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Interval Type-2 Fuzzy C-Means Data Placement Optimization in Scientific Cloud Workflow applications
Simulation Modelling Practice and Theory ( IF 3.5 ) Pub Date : 2020-11-17 , DOI: 10.1016/j.simpat.2020.102217
Hamdi Kchaou , Zied Kechaou , Adel M. Alimi

Scientific workflows stand as practical solutions useful for maintaining data intensive applications representation and execution purposes, which entail not only powerful computing resources, but also massive storage. With the emergence of cloud environment, which enhanced the execution of such applications, the study of workflow placement strategies, as targeted to effectively reduce data movements across data centers, has grown into a highly challenging objective. Given the fact that the workflow execution process is implemented in conformity with a task-execution order, and that each task may deal with either a single or multi-dataset, within a unique data center, various data partitioning or clustering methods have been devised in a bid to retrieve the most optimally effective workflow datasets’ distribution among data centers with the aim of remarkably reducing the datasets’ movements. In this work, a fuzzy data-dependencies based partitioning layer is implemented. More specifically, a dynamic massive data placement strategy is advanced through application of an Interval Type-2 Fuzzy C-Means technique. The latter is opted for as a means whereby the cluster related data centers can be rendered more consistent, thereby, making datasets rather closely associated in terms of related dependency, which helps in remarkably influencing the amounts of transferred data. The proposed strategy is evaluated by means of a simulation technique, using both random and real-world scientific workflows. The performed experiments appear to reveal well that our suggested strategy proves to outperform noticeably the relevant state-of-the-art methods, in that it noticeably helps in reducing the number of data movements across data centers.



中文翻译:

科学云工作流应用程序中的间隔2型模糊C均值数据放置优化

科学的工作流是实用的解决方案,可用于维护数据密集型应用程序的表示和执行目的,这不仅需要强大的计算资源,而且还需要大量存储。随着云环境的出现,增强了此类应用程序的执行能力,旨在有效减少数据中心之间的数据移动的工作流放置策略的研究已发展成为一个极具挑战性的目标。假设工作流执行过程是按照任务执行顺序实施的,并且每个任务都可以在唯一的数据中心内处理单个或多个数据集,为了显着减少数据集的移动,已设计出各种数据分区或聚类方法来检索最有效的工作流数据集在数据中心之间的分布。在这项工作中,实现了基于模糊数据依赖的分区层。更具体地说,通过应用间隔2型模糊C均值技术来推进动态海量数据放置策略。选择后者是一种方法,通过该方法可以使与群集相关的数据中心更加一致,从而使数据集在相关依赖性方面紧密相关,这有助于显着影响传输的数据量。通过使用随机和现实世界科学工作流程的仿真技术对提出的策略进行评估。

更新日期:2020-11-27
down
wechat
bug