当前位置: X-MOL 学术J. Supercomput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Detecting straggler MapReduce tasks in big data processing infrastructure by neural network
The Journal of Supercomputing ( IF 2.5 ) Pub Date : 2020-01-09 , DOI: 10.1007/s11227-019-03136-6
Amir Javadpour , Guojun Wang , Samira Rezaei , Kuan-Ching Li

Straggler task detection is one of the main challenges in applying MapReduce for parallelizing and distributing large-scale data processing. It is defined as detecting running tasks on weak nodes. Considering two stages in the Map phase (copy, combine) and three stages of Reduce (shuffle, sort and reduce), the total execution time is the total sum of the execution time of these five stages. Estimating the correct execution time in each stage that results in correct total execution time is the primary purpose of this paper. The proposed method is based on the application of a backpropagation neural network on the Hadoop for the detection of straggler tasks, to estimate the remaining execution time of tasks that is very important in straggler task detection. Results achieved have been compared with popular algorithms in this domain such as LATE, ESAMR and the real remaining time for WordCount and Sort benchmarks, and shown able to detect straggler tasks and estimate execution time accurately. Besides, it supports to accelerate task execution time.

中文翻译:

通过神经网络检测大数据处理基础设施中落后的 MapReduce 任务

Straggler 任务检测是应用 MapReduce 并行化和分布式大规模数据处理的主要挑战之一。它被定义为检测弱节点上正在运行的任务。考虑到Map阶段的两个阶段(复制、合并)和Reduce的三个阶段(shuffle、排序和reduce),总的执行时间就是这五个阶段的执行时间的总和。估计每个阶段的正确执行时间,从而得出正确的总执行时间是本文的主要目的。所提出的方法基于在Hadoop上应用反向传播神经网络进行落后任务检测,估计任务的剩余执行时间,这在落后任务检测中非常重要。取得的结果已与该领域的流行算法(如 LATE、ESAMR 和 WordCount 和 Sort 基准的实际剩余时间,并显示能够检测落后任务并准确估计执行时间。此外,它还支持加快任务执行时间。
更新日期:2020-01-09
down
wechat
bug