Optimizing job completion time with fairness in large-scale data centers,Future Generation Computer Systems

当前位置： X-MOL 学术 › Future Gener. Comput. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Optimizing job completion time with fairness in large-scale data centers
Future Generation Computer Systems ( IF 7.5 ) Pub Date : 2020-08-22 , DOI: 10.1016/j.future.2020.08.013
Zhaoxi Wu , Liqun Fu

In this paper, we aim to design new algorithms to schedule jobs efficiently and fairly. In particular, we consider that different jobs in a shared cluster have different degrees of sensitivity to their completion times, and parallel frameworks should provide differential treatment to jobs with different degrees of sensitivity. The above problem called CORA (Completion-time Optimal Resource Allocation) can be formulated as a lexicographical maximization problem. But it is challenging to solve in practice due to its inherent multi-objective, integer, large-scale, and non-convex nature. To address these challenges, we propose efficient algorithms under different problem scales. In particular, when the problem is of small-to-median scale, we propose a method called SILM (Single Iteration Lexicographical Maximization) so that the optimal schedule can be found efficiently. When the problem is of large-scale, we propose a method called MILM (Multiple Iterations Lexicographical Maximization), which can be further accelerated by the UCDM (Uniform Coordinate Descent Method), to find the optimal schedule iteratively. Last but not least, we also propose a heuristic algorithm, called Multiple Resource Water Filling (MRWF) that can reach close-to-optimal solution with fast run-time. The simulation using CVXPY (Python version of CVX) shows that, compared with the state-of-art method called Linearized-CORA, our SILM method can reduce run time by at most 87.7%. Furthermore, the performance gap of the heuristic method is within 15% compared with the optimal solutions.

中文翻译：

公平地优化大型数据中心的工作完成时间

在本文中，我们旨在设计新算法以高效，公平地调度作业。特别是，我们认为共享集群中的不同作业对其完成时间的敏感性不同，并且并行框架应为敏感性程度不同的作业提供区别对待。可以将上述称为CORA（完成时最佳资源分配）的问题表述为词典最大化问题。但是由于其固有的多目标，整数，大规模和非凸性，因此在实践中解决具有挑战性。为了解决这些挑战，我们提出了在不同问题规模下的有效算法。特别是当问题是中小规模时，我们提出一种称为SILM（单迭代词法最大化）的方法，以便可以高效地找到最佳计划。当问题规模较大时，我们提出一种称为MILM（多重迭代词典最大化）的方法，可以通过UCDM（统一坐标下降法）进一步加速该方法，以迭代地找到最佳调度。最后但并非最不重要的一点是，我们还提出了一种启发式算法，称为多资源注水（MRWF），该算法可以在快速运行时达到接近最佳的解决方案。使用CVXPY（Python版本的CVX）进行的仿真表明，与最新的方法Linearized-CORA相比，我们的SILM方法最多可以减少87.7％的运行时间。此外，与最佳解决方案相比，启发式方法的性能差距在15％以内。当问题规模较大时，我们提出一种称为MILM（多重迭代词典最大化）的方法，可以通过UCDM（统一坐标下降法）进一步加速该方法，以迭代地找到最佳调度。最后但并非最不重要的一点是，我们还提出了一种启发式算法，称为多资源注水（MRWF），该算法可以在快速运行时达到接近最佳的解决方案。使用CVXPY（Python版本的CVX）进行的仿真表明，与最新的方法Linearized-CORA相比，我们的SILM方法最多可以减少87.7％的运行时间。此外，与最佳解决方案相比，启发式方法的性能差距在15％以内。当问题规模较大时，我们提出一种称为MILM（多重迭代词典最大化）的方法，可以通过UCDM（统一坐标下降法）进一步加速该方法，以迭代地找到最佳调度。最后但并非最不重要的一点是，我们还提出了一种启发式算法，称为多资源注水（MRWF），该算法可以在快速运行时达到接近最佳的解决方案。使用CVXPY（Python版本的CVX）进行的仿真表明，与最新的方法Linearized-CORA相比，我们的SILM方法最多可以减少87.7％的运行时间。此外，与最佳解决方案相比，启发式方法的性能差距在15％以内。可以通过UCDM（统一坐标下降法）进一步加速，以迭代方式找到最佳计划。最后但并非最不重要的一点是，我们还提出了一种启发式算法，称为多资源注水（MRWF），该算法可以在快速运行时达到接近最佳的解决方案。使用CVXPY（Python版本的CVX）进行的仿真表明，与最新的方法Linearized-CORA相比，我们的SILM方法最多可以减少87.7％的运行时间。此外，与最佳解决方案相比，启发式方法的性能差距在15％以内。可以通过UCDM（统一坐标下降法）进一步加速，以迭代方式找到最佳计划。最后但并非最不重要的一点是，我们还提出了一种启发式算法，称为多资源注水（MRWF），该算法可以在快速运行时达到接近最佳的解决方案。使用CVXPY（Python版本的CVX）进行的仿真表明，与最新的方法Linearized-CORA相比，我们的SILM方法最多可以减少87.7％的运行时间。此外，与最佳解决方案相比，启发式方法的性能差距在15％以内。使用CVXPY（Python版本的CVX）进行的仿真表明，与最新的方法Linearized-CORA相比，我们的SILM方法最多可以减少87.7％的运行时间。此外，与最佳解决方案相比，启发式方法的性能差距在15％以内。使用CVXPY（Python版本的CVX）进行的仿真表明，与最新的方法Linearized-CORA相比，我们的SILM方法最多可以减少87.7％的运行时间。此外，与最佳解决方案相比，启发式方法的性能差距在15％以内。

更新日期：2020-08-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>