LraSched: Admitting More Long-Running Applications via Auto-Estimating Container Size and Affinity,The Computer Journal

当前位置： X-MOL 学术 › Comput. J. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

LraSched: Admitting More Long-Running Applications via Auto-Estimating Container Size and Affinity
The Computer Journal ( IF 1.5 ) Pub Date : 2021-05-07 , DOI: 10.1093/comjnl/bxab072
Binlei Cai ₁ , Qin Guo ₂ , Junfeng Yu ₃

Affiliation

Many long-running applications (LRAs) are increasingly using containerization in shared production clusters. To achieve high resource efficiency and LRA performance, one of the key decisions made by existing cluster schedulers is the placement of LRA containers within a cluster. However, they fail to account for estimating the size and affinity of LRA containers before executing placement. We present LraSched, a cluster scheduler that places LRA containers onto machines based on their sizes and affinities while providing consistently high performance. LraSched introduces an automated method that leverages historical data and collects new information to estimate container size and affinity for an LRA. Specifically, it uses an online machine learning method to map a new incoming LRA to previous workloads from which we can transfer experience and recommends the amount of resources (size) and the degree of collocation (affinity) for the containers of the new incoming LRA. By means of recommendations, LraSched adapts the heuristic for vector bin packing to LRA scheduling and places LRA containers in a manner that both maximizes the number of LRAs deployed and minimizes the resource fragmentation, but without affecting LRA performance. Testbed and simulation experiments show that LraSched can improve the resource utilization by up to 6.2% while meeting performance constraints for LRAs.

中文翻译：

LraSched：通过自动估计容器大小和关联性来接纳更多长时间运行的应用程序

许多长期运行的应用程序 (LRA) 越来越多地在共享生产集群中使用容器化。为了实现高资源效率和 LRA 性能，现有集群调度程序做出的关键决策之一是将 LRA 容器放置在集群中。但是，他们未能在执行放置之前估计 LRA 容器的大小和亲和性。我们展示了 LraSched，这是一个集群调度程序，它根据 LRA 容器的大小和亲缘关系将它们放置到机器上，同时提供始终如一的高性能。LraSched 引入了一种自动化方法，该方法利用历史数据并收集新信息来估计容器大小和 LRA 的亲和力。具体来说，它使用在线机器学习方法将新传入的 LRA 映射到以前的工作负载，我们可以从中迁移经验，并为新传入的 LRA 的容器推荐资源量（大小）和配置程度（亲和度）。通过推荐，LraSched 将向量 bin 打包的启发式方法调整为 LRA 调度，并以最大化 LRA 部署数量和最小化资源碎片的方式放置 LRA 容器，但不影响 LRA 性能。测试台和仿真实验表明，LraSched 可以在满足 LRA 的性能限制的同时将资源利用率提高高达 6.2%。LraSched 将向量装箱的启发式方法调整为 LRA 调度，并以最大化 LRA 部署数量和最小化资源碎片的方式放置 LRA 容器，但不影响 LRA 性能。测试台和仿真实验表明，LraSched 可以在满足 LRA 的性能限制的同时将资源利用率提高高达 6.2%。LraSched 将向量装箱的启发式方法调整为 LRA 调度，并以最大化 LRA 部署数量和最小化资源碎片的方式放置 LRA 容器，但不影响 LRA 性能。测试台和仿真实验表明，LraSched 可以在满足 LRA 的性能限制的同时将资源利用率提高高达 6.2%。

更新日期：2021-05-07

点击分享查看原文

点击收藏

阅读更多本刊最新论文