当前位置: X-MOL 学术ACM Trans. Program. Lang. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Analysis and Optimization of Task Granularity on the Java Virtual Machine
ACM Transactions on Programming Languages and Systems ( IF 1.5 ) Pub Date : 2019-07-16 , DOI: 10.1145/3338497
Andrea Rosà 1 , Eduardo Rosales 1 , Walter Binder 1
Affiliation  

Task granularity, i.e., the amount of work performed by parallel tasks, is a key performance attribute of parallel applications. On the one hand, fine-grained tasks (i.e., small tasks carrying out few computations) may introduce considerable parallelization overheads. On the other hand, coarse-grained tasks (i.e., large tasks performing substantial computations) may not fully utilize the available CPU cores, leading to missed parallelization opportunities. In this article, we provide a better understanding of task granularity for task-parallel applications running on a single Java Virtual Machine in a shared-memory multicore. We present a new methodology to accurately and efficiently collect the granularity of each executed task, implemented in a novel profiler (available open-source) that collects carefully selected metrics from the whole system stack with low overhead, and helps developers locate performance and scalability problems. We analyze task granularity in the DaCapo, ScalaBench, and Spark Perf benchmark suites, revealing inefficiencies related to fine-grained and coarse-grained tasks in several applications. We demonstrate that the collected task-granularity profiles are actionable by optimizing task granularity in several applications, achieving speedups up to a factor of 5.90×. Our results highlight the importance of analyzing and optimizing task granularity on the Java Virtual Machine.

中文翻译:

Java虚拟机任务粒度分析与优化

任务粒度,即并行任务执行的工作量,是并行应用程序的关键性能属性。一方面,细粒度任务(即执行少量计算的小任务)可能会引入相当大的并行化开销。另一方面,粗粒度任务(即执行大量计算的大型任务)可能无法充分利用可用的 CPU 内核,从而导致错失并行化机会。在本文中,我们将更好地理解在共享内存多核中的单个 Java 虚拟机上运行的任务并行应用程序的任务粒度。我们提出了一种新方法来准确有效地收集每个执行任务的粒度,在一个新颖的分析器(可用开源)中实现,它以低开销从整个系统堆栈中收集精心挑选的指标,并帮助开发人员定位性能和可伸缩性问题。我们分析了 DaCapo、ScalaBench 和 Spark Perf 基准测试套件中的任务粒度,揭示了几个应用程序中与细粒度和粗粒度任务相关的低效率。我们证明,通过优化多个应用程序中的任务粒度,收集的任务粒度配置文件是可行的,可实现高达 5.90 倍的加速。我们的结果强调了在 Java 虚拟机上分析和优化任务粒度的重要性。揭示了在几个应用程序中与细粒度和粗粒度任务相关的低效率。我们证明,通过优化多个应用程序中的任务粒度,收集的任务粒度配置文件是可行的,可实现高达 5.90 倍的加速。我们的结果强调了在 Java 虚拟机上分析和优化任务粒度的重要性。揭示了在几个应用程序中与细粒度和粗粒度任务相关的低效率。我们证明,通过优化多个应用程序中的任务粒度,收集的任务粒度配置文件是可行的,可实现高达 5.90 倍的加速。我们的结果强调了在 Java 虚拟机上分析和优化任务粒度的重要性。
更新日期:2019-07-16
down
wechat
bug