Specifying and Testing GPU Workgroup Progress Models,arXiv - CS - Programming Languages

当前位置： X-MOL 学术 › arXiv.cs.PL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Specifying and Testing GPU Workgroup Progress Models
arXiv - CS - Programming Languages Pub Date : 2021-09-13 , DOI: arxiv-2109.06132
Tyler Sorensen, Lucas F. Salvador, Harmit Raval, Hugues Evrard, John Wickerson, Margaret Martonosi, Alastair F. Donaldson

As GPU availability has increased and programming support has matured, a wider variety of applications are being ported to these platforms. Many parallel applications contain fine-grained synchronization idioms; as such, their correct execution depends on a degree of relative forward progress between threads (or thread groups). Unfortunately, many GPU programming specifications say almost nothing about relative forward progress guarantees between workgroups. Although prior work has proposed a spectrum of plausible progress models for GPUs, cross-vendor specifications have yet to commit to any model. This work is a collection of tools experimental data to aid specification designers when considering forward progress guarantees in programming frameworks. As a foundation, we formalize a small parallel programming language that captures the essence of fine-grained synchronization. We then provide a means of formally specifying a progress model, and develop a termination oracle that decides whether a given program is guaranteed to eventually terminate with respect to a given progress model. Next, we formalize a constraint for concurrent programs that require relative forward progress to terminate. Using this constraint, we synthesize a large set of 483 progress litmus tests. Combined with the termination oracle, this allows us to determine the expected status of each litmus test -- i.e. whether it is guaranteed eventual termination -- under various progress models. We present a large experimental campaign running the litmus tests across 8 GPUs from 5 different vendors. Our results highlight that GPUs have significantly different termination behaviors under our test suite. Most notably, we find that Apple and ARM GPUs do not support the linear occupancy-bound model, an intuitive progress model defined by prior work and hypothesized to describe the workgroup schedulers of existing GPUs.

中文翻译：

指定和测试 GPU 工作组进度模型

随着 GPU 可用性的提高和编程支持的成熟，越来越多的应用程序被移植到这些平台上。许多并行应用程序包含细粒度的同步习惯用法；因此，它们的正确执行取决于线程（或线程组）之间的相对前进进度。不幸的是，许多 GPU 编程规范几乎没有提及工作组之间的相对前向进度保证。尽管之前的工作已经为 GPU 提出了一系列合理的进展模型，但跨供应商规范尚未对任何模型做出承诺。这项工作是工具实验数据的集合，可帮助规范设计者在考虑编程框架中的前向进度保证时提供帮助。作为基础，我们形式化了一种小型并行编程语言，它捕捉了细粒度同步的本质。然后，我们提供了一种正式指定进度模型的方法，并开发了一个终止预言机，用于决定是否保证给定的程序最终会根据给定的进度模型终止。接下来，我们对需要相对向前进展才能终止的并发程序形成约束。使用此约束，我们合成了大量 483 个进度试金石测试。结合终止预言机，这使我们能够确定每个试金石测试的预期状态——即是否保证最终终止——在各种进度模型下。我们展示了一个大型实验活动，在来自 5 个不同供应商的 8 个 GPU 上运行石蕊测试。我们的结果强调，在我们的测试套件下，GPU 具有显着不同的终止行为。最值得注意的是，我们发现 Apple 和 ARM GPU 不支持线性占用限制模型，这是一种由先前工作定义并假设用于描述现有 GPU 的工作组调度程序的直观进度模型。

更新日期：2021-09-14

点击分享查看原文

点击收藏

阅读更多本刊最新论文