当前位置: X-MOL 学术J. Parallel Distrib. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A general purpose contention manager for software transactions on the GPU
Journal of Parallel and Distributed Computing ( IF 3.8 ) Pub Date : 2020-01-28 , DOI: 10.1016/j.jpdc.2019.12.018
Qi Shen , Craig Sharp , Richard Davison , Gary Ushaw , Rajiv Ranjan , Albert Y. Zomaya , Graham Morgan

The Graphics Processing Unit (GPU) is now used extensively for general purpose GPU programming (GPGPU), allowing for greater exploitation of the multi-core model across many application domains. This is particularly true in cloud/edge/fog computing, where multiple GPU enabled servers support many different end user services. This move away from the naturally parallel domain of graphics can incur significant performance issues. Unlike the CPU, code that is hindered from execution due to blocking/waiting on the GPU can affect thousands of threads, rendering the advantages of a GPU irrelevant and reducing a highly parralel environemnt down to a serial one in the worst case. In this paper we present a solution that minimises blocking/waiting in GPGPU computing using a contention manager that offsets memory conflicts across threads through thread re-ordering. We consider conflicts of memory not only to avoid corruption (standard for transactional memory) but also in the semantic layer of application logic (e.g., enforcing ordering to ensure money drawn from bank account occurs after all deposits). We demonstrate how our approach is successful across a number of industry benchmarks and compare our approach to the only other related solution. We also demonstrate that our approach is scalable in terms of thread numbers (a key requirement on the GPU). We believe this is the first work of its kind demonstrating a generalised conflict and semantic contention manager suitable for the scale of parralel execution found on a GPU.



中文翻译:

用于GPU上软件交易的通用竞争管理器

图形处理单元(GPU)现在已广泛用于通用GPU编程(GPGPU),从而可以在许多应用程序域中更好地利用多核模型。在云/边缘/雾计算中尤其如此,其中支持多个GPU的服务器支持许多不同的最终用户服务。脱离图形的自然并行域可能会导致严重的性能问题。与CPU不同,由于在GPU上阻塞/等待而被阻止执行的代码会影响成千上万的线程,从而使GPU的优势变得无关紧要,并且在最坏的情况下将高度并行的环境缩减为串行环境。在本文中,我们提出了一种解决方案,该解决方案使用争用管理器通过线程重新排序来抵消线程间的内存冲突,从而最大程度地减少GPGPU计算中的阻塞/等待时间。我们认为内存冲突不仅可以避免损坏(交易内存的标准),而且可以在应用程序逻辑的语义层考虑(例如,强制执行排序以确保从所有存款中提取银行帐户的钱)。我们演示了我们的方法在多个行业基准中的成功之处,并将我们的方法与唯一的其他相关解决方案进行了比较。我们还证明了我们的方法在线程数量方面是可扩展的(GPU上的一项关键要求)。

更新日期:2020-01-29
down
wechat
bug