Assisting Static Compiler Vectorization with a Speculative Dynamic Vectorizer in an HW/SW Codesigned Environment,ACM Transactions on Computer Systems

当前位置： X-MOL 学术 › ACM Trans. Comput. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Assisting Static Compiler Vectorization with a Speculative Dynamic Vectorizer in an HW/SW Codesigned Environment
ACM Transactions on Computer Systems ( IF 1.5 ) Pub Date : 2020-04-04 , DOI: 10.1145/2807694
Rakesh Kumar ₁ , Alejandro Martínez ₂ , Antonio González ₃

Affiliation

Compiler-based static vectorization is used widely to extract data-level parallelism from computation-intensive applications. Static vectorization is very effective in vectorizing traditional array-based applications. However, compilers’ inability to do accurate interprocedural pointer disambiguation and interprocedural array dependence analysis severely limits vectorization opportunities. HW/SW codesigned processors provide an excellent opportunity to optimize the applications at runtime. The availability of dynamic application behavior at runtime helps in capturing vectorization opportunities generally missed by the compilers. This article proposes to complement the static vectorization with a speculative dynamic vectorizer in an HW/SW codesigned processor. We present a speculative dynamic vectorization algorithm that speculatively reorders ambiguous memory references to uncover vectorization opportunities. The speculative reordering of memory instructions avoids the need for accurate interprocedural pointer disambiguation and interprocedural array dependence analysis. The hardware checks for any memory dependence violation due to speculative vectorization and takes corrective action in case of violation. Our experiments show that the combined (static + dynamic) vectorization approach provides a 2× performance benefit compared to the static GCC vectorization alone, for SPECFP2006. Furthermore, the speculative dynamic vectorizer is able to vectorize 48% of the loops that ICC failed to vectorize due to conservative dependence analysis in the TSVC benchmark suite. Moreover, the dynamic vectorization scheme is as effective in vectorization of pointer-based applications as for the array-based ones, whereas compilers lose significant vectorization opportunities in pointer-based applications. Furthermore, we show that speculation is not only a luxury but also a necessity for runtime vectorization.

中文翻译：

在硬件/软件协同设计的环境中使用推测动态矢量化器辅助静态编译器矢量化

基于编译器的静态向量化被广泛用于从计算密集型应用程序中提取数据级并行性。静态矢量化在矢量化传统的基于数组的应用程序方面非常有效。然而，编译器无法进行准确的过程间指针消歧和过程间数组依赖分析严重限制了向量化的机会。硬件/软件代码设计的处理器提供了在运行时优化应用程序的绝佳机会。运行时动态应用程序行为的可用性有助于捕捉编译器通常错过的向量化机会。本文建议使用 HW/SW 代码签名处理器中的推测动态矢量化器来补充静态矢量化。我们提出了一种推测性动态矢量化算法，该算法推测性地重新排序不明确的内存引用以发现矢量化机会。内存指令的推测性重新排序避免了对精确的过程间指针消歧和过程间数组依赖性分析的需要。硬件检查是否存在由于推测矢量化而导致的内存依赖违规，并在违规情况下采取纠正措施。我们的实验表明，对于 SPECFP2006，与单独的静态 GCC 矢量化相比，组合（静态 + 动态）矢量化方法提供了 2 倍的性能优势。此外，推测性动态矢量化器能够矢量化 48% 的循环，ICC 由于 TSVC 基准套件中的保守依赖性分析而未能矢量化。而且，动态向量化方案在基于指针的应用程序的向量化中与基于数组的应用程序一样有效，而编译器在基于指针的应用程序中失去了重要的向量化机会。此外，我们表明，投机不仅是一种奢侈品，而且是运行时矢量化的必需品。

更新日期：2020-04-04

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>