NeuroVectorizer: End-to-End Vectorization with Deep Reinforcement Learning,arXiv - CS - Programming Languages

当前位置： X-MOL 学术 › arXiv.cs.PL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

NeuroVectorizer: End-to-End Vectorization with Deep Reinforcement Learning
arXiv - CS - Programming Languages Pub Date : 2019-09-20 , DOI: arxiv-1909.13639
Ameer Haj-Ali, Nesreen K. Ahmed, Ted Willke, Sophia Shao, Krste Asanovic, Ion Stoica

One of the key challenges arising when compilers vectorize loops for today's SIMD-compatible architectures is to decide if vectorization or interleaving is beneficial. Then, the compiler has to determine how many instructions to pack together and how many loop iterations to interleave. Compilers are designed today to use fixed-cost models that are based on heuristics to make vectorization decisions on loops. However, these models are unable to capture the data dependency, the computation graph, or the organization of instructions. Alternatively, software engineers often hand-write the vectorization factors of every loop. This, however, places a huge burden on them, since it requires prior experience and significantly increases the development time. In this work, we explore a novel approach for handling loop vectorization and propose an end-to-end solution using deep reinforcement learning (RL). We conjecture that deep RL can capture different instructions, dependencies, and data structures to enable learning a sophisticated model that can better predict the actual performance cost and determine the optimal vectorization factors. We develop an end-to-end framework, from code to vectorization, that integrates deep RL in the LLVM compiler. Our proposed framework takes benchmark codes as input and extracts the loop codes. These loop codes are then fed to a loop embedding generator that learns an embedding for these loops. Finally, the learned embeddings are used as input to a Deep RL agent, which determines the vectorization factors for all the loops. We further extend our framework to support multiple supervised learning methods. We evaluate our approaches against the currently used LLVM vectorizer and loop polyhedral optimization techniques. Our experiments show 1.29X-4.73X performance speedup compared to baseline and only 3% worse than the brute-force search on a wide range of benchmarks.

中文翻译：

NeuroVectorizer：具有深度强化学习的端到端矢量化

当编译器为当今的 SIMD 兼容架构矢量化循环时出现的主要挑战之一是决定矢量化或交织是否有益。然后，编译器必须确定要打包多少条指令以及要交错多少次循环迭代。今天的编译器被设计为使用基于启发式的固定成本模型来对循环做出矢量化决策。但是，这些模型无法捕获数据依赖性、计算图或指令组织。或者，软件工程师经常手写每个循环的矢量化因子。然而，这给他们带来了巨大的负担，因为它需要事先的经验并显着增加开发时间。在这项工作中，我们探索了一种处理循环向量化的新方法，并提出了一种使用深度强化学习 (RL) 的端到端解决方案。我们推测，深度 RL 可以捕获不同的指令、依赖项和数据结构，从而能够学习一个复杂的模型，该模型可以更好地预测实际性能成本并确定最佳向量化因子。我们开发了一个端到端框架，从代码到向量化，在 LLVM 编译器中集成了深度强化学习。我们提出的框架将基准代码作为输入并提取循环代码。然后将这些循环代码馈送到循环嵌入生成器，该生成器学习这些循环的嵌入。最后，学习到的嵌入被用作深度强化学习代理的输入，它决定了所有循环的矢量化因子。我们进一步扩展了我们的框架以支持多种监督学习方法。我们根据当前使用的 LLVM 向量化器和循环多面体优化技术评估我们的方法。我们的实验表明，与基线相比，性能提高了 1.29 倍到 4.73 倍，并且仅比在各种基准上的蛮力搜索差 3%。

更新日期：2020-01-07

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>