Differentiating a Tensor Language,arXiv - CS - Programming Languages

当前位置： X-MOL 学术 › arXiv.cs.PL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Differentiating a Tensor Language
arXiv - CS - Programming Languages Pub Date : 2020-08-25 , DOI: arxiv-2008.11256
Gilbert Bernstein and Michael Mara and Tzu-Mao Li and Dougal Maclaurin and Jonathan Ragan-Kelley

How does one compile derivatives of tensor programs, such that the resulting code is purely functional (hence easier to optimize and parallelize) and provably efficient relative to the original program? We show that naively differentiating tensor code---as done in popular systems like Tensorflow and PyTorch---can cause asymptotic slowdowns in pathological cases, violating the Cheap Gradients Principle. However, all existing automatic differentiation methods that guarantee this principle (for variable size data) do so by relying on += mutation through aliases/pointers---which complicates downstream optimization. We provide the first purely functional, provably efficient, adjoint/reverse-mode derivatives of array/tensor code by explicitly accounting for sparsity. We do this by focusing on the indicator function from Iverson's APL. We also introduce a new "Tensor SSA" normal form and a new derivation of reverse-mode automatic differentiation based on the universal property of inner-products.

中文翻译：

区分张量语言

如何编译张量程序的衍生代码，使得生成的代码是纯函数式的（因此更容易优化和并行化）并且相对于原始程序来说更高效？我们表明，天真区分张量代码——就像在 Tensorflow 和 PyTorch 等流行系统中所做的那样——会导致病理情况下的渐近减速，违反了廉价梯度原则。然而，所有现有的自动微分方法（对于可变大小的数据）都通过别名/指针依赖 += 变异来保证这一原则——这使下游优化复杂化。我们通过明确地考虑稀疏性，提供了第一个纯函数式、可证明有效的、伴随/反向模式导数的数组/张量代码。我们通过关注艾弗森 APL 的指标函数来做到这一点。

更新日期：2020-10-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>