Dynamic Tensor Rematerialization,arXiv - CS - Programming Languages

当前位置： X-MOL 学术 › arXiv.cs.PL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Dynamic Tensor Rematerialization
arXiv - CS - Programming Languages Pub Date : 2020-06-17 , DOI: arxiv-2006.09616
Marisa Kirisame, Steven Lyubomirsky, Altan Haan, Jennifer Brennan, Mike He, Jared Roesch, Tianqi Chen, and Zachary Tatlock

Checkpointing enables training deep learning models under restricted memory budgets by freeing intermediate activations from memory and recomputing them on demand. Previous checkpointing techniques statically plan these recomputations offline and assume static computation graphs. We demonstrate that a simple online algorithm can achieve comparable performance by introducing Dynamic Tensor Rematerialization (DTR), a greedy online algorithm for checkpointing that is extensible and general, is parameterized by eviction policy, and supports dynamic models. We prove that DTR can train an $N$-layer linear feedforward network on an $\Omega(\sqrt{N})$ memory budget with only $\mathcal{O}(N)$ tensor operations. DTR closely matches the performance of optimal static checkpointing in simulated experiments. We incorporate a DTR prototype into PyTorch just by interposing on tensor allocations and operator calls and collecting lightweight metadata on tensors.

中文翻译：

动态张量重新实现

检查点通过从内存中释放中间激活并按需重新计算它们，可以在有限的内存预算下训练深度学习模型。以前的检查点技术静态地计划这些离线重新计算并假设静态计算图。我们证明了一个简单的在线算法可以通过引入动态张量重新实现（DTR）来实现可比的性能，这是一种用于检查点的贪婪在线算法，具有可扩展性和通用性，由驱逐策略参数化，并支持动态模型。我们证明 DTR 可以在 $\Omega(\sqrt{N})$ 内存预算上训练 $N$ 层线性前馈网络，只需 $\mathcal{O}(N)$ 张量运算。DTR 与模拟实验中最佳静态检查点的性能密切匹配。

更新日期：2020-10-14

点击分享查看原文

点击收藏

阅读更多本刊最新论文