Scaling *down* Deep Learning,arXiv - CS - Machine Learning

当前位置： X-MOL 学术 › arXiv.cs.LG › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Scaling *down* Deep Learning
arXiv - CS - Machine Learning Pub Date : 2020-11-29 , DOI: arxiv-2011.14439
Sam Greydanus

Though deep learning models have taken on commercial and political relevance, many aspects of their training and operation remain poorly understood. This has sparked interest in "science of deep learning" projects, many of which are run at scale and require enormous amounts of time, money, and electricity. But how much of this research really needs to occur at scale? In this paper, we introduce MNIST-1D: a minimalist, low-memory, and low-compute alternative to classic deep learning benchmarks. The training examples are 20 times smaller than MNIST examples yet they differentiate more clearly between linear, nonlinear, and convolutional models which attain 32, 68, and 94% accuracy respectively (these models obtain 94, 99+, and 99+% on MNIST). Then we present example use cases which include measuring the spatial inductive biases of lottery tickets, observing deep double descent, and metalearning an activation function.

中文翻译：

扩展*缩减*深度学习

尽管深度学习模型已经在商业和政治方面具有重要意义，但是其培训和运营的许多方面仍然知之甚少。这激发了人们对“深度学习科学”项目的兴趣，其中许多项目是大规模运行的，需要大量的时间，金钱和电力。但是真正需要大规模进行这项研究的有多少？在本文中，我们介绍了MNIST-1D：这是经典深度学习基准的极简，低内存和低计算替代方案。训练示例比MNIST示例小20倍，但它们在线性，非线性和卷积模型之间的区别更加明显，分别达到32％，68％和94％的精度（这些模型在MNIST上获得94％，99 +和99 +％）。

更新日期：2020-12-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文