Experiments on Properties of Hidden Structures of Sparse Neural Networks,arXiv - CS - Neural and Evolutionary Computing

当前位置： X-MOL 学术 › arXiv.cs.NE › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Experiments on Properties of Hidden Structures of Sparse Neural Networks
arXiv - CS - Neural and Evolutionary Computing Pub Date : 2021-07-27 , DOI: arxiv-2107.12917
Julian Stier, Harshil Darji, Michael Granitzer

Sparsity in the structure of Neural Networks can lead to less energy consumption, less memory usage, faster computation times on convenient hardware, and automated machine learning. If sparsity gives rise to certain kinds of structure, it can explain automatically obtained features during learning. We provide insights into experiments in which we show how sparsity can be achieved through prior initialization, pruning, and during learning, and answer questions on the relationship between the structure of Neural Networks and their performance. This includes the first work of inducing priors from network theory into Recurrent Neural Networks and an architectural performance prediction during a Neural Architecture Search. Within our experiments, we show how magnitude class blinded pruning achieves 97.5% on MNIST with 80% compression and re-training, which is 0.5 points more than without compression, that magnitude class uniform pruning is significantly inferior to it and how a genetic search enhanced with performance prediction achieves 82.4% on CIFAR10. Further, performance prediction for Recurrent Networks learning the Reber grammar shows an $R^2$ of up to 0.81 given only structural information.

中文翻译：

稀疏神经网络隐藏结构性质的实验

神经网络结构的稀疏性可以减少能耗、内存使用量、在方便的硬件上更快的计算时间以及自动化机器学习。如果稀疏引起某种结构，它可以解释在学习过程中自动获得的特征。我们提供了对实验的见解，在这些实验中，我们展示了如何通过先验初始化、修剪和学习期间实现稀疏性，并回答有关神经网络结构与其性能之间关系的问题。这包括将网络理论中的先验引入循环神经网络的第一项工作，以及神经架构搜索期间的架构性能预测。在我们的实验中，我们展示了幅度类盲剪枝如何在 MNIST 上达到 97.5%，并进行 80% 的压缩和重新训练，这比没有压缩的情况多 0.5 个百分点，幅度级均匀修剪明显不如它，以及如何通过性能预测增强的遗传搜索在 CIFAR10 上达到 82.4%。此外，循环网络学习 Reber 语法的性能预测显示，仅在给定结构信息的情况下，$R^2$ 高达 0.81。

更新日期：2021-07-28

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>