当前位置: X-MOL 学术IEEE Trans. Neural Netw. Learn. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Understanding Neural Networks and Individual Neuron Importance via Information-Ordered Cumulative Ablation
IEEE Transactions on Neural Networks and Learning Systems ( IF 10.2 ) Pub Date : 2021-06-30 , DOI: 10.1109/tnnls.2021.3088685
Rana Ali Amjad 1 , Kairen Liu 2 , Bernhard C. Geiger 3
Affiliation  

In this work, we investigate the use of three information-theoretic quantities—entropy, mutual information with the class variable, and a class selectivity measure based on Kullback–Leibler (KL) divergence—to understand and study the behavior of already trained fully connected feedforward neural networks (NNs). We analyze the connection between these information-theoretic quantities and classification performance on the test set by cumulatively ablating neurons in networks trained on MNIST, FashionMNIST, and CIFAR-10. Our results parallel those recently published by Morcos et al., indicating that class selectivity is not a good indicator for classification performance. However, looking at individual layers separately, both mutual information and class selectivity are positively correlated with classification performance, at least for networks with ReLU activation functions. We provide explanations for this phenomenon and conclude that it is ill-advised to compare the proposed information-theoretic quantities across layers. Furthermore, we show that cumulative ablation of neurons with ascending or descending information-theoretic quantities can be used to formulate hypotheses regarding the joint behavior of multiple neurons, such as redundancy and synergy, with comparably low computational cost. We also draw connections to the information bottleneck theory for NNs.

中文翻译:


通过信息有序累积消融了解神经网络和单个神经元的重要性



在这项工作中,我们研究了三个信息论量(熵、类变量的互信息以及基于 Kullback-Leibler (KL) 散度的类选择性度量)的使用,以理解和研究已训练的完全连接的行为前馈神经网络(NN)。我们通过累积消融在 MNIST、FashionMNIST 和 CIFAR-10 上训练的网络中的神经元来分析这些信息论量与测试集上的分类性能之间的联系。我们的结果与 Morcos 等人最近发表的结果相似,表明类选择性并不是分类性能的良好指标。然而,单独观察各个层,互信息和类选择性都与分类性能正相关,至少对于具有 ReLU 激活函数的网络来说是这样。我们对这种现象进行了解释,并得出结论:跨层比较所提出的信息论量是不明智的。此外,我们表明,具有升序或降序信息理论量的神经元的累积消融可用于以相对较低的计算成本制定有关多个神经元的联合行为的假设,例如冗余和协同作用。我们还与神经网络的信息瓶颈理论建立了联系。
更新日期:2021-06-30
down
wechat
bug