Skip to main content
Log in

Excitation Dropout: Encouraging Plasticity in Deep Neural Networks

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

We propose a guided dropout regularizer for deep networks based on the evidence of a network prediction defined as the firing of neurons in specific paths. In this work, we utilize the evidence at each neuron to determine the probability of dropout, rather than dropping out neurons uniformly at random as in standard dropout. In essence, we dropout with higher probability those neurons which contribute more to decision making at training time. This approach penalizes high saliency neurons that are most relevant for model prediction, i.e. those having stronger evidence. By dropping such high-saliency neurons, the network is forced to learn alternative paths in order to maintain loss minimization, resulting in a plasticity-like behavior, a characteristic of human brains too. We demonstrate better generalization ability, an increased utilization of network neurons, and a higher resilience to network compression using several metrics over four image/video recognition benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Achille, A., & Soatto, S. (2018). Information dropout: Learning optimal representations through noisy computation. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 40(12), 2897–2905.

    Article  Google Scholar 

  • Ba, J., & Frey, B. (2013). Adaptive dropout for training deep neural networks. Advances in Neural Information Processing Systems (NIPS), 26, 3084–3092.

    Google Scholar 

  • Baldi, P., & Sadowski, P. J. (2013). Understanding dropout. Advances in Neural Information Processing Systems (NIPS), 26, 2814–2822.

    Google Scholar 

  • Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR).

  • Gal, Y., Hron, J., & Kendall, A. (2017). Concrete dropout. In Advances in neural information processing systems (NIPS).

  • Ghiasi, G., Lin, T. Y., & Le, Q. V. (2018). Dropblock: A regularization method for convolutional networks. Advances in Neural Information Processing Systems (NIPS), 31, 10727–10737.

    Google Scholar 

  • Gomez, A. N., Zhang, I., Swersky, K., Gal, Y., & Hinton, G. E. (2018). Targeted dropout. In: NIPS Compact deep neural network representation with industrial applications workshop.

  • Griffin, G., Holub, A., & Perona, P. (2007). Caltech-256 object category dataset. Technical report 7694, California Institute of Technology. http://authors.library.caltech.edu/7694.

  • Hebb, D. O. (2005). The organization of behavior: A neuropsychological theory. Routledge: Psychology Press.

    Book  Google Scholar 

  • Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. In NIPS deep learning and representation learning workshop.

  • Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580.

  • Kang, G., Li, J., & Tao, D. (2017). Shakeout: A new approach to regularized deep neural network training. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 40(5), 1245–58.

    Article  Google Scholar 

  • Kingma, D. P., Salimans, T., & Welling, M. (2015). Variational dropout and the local reparameterization trick. Advances in Neural Information Processing Systems (NIPS), 28, 2575–2583.

    Google Scholar 

  • Krizhevsky, A., Hinton, G., et al. (2009). Learning multiple layers of features from tiny images. Citeseer.

  • Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (NIPS).

  • Li, Z., Gong, B., & Yang, T. (2016). Improved dropout for shallow and deep learning. In Advances in neural information processing systems (NIPS).

  • Ma, S., Bargal, S. A., Zhang, J., Sigal, L., & Sclaroff, S. (2017). Do less and achieve more: Training CNNs for action recognition utilizing action images from the web. Pattern Recognition, 68, 334–345.

    Article  Google Scholar 

  • Miconi, T., Clune, J., & Stanley, K. O. (2018). Differentiable plasticity: Training plastic neural networks with backpropagation. arXiv preprint arXiv:1804.02464.

  • Mittal, D., Bhardwaj, S., Khapra, M. M., & Ravindran, B. (2018). Recovering from random pruning: On the plasticity of deep convolutional neural networks. In Winter conference on applications of computer vision.

  • Morerio, P., Cavazza, J., Volpi, R., Vidal, R., & Murino, V. (2017). Curriculum dropout. In Proceedings of IEEE international conference on computer vision (ICCV).

  • Rennie, S. J., Goel, V., & Thomas, S. (2014). Annealed dropout training of deep networks. In Spoken language technology workshop (SLT), IEEE, pp. 159–164).

  • Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of IEEE international conference on computer vision (ICCV).

  • Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.

  • Song, S., Miller, K. D., & Abbott, L. F. (2000). Competitive hebbian learning through spike-timing-dependent synaptic plasticity. Nature Neuroscience, 3(9), 919.

    Article  Google Scholar 

  • Soomro, K., Zamir, A. R., & Shah, M. (2012). UCF101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402.

  • Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research (JMLR), 15(1), 1929–1958.

    MathSciNet  MATH  Google Scholar 

  • Wager, S., Wang, S., & Liang, P. S. (2013). Dropout training as adaptive regularization. Advances in Neural Information Processing Systems (NIPS), 26, 351–359.

    Google Scholar 

  • Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., & Fergus, R. (2013). Regularization of neural networks using dropconnect. In Proceedings of international conference on machine learning (ICML).

  • Wang, S., & Manning, C. (2013). Fast dropout training. In Proceedings of international conference on machine learning (ICML), pp. 118–126.

  • Wu, H., & Gu, X. (2015). Towards dropout training for convolutional neural networks. Neural Networks, 71, 1–10.

    Article  Google Scholar 

  • Zagoruyko, S., & Komodakis, N. (2016). Wide residual networks. In Proceedings of British machine vision conference (BMVC).

  • Zhang, J., Lin, Z., Brandt, J., Shen, X., & Sclaroff, S. (2016). Top-down neural attention by excitation backprop. In Proceedings of European conference on computer vision (ECCV).

  • Zhang, J., Bargal, S. A., Lin, Z., Brandt, J., Shen, X., & Sclaroff, S. (2017). Top-down neural attention by excitation backprop. International Journal of Computer Vision (IJCV), 126, 1–19.

    Google Scholar 

  • Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., & Torralba, A. (2016). Learning deep features for discriminative localization. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR).

Download references

Acknowledgements

This work was supported in part by the Defense Advanced Research Projects Agency (DARPA) Explainable Artificial Intelligence (XAI) program, an IBM PhD Fellowship, a Hariri Graduate Fellowship, and gifts from Adobe and NVidia. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. Disclaimer: The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of DARPA, DOI/IBC, or the U.S. Government.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrea Zunino.

Additional information

Communicated by Nikos KOMODAKIS.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was done when A. Zunino was in PAVIS, at Istituto Italiano di Tecnologia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zunino, A., Bargal, S.A., Morerio, P. et al. Excitation Dropout: Encouraging Plasticity in Deep Neural Networks. Int J Comput Vis 129, 1139–1152 (2021). https://doi.org/10.1007/s11263-020-01422-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-020-01422-y

Keywords

Navigation