Excitation Dropout: Encouraging Plasticity in Deep Neural Networks

Zunino, Andrea; Bargal, Sarah Adel; Morerio, Pietro; Zhang, Jianming; Sclaroff, Stan; Murino, Vittorio

doi:10.1007/s11263-020-01422-y

Excitation Dropout: Encouraging Plasticity in Deep Neural Networks

Published: 09 January 2021

Volume 129, pages 1139–1152, (2021)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Andrea Zunino ORCID: orcid.org/0000-0001-8924-7402¹^na1,
Sarah Adel Bargal²^na1,
Pietro Morerio^1,3,
Jianming Zhang⁴,
Stan Sclaroff² &
…
Vittorio Murino^1,3,5

844 Accesses
9 Citations
4 Altmetric
Explore all metrics

Abstract

We propose a guided dropout regularizer for deep networks based on the evidence of a network prediction defined as the firing of neurons in specific paths. In this work, we utilize the evidence at each neuron to determine the probability of dropout, rather than dropping out neurons uniformly at random as in standard dropout. In essence, we dropout with higher probability those neurons which contribute more to decision making at training time. This approach penalizes high saliency neurons that are most relevant for model prediction, i.e. those having stronger evidence. By dropping such high-saliency neurons, the network is forced to learn alternative paths in order to maintain loss minimization, resulting in a plasticity-like behavior, a characteristic of human brains too. We demonstrate better generalization ability, an increased utilization of network neurons, and a higher resilience to network compression using several metrics over four image/video recognition benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Article 18 August 2021

Iqbal H. Sarker

CBAM: Convolutional Block Attention Module

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Article Open access 31 March 2021

Laith Alzubaidi, Jinglan Zhang, … Laith Farhan

References

Achille, A., & Soatto, S. (2018). Information dropout: Learning optimal representations through noisy computation. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 40(12), 2897–2905.
Article Google Scholar
Ba, J., & Frey, B. (2013). Adaptive dropout for training deep neural networks. Advances in Neural Information Processing Systems (NIPS), 26, 3084–3092.
Google Scholar
Baldi, P., & Sadowski, P. J. (2013). Understanding dropout. Advances in Neural Information Processing Systems (NIPS), 26, 2814–2822.
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR).
Gal, Y., Hron, J., & Kendall, A. (2017). Concrete dropout. In Advances in neural information processing systems (NIPS).
Ghiasi, G., Lin, T. Y., & Le, Q. V. (2018). Dropblock: A regularization method for convolutional networks. Advances in Neural Information Processing Systems (NIPS), 31, 10727–10737.
Google Scholar
Gomez, A. N., Zhang, I., Swersky, K., Gal, Y., & Hinton, G. E. (2018). Targeted dropout. In: NIPS Compact deep neural network representation with industrial applications workshop.
Griffin, G., Holub, A., & Perona, P. (2007). Caltech-256 object category dataset. Technical report 7694, California Institute of Technology. http://authors.library.caltech.edu/7694.
Hebb, D. O. (2005). The organization of behavior: A neuropsychological theory. Routledge: Psychology Press.
Book Google Scholar
Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. In NIPS deep learning and representation learning workshop.
Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580.
Kang, G., Li, J., & Tao, D. (2017). Shakeout: A new approach to regularized deep neural network training. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 40(5), 1245–58.
Article Google Scholar
Kingma, D. P., Salimans, T., & Welling, M. (2015). Variational dropout and the local reparameterization trick. Advances in Neural Information Processing Systems (NIPS), 28, 2575–2583.
Google Scholar
Krizhevsky, A., Hinton, G., et al. (2009). Learning multiple layers of features from tiny images. Citeseer.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (NIPS).
Li, Z., Gong, B., & Yang, T. (2016). Improved dropout for shallow and deep learning. In Advances in neural information processing systems (NIPS).
Ma, S., Bargal, S. A., Zhang, J., Sigal, L., & Sclaroff, S. (2017). Do less and achieve more: Training CNNs for action recognition utilizing action images from the web. Pattern Recognition, 68, 334–345.
Article Google Scholar
Miconi, T., Clune, J., & Stanley, K. O. (2018). Differentiable plasticity: Training plastic neural networks with backpropagation. arXiv preprint arXiv:1804.02464.
Mittal, D., Bhardwaj, S., Khapra, M. M., & Ravindran, B. (2018). Recovering from random pruning: On the plasticity of deep convolutional neural networks. In Winter conference on applications of computer vision.
Morerio, P., Cavazza, J., Volpi, R., Vidal, R., & Murino, V. (2017). Curriculum dropout. In Proceedings of IEEE international conference on computer vision (ICCV).
Rennie, S. J., Goel, V., & Thomas, S. (2014). Annealed dropout training of deep networks. In Spoken language technology workshop (SLT), IEEE, pp. 159–164).
Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of IEEE international conference on computer vision (ICCV).
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
Song, S., Miller, K. D., & Abbott, L. F. (2000). Competitive hebbian learning through spike-timing-dependent synaptic plasticity. Nature Neuroscience, 3(9), 919.
Article Google Scholar
Soomro, K., Zamir, A. R., & Shah, M. (2012). UCF101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402.
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research (JMLR), 15(1), 1929–1958.
MathSciNet MATH Google Scholar
Wager, S., Wang, S., & Liang, P. S. (2013). Dropout training as adaptive regularization. Advances in Neural Information Processing Systems (NIPS), 26, 351–359.
Google Scholar
Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., & Fergus, R. (2013). Regularization of neural networks using dropconnect. In Proceedings of international conference on machine learning (ICML).
Wang, S., & Manning, C. (2013). Fast dropout training. In Proceedings of international conference on machine learning (ICML), pp. 118–126.
Wu, H., & Gu, X. (2015). Towards dropout training for convolutional neural networks. Neural Networks, 71, 1–10.
Article Google Scholar
Zagoruyko, S., & Komodakis, N. (2016). Wide residual networks. In Proceedings of British machine vision conference (BMVC).
Zhang, J., Lin, Z., Brandt, J., Shen, X., & Sclaroff, S. (2016). Top-down neural attention by excitation backprop. In Proceedings of European conference on computer vision (ECCV).
Zhang, J., Bargal, S. A., Lin, Z., Brandt, J., Shen, X., & Sclaroff, S. (2017). Top-down neural attention by excitation backprop. International Journal of Computer Vision (IJCV), 126, 1–19.
Google Scholar
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., & Torralba, A. (2016). Learning deep features for discriminative localization. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR).

Download references

Acknowledgements

This work was supported in part by the Defense Advanced Research Projects Agency (DARPA) Explainable Artificial Intelligence (XAI) program, an IBM PhD Fellowship, a Hariri Graduate Fellowship, and gifts from Adobe and NVidia. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. Disclaimer: The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of DARPA, DOI/IBC, or the U.S. Government.

Author information

Andrea Zunino and Sarah Adel Bargal have contributed equally to this work.

Authors and Affiliations

Ireland Research Center, Huawei Technologies Co. Ltd, Dublin, Ireland
Andrea Zunino, Pietro Morerio & Vittorio Murino
Department of Computer Science, Boston University, Boston, USA
Sarah Adel Bargal & Stan Sclaroff
Pattern Analysis and Computer Vision (PAVIS), Istituto Italiano di Tecnologia (IIT), Genoa, Italy
Pietro Morerio & Vittorio Murino
Adobe Research, San Jose, USA
Jianming Zhang
Department of Computer Science, University of Verona, Verona, Italy
Vittorio Murino

Authors

Andrea Zunino
View author publications
You can also search for this author in PubMed Google Scholar
Sarah Adel Bargal
View author publications
You can also search for this author in PubMed Google Scholar
Pietro Morerio
View author publications
You can also search for this author in PubMed Google Scholar
Jianming Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Stan Sclaroff
View author publications
You can also search for this author in PubMed Google Scholar
Vittorio Murino
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrea Zunino.

Additional information

Communicated by Nikos KOMODAKIS.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was done when A. Zunino was in PAVIS, at Istituto Italiano di Tecnologia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zunino, A., Bargal, S.A., Morerio, P. et al. Excitation Dropout: Encouraging Plasticity in Deep Neural Networks. Int J Comput Vis 129, 1139–1152 (2021). https://doi.org/10.1007/s11263-020-01422-y

Download citation

Received: 24 May 2019
Accepted: 21 December 2020
Published: 09 January 2021
Issue Date: April 2021
DOI: https://doi.org/10.1007/s11263-020-01422-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Excitation Dropout: Encouraging Plasticity in Deep Neural Networks

Abstract

Access this article

Similar content being viewed by others

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

CBAM: Convolutional Block Attention Module

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Excitation Dropout: Encouraging Plasticity in Deep Neural Networks

Abstract

Access this article

Similar content being viewed by others

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

CBAM: Convolutional Block Attention Module

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation