Skip to main content
Log in

Convolutional Networks with Adaptive Inference Graphs

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Do convolutional networks really need a fixed feed-forward structure? What if, after identifying the high-level concept of an image, a network could move directly to a layer that can distinguish fine-grained differences? Currently, a network would first need to execute sometimes hundreds of intermediate layers that specialize in unrelated aspects. Ideally, the more a network already knows about an image, the better it should be at deciding which layer to compute next. In this work, we propose convolutional networks with adaptive inference graphs (ConvNet-AIG) that adaptively define their network topology conditioned on the input image. Following a high-level structure similar to residual networks (ResNets), ConvNet-AIG decides for each input image on the fly which layers are needed. In experiments on ImageNet we show that ConvNet-AIG learns distinct inference graphs for different categories. Both ConvNet-AIG with 50 and 101 layers outperform their ResNet counterpart, while using \(20\%\) and \(38\%\) less computations respectively. By grouping parameters into layers for related classes and only executing relevant layers, ConvNet-AIG improves both efficiency and overall classification quality. Lastly, we also study the effect of adaptive inference graphs on the susceptibility towards adversarial examples. We observe that ConvNet-AIG shows a higher robustness than ResNets, complementing other known defense mechanisms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Andreas, J., Rohrbach, M., Darrell, T., Klein, D. (2016). Learning to compose neural networks for question answering. In: Proceedings of NAACL-HLT.

  • Andreas, J., Rohrbach, M., Darrell, T., & Klein, D. (2016). Neural module networks. In: Conference on computer vision and pattern recognition (CVPR).

  • Bengio, E., Bacon, P. L., Pineau, J., & Precup, D. (2015). Conditional computation in neural networks for faster models. arXiv preprint arXiv:1511.06297.

  • Bengio, Y., Léonard, N., & Courville, A. (2013). Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432.

  • Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In: Conference on computer vision and pattern recognition (CVPR).

  • Figurnov, M., Collins, M. D., Zhu, Y., Zhang, L., Huang, J., Vetrov, D., & Salakhutdinov, R. (2017). Spatially adaptive computation time for residual networks. In: Conference on computer vision and pattern recognition (CVPR).

  • Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep sparse rectifier neural networks. In: International conference on artificial intelligence and statistics (AISTATS).

  • Goodfellow, I. J., Shlens, J., & Szegedy, C. (2014). Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572.

  • Gumbel, E. J. (1954). Statistical theory of extreme values and some practical applications: A series of lectures. 33. US Govt. Print. Office.

  • Guo, C., Rana, M., Cisse, M., & van der Maaten, L. (2017). Countering adversarial images using input transformations. arXiv preprint arXiv:1711.00117.

  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In: Conference on computer vision and pattern recognition (CVPR).

  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Identity mappings in deep residual networks. In: European conference on computer vision (ECCV).

  • Hu, J., Shen, L., & Sun, G. (2017). Squeeze-and-excitation networks. arXiv preprint arXiv:1709.01507.

  • Huang, G., Chen, D., Li, T., Wu, F., van der Maaten, L., & Weinberger, K. Q. (2017). Multi-scale dense convolutional networks for efficient prediction. arXiv preprint arXiv:1703.09844.

  • Huang, G., Liu, Z., Weinberger, K. Q., & van der Maaten, L. (2017). Densely connected convolutional networks. In: Conference on computer vision and pattern recognition (CVPR).

  • Huang, G., Sun, Y., Liu, Z., Sedra, D., & Weinberger, K. Q. (2016). Deep networks with stochastic depth. In: European conference on computer vision (ECCV)

  • Huang, X., & Belongie, S. (2017). Arbitrary style transfer in real-time with adaptive instance normalization. In: International conference on computer vision (ICCV).

  • Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp. 448–456.

  • Jang, E., Gu, S., & Poole, B. (2016). Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144.

  • Johnson, J., Hariharan, B., van der Maaten, L., Hoffman, J., Fei-Fei, L., Zitnick, C. L., et al. (2017). Inferring andexecuting programs for visual reasoning. In: International conference on computer vision (ICCV).

  • Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.

  • Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images.

  • Li, H., Lin, Z., Shen, X., Brandt, J., & Hua, G. (2015). A convolutional neural network cascade for face detection. In: Conference on computer vision and pattern recognition (CVPR).

  • Li, Y., Wang, N., Liu, J., & Hou, X. (2017). Demystifying neural style transfer. arXiv preprint arXiv:1701.01036

  • Maddison, C. J., Mnih, A., & Teh, Y. W. (2016). The concrete distribution: A continuous relaxation of discrete random variables. arXiv preprint arXiv:1611.00712.

  • Misra, I., Gupta, A., & Hebert, M. (2017). From red wine to red tomato: Composition with context. In: Conference on computer vision and pattern recognition (CVPR).

  • Shazeer, N., Mirhoseini, A., Maziarz, K., Davis, A., Le, Q., Hinton, G., & Dean, J. (2017). Outrageously large neural networks: The sparsely-gated mixture-of-experts layer. arXiv preprint arXiv:1701.06538.

  • Srivastava, N., Hinton, G. E., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of machine learning research (JMLR), 15(1), 1929–1958.

    MathSciNet  MATH  Google Scholar 

  • Srivastava, R. K., Greff, K., & Schmidhuber, J. (2015). Highway networks. arXiv preprint arXiv:1505.00387.

  • Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. In: Conference on computer vision and pattern recognition (CVPR).

  • Teerapittayanon, S., McDanel, B., & Kung, H. (2016). Branchynet: Fast inference via early exiting from deep neural networks. In: Conference on pattern recognition (ICPR).

  • Veit, A., Wilber, M. J., & Belongie, S. (2016). Residual networks behave like ensembles of relatively shallow networks. In: Advances in neural information processing systems (NIPS).

  • Viola, P., & Jones, M. J. (2004). Robust real-time face detection. International Journal of Computer Vision (IJCV), 57(2), 137–154.

    Article  Google Scholar 

  • Yang, F., Choi, W., & Lin, Y. (2016). Exploit all the layers: Fast and accurate cnn object detector with scale dependent pooling and cascaded rejection classifiers. In: Conference on computer vision and pattern recognition (CVPR).

Download references

Acknowledgements

We would like to thank Ilya Kostrikov, Daniel D. Lee, Kimberly Wilber, Antonio Marcedone, Yiqing Hua and Charles Herrmann for insightful discussions and feedback.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andreas Veit.

Additional information

Communicated by Yair Weiss.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Veit, A., Belongie, S. Convolutional Networks with Adaptive Inference Graphs. Int J Comput Vis 128, 730–741 (2020). https://doi.org/10.1007/s11263-019-01190-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-019-01190-4

Keywords

Navigation