Skip to main content
Log in

A survey of the recent architectures of deep convolutional neural networks

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Deep Convolutional Neural Network (CNN) is a special type of Neural Networks, which has shown exemplary performance on several competitions related to Computer Vision and Image Processing. Some of the exciting application areas of CNN include Image Classification and Segmentation, Object Detection, Video Processing, Natural Language Processing, and Speech Recognition. The powerful learning ability of deep CNN is primarily due to the use of multiple feature extraction stages that can automatically learn representations from the data. The availability of a large amount of data and improvement in the hardware technology has accelerated the research in CNNs, and recently interesting deep CNN architectures have been reported. Several inspiring ideas to bring advancements in CNNs have been explored, such as the use of different activation and loss functions, parameter optimization, regularization, and architectural innovations. However, the significant improvement in the representational capacity of the deep CNN is achieved through architectural innovations. Notably, the ideas of exploiting spatial and channel information, depth and width of architecture, and multi-path information processing have gained substantial attention. Similarly, the idea of using a block of layers as a structural unit is also gaining popularity. This survey thus focuses on the intrinsic taxonomy present in the recently reported deep CNN architectures and, consequently, classifies the recent innovations in CNN architectures into seven different categories. These seven categories are based on spatial exploitation, depth, multi-path, width, feature-map exploitation, channel boosting, and attention. Additionally, the elementary understanding of CNN components, current challenges, and applications of CNN are also provided.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Abbas Q, Ibrahim MEA, Jaffar MA (2019) A comprehensive review of recent advances on deep vision systems. Artif Intell Rev 52:39–76. https://doi.org/10.1007/s10462-018-9633-3

    Article  Google Scholar 

  • Abdel-Hamid O, Mohamed AR, Jiang H, Penn G (2012) Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition. In: ICASSP, IEEE international conference on acoustics speech and signal processing, pp 4277–4280. https://doi.org/10.1007/978-3-319-96145-3_2

  • Abdel-Hamid O, Deng L, Yu D (2013) Exploring convolutional neural network structures and optimization techniques for speech recognition. In: Interspeech, pp 1173–1175

  • Abdeljaber O, Avci O, Kiranyaz S et al (2017) Real-time vibration-based structural damage detection using one-dimensional convolutional neural networks. J Sound Vib. https://doi.org/10.1016/j.jsv.2016.10.043

    Article  Google Scholar 

  • Abdulkader A (2006) Two-tier approach for Arabic offline handwriting recognition. In: Tenth international workshop on frontiers in handwriting recognition

  • Ahmed U, Khan A, Khan SH et al (2019) Transfer learning and meta classification based deep churn prediction system for telecom industry, pp 1–10

  • Akar E, Marques O, Andrews WA, Furht B (2019) Cloud-based skin lesion diagnosis system using convolutional neural networks. In: Intelligent computing-proceedings of the computing conference, pp 982–1000

  • Amer M, Maul T (2019) A review of modularization techniques in artificial neural networks. Artif Intell Rev 52:527–561. https://doi.org/10.1007/s10462-019-09706-7

    Article  Google Scholar 

  • Aurisano A, Radovic A, Rocco D et al (2016) A convolutional neural network neutrino event classifier. J Instrum. https://doi.org/10.1088/1748-0221/11/09/P09001

    Article  Google Scholar 

  • Aziz A, Sohail A, Fahad L, et al (2020) Channel Boosted Convolutional Neural Network for Classification of Mitotic Nuclei using Histopathological Images. In: 2020 17th International Bhurban Conference on Applied Sciences and Technology (IBCAST). pp 277–284

  • Badrinarayanan V, Kendall A, Cipolla R (2017) SegNet: a Deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2016.2644615

    Article  Google Scholar 

  • Batmaz Z, Yurekli A, Bilge A, Kaleli C (2019) A review on deep learning for recommender systems: challenges and remedies. Artif Intell Rev 52:1–37. https://doi.org/10.1007/s10462-018-9654-y

    Article  Google Scholar 

  • Bay H, Ess A, Tuytelaars T, Van Gool L (2008) Speeded-up robust features (SURF). Comput Vis Image Underst 110:346–359. https://doi.org/10.1016/j.cviu.2007.09.014

    Article  Google Scholar 

  • Bengio Y (2009) Learning deep architectures for AI. Found Trends® Mach Learn 2:1–127. https://doi.org/10.1561/2200000006

    Article  MATH  Google Scholar 

  • Bengio Y (2013) Deep learning of representations: looking forward. In: International conference on statistical language and speech processing. Springer, pp 1–37

  • Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. In: Advances in neural information processing systems. The MIT Press, pp 153–160

  • Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35:1798–1828. https://doi.org/10.1109/TPAMI.2013.50

    Article  Google Scholar 

  • Berg A, Deng J, Fei-Fei L (2010) Large scale visual recognition challenge 2010

  • Bettoni M, Urgese G, Kobayashi Y, et al (2017) A convolutional neural network fully implemented on FPGA for embedded platforms. IEEE, pp 49–52. https://doi.org/10.1109/ngcas.2017.16

  • Bhunia AK, Konwer A, Bhunia AK et al (2019) Script identification in natural scene image and video frames using an attention based Convolutional-LSTM network. Pattern Recognit 85:172–184

    Article  Google Scholar 

  • Boureau Y (2009) Icml2010B.Pdf. doi: citeulike-article-id:8496352

  • Bouvrie J (2006) 1 Introduction Notes on Convolutional Neural Networks. doi: http://dx.doi.org/10.1016/j.protcy.2014.09.007

  • Bulat A, Tzimiropoulos G (2016) Human pose estimation via convolutional part heatmap regression BT. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer vision—ECCV. Springer, Cham, pp 717–732

    Google Scholar 

  • Cai Z, Vasconcelos N (2019) Cascade R-CNN: high quality object detection and instance segmentation. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/tpami.2019.2956516

    Article  Google Scholar 

  • Chapelle O (1998) Support vector machines for image classification. Stage deuxième année magistère d’informatique l’École Norm Supérieur Lyon 10:1055–1064. https://doi.org/10.1109/72.788646

    Article  Google Scholar 

  • Chellapilla K, Puri S, Simard P (2006) High performance convolutional neural networks for document processing. In: Tenth international workshop on frontiers in handwriting recognition

  • Chen Y-N, Han C-C, Wang C-T et al (2006) The application of a convolution neural network on face and license plate detection. In: 18th international conference on pattern recognition, 2006. ICPR 2006, pp 552–555

  • Chen W, Wilson JT, Tyree S et al (2015) Compressing neural networks with the hashing trick. In: 32nd international conference on machine learning, ICML 2015

  • Chevalier M, Thome N, Cord M et al (2015) LR-CNN for fine-grained classification with varying resolution. In: 2015 IEEE international conference on image processing (ICIP). IEEE, pp 3101–3105

  • Chollet F (2017) Xception: deep learning with depthwise separable convolutions. arXiv:1610.02357

  • Chouhan N, Khan A (2019) Network anomaly detection using channel boosted and residual learning based deep convolutional neural network. Appl Soft Comput 83:105612

    Article  Google Scholar 

  • Cireşan DC, Meier U, Gambardella LM, Schmidhuber J (2010) Deep, big, simple neural nets for handwritten. Neural Comput 22:3207–3220

    Article  Google Scholar 

  • Cireşan DC, Meier U, Masci J et al (2011) High-performance neural networks for visual object classification. Preprint arXiv:1102.0183

  • Cireşan D, Meier U, Masci J, Schmidhuber J (2012a) Multi-column deep neural network for traffic sign classification. Neural Netw 32:333–338. https://doi.org/10.1016/j.neunet.2012.02.023

    Article  Google Scholar 

  • Cireşan D, Giusti A, Gambardella LM, Schmidhuber J (2012b) Deep neural networks segment neuronal membranes in electron microscopy images. In: Advances in neural information processing systems, pp 2843–2851

  • Cireşan DC, Giusti A, Gambardella LM, Schmidhuber J (2013) Mitosis detection in breast cancer histology images with deep neural networks BT. In: Proceedings of medical image computing and computer-assisted intervention, MICCAI 2013, pp 411–418

  • Cireşan DC, Cireşan DC, Meier U, Schmidhuber J (2018) Multi-column deep neural networks for image classification. In: IEEE conference on computer vision and pattern recognition

  • Collobert R, Weston J (2008) A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th international conference on Machine learning. ACM, pp 160–167

  • Csáji B (2001) Approximation with artificial neural networks. M.Sc. Thesis 45

  • Dahl G, Mohamed A, Hinton GE (2010) Phone recognition with the mean-covariance restricted Boltzmann machine. In: Advances in neural information processing systems, pp 469–477

  • Dahl GE, Sainath TN, Hinton GE (2013) Improving deep neural networks for LVCSR using rectified linear units and dropout. In: 2013 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 8609–8613

  • Dai J, Li Y, He K, Sun J (2016) R-FCN: object detection via region-based fully convolutional networks. J Power Sources. https://doi.org/10.1016/j.jpowsour.2007.02.075

    Article  Google Scholar 

  • Dalal N, Triggs W (2004) Histograms of oriented gradients for human detection. In: IEEE computer society conference on computer vision and pattern recognition CVPR05, vol. 1, pp 886–893. https://doi.org/10.1109/cvpr.2005.177

  • Dauphin YN, De Vries H, Bengio Y (2015) Equilibrated adaptive learning rates for non-convex optimization. In: Advances in neural information processing system 2015, January, pp 1504–1512

  • Dauphin YN, Fan A, Auli M, Grangier D (2017) Language modeling with gated convolutional networks. In: Proceedings of the 34th international conference on machine learning, vol 70, pp 933–941

  • de Vries H, Memisevic R, Courville A (2016) Deep learning vector quantization. In: European symposium on artificial neural networks, computational intelligence and machine learning

  • Decoste D, Schölkopf B (2002) Training invariant support vector machines. Mach Learn 46:161–190

    Article  Google Scholar 

  • Delalleau O, Bengio Y (2011) Shallow versus deep sum-product networks. In: Advances in neural information processing systems, pp 666–674

  • Deng L (2012) The MNIST database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Process Mag 29:141–142

    Article  Google Scholar 

  • Deng L, Yu D, Delft B (2013) Deep learning: methods and applications foundations and trends R in signal processing. Sig Process 7:3–4. https://doi.org/10.1561/2000000039

    Article  Google Scholar 

  • Do MN, Vetterli M (2005) The contourlet transform: an efficient directional multiresolution image representation. IEEE Trans Image Process 14:2091–2106

    Article  Google Scholar 

  • Dollár P, Tu Z, Perona P, Belongie S (2009) Integral channel features

  • Donahue J, Anne Hendricks L, Guadarrama S et al (2015) Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2625–2634

  • Dong C, Loy CC, He K, Tang X (2016) Image super-resolution using deep convolutional networks. IEEE Trans Pattern Anal Mach Intell 38:295–307

    Article  Google Scholar 

  • Erhan D, Bengio Y, Courville A, Vincent P (2009) Visualizing higher-layer features of a deep network. Univ Montr 1341:1

    Google Scholar 

  • Farfade SS, Saberian MJ, Li L-J (2015) Multi-view face detection using deep convolutional neural networks. In: Proceedings of the 5th ACM on international conference on multimedia retrieval—ICMR’15. ACM Press, New York, USA, pp 643–650

  • Fasel B (2002) Facial expression analysis using shape and motion information extracted by convolutional neural networks. In: Proceedings of the 2002 12th IEEE workshop on neural networks for signal processing, 2002, pp 607–616

  • Frizzi S, Kaabi R, Bouchouicha M et al (2016) Convolutional neural network for video fire and smoke detection. In: IECON 2016-42nd annual conference of the IEEE industrial electronics society. IEEE, pp 877–882

  • Frome A, Cheung G, Abdulkader A, et al (2009) Large-scale privacy protection in Google Street View. In: Proceedings of the IEEE international conference on computer vision

  • Frosst N, Hinton G (2018) Distilling a neural network into a soft decision tree. In: CEUR workshop proceedings

  • Fukushima K (1988) Neocognitron: a hierarchical neural network capable of visual pattern recognition. Neural Netw 1:119–130

    Article  Google Scholar 

  • Fukushima K, Miyake S (1982) Neocognitron: a self-organizing neural network model for a mechanism of visual pattern recognition. In: Competition and cooperation in neural nets. Springer, pp 267–285

  • Garcia C, Delakis M (2004) Convolutional face finder: a neural architecture for fast and robust face detection. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2004.97

    Article  Google Scholar 

  • Gardner MW, Dorling SR (1998) Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmos Environ 32:2627–2636

    Article  Google Scholar 

  • Geng X, Lin J, Zhao B et al (2019) Hardware-aware softmax approximation for deep neural networks. In: Lecture notes in computer science. Lecture notes in artificial intelligence, Lecture notes in bioinformatics. pp 107–122

  • Gidaris S, Komodakis N (2015) Object detection via a multi-region and semantic segmentation-aware U model. In: Proceedings of IEEE international conference on computer vision 2015, pp 1134–1142. https://doi.org/10.1109/iccv.2015.135

  • Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision

  • Giusti A, Cireşan DC, Masci J et al (2013) Fast image scanning with deep max-pooling convolutional neural networks. In: 2013 IEEE international conference on image processing. IEEE, pp 4034–4038

  • Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp 249–256

  • Goh H, Thome N, Cord M, Lim J-H (2013) Top-down regularization of deep belief networks. In: Advances in neural information processing systems (NIPS). pp 1878–1886

  • Goodfellow I, Bengio Y, Courville A (2017) Deep learning. Nat Methods 13:35. https://doi.org/10.1038/nmeth.3707

    Article  MATH  Google Scholar 

  • Grill-Spector K, Weiner KS, Gomez J et al (2018) The functional neuroanatomy of face perception: from brain measurements to deep neural networks. Interface Focus 8:20180013. https://doi.org/10.1098/rsfs.2018.0013

    Article  Google Scholar 

  • Grün F, Rupprecht C, Navab N, Tombari F (2016) A taxonomy and library for visualizing learned features in convolutional neural networks. https://doi.org/10.1080/10962247.2014.948229

  • Gu J, Wang Z, Kuen J et al (2018) Recent advances in convolutional neural networks. Pattern Recognit 77:354–377. https://doi.org/10.1016/j.patcog.2017.10.013

    Article  Google Scholar 

  • Guo Y, Liu Y, Oerlemans A et al (2016) Deep learning for visual understanding: a review. Neurocomputing 187:27–48. https://doi.org/10.1016/j.neucom.2015.09.116

    Article  Google Scholar 

  • Hamel P, Eck D (2010) Learning features from music audio with deep belief networks. In: ISMIR, Utrecht, The Netherlands, pp 339–344

  • Han S, Mao H, Dally WJ (2016) Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. In: 4th international conference on learning representations, ICLR 2016—conference track proceedings

  • Han D, Kim J, Kim J (2017) Deep pyramidal residual networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 6307–6315

  • Han W, Feng R, Wang L, Gao L (2018) Adaptive spatial-scale-aware deep convolutional neural network for high-resolution remote sensing imagery scene classification. In: IGARSS 2018–2018 IEEE international geoscience and remote sensing symposium, pp 4736–4739. https://doi.org/10.1109/igarss.2018.8518290

  • Hanin B, Sellke M (2017) Approximating continuous functions by ReLU Nets of minimal width. Preprint. arXiv:1710.11278

  • He K, Zhang X, Ren S, Sun J (2015a) Deep residual learning for image recognition. Multimed Tools Appl 77:10437–10453. https://doi.org/10.1007/s11042-017-4440-4

    Article  Google Scholar 

  • He K, Zhang X, Ren S, Sun J (2015b) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37:1904–1916

    Article  Google Scholar 

  • He K, Gkioxari G, Dollar P, Girshick R (2017) Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision

  • Heikkilä M, Pietikäinen M, Schmid C (2009) Description of interest regions with local binary patterns. Pattern Recognit 42:425–436. https://doi.org/10.1016/j.patcog.2008.08.014

    Article  MATH  Google Scholar 

  • Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18:1527–1554

    Article  MathSciNet  Google Scholar 

  • Hinton GE, Krizhevsky A, Wang SD (2011) Transforming auto-encoders. In: International conference on artificial neural networks. Springer, pp 44–51

  • Hinton G, Deng L, Yu D et al (2012a) Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process Mag 29:82–97

    Article  Google Scholar 

  • Hinton GE, Srivastava N, Krizhevsky A, et al (2012b) Improving neural networks by preventing co-adaptation of feature detectors. pp 1–18. arXiv:12070580

  • Hinton G, Sabour S, Frosst N (2018) Matrix capsules with EM routing. In: 6th international conference on learning representations, ICLR 2018 - conference track proceedings

  • Hochreiter S (1998) The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int J Uncertain Fuzziness Knowl-Based Syst 6:107–116

    Article  Google Scholar 

  • Howard AG, Zhu M, Chen B, et al (2017) MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv:170404861

  • Hu B, Lu Z, Li H, Chen Q (2011) Topic modeling for named entity queries. In: Proceedings of the 20th ACM international conference on Information and knowledge management—CIKM’11. ACM Press, New York, New York, USA, 2009

  • Hu J, Shen L, Sun G (2018a) Squeeze-and-excitation networks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition. IEEE, pp 7132–7141

  • Hu Y, Wen G, Luo M, et al (2018b) Competitive inner-imaging squeeze and excitation for residual network. arXiv:1807.08920v3

  • Huang G, Sun Y, Liu Z et al (2016a) Deep networks with stochastic depth. In: European conference on computer vision. Springer, pp 646–661

  • Huang G, Sun Y, Liu Z et al (2016b) Deep networks with stochastic depth BT. In: European conference on computer vision ECCV 2016. Springer, pp 646–661

  • Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of 30th IEEE conference on computer vision and pattern recognition, CVPR 2017, pp 2261–2269. https://doi.org/10.1109/cvpr.2017.243

  • Huang Y, Cheng Y, Chen D et al (2018) GPipe: efficient training of giant neural networks using pipeline parallelism. arXiv:1811.06965v3

  • Huang KY, Wu CH, Hong QB et al (2019) Speech emotion recognition using deep neural network considering verbal and nonverbal speech sounds. In: Proceedings of IEEE international conference on acoustics, speech and signal processing ICASSP

  • Hubel DH, Wiesel TN (1959) Receptive fields of single neurones in the cat’s striate cortex. J Physiol. https://doi.org/10.1113/jphysiol.1959.sp006308

    Article  Google Scholar 

  • Hubel DH, Wiesel TN (1962) Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol 160:106–154. https://doi.org/10.1113/jphysiol.1962.sp006837

    Article  Google Scholar 

  • Hubel DH, Wiesel TN (1968) Receptive fields and functional architecture of monkey striate cortex. J Physiol 195:215–243. https://doi.org/10.1113/jphysiol.1968.sp008455

    Article  Google Scholar 

  • Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. J Mol Struct. https://doi.org/10.1016/j.molstruc.2016.12.061

    Article  Google Scholar 

  • Jaderberg M, Simonyan K, Zisserman A, Kavukcuoglu K (2015) Spatial transformer networks. Nature. https://doi.org/10.1038/nbt.3343

    Article  Google Scholar 

  • Jarrett K, Kavukcuoglu K, Ranzato M, LeCun Y (2009) What is the best multi-stage architecture for object recognition? In: IEEE 12th international conference on comput vision, 2009, pp 2146–2153

  • Ji S, Yang M, Yu K, Xu W (2010) 3D convolutional neural networks for human action recognition. Int Conf Mach Learn 35:221–231. https://doi.org/10.1109/TPAMI.2012.59

    Article  Google Scholar 

  • Joachims T (1998) Text categorization with support vector machines: Learning with many relevant features. In: European conference on machine learning. pp 137–142

  • Justus D, Brennan J, Bonner S, McGough AS (2019) Predicting the computational cost of deep learning models. In: Proceedings of 2018 IEEE international conference on big data, Big Data 2018

  • Kafi M, Maleki M, Davoodian N (2015) Functional histology of the ovarian follicles as determined by follicular fluid concentrations of steroids and IGF-1 in Camelus dromedarius. Res Vet Sci 99:37–40. https://doi.org/10.1016/j.rvsc.2015.01.001

    Article  Google Scholar 

  • Kahng M, Thorat N, Chau DHP et al (2019) GAN Lab: understanding complex deep generative models using interactive visual experimentation. IEEE Trans Vis Comput Graph 25:310–320

    Article  Google Scholar 

  • Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. Preprint arXiv:1404.2188

  • Kawashima T, Kawanishi Y, Ide I et al (2017) Action recognition from extremely low-resolution thermal image sequence. In: 2017 14th IEEE international conference on advanced video and signal based surveillance, AVSS 2017. IEEE, pp 1–6

  • Kawaguchi K, Huang J, Kaelbling LP (2019) Effect of depth and width on local minima in deep learning. Neural Comput 31:1462–1498. https://doi.org/10.1162/neco_a_01195

    Article  MathSciNet  MATH  Google Scholar 

  • Khan A, Sohail A, Ali A (2018a) A New channel boosted convolutional neural network using transfer learning. Preprint arXiv:1804.08528

  • Khan A, Zameer A, Jamal T, Raza A (2018b) Deep belief networks based feature generation and regression for predicting wind power. Preprint arXiv:1807.11682

  • Khan A, Qureshi AS, Hussain M et al (2019) A recent survey on the applications of genetic programming in image processing. Preprint arXiv:1901.07387

  • Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. Adv Neural Inf Process Syst. https://doi.org/10.1061/(ASCE)GT.1943-5606.0001284

    Article  Google Scholar 

  • Kuen J, Kong X, Wang G et al (2017) DelugeNets: deep networks with efficient and flexible cross-layer information inflows. In: 2017 IEEE international conference on computer vision workshop (ICCVW), pp 958–966

  • Kuen J, Kong X, Wang G, Tan YP (2018) DelugeNets: deep networks with efficient and flexible cross-layer information inflows. In: Proceedings of IEEE international conference on computer vision work ICCVW 2017, pp 958–966. https://doi.org/10.1109/iccvw.2017.117

  • Lacey G, Taylor GW, Areibi S (2016) Deep learning on FPGAs: past, present, and future. arXiv:160204283

  • Larsson G, Maire M, Shakhnarovich G (2016) Fractalnet: ultra-deep neural networks without residuals. Preprint 1605.07648, pp 1–11

  • Laskar MNU, Giraldo LGS, Schwartz O (2018) Correspondence of deep neural networks and the brain for visual textures, pp 1–17

  • Le QV, Ranzato M, Monga R et al (2011) Building high-level features using large scale unsupervised learning. In: IEEE International conference on acoustics speech and signal processing ICASSP, pp 8595–8598. https://doi.org/10.1109/icassp.2013.6639343

  • LeCun Y (2007) Effcient BackPrp. J Exp Psychol Gen 136:23–42

    Article  Google Scholar 

  • LeCun Y, Boser B, Denker JS et al (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1:541–551

    Article  Google Scholar 

  • LeCun Y, Jackel LD, Bottou L et al (1995) Learning algorithms for classification: a comparison on handwritten digit recognition. Neural Netw Stat Mech Perspect 261:276

    Google Scholar 

  • LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86:2278–2324

    Article  Google Scholar 

  • LeCun Y, Kavukcuoglu K, Farabet CC et al (2010) Convolutional networks and applications in vision. In: ISCAS. IEEE, pp 253–256

  • LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444. https://doi.org/10.1038/nature14539

    Article  Google Scholar 

  • Lee C-Y, Gallagher PW, Tu Z (2016) Generalizing pooling functions in convolutional neural networks: mixed, gated, and tree. In: Artificial intelligence and statistics, pp 464–472

  • Lee S, Son K, Kim H, Park J (2017) Car plate recognition based on CNN using embedded system with GPU, pp 239–241

  • Levi G, Hassner T (2009) Sicherheit und Medien. Sicherheit und Medien. https://doi.org/10.1109/CVPRW.2015.7301352

    Article  Google Scholar 

  • Li S, Liu Z-Q, Chan AB (2014) Heterogeneous multi-task learning for human pose estimation with deep convolutional neural network. In: 2014 IEEE conference on computer vision and pattern recognition workshops. IEEE, pp 488–495

  • Li H, Lin Z, Shen X et al (2015) A convolutional neural network cascade for face detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5325–5334

  • Li X, Bing L, Lam W, Shi B (2018) Transformation networks for target-oriented sentiment classification, pp 946–956

  • Lin M, Chen Q, Yan S (2013) Network in network, pp 1–10. https://doi.org/10.1109/asru.2015.7404828

  • Lin T-Y, Maire M, Belongie S et al (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, pp 740–755

  • Lin TY, Dollár P, Girshick R et al (2017) Feature pyramid networks for object detection. In: Proceedings of 30th IEEE conference on computer vision and pattern recognition, CVPR 2017

  • Lindholm E, Nickolls J, Oberman S, Montrym J (2008) NVIDIA TESLA: a unified graphics and computing architecture. IEEE Micro 28:39–55. https://doi.org/10.1109/MM.2008.31

    Article  Google Scholar 

  • Linnainmaa S (1970) The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors. Master’s Thesis (in Finnish), Univ Helsinki 6–7

  • Liu C-L, Nakashima K, Sako H, Fujisawa H (2003) Handwritten digit recognition: benchmarking of state-of-the-art techniques. Pattern Recognit 36:2271–2285

    Article  Google Scholar 

  • Liu W, Wang Z, Liu X et al (2017) A survey of deep neural network architectures and their applications. Neurocomputing 234:11–26. https://doi.org/10.1016/j.neucom.2016.12.038

    Article  Google Scholar 

  • Liu X, Deng Z, Yang Y (2019) Recent progress in semantic image segmentation. Artif Intell Rev 52:1089–1106. https://doi.org/10.1007/s10462-018-9641-3

    Article  Google Scholar 

  • Long ZM, Guo SQ, Chen GJ, Yin BL (2012) Modeling and simulation for the articulated robotic arm test system of the combination drive. In: 2011 international conference on mechatronics and materials engineering ICMME 2011, pp 151:480–483. https://doi.org/10.4028/www.scientific.net/AMM.151.480

  • Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 3431–3440

  • Lowe DG (1999) Object recognition from local scale-invariant features. In: Proceedings of Seventh IEEE International Conference on Computer Vision, vol 2, pp 1150–1157. https://doi.org/10.1109/iccv.1999.790410

  • Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60:91–110

    Article  Google Scholar 

  • Lu H, Li B, Zhu J et al (2017a) Wound intensity correction and segmentation with convolutional neural networks. Concurr Comput Pract Exp 29:e3927

    Article  Google Scholar 

  • Lu Z, Pu H, Wang F et al (2017b) The expressive power of neural networks: a view from the width. In: Advances in neural information processing systems, pp 6231–6239

  • Lv E, Wang X, Cheng Y, Yu Q (2019) Deep ensemble network based on multi-path fusion. Artif Intell Rev 52:151–168. https://doi.org/10.1007/s10462-019-09708-5

    Article  Google Scholar 

  • Madrazo CF, Heredia I, Lloret L, Marco de Lucas J (2019) Application of a convolutional neural network for image classification for the analysis of collisions in high energy physics. EPJ Web Conf. https://doi.org/10.1051/epjconf/201921406017

    Article  Google Scholar 

  • Mao X, Shen C, Yang Y-B (2016) Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. In: Advances in neural information processing systems, pp 2802–2810

  • Marmanis D, Wegner JD, Galliani S et al (2016) Semantic segmentation of aerial images with an ensemble of CNNs. ISPRS Ann Photogramm Remote Sens Spat Inf Sci 3:473

    Article  Google Scholar 

  • Matsugu M, Mori K, Ishii M, Mitarai Y (2002) Convolutional spiking neural network model for robust face detection. In: Proceedings of the 9th international conference on neural information processing, 2002. ICONIP’02, pp 660–664

  • Mikolov T, Karafiát M, Burget L et al (2010) Recurrent neural network based language model. In: Eleventh annual conference of the international speech communication association

  • Misra D (2019) Mish: a self regularized non-monotonic neural activation function. arXiv:190808681

  • Mohamed A, Dahl GE, Hinton G (2012) Acoustic modeling using deep belief networks. IEEE Trans Audio Speech Lang Process 20:14–22

    Article  Google Scholar 

  • Montufar GF, Pascanu R, Cho K, Bengio Y (2014) On the number of linear regions of deep neural networks. In: Advances in neural information processing systems, pp 2924–2932

  • Moons B, Verhelst M (2017) An energy-efficient precision-scalable ConvNet processor in 40-nm CMOS. IEEE J Solid-State Circuits 52:903–914

    Article  Google Scholar 

  • Morar A, Moldoveanu F, Gröller E (2012) Image segmentation based on active contours without edges. In: IEEE 8th international conference on intelligent computer communication processing ICCP 2012, pp 213–220. https://doi.org/10.1109/iccp.2012.6356188

  • Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. In: ICML 27th international conference on machine learning

  • Najafabadi MM, Villanustre F, Khoshgoftaar TM et al (2015) Deep learning applications and challenges in big data analytics. J Big Data 2:1–21. https://doi.org/10.1186/s40537-014-0007-7

    Article  Google Scholar 

  • Nguyen Q, Mukkamala M, Hein M (2018) Neural networks should be wide enough to learn disconnected decision regions. Preprint arXiv:1803.00094

  • Nguyen G, Dlugolinsky S, Bobák M et al (2019) Machine learning and deep learning frameworks and libraries for large-scale data mining: a survey. Artif Intell Rev 52:77–124. https://doi.org/10.1007/s10462-018-09679-z

    Article  Google Scholar 

  • Nickolls J, Buck I, Garland M, Skadron K (2008) Scalable parallel programming with CUDA. In: ACM SIGGRAPH 2008 classes on SIGGRAPH’08. ACM Press, New York, New York, USA, p 1

  • Nwankpa C, Ijomah W, Gachagan A, Marshall S (2018) Activation functions: comparison of trends in practice and research for deep learning. Preprint arXiv:1811.03378

  • Oh K-S, Jung K (2004) GPU implementation of neural networks. Pattern Recognit 37:1311–1314

    Article  Google Scholar 

  • Ojala T, Pietikäinen M, Harwood D (1996) A comparative study of texture measures with classification based on feature distributions. Pattern Recognit 29:51–59. https://doi.org/10.1016/0031-3203(95)00067-4

    Article  Google Scholar 

  • Ojala T, PeitiKainen M, Maenpã T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 247:971–987

    Article  Google Scholar 

  • Oquab M, Bottou L, Laptev I, Sivic J (2014) Learning and transferring mid-level image representations using convolutional neural networks. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 1717–1724

  • Pang J, Chen K, Shi J et al (2020) Libra R-CNN: towards balanced learning for object detection

  • Pascanu R, Mikolov T, Bengio Y (2012) Understanding the exploding gradient problem. arXiv:1211.5063

  • Peng X, Hoffman J, Yu SX, Saenko K (2016) Fine-to-coarse knowledge transfer for low-res image classification. In: 2016 IEEE international conference on image processing (ICIP). IEEE, pp 3683–3687

  • Potluri S, Fasih A, Vutukuru LK et al (2011) CNN based high performance computing for real time image processing on GPU. In: Proceedings of the joint INDS’11 & ISTET’11, pp 1–7

  • Qureshi AS, Khan A (2018) Adaptive transfer learning in deep neural networks: wind power prediction using knowledge transfer from region to region and between different task domains. Preprint arXiv:1810.12611

  • Qureshi AS, Khan A, Zameer A, Usman A (2017) Wind power prediction using deep neural network based meta regression and transfer learning. Appl Soft Comput J 58:742–755. https://doi.org/10.1016/j.asoc.2017.05.031

    Article  Google Scholar 

  • Ramachandran P, Zoph B, Le QV (2017) Swish: a self-gated activation function

  • Ranjan R, Patel VM, Chellappa R (2015) A deep pyramid deformable part model for face detection. Preprint arXiv:1508.04389

  • Ranzato M, Huang FJ, Boureau YL, LeCun Y (2007) Unsupervised learning of invariant feature hierarchies with applications to object recognition. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 1–8

  • Rawat W, Wang Z (2016) Deep convolutional neural networks for image classification: a comprehensive review. Neural Comput 61:1120–1132. https://doi.org/10.1162/NECO

    Article  MATH  Google Scholar 

  • Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst. https://doi.org/10.1109/tpami.2016.2577031

    Article  Google Scholar 

  • Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

  • Roy AG, Navab N, Wachinger C (2018) Concurrent spatial and channel ‘squeeze & excitation’ in fully convolutional networks. Lecture Notes in Computer Science (including Subser Lectue Notes in Artificial Intelligence Lecture Notes in Bioinformatics) 11070 LNCS:421–429. https://doi.org/10.1007/978-3-030-00928-1_48

  • Russakovsky O, Deng J, Su H et al (2015) imagenet large scale visual recognition challenge. Int J Comput Vis. https://doi.org/10.1007/s11263-015-0816-y

    Article  MathSciNet  Google Scholar 

  • Salakhutdinov R, Larochelle H (2010) Efficient learning of deep Boltzmann machines. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp 693–700

  • Scherer D, Müller A, Behnke S (2010) Evaluation of pooling operations in convolutional architectures for object recognition. In: Artificial neural networks–ICANN 2010. Springer, pp 92–101

  • Schmidhuber J (2007) New millennium AI and the convergence of history. In: Challenges for computational intelligence. Springer, pp 15–35

  • Sermanet P, Chintala S, Lecun Y (2012) Convolutional neural networks applied to house numbers digit classification. In: Proceedings of the 21st international conference on pattern recognition (ICPR2012), Tsukuba. IEEE, pp 3288–3291

  • Shakeel MF, Bajwa NA, Anwaar AM et al (2019) Detecting driver drowsiness in real time through deep learning based object detection. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

  • Sharma A, Muttoo SK (2018) Spatial image steganalysis based on ResNeXt. In: 2018 IEEE 18th International conference on communication technology, pp 1213–1216. https://doi.org/10.1109/icct.2018.8600132

  • Shi Y, Tian Y, Wang Y, Huang T (2017) Sequential deep trajectory descriptor for action recognition with three-stream CNN. IEEE Trans Multimed 19:1510–1520

    Article  Google Scholar 

  • Shin H-CC, Roth HR, Gao M et al (2016) Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging 35:1285–1298. https://doi.org/10.1109/TMI.2016.2528162

    Article  Google Scholar 

  • Simard PY, Steinkraus D, Platt JC (2003) Best practices for convolutional neural networks applied to visual document analysis, p 958

  • Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. In: Advances in neural information processing systems, pp 568–576

  • Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. ICLR 75:398–406. https://doi.org/10.2146/ajhp170251

    Article  Google Scholar 

  • Simonyan K, Vedaldi A, Zisserman A (2013) Deep inside convolutional networks: visualising image classification models and saliency maps, pp 1–8. https://doi.org/10.1080/00994480.2000.10748487

  • Sinha T, Verma B, Haidar A (2018) Optimization of convolutional neural network parameters for image classification. In: 2017 IEEE symposium series on computational intelligence SSCI 2017, pp 1–7. https://doi.org/10.1109/ssci.2017.8285338

  • Spanhol FA, Oliveira LS, Petitjean C, Heutte L (2016a) A dataset for breast cancer histopathological image classification. IEEE Trans Biomed Eng 63:1455–1462

    Article  Google Scholar 

  • Spanhol FA, Oliveira LS, Petitjean C, Heutte L (2016b) Breast cancer histopathological image classification using convolutional neural networks. In: 2016 international joint conference on neural networks (IJCNN). IEEE, pp 2560–2567

  • Srinivas S, Sarvadevabhatla RK, Mopuri KR et al (2016) A taxonomy of deep convolutional neural nets for computer vision. Front Robot AI 2:1–13. https://doi.org/10.3389/frobt.2015.00036

    Article  Google Scholar 

  • Srivastava N, Hinton G, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfittin. J Mach Learn Res 1:11. https://doi.org/10.1016/j.micromeso.2003.09.025

    Article  MATH  Google Scholar 

  • Srivastava RK, Greff K, Schmidhuber J (2015a) Highway networks. https://doi.org/10.1002/esp.3417

  • Srivastava RK, Greff K, Schmidhuber J (2015b) Training very deep networks. In: Advances in neural information processing systems

  • Stefanini M, Lancellotti R, Baraldi L, Calderara S (2019) A deep-learning-based approach to vm behavior identification in cloud systems. In: Proceedings of the 9th international conference on cloud computing and services science. SCITEPRESS—Science and Technology Publications, pp 308–315

  • Strigl D, Kofler K, Podlipnig S (2010) Performance and scalability of GPU-based convolutional neural networks. In: 2010 18th Euromicro international conference on parallel, distributed and network-based processing (PDP), pp 317–324

  • Suganuma M, Shirakawa S, Nagao T (2017) A genetic programming approach to designing convolutional neural network architectures. In: Proceedings of the genetic and evolutionary computation conference. ACM, pp 497–504

  • Sun L, Jia K, Yeung D-Y, Shi BE (2015) Human action recognition using factorized spatio-temporal convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 4597–4605

  • Sundermeyer M, Schlüter R, Ney H (2012) LSTM neural networks for language modeling. In: Thirteenth annual conference of the international speech communication association

  • Sze V, Chen YH, Yang TJ, Emer JS (2017) Efficient processing of deep neural networks: a tutorial and survey. In: Proceedings of IEEE

  • Szegedy C, Zaremba W, Sutskever I et al (2014) Intriguing properties of neural networks. In: 2nd international conference on learning Representations, ICLR 2014 - conference track proceedings

  • Szegedy C, Liu W, Jia Y et al (2015) Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1–9

  • Szegedy C, Ioffe S, Vanhoucke V (2016a) Inception-v4, Inception-ResNet and the impact of residual connections on learning. Preprint arXiv:1602.07261v2 131:262–263. https://doi.org/10.1007/s10236-015-0809-y

  • Szegedy C, Vanhoucke V, Ioffe S et al (2016b) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Computer Society conference on computer vision and pattern recognition. IEEE, pp 2818–2826

  • Targ S, Almeida D, Lyman K (2016) Resnet in Resnet: generalizing residual architectures. Preprint arXiv:1603.08029

  • Tong W, Song L, Yang X, et al (2015) CNN-based shot boundary detection and video annotation. In: 2015 IEEE international symposium on broadband multimedia systems and broadcasting. IEEE, pp 1–5

  • Tong T, Li G, Liu X, Gao Q (2017) Image super-resolution using dense skip connections. In: 2017 IEEE international conference on computer vision (ICCV), pp 4809–4817

  • Tran D, Bourdev L, Fergus R, et al (2015) Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 4489–4497

  • Ullah A, Ahmad J, Muhammad K et al (2017) Action recognition in video sequences using deep bi-directional LSTM with CNN features. IEEE Access 6:1155–1166

    Article  Google Scholar 

  • Vinayakumar R, Soman KP, Poornachandrany P (2017) Applying convolutional neural network for network intrusion detection. In: 2017 International conference on advances in computing, communications and informatics, ICACCI 2017

  • Vincent P, Larochelle H, Bengio Y, Manzagol P-A (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on machine learning. ACM, pp 1096–1103

  • Vinyals O, Toshev A, Bengio S, Erhan D (2017) Show and tell: lessons learned from the 2015 MSCOCO image captioning challenge. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2016.2587640

    Article  Google Scholar 

  • Wahab N, Khan A, Lee YS (2017) Two-phase deep convolutional neural network for reducing class skewness in histopathological images based breast cancer detection. Comput Biol Med 85:86–97. https://doi.org/10.1016/j.compbiomed.2017.04.012

    Article  Google Scholar 

  • Wahab N, Khan A, Lee YS (2019) Transfer learning based deep CNN for segmentation and detection of mitoses in breast cancer histopathological images. Microscopy 68:216–233. https://doi.org/10.1093/jmicro/dfz002

    Article  Google Scholar 

  • Wang H, Raj B (2017) On the origin of deep learning, pp 1–72. https://doi.org/10.1016/0014-5793(91)81229-2

  • Wang H, Schmid C (2013) Action recognition with improved trajectories. In: Proceedings of the IEEE international conference on computer vision, pp 3551–3558

  • Wang T, Wu DJDJ, Coates A, Ng AY (2012) End-to-end text recognition with convolutional neural networks. In: International Conference on Pattern Recognition ICPR, pp 3304–3308

  • Wang F, Jiang M, Qian C et al (2017a) Residual attention network for image classification. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 6450–6458

  • Wang X, Gao L, Song J, Shen H (2017b) Beyond frame-level CNN: saliency-aware 3-D CNN With LSTM for video action recognition. IEEE Signal Process Lett 24:510–514. https://doi.org/10.1109/LSP.2016.2611485

    Article  Google Scholar 

  • Wang Y, Wang L, Wang H, Li P (2019) End-to-end image super-resolution via deep and shallow convolutional networks. IEEE Access 7:31959–31970. https://doi.org/10.1109/ACCESS.2019.2903582

    Article  Google Scholar 

  • Woo S, Park J, Lee JY, Kweon IS (2018) CBAM: Convolutional block attention module. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics) 11211 LNCS:3–19. https://doi.org/10.1007/978-3-030-01234-2_1

  • Wu J, Leng C, Wang Y, et al (2016) Quantized convolutional neural networks for mobile devices. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition

  • Xie S, Girshick R, Dollar P et al (2017) Aggregated residual transformations for deep neural networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 5987–5995

  • Xie W, Zhang C, Zhang Y et al (2018) An energy-efficient FPGA-based embedded system for CNN application. In: 2018 IEEE international conference on electron devices and solid state circuits (EDSSC). IEEE, pp 1–2

  • Xiong Y, Kim HJ, Hedau V (2019) ANTNets: mobile convolutional neural networks for resource efficient image classification. arXiv:190403775

  • Xu B, Wang N, Chen T, Li M (2015a) Empirical evaluation of rectified activations in convolutional network. J Foot Ankle Res 1:O22. https://doi.org/10.1186/1757-1146-1-S1-O22

    Article  Google Scholar 

  • Xu K, Ba J, Kiros R et al (2015b) Show, attend and tell: neural image caption generation with visual attention. In: International conference on machine learning, pp 2048–2057

  • Yamada Y, Iwamura M, Kise K (2016) Deep pyramidal residual networks with separated stochastic depth. Preprint arXiv:1612.01230

  • Yang Q, Pan SJ, Yang Q, Fellow QY (2008) A survey on transfer learning. IEEE Trans Knowl Data Eng 1:1–15. https://doi.org/10.1109/TKDE.2009.191

    Article  Google Scholar 

  • Yang S, Luo P, Loy C-C, Tang X (2015) From facial parts responses to face detection: a deep learning approach. In: Proceedings of the IEEE international conference on computer visio, pp 3676–3684

  • Yang J, Xiong W, Li S, Xu C (2019) Learning structured and non-redundant representations with deep neural networks. Pattern Recognit 86:224–235

    Article  Google Scholar 

  • Yıldırım Ö, Pławiak P, Tan RS, Acharya UR (2018) Arrhythmia detection using deep convolutional neural network with long duration ECG signals. Comput Biol Med. https://doi.org/10.1016/j.compbiomed.2018.09.009

    Article  Google Scholar 

  • Young SR, Rose DC, Karnowski TP et al (2015) Optimizing deep learning hyper-parameters through an evolutionary algorithm. In: Proceedings of the workshop on machine learning in high-performance computing environments. ACM, p 4

  • Zagoruyko S, Komodakis N (2016) Wide residual networks. Proc Br Mach Vis Conf 87(1-87):12. https://doi.org/10.5244/C.30.87

    Article  Google Scholar 

  • Zeiler MD, Fergus R (2013) Visualizing and understanding convolutional networks. Preprint arXiv:1311.2901v3, vol 30, pp 225–231. https://doi.org/10.1111/j.1475-4932.1954.tb03086.x

  • Zhang X, LeCun Y (2015) Text understanding from scratch. Preprint arXiv:1502.01710

  • Zhang K, Zhang Z, Li Z et al (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 23:1499–1503

    Article  Google Scholar 

  • Zhang X, Li Z, Loy CC, Lin D (2017) PolyNet: a pursuit of structural diversity in very deep networks. In: Proceedings of 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, pp 3900–3908. https://doi.org/10.1109/cvpr.2017.415

  • Zhang X, Zhou X, Lin M, Sun J (2018a) ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition

  • Zhang Y, Qiu Z, Yao T, et al (2018b) Fully convolutional adaptation networks for semantic segmentation. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition

  • Zhang Q, Zhang M, Chen T et al (2019) Recent advances in convolutional neural network acceleration. Neurocomputing 323:37–51. https://doi.org/10.1016/j.neucom.2018.09.038

    Article  Google Scholar 

  • Zheng H, Fu J, Mei T, Luo J (2017) Learning multi-attention convolutional neural network for fine-grained image recognition. In: 2017 IEEE international conference on computer vision (ICCV), pp 5219–5227

  • Zhou B, Khosla A, Lapedriza A et al (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2921–2929

Download references

Acknowledgements

The authors would like to thank Pattern Recognition lab at DCIS, and PIEAS for providing them computational facilities. The authors express their gratitude to M. Waleed Khan of PIEAS for the detailed discussion related to the Mathematical description of the different CNN architectures.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Asifullah Khan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khan, A., Sohail, A., Zahoora, U. et al. A survey of the recent architectures of deep convolutional neural networks. Artif Intell Rev 53, 5455–5516 (2020). https://doi.org/10.1007/s10462-020-09825-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-020-09825-6

Keywords

Navigation