Skip to main content
Log in

Fast multi-resolution occlusion: a method for explaining and understanding deep neural networks

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Deep Convolutional Neural Networks (DCNNs) contain a high level of complexity and nonlinearity, so it is not clear based on what features DCNN models make decisions and how they can reach such promising results. There are two types of visualization techniques to interpret and explain the deep models: Backpropagation-based and Perturbation-based algorithms. The most notable drawback of the backpropagation-based visualization is that they cannot be applied for all architectures, whereas Perturbation-based visualizations are totally independent of the architectures. These methods, however, take a lot of computation and memory resources which make them slow and expensive, thereby unsuitable for many real-world applications. To cope with these problems, in this paper, a perturbation-based visualization method called Fast Multi-resolution Occlusion (FMO) are presented which is efficient in terms of time and resource consumption and can be considered in real-world applications. In order to compare the FMO with five well-known Perturbation-based visualizations methods such as Occlusion Test, Super-pixel perturbation (LIME), Randomized Input Sampling (RISE), Meaningful Perturbation and Extremal Perturbation, different experiments are designed in terms of time-consumption, visualization quality and localization accuracy. All methods are applied on 5 well-known DCNNs DenseNet121, InceptionV3, InceptionResnetV2, MobileNet and ResNet50 using common benchmark datasets ImageNet, PASCAL VOC07 and COCO14. According to the experimental results, FMO is averagely 2.32 times faster than LIME on five models DenseNet121, InceptionResnetV2, InceptionV3, MobileNet and ResNet50 with images of ILSVRC2012 dataset as well as 24.84 times faster than Occlusion Test, 11.87 times faster than RISE, 8.72 times faster than Meaningful Perturbation and 10.03 times faster than Extremal Perturbation on all of the five used models with images of common dataset ImageNet without scarifying visualization quality. Moreover, the methods are evaluated in terms of localization accuracy on two hard common datasets of PASCAL VOC07 and COCO14. The results show that FMO outperforms the compared relevant methods in terms of localization accuracy. Also, FMO extends the superimposing process of the Occlusion Test method, which yields a heatmap with more visualization quality than the Occlusion Test on many colorful images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Cireşan DC, Giusti A, Gambardella LM, Schmidhuber J (2012) Deep neural networks segment neuronal membranes in electron microscopy images. Adv Neural Inf Proces Syst 4:2843–2851

    Google Scholar 

  2. Ranjan R, Patel VM, Chellappa R (2019) HyperFace: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans Pattern Anal Mach Intell 41:121–135. https://doi.org/10.1109/TPAMI.2017.2781233

    Article  Google Scholar 

  3. C Mao, L Yao, Y Pan, Y Luo, Z Zeng (2019). Deep Generative Classifiers for Thoracic Disease Diagnosis with Chest X-ray Images, In: Proc. - 2018 IEEE Int. Conf Bioinforma Biomed BIBM 2018: pp. 1209–1214. https://doi.org/10.1109/BIBM.2018.8621107

  4. Yuan C, Wu Y, Qin X, Qiao S, Pan Y, Huang P, Liu D, Han N (2019) An effective image classification method for shallow densely connected convolution networks through squeezing and splitting techniques. Appl Intell 49:3570–3586. https://doi.org/10.1007/s10489-019-01468-7

    Article  Google Scholar 

  5. Collobert R, Weston J, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing ( almost ) from scratch. J Matchine Learn Res 12:2493–2537

    MATH  Google Scholar 

  6. R Socher, A Perelygin, JY Wu, J Chuang, CD Manning, AY Ng, C Potts (2013). Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank

  7. S Thomson, NA Smith, T Berg-kirkpatrick, L Zettlemoyer, S Thomson (2019). Encoding and decoding graph representations of natural language, Dr. Thesis, Lang. Technol. Inst. Sch. Comput. Sci. Carnegie Mellon Univ

  8. B Alshemali, J Kalita (2020). Improving the Reliability of Deep Neural Networks in NLP: A Review, Knowledge-Based Syst. 191. https://doi.org/10.1016/j.knosys.2019.105210

  9. Letsch F, Jirak D, Wermter S (2019) Localizing salient body motion in multi-person scenes using convolutional neural networks. Neurocomputing. 330:449–464. https://doi.org/10.1016/j.neucom.2018.11.048

    Article  Google Scholar 

  10. Majd M, Safabakhsh R (2019) A motion-aware ConvLSTM network for action recognition. Appl Intell 49:2515–2521. https://doi.org/10.1007/s10489-018-1395-8

    Article  Google Scholar 

  11. Shakeel MS, Lam KM (2019) Deep-feature encoding-based discriminative model for age-invariant face recognition. Pattern Recogn 93:442–457. https://doi.org/10.1016/j.patcog.2019.04.028

    Article  Google Scholar 

  12. G Gautam, A Raj, S Mukhopadhyay (2019). Identifying twins based on ocular region features using deep representations, Appl Intell https://doi.org/10.1007/s10489-019-01562-w

  13. J Chen, W Chen, C Huang (2016). Financial time-series data analysis using deep convolutional Neural Netw, 99–104. https://doi.org/10.1109/CCBD.2016.51

  14. OB Sezer, AM Ozbayoglu (2019). Financial Trading Model with Stock Bar Chart Image Time Series with Deep Convolutional Neural Networks, ArXiv:V:1903.04610v1

  15. Acharya UR, Fujita H, Oh SL, Hagiwara Y, Tan JH, Adam M, Tan RS (2019) Deep convolutional neural network for the automated diagnosis of congestive heart failure using ECG signals. Appl Intell 49:16–27. https://doi.org/10.1007/s10489-018-1179-1

    Article  Google Scholar 

  16. Behzadi-khormouji H, Rostami H, Salehi S, Derakhshande-Rishehri T, Masoumi M, Salemi S, Keshavarz A, Gholamrezanezhad A, Assadi M, Batouli A (2020) Deep learning, reusable and problem-based architectures for detection of consolidation on chest X-ray images. Comput Methods Prog Biomed 185:105162. https://doi.org/10.1016/j.cmpb.2019.105162

    Article  Google Scholar 

  17. Xu X, Wang C, Guo J, Yang L, Bai H, Li W, Yi Z (2020) DeepLN: a framework for automatic lung nodule detection using multi-resolution CT screening images. Knowledge-Based Syst 189:105128. https://doi.org/10.1016/j.knosys.2019.105128

    Article  Google Scholar 

  18. G Huang, Z Liu, L Van Der Maaten, KQ Weinberger (2017). Densely connected convolutional networks, In: Proc. - 30th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR 2017: pp. 2261–2269. https://doi.org/10.1109/CVPR.2017.243

  19. C Szegedy, V Vanhoucke, J Shlens, Z Wojna (2015). Rethinking the Inception Architecture for Computer Vision, ArXiv:1512.00567

  20. C Szegedy, S Ioffe, V Vanhoucke, A Alemi (2016). Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning, ArXiv:1602.07261

  21. M Sandler, A Howard, M Zhu, A Zhmoginov, LC Chen (2018). MobileNetV2: Inverted Residuals and Linear Bottlenecks, in: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit : pp. 4510–4520. https://doi.org/10.1109/CVPR.2018.00474

  22. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115:211–252. https://doi.org/10.1007/s11263-015-0816-y

    Article  MathSciNet  Google Scholar 

  23. C Murdock, M Chang, S Lucey (2018). Deep component analysis via alternating direction neural networks, Springer International Publishing. https://doi.org/10.1007/978-3-030-01216-8

  24. Sturm I, Lapuschkin S, Samek W, Müller KR, Shrikumar A, Greenside P, Kundaje A, Qi Z, Khorram S, Fuxin L, Sabour S, Frosst N, Hinton GE, Collobert R, Weston J, Karlen M, Kavukcuoglu K, Kuksa P, Baehrens D, Schroeter T, Harmeling S, Kawanabe M, Hansen KH, Muller K-RR, Chen J, Chen W, Huang C, Cires D, Meier U, Caruana R, Lou Y, Gehrke J, Koch P, Sturm M, Elhadad N, Ancona M, Ceolini E, Öztireli C, Gross M, Montavon GG, Samek W, Müller KR, Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A, Mahendran A, Vedaldi A, Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC, Seifert C, Aamir A, Balagopalan A, Jain D, Sharma A, Grottel S, Gumhold S, Guidotti R, Monreale A, Ruggieri S, Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A, Sezer OB, Ozbayoglu AM, Liu HHH, Wang R, Shan S, Chen X, Hu Z, Tang J, Wang Z, Zhang K, Zhang L, Sun Q, Thomson S, Smith NA, Berg-kirkpatrick T, Zettlemoyer L, Thomson S, Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D, Bojarski M, Yeres P, Choromanska A, Choromanski K, Firner B, Jackel L, Muller U, Fong RC, Vedaldi A, Xie H, Yang D, Sun N, Chen Z, Zhang Y, Lenc K, Vedaldi A, Bach S, Binder A, Montavon GG, Klauschen F, Müller KR, Samek W, Huang G, Liu Z, Van Der Maaten L, Weinberger KQ, Ribeiro MT, Singh S, Guestrin C, Zhang S, Wen L, Shi H, Lei Z, Lyu S, Li SZ, Sundararajan M, Taly A, Yan Q, Socher R, Perelygin A, Wu JY, Chuang J, Manning CD, Ng AY, Potts C, Browne K, Swift B, Gardner H, Lapuschkin S, Binder A, Montavon GG, Muller K-RR, Samek W, Liu HHH, Xu J, Wu Y, Guo Q, Ibragimov B, Xing L, Zeng D, Liu HHH, Zhao F, Ge S, Shen W, Zhang Z, Huynh-The T, Hua C-H, Ngo T-T, Kim D-S (2019) Learning Deep Features for Discriminative Localization. Int J Comput Vis 10:109–119. https://doi.org/10.1016/j.dsp.2017.10.011

    Article  Google Scholar 

  25. Xiao D, Yang X, Li J, Islam M (2020) Attention deep neural network for lane marking detection. Knowledge-Based Syst 105584:105584. https://doi.org/10.1016/j.knosys.2020.105584

    Article  Google Scholar 

  26. S Becker, M Ackermann, S Lapuschkin, KR Müller, W Samek (2018). Interpreting and Explaining Deep Neural Networks for Classification of Audio Signals, ArXiv:1807.03418. http://arxiv.org/abs/1807.03418

  27. K Browne, B Swift, H Gardner (2018). Critical challenges for the visual representation of deep neural networks, Springer International Publishing. https://doi.org/10.1007/s10614-018-9803-z, Human and Machine Learning

  28. N Bernard, F Lepr (2019). Evolutionary Algorithms for Convolutional Neural Network Visualisation, 5th Lat. Am. High Perform. Comput. Conf. CARLA 2018, Bucaramanga, Colomb. Sept. 18–32. https://doi.org/10.1016/c2013-0-11102-6

  29. K Rao, M Shekar (2019). Generating heatmaps to visualize the evidence of deep learning based diagnosis of chest X-rays, conf. RSNAAt Chicago

  30. Tai-Seale M, McGuire TG, Zhang W (2007) Time allocation in primary care office visits. Health Serv Res 42:1871–1894. https://doi.org/10.1111/j.1475-6773.2006.00689.x

    Article  Google Scholar 

  31. Statista. Number of patients that physicians in the U.S. saw per day from 2012 to 2018. https://www.statista.com/statistics/613959/us-physicans-patients-seen-per-day/, 2019

  32. Morgan P, Everett CM, Hing E (2014) Time spent with patients by physicians, nurse practitioners, and physician assistants in community health centers, 2006-2010. Healthcare. 2:232–237. https://doi.org/10.1016/j.hjdsi.2014.09.009

    Article  Google Scholar 

  33. C Seifert, A Aamir, A Balagopalan, D Jain, A Sharma, S Grottel, S Gumhold (2017). Visualizations of deep neural networks in computer vision: a survey, Springer International Publishing. https://doi.org/10.1007/978-3-319-54024-5

  34. A Blanco-Justicia, J Domingo-Ferrer, S Martínez, D Sánchez (2020). Machine learning explainability via microaggregation and shallow decision trees, Knowledge-Based Syst. 105532. https://doi.org/10.1016/j.knosys.2020.105532

  35. K Simonyan, A Vedaldi, A Zisserman (2014). Deep inside convolutional networks: Visualising image classification models and saliency maps, In: 2nd Int. Conf. Learn. Represent. ICLR 2014 - work. Track Proc

  36. MD Zeiler, R Fergus (2014). Visualizing and Understanding Convolutional Networks, ArXiv:1311.2901

  37. B Zhou, A Khosla, A Lapedriza, A Oliva, A Torralba (2015). Learning Deep Features for Discriminative Localization, ArXiv:1512.04150

  38. M José Oramas, K Wang, T Tuytelaars (2019). Visual explanation by interpretation: improving visual feedback capabilities of deep neural networks, 7th Int. Conf Learn Represent ICLR 2019

  39. Montavon G, Lapuschkin S, Binder A, Samek W, Müller KR (2017) Explaining nonlinear classification decisions with deep Taylor decomposition. Pattern Recogn 65:211–222. https://doi.org/10.1016/j.patcog.2016.11.008

    Article  Google Scholar 

  40. W Samek, T Wiegand, KR Müller, (2017). Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models. http://arxiv.org/abs/1708.08296

  41. Qin Z (2018) How Convolutional Neural Networks See The World — A Survey Of Convolutional Neural Network Visualization Methods 1:149–180. https://doi.org/10.3934/mfc.2018008

    Article  Google Scholar 

  42. A Shrikumar, P Greenside, A Kundaje (2017). Learning Important Features Through Propagating Activation Differences, ArXiv:1704.02685

  43. MT Ribeiro, S Singh, C Guestrin (2016). “ Why Should I Trust You ?” Explaining the Predictions of Any Classifier, ArXiv:1602.04938

  44. S Sabour, N Frosst, GE Hinton (2017). Dynamic Routing Between Capsules, A Single Nucleotide Polymorphism within the Novel Sex-Linked Testis-Specific Retrotransposed PGAM4 Gene Influences Human Male Fertility, 7, e35195 https://doi.org/10.1371/journal.pone.0035195

  45. LM Zintgraf, TS Cohen, T Adel, M Welling (2017). Visualizing Deep Neural Network Decisions: Prediction Difference Analysis 1–12. http://arxiv.org/abs/1702.04595

  46. JT Springenberg, A Dosovitskiy, T Brox, M Riedmiller (2015). Striving for Simplisity : The all Convolutional Net, ArXiv:1412.6806

  47. RR Selvaraju, M Cogswell, A Das, R Vedantam, D Parikh, D Batra (2017). Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization, ArXiv:1610.02391

  48. Z Qi, S Khorram, L Fuxin (2019). Visualizing Deep Networks by Optimizing with Integrated Gradients. http://arxiv.org/abs/1905.00954

  49. S Bach, A Binder, G Montavon, F Klauschen, KR Müller, W Samek (2015). On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation, PLoS One. 10. https://doi.org/10.1371/journal.pone.0130140

  50. F Arbabzadah, G Montavon, KR Müller, W Samek (2016). Identifying individual facial expressions by deconstructing a neural network, Lect. Notes Comput. Sci. (Including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics). 9796 LNCS 344–354. https://doi.org/10.1007/978-3-319-45886-1_28

  51. Arras L, Horn F, Montavon G, Müller KR, Samek W (2017) “What is relevant in a text document?”: An interpretable machine learning approach. PLoS One 12:1–23. https://doi.org/10.1371/journal.pone.0181142

    Article  Google Scholar 

  52. Sturm I, Lapuschkin S, Samek W, Müller KR (2016) Interpretable deep neural networks for single-trial EEG classification. J Neurosci Methods Neurosci 274:141–145. https://doi.org/10.1016/j.jneumeth.2016.10.008

    Article  Google Scholar 

  53. S Lapuschkin, A Binder, G Montavon, KR Muller, W Samek (2016). Analyzing Classifiers: Fisher Vectors and Deep Neural Networks, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 2016-Decem 2912–2920. https://doi.org/10.1109/CVPR.2016.318

  54. D Baehrens, T Schroeter, S Harmeling, M Kawanabe, KH Hansen, KR Muller (2010). How to Explain Individual Classification Decisions, J Mach Learn Res 11 1–29 arXiv:0912.1128. http://jmlr.csail.mit.edu/papers/volume11/baehrens10a/baehrens10a.pdf%5Cnpapers3://publication/uuid/EBAB9FAA-E1A8-494F-BC05-640DEFA5DB06

  55. Samek W, Binder A, Montavon G, Lapuschkin S, Müller KR (2017) Evaluating the visualization of what a deep neural network has learned. IEEE Trans Neural Networks Learn Syst 28:2660–2673. https://doi.org/10.1109/TNNLS.2016.2599820

    Article  MathSciNet  Google Scholar 

  56. Montavon G, Samek W, Müller KR (2018) Methods for interpreting and understanding deep neural networks. Digit Signal Process A Rev J 73:1–15. https://doi.org/10.1016/j.dsp.2017.10.011

    Article  MathSciNet  Google Scholar 

  57. M Ancona, E Ceolini, C Öztireli, M Gross (2018). Towards better understanding of gradient-based attribution methods for Deep Neural Networks 1–16. http://arxiv.org/abs/1711.06104

  58. V. Petsiuk, A. Das, K. Saenko, RISE: randomized input sampling for explanation of black-box models, 1 (2018). http://arxiv.org/abs/1806.07421

    Google Scholar 

  59. M Sundararajan, A Taly, Q Yan (2017). Axiomatic Attribution for Deep Networks, ArXiv:1703.01365

  60. R.C. Fong, A. Vedaldi, Interpretable Explanations of Black Boxes by Meaningful Perturbation, ArXiv:1704.03296. (2018)

  61. R Fong, M Patrick, A Vedaldi (2019). Understanding deep networks via extremal perturbations and smooth masks, in: Proc IEEE Int Conf Comput Vis : pp. 2950–2958. https://doi.org/10.1109/ICCV.2019.00304

  62. Gao P, Yuan R, Wang F, Xiao L, Fujita H, Zhang Y (2020) Siamese attentional keypoint network for high performance visual tracking. Knowledge-Based Syst 193:105448. https://doi.org/10.1016/j.knosys.2019.105448

    Article  Google Scholar 

  63. Gao P, Zhang Q, Wang F, Xiao L, Fujita H, Zhang Y (2020) Learning reinforced attentional representation for end-to-end visual tracking. Inf Sci (Ny) 517:52–67. https://doi.org/10.1016/j.ins.2019.12.084

    Article  Google Scholar 

  64. JDJ Deng, WDW Dong, R Socher, LJLLJ Li, KLK Li, LFFL Fei-Fei (2009). ImageNet: A Large-Scale Hierarchical Image Database, 2009 IEEE Conf. Comput. Vis. Pattern Recognit. 2–9. https://doi.org/10.1109/CVPR.2009.5206848

  65. Mark E, SM AE, Luc VG, Christopher KIW, John W, Andrew Z (2015) The Pascal Visual Object Classes Challenge – a Retrospective. J Verbal Learning Verbal Behav 1:98–136. https://doi.org/10.1016/S0022-5371(70)80029-9

    Article  Google Scholar 

  66. TY Lin, M Maire, S Belongie, J Hays, P Perona, D Ramanan, P Dollár, CL Zitnick (2014). Microsoft COCO: Common objects in context, Lect. Notes Comput. Sci. (Including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics). 8693 LNCS 740–755. https://doi.org/10.1007/978-3-319-10602-1_48

  67. K He, X Zhang, S Ren, J Sun (2015). Deep Residual Learning for Image Recognition. https://doi.org/10.1109/CVPR.2016.90

  68. S Gulli, Antonio Pal (2017). Deep learning with Keras : implement neural networks with Keras on Theano and TensorFlow. https://books.google.es/books/about/Deep_Learning_with_Keras.html?id=20EwDwAAQBAJ&redir_esc=y

  69. Chamber MA, Montes FJ (1982) TensorFlow: large-scale machine learning on heterogeneous distributed systems. Plant Soil 66:353–360. https://doi.org/10.1007/BF02183801

    Article  Google Scholar 

  70. Zhang J, Bargal SA, Lin Z, Brandt J, Shen X, Sclaroff S (2018) Top-down neural attention by excitation Backprop. Int J Comput Vis 126:1084–1102. https://doi.org/10.1007/s11263-017-1059-x

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Habib Rostami.

Ethics declarations

Conflict of interests

The authors declare that they did not receive any financial or personal support from other individuals or organizations that could inappropriately influence their work. This study was approved by the ethical committee of the Persian Gulf University. All authors gave their informed consent before enrolment. In addition, the authors did not have any potential conflict of interests with respect to this work.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Behzadi-Khormouji, H., Rostami, H. Fast multi-resolution occlusion: a method for explaining and understanding deep neural networks. Appl Intell 51, 2431–2455 (2021). https://doi.org/10.1007/s10489-020-01946-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-01946-3

Keywords

Navigation