Fast multi-resolution occlusion: a method for explaining and understanding deep neural networks

Behzadi-Khormouji, Hamed; Rostami, Habib

doi:10.1007/s10489-020-01946-3

Fast multi-resolution occlusion: a method for explaining and understanding deep neural networks

Published: 04 November 2020

Volume 51, pages 2431–2455, (2021)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

470 Accesses
6 Citations
Explore all metrics

Abstract

Deep Convolutional Neural Networks (DCNNs) contain a high level of complexity and nonlinearity, so it is not clear based on what features DCNN models make decisions and how they can reach such promising results. There are two types of visualization techniques to interpret and explain the deep models: Backpropagation-based and Perturbation-based algorithms. The most notable drawback of the backpropagation-based visualization is that they cannot be applied for all architectures, whereas Perturbation-based visualizations are totally independent of the architectures. These methods, however, take a lot of computation and memory resources which make them slow and expensive, thereby unsuitable for many real-world applications. To cope with these problems, in this paper, a perturbation-based visualization method called Fast Multi-resolution Occlusion (FMO) are presented which is efficient in terms of time and resource consumption and can be considered in real-world applications. In order to compare the FMO with five well-known Perturbation-based visualizations methods such as Occlusion Test, Super-pixel perturbation (LIME), Randomized Input Sampling (RISE), Meaningful Perturbation and Extremal Perturbation, different experiments are designed in terms of time-consumption, visualization quality and localization accuracy. All methods are applied on 5 well-known DCNNs DenseNet121, InceptionV3, InceptionResnetV2, MobileNet and ResNet50 using common benchmark datasets ImageNet, PASCAL VOC07 and COCO14. According to the experimental results, FMO is averagely 2.32 times faster than LIME on five models DenseNet121, InceptionResnetV2, InceptionV3, MobileNet and ResNet50 with images of ILSVRC2012 dataset as well as 24.84 times faster than Occlusion Test, 11.87 times faster than RISE, 8.72 times faster than Meaningful Perturbation and 10.03 times faster than Extremal Perturbation on all of the five used models with images of common dataset ImageNet without scarifying visualization quality. Moreover, the methods are evaluated in terms of localization accuracy on two hard common datasets of PASCAL VOC07 and COCO14. The results show that FMO outperforms the compared relevant methods in terms of localization accuracy. Also, FMO extends the superimposing process of the Occlusion Test method, which yields a heatmap with more visualization quality than the Occlusion Test on many colorful images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Attention mechanisms in computer vision: A survey

Article Open access 15 March 2022

Meng-Hao Guo, Tian-Xing Xu, … Shi-Min Hu

A survey on Image Data Augmentation for Deep Learning

Article Open access 06 July 2019

Connor Shorten & Taghi M. Khoshgoftaar

Visual attention network

Article Open access 28 July 2023

Meng-Hao Guo, Cheng-Ze Lu, … Shi-Min Hu

References

Cireşan DC, Giusti A, Gambardella LM, Schmidhuber J (2012) Deep neural networks segment neuronal membranes in electron microscopy images. Adv Neural Inf Proces Syst 4:2843–2851
Google Scholar
Ranjan R, Patel VM, Chellappa R (2019) HyperFace: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans Pattern Anal Mach Intell 41:121–135. https://doi.org/10.1109/TPAMI.2017.2781233
Article Google Scholar
C Mao, L Yao, Y Pan, Y Luo, Z Zeng (2019). Deep Generative Classifiers for Thoracic Disease Diagnosis with Chest X-ray Images, In: Proc. - 2018 IEEE Int. Conf Bioinforma Biomed BIBM 2018: pp. 1209–1214. https://doi.org/10.1109/BIBM.2018.8621107
Yuan C, Wu Y, Qin X, Qiao S, Pan Y, Huang P, Liu D, Han N (2019) An effective image classification method for shallow densely connected convolution networks through squeezing and splitting techniques. Appl Intell 49:3570–3586. https://doi.org/10.1007/s10489-019-01468-7
Article Google Scholar
Collobert R, Weston J, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing ( almost ) from scratch. J Matchine Learn Res 12:2493–2537
MATH Google Scholar
R Socher, A Perelygin, JY Wu, J Chuang, CD Manning, AY Ng, C Potts (2013). Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
S Thomson, NA Smith, T Berg-kirkpatrick, L Zettlemoyer, S Thomson (2019). Encoding and decoding graph representations of natural language, Dr. Thesis, Lang. Technol. Inst. Sch. Comput. Sci. Carnegie Mellon Univ
B Alshemali, J Kalita (2020). Improving the Reliability of Deep Neural Networks in NLP: A Review, Knowledge-Based Syst. 191. https://doi.org/10.1016/j.knosys.2019.105210
Letsch F, Jirak D, Wermter S (2019) Localizing salient body motion in multi-person scenes using convolutional neural networks. Neurocomputing. 330:449–464. https://doi.org/10.1016/j.neucom.2018.11.048
Article Google Scholar
Majd M, Safabakhsh R (2019) A motion-aware ConvLSTM network for action recognition. Appl Intell 49:2515–2521. https://doi.org/10.1007/s10489-018-1395-8
Article Google Scholar
Shakeel MS, Lam KM (2019) Deep-feature encoding-based discriminative model for age-invariant face recognition. Pattern Recogn 93:442–457. https://doi.org/10.1016/j.patcog.2019.04.028
Article Google Scholar
G Gautam, A Raj, S Mukhopadhyay (2019). Identifying twins based on ocular region features using deep representations, Appl Intell https://doi.org/10.1007/s10489-019-01562-w
J Chen, W Chen, C Huang (2016). Financial time-series data analysis using deep convolutional Neural Netw, 99–104. https://doi.org/10.1109/CCBD.2016.51
OB Sezer, AM Ozbayoglu (2019). Financial Trading Model with Stock Bar Chart Image Time Series with Deep Convolutional Neural Networks, ArXiv:V:1903.04610v1
Acharya UR, Fujita H, Oh SL, Hagiwara Y, Tan JH, Adam M, Tan RS (2019) Deep convolutional neural network for the automated diagnosis of congestive heart failure using ECG signals. Appl Intell 49:16–27. https://doi.org/10.1007/s10489-018-1179-1
Article Google Scholar
Behzadi-khormouji H, Rostami H, Salehi S, Derakhshande-Rishehri T, Masoumi M, Salemi S, Keshavarz A, Gholamrezanezhad A, Assadi M, Batouli A (2020) Deep learning, reusable and problem-based architectures for detection of consolidation on chest X-ray images. Comput Methods Prog Biomed 185:105162. https://doi.org/10.1016/j.cmpb.2019.105162
Article Google Scholar
Xu X, Wang C, Guo J, Yang L, Bai H, Li W, Yi Z (2020) DeepLN: a framework for automatic lung nodule detection using multi-resolution CT screening images. Knowledge-Based Syst 189:105128. https://doi.org/10.1016/j.knosys.2019.105128
Article Google Scholar
G Huang, Z Liu, L Van Der Maaten, KQ Weinberger (2017). Densely connected convolutional networks, In: Proc. - 30th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR 2017: pp. 2261–2269. https://doi.org/10.1109/CVPR.2017.243
C Szegedy, V Vanhoucke, J Shlens, Z Wojna (2015). Rethinking the Inception Architecture for Computer Vision, ArXiv:1512.00567
C Szegedy, S Ioffe, V Vanhoucke, A Alemi (2016). Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning, ArXiv:1602.07261
M Sandler, A Howard, M Zhu, A Zhmoginov, LC Chen (2018). MobileNetV2: Inverted Residuals and Linear Bottlenecks, in: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit : pp. 4510–4520. https://doi.org/10.1109/CVPR.2018.00474
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115:211–252. https://doi.org/10.1007/s11263-015-0816-y
Article MathSciNet Google Scholar
C Murdock, M Chang, S Lucey (2018). Deep component analysis via alternating direction neural networks, Springer International Publishing. https://doi.org/10.1007/978-3-030-01216-8
Sturm I, Lapuschkin S, Samek W, Müller KR, Shrikumar A, Greenside P, Kundaje A, Qi Z, Khorram S, Fuxin L, Sabour S, Frosst N, Hinton GE, Collobert R, Weston J, Karlen M, Kavukcuoglu K, Kuksa P, Baehrens D, Schroeter T, Harmeling S, Kawanabe M, Hansen KH, Muller K-RR, Chen J, Chen W, Huang C, Cires D, Meier U, Caruana R, Lou Y, Gehrke J, Koch P, Sturm M, Elhadad N, Ancona M, Ceolini E, Öztireli C, Gross M, Montavon GG, Samek W, Müller KR, Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A, Mahendran A, Vedaldi A, Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC, Seifert C, Aamir A, Balagopalan A, Jain D, Sharma A, Grottel S, Gumhold S, Guidotti R, Monreale A, Ruggieri S, Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A, Sezer OB, Ozbayoglu AM, Liu HHH, Wang R, Shan S, Chen X, Hu Z, Tang J, Wang Z, Zhang K, Zhang L, Sun Q, Thomson S, Smith NA, Berg-kirkpatrick T, Zettlemoyer L, Thomson S, Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D, Bojarski M, Yeres P, Choromanska A, Choromanski K, Firner B, Jackel L, Muller U, Fong RC, Vedaldi A, Xie H, Yang D, Sun N, Chen Z, Zhang Y, Lenc K, Vedaldi A, Bach S, Binder A, Montavon GG, Klauschen F, Müller KR, Samek W, Huang G, Liu Z, Van Der Maaten L, Weinberger KQ, Ribeiro MT, Singh S, Guestrin C, Zhang S, Wen L, Shi H, Lei Z, Lyu S, Li SZ, Sundararajan M, Taly A, Yan Q, Socher R, Perelygin A, Wu JY, Chuang J, Manning CD, Ng AY, Potts C, Browne K, Swift B, Gardner H, Lapuschkin S, Binder A, Montavon GG, Muller K-RR, Samek W, Liu HHH, Xu J, Wu Y, Guo Q, Ibragimov B, Xing L, Zeng D, Liu HHH, Zhao F, Ge S, Shen W, Zhang Z, Huynh-The T, Hua C-H, Ngo T-T, Kim D-S (2019) Learning Deep Features for Discriminative Localization. Int J Comput Vis 10:109–119. https://doi.org/10.1016/j.dsp.2017.10.011
Article Google Scholar
Xiao D, Yang X, Li J, Islam M (2020) Attention deep neural network for lane marking detection. Knowledge-Based Syst 105584:105584. https://doi.org/10.1016/j.knosys.2020.105584
Article Google Scholar
S Becker, M Ackermann, S Lapuschkin, KR Müller, W Samek (2018). Interpreting and Explaining Deep Neural Networks for Classification of Audio Signals, ArXiv:1807.03418. http://arxiv.org/abs/1807.03418
K Browne, B Swift, H Gardner (2018). Critical challenges for the visual representation of deep neural networks, Springer International Publishing. https://doi.org/10.1007/s10614-018-9803-z, Human and Machine Learning
N Bernard, F Lepr (2019). Evolutionary Algorithms for Convolutional Neural Network Visualisation, 5th Lat. Am. High Perform. Comput. Conf. CARLA 2018, Bucaramanga, Colomb. Sept. 18–32. https://doi.org/10.1016/c2013-0-11102-6
K Rao, M Shekar (2019). Generating heatmaps to visualize the evidence of deep learning based diagnosis of chest X-rays, conf. RSNAAt Chicago
Tai-Seale M, McGuire TG, Zhang W (2007) Time allocation in primary care office visits. Health Serv Res 42:1871–1894. https://doi.org/10.1111/j.1475-6773.2006.00689.x
Article Google Scholar
Statista. Number of patients that physicians in the U.S. saw per day from 2012 to 2018. https://www.statista.com/statistics/613959/us-physicans-patients-seen-per-day/, 2019
Morgan P, Everett CM, Hing E (2014) Time spent with patients by physicians, nurse practitioners, and physician assistants in community health centers, 2006-2010. Healthcare. 2:232–237. https://doi.org/10.1016/j.hjdsi.2014.09.009
Article Google Scholar
C Seifert, A Aamir, A Balagopalan, D Jain, A Sharma, S Grottel, S Gumhold (2017). Visualizations of deep neural networks in computer vision: a survey, Springer International Publishing. https://doi.org/10.1007/978-3-319-54024-5
A Blanco-Justicia, J Domingo-Ferrer, S Martínez, D Sánchez (2020). Machine learning explainability via microaggregation and shallow decision trees, Knowledge-Based Syst. 105532. https://doi.org/10.1016/j.knosys.2020.105532
K Simonyan, A Vedaldi, A Zisserman (2014). Deep inside convolutional networks: Visualising image classification models and saliency maps, In: 2nd Int. Conf. Learn. Represent. ICLR 2014 - work. Track Proc
MD Zeiler, R Fergus (2014). Visualizing and Understanding Convolutional Networks, ArXiv:1311.2901
B Zhou, A Khosla, A Lapedriza, A Oliva, A Torralba (2015). Learning Deep Features for Discriminative Localization, ArXiv:1512.04150
M José Oramas, K Wang, T Tuytelaars (2019). Visual explanation by interpretation: improving visual feedback capabilities of deep neural networks, 7th Int. Conf Learn Represent ICLR 2019
Montavon G, Lapuschkin S, Binder A, Samek W, Müller KR (2017) Explaining nonlinear classification decisions with deep Taylor decomposition. Pattern Recogn 65:211–222. https://doi.org/10.1016/j.patcog.2016.11.008
Article Google Scholar
W Samek, T Wiegand, KR Müller, (2017). Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models. http://arxiv.org/abs/1708.08296
Qin Z (2018) How Convolutional Neural Networks See The World — A Survey Of Convolutional Neural Network Visualization Methods 1:149–180. https://doi.org/10.3934/mfc.2018008
Article Google Scholar
A Shrikumar, P Greenside, A Kundaje (2017). Learning Important Features Through Propagating Activation Differences, ArXiv:1704.02685
MT Ribeiro, S Singh, C Guestrin (2016). “ Why Should I Trust You ?” Explaining the Predictions of Any Classifier, ArXiv:1602.04938
S Sabour, N Frosst, GE Hinton (2017). Dynamic Routing Between Capsules, A Single Nucleotide Polymorphism within the Novel Sex-Linked Testis-Specific Retrotransposed PGAM4 Gene Influences Human Male Fertility, 7, e35195 https://doi.org/10.1371/journal.pone.0035195
LM Zintgraf, TS Cohen, T Adel, M Welling (2017). Visualizing Deep Neural Network Decisions: Prediction Difference Analysis 1–12. http://arxiv.org/abs/1702.04595
JT Springenberg, A Dosovitskiy, T Brox, M Riedmiller (2015). Striving for Simplisity : The all Convolutional Net, ArXiv:1412.6806
RR Selvaraju, M Cogswell, A Das, R Vedantam, D Parikh, D Batra (2017). Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization, ArXiv:1610.02391
Z Qi, S Khorram, L Fuxin (2019). Visualizing Deep Networks by Optimizing with Integrated Gradients. http://arxiv.org/abs/1905.00954
S Bach, A Binder, G Montavon, F Klauschen, KR Müller, W Samek (2015). On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation, PLoS One. 10. https://doi.org/10.1371/journal.pone.0130140
F Arbabzadah, G Montavon, KR Müller, W Samek (2016). Identifying individual facial expressions by deconstructing a neural network, Lect. Notes Comput. Sci. (Including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics). 9796 LNCS 344–354. https://doi.org/10.1007/978-3-319-45886-1_28
Arras L, Horn F, Montavon G, Müller KR, Samek W (2017) “What is relevant in a text document?”: An interpretable machine learning approach. PLoS One 12:1–23. https://doi.org/10.1371/journal.pone.0181142
Article Google Scholar
Sturm I, Lapuschkin S, Samek W, Müller KR (2016) Interpretable deep neural networks for single-trial EEG classification. J Neurosci Methods Neurosci 274:141–145. https://doi.org/10.1016/j.jneumeth.2016.10.008
Article Google Scholar
S Lapuschkin, A Binder, G Montavon, KR Muller, W Samek (2016). Analyzing Classifiers: Fisher Vectors and Deep Neural Networks, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 2016-Decem 2912–2920. https://doi.org/10.1109/CVPR.2016.318
D Baehrens, T Schroeter, S Harmeling, M Kawanabe, KH Hansen, KR Muller (2010). How to Explain Individual Classification Decisions, J Mach Learn Res 11 1–29 arXiv:0912.1128. http://jmlr.csail.mit.edu/papers/volume11/baehrens10a/baehrens10a.pdf%5Cnpapers3://publication/uuid/EBAB9FAA-E1A8-494F-BC05-640DEFA5DB06
Samek W, Binder A, Montavon G, Lapuschkin S, Müller KR (2017) Evaluating the visualization of what a deep neural network has learned. IEEE Trans Neural Networks Learn Syst 28:2660–2673. https://doi.org/10.1109/TNNLS.2016.2599820
Article MathSciNet Google Scholar
Montavon G, Samek W, Müller KR (2018) Methods for interpreting and understanding deep neural networks. Digit Signal Process A Rev J 73:1–15. https://doi.org/10.1016/j.dsp.2017.10.011
Article MathSciNet Google Scholar
M Ancona, E Ceolini, C Öztireli, M Gross (2018). Towards better understanding of gradient-based attribution methods for Deep Neural Networks 1–16. http://arxiv.org/abs/1711.06104
V. Petsiuk, A. Das, K. Saenko, RISE: randomized input sampling for explanation of black-box models, 1 (2018). http://arxiv.org/abs/1806.07421
Google Scholar
M Sundararajan, A Taly, Q Yan (2017). Axiomatic Attribution for Deep Networks, ArXiv:1703.01365
R.C. Fong, A. Vedaldi, Interpretable Explanations of Black Boxes by Meaningful Perturbation, ArXiv:1704.03296. (2018)
R Fong, M Patrick, A Vedaldi (2019). Understanding deep networks via extremal perturbations and smooth masks, in: Proc IEEE Int Conf Comput Vis : pp. 2950–2958. https://doi.org/10.1109/ICCV.2019.00304
Gao P, Yuan R, Wang F, Xiao L, Fujita H, Zhang Y (2020) Siamese attentional keypoint network for high performance visual tracking. Knowledge-Based Syst 193:105448. https://doi.org/10.1016/j.knosys.2019.105448
Article Google Scholar
Gao P, Zhang Q, Wang F, Xiao L, Fujita H, Zhang Y (2020) Learning reinforced attentional representation for end-to-end visual tracking. Inf Sci (Ny) 517:52–67. https://doi.org/10.1016/j.ins.2019.12.084
Article Google Scholar
JDJ Deng, WDW Dong, R Socher, LJLLJ Li, KLK Li, LFFL Fei-Fei (2009). ImageNet: A Large-Scale Hierarchical Image Database, 2009 IEEE Conf. Comput. Vis. Pattern Recognit. 2–9. https://doi.org/10.1109/CVPR.2009.5206848
Mark E, SM AE, Luc VG, Christopher KIW, John W, Andrew Z (2015) The Pascal Visual Object Classes Challenge – a Retrospective. J Verbal Learning Verbal Behav 1:98–136. https://doi.org/10.1016/S0022-5371(70)80029-9
Article Google Scholar
TY Lin, M Maire, S Belongie, J Hays, P Perona, D Ramanan, P Dollár, CL Zitnick (2014). Microsoft COCO: Common objects in context, Lect. Notes Comput. Sci. (Including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics). 8693 LNCS 740–755. https://doi.org/10.1007/978-3-319-10602-1_48
K He, X Zhang, S Ren, J Sun (2015). Deep Residual Learning for Image Recognition. https://doi.org/10.1109/CVPR.2016.90
S Gulli, Antonio Pal (2017). Deep learning with Keras : implement neural networks with Keras on Theano and TensorFlow. https://books.google.es/books/about/Deep_Learning_with_Keras.html?id=20EwDwAAQBAJ&redir_esc=y
Chamber MA, Montes FJ (1982) TensorFlow: large-scale machine learning on heterogeneous distributed systems. Plant Soil 66:353–360. https://doi.org/10.1007/BF02183801
Article Google Scholar
Zhang J, Bargal SA, Lin Z, Brandt J, Shen X, Sclaroff S (2018) Top-down neural attention by excitation Backprop. Int J Comput Vis 126:1084–1102. https://doi.org/10.1007/s11263-017-1059-x
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Engineering Department, Faculty of Intelligent Systems Engineering and Data Science, Persian Gulf University, 75168 Bushehr, Iran
Hamed Behzadi-Khormouji & Habib Rostami

Authors

Hamed Behzadi-Khormouji
View author publications
You can also search for this author in PubMed Google Scholar
Habib Rostami
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Habib Rostami.

Ethics declarations

Conflict of interests

The authors declare that they did not receive any financial or personal support from other individuals or organizations that could inappropriately influence their work. This study was approved by the ethical committee of the Persian Gulf University. All authors gave their informed consent before enrolment. In addition, the authors did not have any potential conflict of interests with respect to this work.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Behzadi-Khormouji, H., Rostami, H. Fast multi-resolution occlusion: a method for explaining and understanding deep neural networks. Appl Intell 51, 2431–2455 (2021). https://doi.org/10.1007/s10489-020-01946-3

Download citation

Accepted: 12 September 2020
Published: 04 November 2020
Issue Date: April 2021
DOI: https://doi.org/10.1007/s10489-020-01946-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast multi-resolution occlusion: a method for explaining and understanding deep neural networks

Abstract

Access this article

Similar content being viewed by others

Attention mechanisms in computer vision: A survey

A survey on Image Data Augmentation for Deep Learning

Visual attention network

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fast multi-resolution occlusion: a method for explaining and understanding deep neural networks

Abstract

Access this article

Similar content being viewed by others

Attention mechanisms in computer vision: A survey

A survey on Image Data Augmentation for Deep Learning

Visual attention network

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation