skip to main content
research-article

SMINT: Toward Interpretable and Robust Model Sharing for Deep Neural Networks

Authors Info & Claims
Published:03 May 2020Publication History
Skip Abstract Section

Abstract

Sharing a pre-trained machine learning model, particularly a deep neural network via prediction APIs, is becoming a common practice on machine learning as a service (MLaaS) platforms nowadays. Although deep neural networks (DNN) have shown remarkable successes in many tasks, they are also criticized for the lack of interpretability and transparency. Interpreting a shared DNN model faces two additional challenges compared with interpreting a general model. (1) Limited training data can be disclosed to users. (2) The internal structure of the models may not be available. These two challenges impede the application of most existing interpretability approaches, such as saliency maps or influence functions, for DNN models. Case-based reasoning methods have been used for interpreting decisions; however, how to select and organize the data points under the constraints of shared DNN models is not discussed. Moreover, simply providing cases as explanations may not be sufficient for supporting instance level interpretability. Meanwhile, existing interpretation methods for DNN models generally lack the means to evaluate the reliability of the interpretation. In this article, we propose a framework named Shared Model INTerpreter (SMINT) to address the above limitations. We propose a new data structure called a boundary graph to organize training points to mimic the predictions of DNN models. We integrate local features, such as saliency maps and interpretable input masks, into the data structure to help users to infer the model decision boundaries. We show that the boundary graph is able to address the reliability issues in many local interpretation methods. We further design an algorithm named hidden-layer aware p-test to measure the reliability of the interpretations. Our experiments show that SMINT is able to achieve above 99% fidelity to corresponding DNN models on both MNIST and ImageNet by sharing only a tiny fraction of training data to make these models interpretable. The human pilot study demonstrates that SMINT provides better interpretability compared with existing methods. Moreover, we demonstrate that SMINT is able to assist model tuning for better performance on different user data.

References

  1. Agnar Aamodt and Enric Plaza. 1994. Case-based reasoning: Foundational issues, methodological variations, and system approaches. AI Commun. 7, 1 (1994), 39--59.Google ScholarGoogle ScholarCross RefCross Ref
  2. Julius Adebayo, Justin Gilmer, Michael Muelly, Ian Goodfellow, Moritz Hardt, and Been Kim. 2018. Sanity checks for saliency maps. In Advances in Neural Information Processing Systems. 9505--9515.Google ScholarGoogle Scholar
  3. David Alvarez-Melis and Tommi S. Jaakkola. 2018. On the robustness of interpretability methods. https://arxiv.org/pdf/1806.08049.pdf.Google ScholarGoogle Scholar
  4. Amazon. 2017. Machine Learning on AWS. Retrieved July 7, 2017 from https://aws.amazon.com/machine-learning/.Google ScholarGoogle Scholar
  5. Alexandr Andoni, Piotr Indyk, Thijs Laarhoven, Ilya Razenshteyn, and Ludwig Schmidt. 2015. Practical and optimal LSH for angular distance. In Advances in Neural Information Processing Systems. 1225--1233.Google ScholarGoogle Scholar
  6. Alexandr Andoni and Ilya Razenshteyn. 2015. Optimal data-dependent hashing for approximate near neighbors. In Proceedings of the 47th Annual ACM Symposium on Theory of Computing. ACM, 793--801.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Franz Aurenhammer. 1991. Voronoi diagrams: A survey of a fundamental geometric data structure. ACM Comput. Surv. 23, 3 (1991), 345--405.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Jacob Bien and Robert Tibshirani. 2011. Prototype selection for interpretable classification. Ann. Appl. Stat. 5, 4 (2011), 2403--2424.Google ScholarGoogle ScholarCross RefCross Ref
  9. Jean-Daniel Boissonnat, Olivier Devillers, and Monique Teillaud. 1993. A semidynamic construction of higher-order voronoi diagrams and its randomized analysis. Algorithmica 9, 4 (1993), 329--356.Google ScholarGoogle ScholarCross RefCross Ref
  10. Olcay Boz. 2002. Extracting decision trees from trained neural networks. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 456--461.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Nicholas Carlini and David Wagner. 2017. Adversarial examples are not easily detected: Bypassing ten detection methods. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security. ACM, 3--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. In Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP’17). IEEE, 39--57.Google ScholarGoogle ScholarCross RefCross Ref
  13. Zhengping Che, Sanjay Purushotham, Robinder Khemani, and Yan Liu. 2016. Interpretable deep models for icu outcome prediction. In Proceedings of the AMIA Annual Symposium Proceedings, Vol. 2016. American Medical Informatics Association, 371.Google ScholarGoogle Scholar
  14. Daqing Chen and Phillip Burrell. 2001. Case-based reasoning system and artificial neural networks: A review. Neur. Comput. Appl. 10, 3 (2001), 264--276.Google ScholarGoogle ScholarCross RefCross Ref
  15. Marvin S. Cohen, Jared T. Freeman, and Steve Wolf. 1996. Metarecognition in time-stressed decision making: Recognizing, critiquing, and correcting. Hum. Factors 38, 2 (1996), 206--219.Google ScholarGoogle ScholarCross RefCross Ref
  16. Mark Craven and Jude W. Shavlik. 1996. Extracting tree-structured representations of trained networks. In Advances in Neural Information Processing Systems. 24--30.Google ScholarGoogle Scholar
  17. Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S. Mirrokni. 2004. Locality-sensitive hashing scheme based on p-stable distributions. In Proceedings of the 20th Annual Symposium on Computational Geometry. ACM, 253--262.Google ScholarGoogle Scholar
  18. Ruth Fong and Andrea Vedaldi. 2017. Interpretable explanations of black boxes by meaningful perturbation. In Proceedings of the IEEE International Conference on Computer Vision. 3429--3437.Google ScholarGoogle ScholarCross RefCross Ref
  19. Nicholas Frosst and Geoffrey Hinton. 2017. Distilling a neural network into a soft decision tree. https://arxiv.org/pdf/1711.09784.pdf.Google ScholarGoogle Scholar
  20. Amirata Ghorbani, Abubakar Abid, and James Zou. 2019. Interpretation of neural networks is fragile. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 3681--3688.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. https://arxiv.org/abs/1412.6572.pdf.Google ScholarGoogle Scholar
  22. Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. 2013. Speech recognition with deep recurrent neural networks. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’13). IEEE, 6645--6649.Google ScholarGoogle ScholarCross RefCross Ref
  23. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.Google ScholarGoogle ScholarCross RefCross Ref
  24. Geoffrey Hinton, Li Deng, Dong Yu, George E. Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara N. Sainath, et al. 2012. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Sign. Process. Mag. 29, 6 (2012), 82--97.Google ScholarGoogle ScholarCross RefCross Ref
  25. Qiang Huang, Jianlin Feng, Qiong Fang, Wilfred Ng, and Wei Wang. 2017. Query-aware locality-sensitive hashing scheme for lp norm. VLDB J. 26, 5 (2017), 683--708.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Mayank Kabra, Alice Robie, and Kristin Branson. 2015. Understanding classifier errors by examining influential neighbors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3917--3925.Google ScholarGoogle ScholarCross RefCross Ref
  27. Been Kim, Rajiv Khanna, and Oluwasanmi O. Koyejo. 2016. Examples are not enough, learn to criticize! criticism for interpretability. In Advances in Neural Information Processing Systems. 2280--2288.Google ScholarGoogle Scholar
  28. Been Kim, Martin Wattenberg, Justin Gilmer, Carrie Cai, James Wexler, Fernanda Viegas, et al. 2018. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). In Proceedings of the International Conference on Machine Learning. 2673--2682.Google ScholarGoogle Scholar
  29. Yoon Kim. 2014. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP'14). Association for Computational Linguistics, 1746--1751. DOI:10.3115/v1/D14-1181Google ScholarGoogle ScholarCross RefCross Ref
  30. Pieter-Jan Kindermans, Sara Hooker, Julius Adebayo, Maximilian Alber, Kristof T. Schütt, Sven Dähne, Dumitru Erhan, and Been Kim. 2017. The (Un) reliability of saliency methods. In Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Springer, 267--280.Google ScholarGoogle Scholar
  31. Pang Wei Koh and Percy Liang. 2017. Understanding black-box predictions via influence functions. In Proceedings of the 34th International Conference on Machine Learning, Vol. 70. JMLR. org, 1885--1894.Google ScholarGoogle Scholar
  32. J. Kolodner. 2014. Selecting the best case for a case-based reasoner. In Proceedings of the 11th Annual Conference Cognitive Science Society Pod. 155--162.Google ScholarGoogle Scholar
  33. Janet L. Kolodner. 1992. An introduction to case-based reasoning. Artif. Intell. Rev. 6, 1 (1992), 3--34.Google ScholarGoogle ScholarCross RefCross Ref
  34. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2017. ImageNet classification with deep convolutional neural networks. Commun. ACM 60, 6 (May 2017), 84--90. DOI:https://doi.org/10.1145/3065386Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Brian Kulis and Kristen Grauman. 2009. Kernelized locality-sensitive hashing for scalable image search. In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision. IEEE, 2130--2137.Google ScholarGoogle ScholarCross RefCross Ref
  36. Der-Tsai Lee and Bruce J. Schachter. 1980. Two algorithms for constructing a Delaunay triangulation. International J. Comput. Inf. Sci. 9, 3 (1980), 219--242.Google ScholarGoogle ScholarCross RefCross Ref
  37. Oscar Li, Hao Liu, Chaofan Chen, and Cynthia Rudin. 2018. Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions. In Thirty-Second AAAI Conference on Artificial Intelligence.Google ScholarGoogle Scholar
  38. Xingjun Ma, Bo Li, Yisen Wang, Sarah M. Erfani, Sudanthi Wijewickrema, Michael E. Houle, Grant Schoenebeck, Dawn Song, and James Bailey. 2018. Characterizing adversarial subspaces using local intrinsic dimensionality. Proceedings of the International Conference on Learning Representations (ICLR’18).Google ScholarGoogle Scholar
  39. Charles Mathy, Nate Derbinsky, José Bento, Jonathan Rosenthal, and Jonathan S. Yedidia. 2015. The boundary forest algorithm for online supervised and unsupervised learning. In Proceedings of the AAAI Conference on Artifical Intelligence (AAAI’15). 2864--2870.Google ScholarGoogle Scholar
  40. Microsoft. 2017. Microsoft Azure Machine Learning Studio. Retrieved July 7, 2017 from https://studio.azureml.net/.Google ScholarGoogle Scholar
  41. Tim Miller. 2018. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence 267 (2019), 1--38.Google ScholarGoogle Scholar
  42. Allen Newell, Herbert Alexander Simon, et al. 1972. Human Problem Solving. Vol. 104. Prentice-Hall, Englewood Cliffs, NJ.Google ScholarGoogle Scholar
  43. Richard Nock, Marc Sebban, and Didier Bernard. 2003. A simple locally adaptive nearest neighbor rule with Aapplication to pollution forecasting. Int. J. Pattern Recogn. Artif. Intell. 7, 8 (2003), 1369--1382.Google ScholarGoogle ScholarCross RefCross Ref
  44. Augustus Odena, Catherine Olsson, David Andersen, and Ian Goodfellow. 2019. TensorFuzz: Debugging neural networks with coverage-guided fuzzing. In International Conference on Machine Learning. 4901--4911.Google ScholarGoogle Scholar
  45. Bo Pang and Lillian Lee. 2005. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 115--124.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Nicolas Papernot, Fartash Faghri, Nicholas Carlini, Ian Goodfellow, Reuben Feinman, Alexey Kurakin, Cihang Xie, Yash Sharma, Tom Brown, Aurko Roy, Alexander Matyasko, Vahid Behzadan, Karen Hambardzumyan, Zhishuai Zhang, Yi-Lin Juang, Zhi Li, Ryan Sheatsley, Abhibhav Garg, Jonathan Uesato, Willi Gierke, Yinpeng Dong, David Berthelot, Paul Hendricks, Jonas Rauber, and Rujun Long. 2018. Technical report on the CleverHans v2.1.0 adversarial examples library. https://arxiv.org/pdf/1610.00768.pdf.Google ScholarGoogle Scholar
  47. Nicolas Papernot and Patrick McDaniel. 2018. Deep k-nearest neighbors: Towards confident, interpretable and robust deep learning. https://arxiv.org/abs/1803.04765.pdf.Google ScholarGoogle Scholar
  48. Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z. Berkay Celik, and Ananthram Swami. 2016. The limitations of deep learning in adversarial settings. In Proceedings of the 2016 IEEE European Symposium on Security and Privacy (EuroS8P’16). IEEE, 372--387.Google ScholarGoogle ScholarCross RefCross Ref
  49. Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. 2017. Deepxplore: Automated whitebox testing of deep learning systems. In Proceedings of the 26th Symposium on Operating Systems Principles. ACM, 1--18.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Google Cloud Platform. 2018. Cloud Machine Learning Engine. Retrieved February 1, 2018 from https://cloud.google.com/ml-engine/.Google ScholarGoogle Scholar
  51. Jim Prentzas and Ioannis Hatzilygeroudis. 2009. Combinations of case-based reasoning with other intelligent methods. Int. J. Hybrid Intell. Syst. 6, 4 (2009), 189--209.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should I trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1135--1144.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 3 (2015), 211--252. DOI:https://doi.org/10.1007/s11263-015-0816-yGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  54. Gregor P. J. Schmitz, Chris Aldrich, and Francois S. Gouws. 1999. ANN-DT: An algorithm for extraction of decision trees from artificial neural networks. IEEE Trans. Neur. Netw. 10, 6 (1999), 1392--1401.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the International Conference on Computer Vision (ICCV’17). 618--626.Google ScholarGoogle ScholarCross RefCross Ref
  56. David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. 2016. Mastering the game of Go with deep neural networks and tree search. Nature 529, 7587 (2016), 484--489.Google ScholarGoogle Scholar
  57. J. Springenberg, Alexey Dosovitskiy, Thomas Brox, and M. Riedmiller. 2015. Striving for simplicity: The all convolutional net. In Proceedings of the International Conference on Learning Representations (ICLR’15).Google ScholarGoogle Scholar
  58. Pierre Stock and Moustapha Cisse. 2017. Convnets and imagenet beyond accuracy: Explanations, bias detection, adversarial examples and model criticism. https://arxiv.org/abs/1711.11443.pdf.Google ScholarGoogle Scholar
  59. Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic attribution for deep networks. In Proceedings of the 34th International Conference on Machine Learning, Vol. 70. 3319--3328.Google ScholarGoogle Scholar
  60. Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, and Alexander A. Alemi. 2017. Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’17), Vol. 4. 12.Google ScholarGoogle Scholar
  61. Tensorflow. 2017. Deep MNIST for Experts. Retrieved July 7, 2017 from https://www.tensorflow.org/get_started/mnist/pros.Google ScholarGoogle Scholar
  62. Csaba D. Toth, Joseph O’Rourke, and Jacob E. Goodman. 2004. Handbook of Discrete and Computational Geometry. CRC Press.Google ScholarGoogle Scholar
  63. Florian Tramèr, Fan Zhang, Ari Juels, Michael K. Reiter, and Thomas Ristenpart. 2016. Stealing machine learning models via prediction apis. In Proceedings of the USENIX Security Conference.Google ScholarGoogle Scholar
  64. Huijun Wu, Chen Wang, Jie Yin, Kai Lu, and Liming Zhu. 2018. Sharing deep neural network models with interpretation. In Proceedings of the 2018 World Wide Web Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 177--186.Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Mike Wu, Michael C. Hughes, Sonali Parbhoo, Maurizio Zazzi, Volker Roth, and Finale Doshi-Velez. 2018. Beyond sparsity: Tree regularization of deep models for interpretability. In Thirty-Second AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  66. Weilin Xu, David Evans, and Yanjun Qi. 2017. Feature squeezing: Detecting adversarial examples in deep neural networks. Proceedings of the Network and Distributed System Security Symposium (NDSS’18).Google ScholarGoogle Scholar
  67. Pierre Stock and Moustapha Cisse. 2018. Convnets and imagenet beyond accuracy: Understanding mistakes and uncovering biases. In Proceedings of the European Conference on Computer Vision (ECCV'18). 498--512.Google ScholarGoogle ScholarCross RefCross Ref
  68. Matthew D. Zeiler and Rob Fergus. 2014. Visualizing and understanding convolutional networks. In European Conference on Computer Vision. Springer, 818--833.Google ScholarGoogle Scholar
  69. Quanshi Zhang, Ying Nian Wu, and Song-Chun Zhu. 2018. Interpretable convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8827--8836.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. SMINT: Toward Interpretable and Robust Model Sharing for Deep Neural Networks

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on the Web
            ACM Transactions on the Web  Volume 14, Issue 3
            August 2020
            126 pages
            ISSN:1559-1131
            EISSN:1559-114X
            DOI:10.1145/3398019
            Issue’s Table of Contents

            Copyright © 2020 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 3 May 2020
            • Accepted: 1 February 2020
            • Revised: 1 October 2019
            • Received: 1 April 2019
            Published in tweb Volume 14, Issue 3

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format