当前位置: X-MOL 学术Nat. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Concept whitening for interpretable image recognition
Nature Machine Intelligence ( IF 18.8 ) Pub Date : 2020-12-07 , DOI: 10.1038/s42256-020-00265-z
Zhi Chen , Yijie Bei , Cynthia Rudin

What does a neural network encode about a concept as we traverse through the layers? Interpretability in machine learning is undoubtedly important, but the calculations of neural networks are very challenging to understand. Attempts to see inside their hidden layers can be misleading, unusable or rely on the latent space to possess properties that it may not have. Here, rather than attempting to analyse a neural network post hoc, we introduce a mechanism, called concept whitening (CW), to alter a given layer of the network to allow us to better understand the computation leading up to that layer. When a concept whitening module is added to a convolutional neural network, the latent space is whitened (that is, decorrelated and normalized) and the axes of the latent space are aligned with known concepts of interest. By experiment, we show that CW can provide us with a much clearer understanding of how the network gradually learns concepts over layers. CW is an alternative to a batch normalization layer in that it normalizes, and also decorrelates (whitens), the latent space. CW can be used in any layer of the network without hurting predictive performance.

A preprint version of the article is available at ArXiv.


中文翻译:

概念美白可解释图像

当我们遍历各层时,神经网络对概念的编码是什么?机器学习中的可解释性无疑很重要,但是神经网络的计算很难理解。试图查看其隐藏层的内部可能会产生误导,无法使用或依赖于潜在空间来拥有其可能不具备的属性。在这里,我们没有尝试事后分析神经网络,而是引入了一种称为概念白化(CW)的机制来更改网络的给定层,以使我们能够更好地理解导致该层的计算。当将概念白化模块添加到卷积神经网络时,潜在空间将被白化(即去相关和标准化),并且潜在空间的轴与感兴趣的已知概念对齐。通过实验 我们证明CW可以使我们对网络如何逐步学习分层概念有更清晰的了解。CW是批处理归一化层的替代方法,因为它可以对潜伏空间进行归一化,也可以对其进行去相关(变白)。CW可用于网络的任何层,而不会影响预测性能。

该文章的预印本可从ArXiv获得。
更新日期:2020-12-08
down
wechat
bug