当前位置: X-MOL 学术Inf. Softw. Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Performance Analysis of Out-of-Distribution Detection on Trained Neural Networks
Information and Software Technology ( IF 3.9 ) Pub Date : 2020-09-11 , DOI: 10.1016/j.infsof.2020.106409
Jens Henriksson , Christian Berger , Markus Borg , Lars Tornberg , Sankar Raman Sathyamoorthy , Cristofer Englund

Context:Deep Neural Networks (DNN) have shown great promise in various domains, for example to support pattern recognition in medical imagery. However, DNNs need to be tested for robustness before being deployed in safety critical applications. One common challenge occurs when the model is exposed to data samples outside of the training data domain, which can yield to outputs with high confidence despite no prior knowledge of the given input.

Objective: The aim of this paper is to investigate how the performance of detecting out-of-distribution (OOD) samples changes for outlier detection methods (e.g., supervisors) when DNNs become better on training samples.

Method: Supervisors are components aiming at detecting out-of-distribution samples for a DNN. The experimental setup in this work compares the performance of supervisors using metrics and datasets that reflect the most common setups in related works. Four different DNNs with three different supervisors are compared during different stages of training, to detect at what point during training the performance of the supervisors begins to deteriorate.

Results: found that the outlier detection performance of the supervisors increased as the accuracy of the underlying DNN improved. However, all supervisors showed a large variation in performance, even for variations of network parameters that marginally changed the model accuracy. The results showed that understanding the relationship between training results and supervisor performance is crucial to improve a model’s robustness.

Conclusion: Analyzing DNNs for robustness is a challenging task. Results showed that variations in model parameters that have small variations on model predictions can have a large impact on the out-of-distribution detection performance. This kind of behavior needs to be addressed when DNNs are part of a safety critical application and hence, the necessary safety argumentation for such systems need be structured accordingly.



中文翻译:

训练神经网络失配检测的性能分析

背景:深度神经网络(DNN)在各个领域都显示出了巨大的希望,例如支持医学图像中的模式识别。但是,在将DNN部署到安全关键型应用程序之前,需要对其健壮性进行测试。当模型暴露于训练数据域之外的数据样本时,会发生一个常见的挑战,尽管事先没有给定输入的知识,但可以以高置信度产生输出。

目的:本文的目的是研究当DNN在训练样本上变得更好时,对于非常规检测方法(例如,主管),检测失配(OOD)样本的性能如何变化。

方法:主管是旨在检测DNN的分布外样本的组件。本工作中的实验设置使用反映相关工作中最常见设置的指标和数据集来比较主管的绩效。在培训的不同阶段,将对具有三个不同主管的四个不同的DNN进行比较,以检测主管在培训过程中的哪一点开始恶化。

结果:发现主管的离群值检测性能随着基础DNN的准确性的提高而提高。但是,即使网络参数的变化在一定程度上改变了模型的准确性,所有主管都表现出很大的性能差异。结果表明,了解训练结果与主管绩效之间的关系对于提高模型的鲁棒性至关重要。

结论:分析DNN的鲁棒性是一项艰巨的任务。结果表明,模型参数的变化对模型预测的影响很小,可能会对分布外检测性能产生较大影响。当DNN是安全性关键应用程序的一部分时,必须解决这种行为,因此,需要相应地构造此类系统的必要安全性论证。

更新日期:2020-09-11
down
wechat
bug