Efficient estimation of the ANOVA mean dimension, with an application to neural net classification,arXiv - CS - Numerical Analysis

当前位置： X-MOL 学术 › arXiv.cs.NA › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Efficient estimation of the ANOVA mean dimension, with an application to neural net classification
arXiv - CS - Numerical Analysis Pub Date : 2020-07-02 , DOI: arxiv-2007.01281
Christopher Hoyt and Art B. Owen

The mean dimension of a black box function of $d$ variables is a convenient way to summarize the extent to which it is dominated by high or low order interactions. It is expressed in terms of $2^d-1$ variance components but it can be written as the sum of $d$ Sobol' indices that can be estimated by leave one out methods. We compare the variance of these leave one out methods: a Gibbs sampler called winding stairs, a radial sampler that changes each variable one at a time from a baseline, and a naive sampler that never reuses function evaluations and so costs about double the other methods. For an additive function the radial and winding stairs are most efficient. For a multiplicative function the naive method can easily be most efficient if the factors have high kurtosis. As an illustration we consider the mean dimension of a neural network classifier of digits from the MNIST data set. The classifier is a function of $784$ pixels. For that problem, winding stairs is the best algorithm. We find that inputs to the final softmax layer have mean dimensions ranging from $1.35$ to $2.0$.

中文翻译：

有效估计方差分析平均维数，应用于神经网络分类

$d$ 变量的黑盒函数的平均维数是一种方便的方法来总结它受高阶或低阶相互作用支配的程度。它用 $2^d-1$ 方差分量表示，但可以写成 $d$ Sobol' 指数的总和，可以通过留一法估计。我们比较了这些留一法的方差：称为绕楼梯的吉布斯采样器，从基线一次改变每个变量的径向采样器，以及从不重复使用函数评估的朴素采样器，因此成本约为其他方法的两倍. 对于加法函数，径向楼梯和蜿蜒楼梯是最有效的。对于乘法函数，如果因子具有高峰态，则朴素方法很容易最有效。作为说明，我们考虑来自 MNIST 数据集的数字的神经网络分类器的平均维度。分类器是 $784$ 像素的函数。对于这个问题，绕楼梯是最好的算法。我们发现最终 softmax 层的输入的平均尺寸从 1.35 美元到 2.0 美元不等。

更新日期：2020-09-29

点击分享查看原文

点击收藏

阅读更多本刊最新论文