Nonparametric regression using deep neural networks with ReLU activation function,Annals of Statistics

当前位置： X-MOL 学术 › Ann. Stat. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Nonparametric regression using deep neural networks with ReLU activation function
Annals of Statistics ( IF 3.2 ) Pub Date : 2020-08-01 , DOI: 10.1214/19-aos1875
Johannes Schmidt-Hieber

Consider the multivariate nonparametric regression model. It is shown that estimators based on sparsely connected deep neural networks with ReLU activation function and properly chosen network architecture achieve the minimax rates of convergence (up to $\log n$-factors) under a general composition assumption on the regression function. The framework includes many well-studied structural constraints such as (generalized) additive models. While there is a lot of flexibility in the network architecture, the tuning parameter is the sparsity of the network. Specifically, we consider large networks with number of potential network parameters exceeding the sample size. The analysis gives some insights into why multilayer feedforward neural networks perform well in practice. Interestingly, for ReLU activation function the depth (number of layers) of the neural network architectures plays an important role and our theory suggests that for nonparametric regression, scaling the network depth with the sample size is natural. It is also shown that under the composition assumption wavelet estimators can only achieve suboptimal rates.

中文翻译：

使用具有 ReLU 激活函数的深度神经网络进行非参数回归

考虑多元非参数回归模型。结果表明，基于稀疏连接的深度神经网络的估计，在回归函数上的一般成分假设下，实现了基于稀疏连接的深度神经网络的估计率在回归函数上的一般成分假设下实现了最小的收敛速度（最多$ \ log n $ -factors）。该框架包括许多经过充分研究的结构约束，例如（广义）可加模型。虽然网络架构有很大的灵活性，但调整参数是网络的稀疏性。具体来说，我们考虑潜在网络参数数量超过样本大小的大型网络。该分析提供了一些关于为什么多层前馈神经网络在实践中表现良好的见解。有趣的是，对于 ReLU 激活函数，神经网络架构的深度（层数）起着重要作用，我们的理论表明，对于非参数回归，随样本大小缩放网络深度是很自然的。还表明，在组合假设下，小波估计器只能实现次优速率。

更新日期：2020-08-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文