Early-stopped neural networks are consistent,arXiv - CS - Machine Learning

当前位置： X-MOL 学术 › arXiv.cs.LG › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Early-stopped neural networks are consistent
arXiv - CS - Machine Learning Pub Date : 2021-06-10 , DOI: arxiv-2106.05932
Ziwei Ji, Justin D. Li, Matus Telgarsky

This work studies the behavior of neural networks trained with the logistic loss via gradient descent on binary classification data where the underlying data distribution is general, and the (optimal) Bayes risk is not necessarily zero. In this setting, it is shown that gradient descent with early stopping achieves population risk arbitrarily close to optimal in terms of not just logistic and misclassification losses, but also in terms of calibration, meaning the sigmoid mapping of its outputs approximates the true underlying conditional distribution arbitrarily finely. Moreover, the necessary iteration, sample, and architectural complexities of this analysis all scale naturally with a certain complexity measure of the true conditional model. Lastly, while it is not shown that early stopping is necessary, it is shown that any univariate classifier satisfying a local interpolation property is necessarily inconsistent.

中文翻译：

早期停止的神经网络是一致的

这项工作研究了通过梯度下降对二元分类数据进行逻辑损失训练的神经网络的行为，其中基础数据分布是一般的，并且（最佳）贝叶斯风险不一定为零。在这种情况下，结果表明，提前停止的梯度下降不仅在逻辑和错误分类损失方面，而且在校准方面都实现了接近最优的总体风险，这意味着其输出的 sigmoid 映射接近真实的潜在条件分布随意精细。此外，此分析的必要迭代、样本和架构复杂性都自然地随着真实条件模型的某种复杂性度量而扩展。最后，虽然没有表明提前停止是必要的，

更新日期：2021-06-11

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>