Wide flat minima and optimal generalization in classifying high-dimensional Gaussian mixtures,Journal of Statistical Mechanics: Theory and Experiment

当前位置： X-MOL 学术 › J. Stat. Mech. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Wide flat minima and optimal generalization in classifying high-dimensional Gaussian mixtures
Journal of Statistical Mechanics: Theory and Experiment ( IF 2.2 ) Pub Date : 2020-12-22 , DOI: 10.1088/1742-5468/abcd31
Carlo Baldassi ₁ , Enrico M Malatesta ₁ , Matteo Negri _{1,

2} , Riccardo Zecchina ₁

Affiliation

We analyze the connection between minimizers with good generalizing properties and high local entropy regions of a threshold-linear classifier in Gaussian mixtures with the mean squared error loss function. We show that there exist configurations that achieve the Bayes-optimal generalization error, even in the case of unbalanced clusters. We explore analytically the error-counting loss landscape in the vicinity of a Bayes-optimal solution, and show that the closer we get to such configurations, the higher the local entropy, implying that the Bayes-optimal solution lays inside a wide flat region. We also consider the algorithmically relevant case of targeting wide flat minima of the (differentiable) mean squared error loss. Our analytical and numerical results show not only that in the balanced case the dependence on the norm of the weights is mild, but also, in the unbalanced case, that the performances can be improved.

中文翻译：

高维高斯混合分类中的宽平坦最小值和最优泛化

我们使用均方误差损失函数分析了具有良好泛化特性的最小化器与高斯混合中阈值线性分类器的高局部熵区域之间的联系。我们表明，即使在不平衡集群的情况下，也存在实现贝叶斯最优泛化误差的配置。我们分析了贝叶斯最优解附近的错误计数损失情况，并表明我们越接近这种配置，局部熵就越高，这意味着贝叶斯最优解位于一个宽阔的平坦区域内。我们还考虑了针对（可微）均方误差损失的宽平坦最小值的算法相关情况。我们的分析和数值结果不仅表明在平衡的情况下对权重范数的依赖是轻微的，

更新日期：2020-12-22

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文