当前位置: X-MOL 学术Mach. Learn. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Diametrical Risk Minimization: theory and computations
Machine Learning ( IF 7.5 ) Pub Date : 2021-09-02 , DOI: 10.1007/s10994-021-06036-0
Matthew D. Norton 1 , Johannes O. Royset 2
Affiliation  

The theoretical and empirical performance of Empirical Risk Minimization (ERM) often suffers when loss functions are poorly behaved with large Lipschitz moduli and spurious sharp minimizers. We propose and analyze a counterpart to ERM called Diametrical Risk Minimization (DRM), which accounts for worst-case empirical risks within neighborhoods in parameter space. DRM has generalization bounds that are independent of Lipschitz moduli for convex as well as nonconvex problems and it can be implemented using a practical algorithm based on stochastic gradient descent. Numerical results illustrate the ability of DRM to find quality solutions with low generalization error in sharp empirical risk landscapes from benchmark neural network classification problems with corrupted labels.



中文翻译:

径向风险最小化:理论和计算

经验风险最小化 (ERM) 的理论和经验性能通常会在损失函数在大的 Lipschitz 模量和虚假的尖锐最小化器的情况下表现不佳时受到影响。我们提出并分析了 ERM 的对应物,称为直径风险最小化 (DRM),它解释了参数空间中邻域内最坏情况的经验风险。对于凸和非凸问题,DRM 具有独立于 Lipschitz 模的泛化边界,并且可以使用基于随机梯度下降的实用算法来实现。数值结果说明了 DRM 能够从带有损坏标签的基准神经网络分类问题中找到具有低泛化误差的高质量解决方案。

更新日期:2021-09-03
down
wechat
bug