当前位置: X-MOL 学术Found. Comput. Math. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Halting Time is Predictable for Large Models: A Universality Property and Average-Case Analysis
Foundations of Computational Mathematics ( IF 3 ) Pub Date : 2022-02-15 , DOI: 10.1007/s10208-022-09554-y
Courtney Paquette 1, 2 , Elliot Paquette 1 , Bart van Merriënboer 2 , Fabian Pedregosa 2
Affiliation  

Average-case analysis computes the complexity of an algorithm averaged over all possible inputs. Compared to worst-case analysis, it is more representative of the typical behavior of an algorithm, but remains largely unexplored in optimization. One difficulty is that the analysis can depend on the probability distribution of the inputs to the model. However, we show that this is not the case for a class of large-scale problems trained with first-order methods including random least squares and one-hidden layer neural networks with random weights. In fact, the halting time exhibits a universality property: it is independent of the probability distribution. With this barrier for average-case analysis removed, we provide the first explicit average-case convergence rates showing a tighter complexity not captured by traditional worst-case analysis. Finally, numerical simulations suggest this universality property holds for a more general class of algorithms and problems.



中文翻译:

大型模型的停止时间是可预测的:普遍性和平均案例分析

平均情况分析计算所有可能输入的平均算法复杂度。与最坏情况分析相比,它更能代表算法的典型行为,但在优化方面仍未得到充分探索。一个困难是分析可能取决于模型输入的概率分布。然而,我们表明,对于使用一阶方法(包括随机最小二乘法和具有随机权重的单隐藏层神经网络)训练的一类大规模问题,情况并非如此。事实上,停止时间具有普遍性:它与概率分布无关。消除了平均情况分析的这个障碍,我们提供了第一个显式的平均情况收敛速度,显示了传统最坏情况分析没有捕捉到的更严格的复杂性。最后,数值模拟表明这种普遍性适用于更一般的算法和问题。

更新日期:2022-02-15
down
wechat
bug