当前位置: X-MOL 学术J. Appl. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Streaming constrained binary logistic regression with online standardized data
Journal of Applied Statistics ( IF 1.2 ) Pub Date : 2021-01-06 , DOI: 10.1080/02664763.2020.1870672
Benoît Lalloué 1, 2 , Jean-Marie Monnez 1, 2 , Eliane Albuisson 3, 4, 5
Affiliation  

ABSTRACT

Online learning is a method for analyzing very large datasets (‘big data’) as well as data streams. In this article, we consider the case of constrained binary logistic regression and show the interest of using processes with an online standardization of the data, in particular to avoid numerical explosions or to allow the use of shrinkage methods. We prove the almost sure convergence of such a process and propose using a piecewise constant step-size such that the latter does not decrease too quickly and does not reduce the speed of convergence. We compare twenty-four stochastic approximation processes with raw or online standardized data on five real or simulated data sets. Results show that, unlike processes with raw data, processes with online standardized data can prevent numerical explosions and yield the best results.



中文翻译:

使用在线标准化数据流式约束二元逻辑回归

摘要

在线学习是一种分析非常大的数据集(“大数据”)以及数据流的方法。在本文中,我们考虑了约束二元逻辑回归的情况,并展示了使用具有数据在线标准化的过程的兴趣,特别是为了避免数值爆炸或允许使用收缩方法。我们证明了这样一个过程几乎肯定会收敛,并建议使用分段恒定步长,这样后者不会下降太快,也不会降低收敛速度。我们将二十四个随机近似过程与五个真实或模拟数据集上的原始或在线标准化数据进行比较。结果表明,与具有原始数据的流程不同,具有在线标准化数据的流程可以防止数值爆炸并产生最佳结果。

更新日期:2021-01-06
down
wechat
bug