当前位置: X-MOL 学术arXiv.cs.CG › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Coresets for Regressions with Panel Data
arXiv - CS - Computational Geometry Pub Date : 2020-11-02 , DOI: arxiv-2011.00981
Lingxiao Huang, K. Sudhir, Nisheeth K. Vishnoi

This paper introduces the problem of coresets for regression problems to panel data settings. We first define coresets for several variants of regression problems with panel data and then present efficient algorithms to construct coresets of size that depend polynomially on 1/$\varepsilon$ (where $\varepsilon$ is the error parameter) and the number of regression parameters - independent of the number of individuals in the panel data or the time units each individual is observed for. Our approach is based on the Feldman-Langberg framework in which a key step is to upper bound the "total sensitivity" that is roughly the sum of maximum influences of all individual-time pairs taken over all possible choices of regression parameters. Empirically, we assess our approach with synthetic and real-world datasets; the coreset sizes constructed using our approach are much smaller than the full dataset and coresets indeed accelerate the running time of computing the regression objective.

中文翻译:

面板数据回归的核心集

本文介绍了回归问题的coresets问题到面板数据设置。我们首先为面板数据的回归问题的几种变体定义核心集,然后提出有效的算法来构建大小依赖于 1/$\varepsilon$(其中 $\varepsilon$ 是误差参数)和回归参数数量的多项式的核心集- 独立于面板数据中的个体数量或观察每个个体的时间单位。我们的方法基于 Feldman-Langberg 框架,其中一个关键步骤是设定“总灵敏度”的上限,该上限大致是所有个体时间对对所有可能的回归参数选择的最大影响的总和。根据经验,我们使用合成和真实世界的数据集评估我们的方法;
更新日期:2020-11-04
down
wechat
bug