当前位置: X-MOL 学术Stat. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Optimal representative sample weighting
Statistics and Computing ( IF 2.2 ) Pub Date : 2021-02-28 , DOI: 10.1007/s11222-021-10001-1
Shane Barratt , Guillermo Angeris , Stephen Boyd

We consider the problem of assigning weights to a set of samples or data records, with the goal of achieving a representative weighting, which happens when certain sample averages of the data are close to prescribed values. We frame the problem of finding representative sample weights as an optimization problem, which in many cases is convex and can be efficiently solved. Our formulation includes as a special case the selection of a fixed number of the samples, with equal weights, i.e., the problem of selecting a smaller representative subset of the samples. While this problem is combinatorial and not convex, heuristic methods based on convex optimization seem to perform very well. We describe our open-source implementation rsw and apply it to a skewed sample of the CDC BRFSS dataset.



中文翻译:

最佳代表性样本加权

我们考虑将权重分配给一组样本或数据记录的问题,目的是实现代表性的权重,这种权重发生在数据的某些样本平均值接近规定值时。我们将寻找代表性样本权重的问题框架为优化问题,该问题在许多情况下是凸的并且可以有效地解决。在特殊情况下,我们的公式包括选择权重相等的固定数量的样本,即选择样本较小的代表性子集的问题。尽管此问题是组合问题,而不是凸问题,但基于凸优化的启发式方法似乎表现良好。我们描述了开源实现rsw,并将其应用于CDC BRFSS数据集的倾斜样本。

更新日期:2021-02-28
down
wechat
bug