Star Discrepancy Subset Selection: Problem Formulation and Efficient Approaches for Low Dimensions,arXiv - CS - Numerical Analysis

当前位置： X-MOL 学术 › arXiv.cs.NA › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Star Discrepancy Subset Selection: Problem Formulation and Efficient Approaches for Low Dimensions
arXiv - CS - Numerical Analysis Pub Date : 2021-01-19 , DOI: arxiv-2101.07881
Carola Doerr, Luís Paquete

Motivated by applications in instance selection, we introduce the \emph{star discrepancy subset selection problem}, which consists of finding a subset of $m$ out of $n$ points that minimizes the star discrepancy. We introduce two mixed integer linear formulations (MILP) and a combinatorial branch-and-bound (BB) algorithm for this problem and we evaluate our approaches against random subset selection and a greedy construction on different use-cases in dimension two and three. Our results show that one of the MILPs and BB are efficient in dimension two for large and small $m/n$ ratio, respectively, and for not too large $n$. However, the performance of both approaches decays strongly for larger dimensions and set sizes. As a side effect of our empirical comparisons we obtain point sets of discrepancy values that are much smaller than those of common low-discrepancy sequences, random point sets, and of Latin Hypercube Sampling. This suggests that subset selection could be an interesting approach for generating point sets of small discrepancy value.

中文翻译：

星形差异子集选择：问题公式化和低维度的有效方法

受实例选择中应用程序的激励，我们引入了\ emph {星差异子集选择问题}，该问题包括从\（n \）个点中找到\（m \）个子集，以最大程度地减少星号差异。我们针对此问题引入了两种混合整数线性公式（MILP）和组合分支定界（BB）算法，并针对二维和三维中不同用例的随机子集选择和贪婪构造评估了我们的方法。我们的结果表明，对于较大的和较小的$ m / n $比率，以及对于不太大的$ n $，MIL和BB之一在第二维上都是有效的。但是，对于较大的尺寸和尺寸，两种方法的性能都会大大降低。作为我们的经验比较的副作用，我们获得了差异值的点集，这些点集比常见的低差异序列，随机点集和拉丁超立方体采样的点集要小得多。这表明子集选择可能是生成小的差异值的点集的有趣方法。

更新日期：2021-01-21

点击分享查看原文

点击收藏

阅读更多本刊最新论文