当前位置: X-MOL 学术arXiv.cs.CG › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Social Distancing is Good for Points too!
arXiv - CS - Computational Geometry Pub Date : 2020-06-28 , DOI: arxiv-2006.15650
Alejandro Flores-Velazco

The nearest-neighbor rule is a well-known classification technique that, given a training set P of labeled points, classifies any unlabeled query point with the label of its closest point in P. The nearest-neighbor condensation problem aims to reduce the training set without harming the accuracy of the nearest-neighbor rule. FCNN is the most popular algorithm for condensation. It is heuristic in nature, and theoretical results for it are scarce. In this paper, we settle the question of whether reasonable upper-bounds can be proven for the size of the subset selected by FCNN. First, we show that the algorithm can behave poorly when points are too close to each other, forcing it to select many more points than necessary. We then successfully modify the algorithm to avoid such cases, thus imposing that selected points should "keep some distance". This modification is sufficient to prove useful upper-bounds, along with approximation guarantees for the algorithm.

中文翻译:

社交距离对积分也有好处!

最近邻规则是一种众所周知的分类技术,给定一个标记点​​的训练集 P,用其在 P 中最近点的标签对任何未标记的查询点进行分类。 最近邻凝聚问题旨在减少训练集在不损害最近邻规则的准确性的情况下。FCNN 是最流行的缩合算法。它本质上是启发式的,其理论结果很少。在本文中,我们解决了是否可以证明 FCNN 选择的子集大小的合理上限的问题。首先,我们表明当点彼此太近时,算法可能表现不佳,迫使它选择比必要更多的点。然后我们成功地修改了算法以避免这种情况,从而强加所选点应该“
更新日期:2020-06-30
down
wechat
bug