当前位置:
X-MOL 学术
›
arXiv.cs.CG
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Social Distancing is Good for Points too!
arXiv - CS - Computational Geometry Pub Date : 2020-06-28 , DOI: arxiv-2006.15650 Alejandro Flores-Velazco
arXiv - CS - Computational Geometry Pub Date : 2020-06-28 , DOI: arxiv-2006.15650 Alejandro Flores-Velazco
The nearest-neighbor rule is a well-known classification technique that,
given a training set P of labeled points, classifies any unlabeled query point
with the label of its closest point in P. The nearest-neighbor condensation
problem aims to reduce the training set without harming the accuracy of the
nearest-neighbor rule. FCNN is the most popular algorithm for condensation. It is heuristic in
nature, and theoretical results for it are scarce. In this paper, we settle the
question of whether reasonable upper-bounds can be proven for the size of the
subset selected by FCNN. First, we show that the algorithm can behave poorly
when points are too close to each other, forcing it to select many more points
than necessary. We then successfully modify the algorithm to avoid such cases,
thus imposing that selected points should "keep some distance". This
modification is sufficient to prove useful upper-bounds, along with
approximation guarantees for the algorithm.
中文翻译:
社交距离对积分也有好处!
最近邻规则是一种众所周知的分类技术,给定一个标记点的训练集 P,用其在 P 中最近点的标签对任何未标记的查询点进行分类。 最近邻凝聚问题旨在减少训练集在不损害最近邻规则的准确性的情况下。FCNN 是最流行的缩合算法。它本质上是启发式的,其理论结果很少。在本文中,我们解决了是否可以证明 FCNN 选择的子集大小的合理上限的问题。首先,我们表明当点彼此太近时,算法可能表现不佳,迫使它选择比必要更多的点。然后我们成功地修改了算法以避免这种情况,从而强加所选点应该“
更新日期:2020-06-30
中文翻译:
社交距离对积分也有好处!
最近邻规则是一种众所周知的分类技术,给定一个标记点的训练集 P,用其在 P 中最近点的标签对任何未标记的查询点进行分类。 最近邻凝聚问题旨在减少训练集在不损害最近邻规则的准确性的情况下。FCNN 是最流行的缩合算法。它本质上是启发式的,其理论结果很少。在本文中,我们解决了是否可以证明 FCNN 选择的子集大小的合理上限的问题。首先,我们表明当点彼此太近时,算法可能表现不佳,迫使它选择比必要更多的点。然后我们成功地修改了算法以避免这种情况,从而强加所选点应该“