当前位置: X-MOL 学术arXiv.cs.CG › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Linear-Time Approximation Scheme for k-Means Clustering of Affine Subspaces
arXiv - CS - Computational Geometry Pub Date : 2021-06-27 , DOI: arxiv-2106.14176
Kyungjin Cho, Eunjin Oh

In this paper, we present a linear-time approximation scheme for $k$-means clustering of \emph{incomplete} data points in $d$-dimensional Euclidean space. An \emph{incomplete} data point with $\Delta>0$ unspecified entries is represented as an axis-parallel affine subspaces of dimension $\Delta$. The distance between two incomplete data points is defined as the Euclidean distance between two closest points in the axis-parallel affine subspaces corresponding to the data points. We present an algorithm for $k$-means clustering of axis-parallel affine subspaces of dimension $\Delta$ that yields an $(1+\epsilon)$-approximate solution in $O(nd)$ time. The constants hidden behind $O(\cdot)$ depend only on $\Delta, \epsilon$ and $k$. This improves the $O(n^2 d)$-time algorithm by Eiben et al.[SODA'21] by a factor of $n$.

中文翻译:

仿射子空间k均值聚类的线性时间近似方案

在本文中,我们提出了一个线性时间近似方案,用于在 $d$ 维欧几里得空间中对 \emph{不完整} 数据点进行 $k$-means 聚类。具有 $\Delta>0$ 未指定条目的 \emph{incomplete} 数据点表示为维度 $\Delta$ 的轴平行仿射子空间。两个不完整数据点之间的距离定义为数据点对应的轴平行仿射子空间中两个最近点之间的欧几里德距离。我们提出了一种算法,用于对维度为 $\Delta$ 的轴平行仿射子空间进行 $k$-means 聚类,该算法在 $O(nd)$ 时间内产生 $(1+\epsilon)$-近似解。隐藏在 $O(\cdot)$ 后面的常量仅取决于 $\Delta、\epsilon$ 和 $k$。这将 Eiben 等人 [SODA'21] 的 $O(n^2 d)$-time 算法改进了 $n$。
更新日期:2021-06-29
down
wechat
bug