当前位置: X-MOL 学术arXiv.cs.DM › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Near-Optimal Average-Case Approximate Trace Reconstruction from Few Traces
arXiv - CS - Discrete Mathematics Pub Date : 2021-07-24 , DOI: arxiv-2107.11530
Xi Chen, Anindya De, Chin Ho Lee, Rocco A. Servedio, Sandip Sinha

In the standard trace reconstruction problem, the goal is to \emph{exactly} reconstruct an unknown source string $\mathsf{x} \in \{0,1\}^n$ from independent "traces", which are copies of $\mathsf{x}$ that have been corrupted by a $\delta$-deletion channel which independently deletes each bit of $\mathsf{x}$ with probability $\delta$ and concatenates the surviving bits. We study the \emph{approximate} trace reconstruction problem, in which the goal is only to obtain a high-accuracy approximation of $\mathsf{x}$ rather than an exact reconstruction. We give an efficient algorithm, and a near-matching lower bound, for approximate reconstruction of a random source string $\mathsf{x} \in \{0,1\}^n$ from few traces. Our main algorithmic result is a polynomial-time algorithm with the following property: for any deletion rate $0 < \delta < 1$ (which may depend on $n$), for almost every source string $\mathsf{x} \in \{0,1\}^n$, given any number $M \leq \Theta(1/\delta)$ of traces from $\mathrm{Del}_\delta(\mathsf{x})$, the algorithm constructs a hypothesis string $\widehat{\mathsf{x}}$ that has edit distance at most $n \cdot (\delta M)^{\Omega(M)}$ from $\mathsf{x}$. We also prove a near-matching information-theoretic lower bound showing that given $M \leq \Theta(1/\delta)$ traces from $\mathrm{Del}_\delta(\mathsf{x})$ for a random $n$-bit string $\mathsf{x}$, the smallest possible expected edit distance that any algorithm can achieve, regardless of its running time, is $n \cdot (\delta M)^{O(M)}$.

中文翻译:

从少量迹线重建接近最优的平均情况近似迹线

在标准的轨迹重建问题中,目标是从独立的“轨迹”中 \emph{exactly} 重建一个未知的源字符串 $\mathsf{x} \in \{0,1\}^n$,它们是 $ \mathsf{x}$ 已被 $\delta$-deletion 通道破坏,该通道以 $\delta$ 的概率独立删除 $\mathsf{x}$ 的每一位并连接幸存的位。我们研究了 \emph{approximate} 迹重建问题,其中目标只是获得 $\mathsf{x}$ 的高精度近似值,而不是精确的重建。我们给出了一个有效的算法和一个近似匹配的下界,用于从少数轨迹近似重建随机源字符串 $\mathsf{x} \in \{0,1\}^n$。我们的主要算法结果是具有以下性质的多项式时间算法:对于任何删除率 $0 < \delta < 1$(可能取决于 $n$),对于几乎每个源字符串 $\mathsf{x} \in \{0,1\}^n$,给定任何数字 $M \leq \Theta(1/\delta )$ 来自 $\mathrm{Del}_\delta(\mathsf{x})$ 的轨迹,该算法构造一个假设字符串 $\widehat{\mathsf{x}}$,其编辑距离至多为 $n \cdot (\delta M)^{\Omega(M)}$ 来自 $\mathsf{x}$。我们还证明了一个接近匹配的信息理论下界,表明给定的 $M \leq \Theta(1/\delta)$ 跟踪 $\mathrm{Del}_\delta(\mathsf{x})$ $n$-bit 字符串 $\mathsf{x}$,任何算法可以达到的最小可能预期编辑距离,无论其运行时间如何,都是 $n \cdot (\delta M)^{O(M)}$ . 该算法构造了一个假设字符串 $\widehat{\mathsf{x}}$,它与 $\mathsf{x}$ 的编辑距离至多为 $n \cdot (\delta M)^{\Omega(M)}$。我们还证明了一个接近匹配的信息理论下界,表明给定的 $M \leq \Theta(1/\delta)$ 跟踪 $\mathrm{Del}_\delta(\mathsf{x})$ $n$-bit 字符串 $\mathsf{x}$,任何算法可以达到的最小可能预期编辑距离,无论其运行时间如何,都是 $n \cdot (\delta M)^{O(M)}$ . 该算法构造了一个假设字符串 $\widehat{\mathsf{x}}$,它与 $\mathsf{x}$ 的编辑距离至多为 $n \cdot (\delta M)^{\Omega(M)}$。我们还证明了一个接近匹配的信息理论下界,表明给定的 $M \leq \Theta(1/\delta)$ 跟踪 $\mathrm{Del}_\delta(\mathsf{x})$ $n$-bit 字符串 $\mathsf{x}$,任何算法可以达到的最小可能预期编辑距离,无论其运行时间如何,都是 $n \cdot (\delta M)^{O(M)}$ .
更新日期:2021-07-27
down
wechat
bug