Hardness of Approximation of (Multi-)LCS over Small Alphabet,arXiv - CS - Computational Complexity

当前位置： X-MOL 学术 › arXiv.cs.CC › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Hardness of Approximation of (Multi-)LCS over Small Alphabet
arXiv - CS - Computational Complexity Pub Date : 2020-06-24 , DOI: arxiv-2006.13449
Amey Bhangale, Diptarka Chakraborty, Rajendra Kumar

The problem of finding longest common subsequence (LCS) is one of the fundamental problems in computer science, which finds application in fields such as computational biology, text processing, information retrieval, data compression etc. It is well known that (decision version of) the problem of finding the length of a LCS of an arbitrary number of input sequences (which we refer to as Multi-LCS problem) is NP-complete. Jiang and Li [SICOMP'95] showed that if Max-Clique is hard to approximate within a factor of $s$ then Multi-LCS is also hard to approximate within a factor of $\Theta(s)$. By the NP-hardness of the problem of approximating Max-Clique by Zuckerman [ToC'07], for any constant $\delta>0$, the length of a LCS of arbitrary number of input sequences of length $n$ each, cannot be approximated within an $n^{1-\delta}$-factor in polynomial time unless {\tt{P}}$=${\NP}. However, the reduction of Jiang and Li assumes the alphabet size to be $\Omega(n)$. So far no hardness result is known for the problem of approximating Multi-LCS over sub-linear sized alphabet. On the other hand, it is easy to get $1/|\Sigma|$-factor approximation for strings of alphabet $\Sigma$. In this paper, we make a significant progress towards proving hardness of approximation over small alphabet by showing a polynomial-time reduction from the well-studied \emph{densest $k$-subgraph} problem with {\em perfect completeness} to approximating Multi-LCS over alphabet of size $poly(n/k)$. As a consequence, from the known hardness result of densest $k$-subgraph problem (e.g. [Manurangsi, STOC'17]) we get that no polynomial-time algorithm can give an $n^{-o(1)}$-factor approximation of Multi-LCS over an alphabet of size $n^{o(1)}$, unless the Exponential Time Hypothesis is false.

中文翻译：

小字母表上（多）LCS 逼近的硬度

寻找最长公共子序列（LCS）的问题是计算机科学中的基本问题之一，在计算生物学、文本处理、信息检索、数据压缩等领域都有应用。众所周知（决策版本）找到任意数量输入序列的 LCS 长度的问题（我们称之为 Multi-LCS 问题）是 NP 完全的。Jiang 和 Li [SICOMP'95] 表明，如果 Max-Clique 很难在 $s$ 的因子内近似，那么 Multi-LCS 也很难在 $\Theta(s)$ 的因子内近似。根据 Zuckerman [ToC'07] 逼近 Max-Clique 问题的 NP 难度，对于任何常数 $\delta>0$，每个长度为 $n$ 的任意数量输入序列的 LCS 的长度，除非 {\tt{P}}$=${\NP}，否则不能在多项式时间内在 $n^{1-\delta}$-factor 内近似。然而，Jiang 和 Li 的减少假设字母大小为 $\Omega(n)$。到目前为止，关于在亚线性大小的字母表上逼近 Multi-LCS 的问题，还没有已知的硬度结果。另一方面，对于字母 $\Sigma$ 的字符串，很容易得到 $1/|\Sigma|$-factor 近似值。在本文中，我们通过展示多项式时间减少从经过充分研究的 \emph{densest $k$-subgraph} 问题与 {\em 完美完备性} 到逼近 Multi -LCS 在大小为 $poly(n/k)$ 的字母表上。因此，从已知的最密集 $k$-子图问题的硬度结果（例如 [Manurangsi, STOC'

更新日期：2020-06-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>