当前位置: X-MOL 学术Algorithms Mol. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Connectivity problems on heterogeneous graphs.
Algorithms for Molecular Biology ( IF 1 ) Pub Date : 2019-03-08 , DOI: 10.1186/s13015-019-0141-z
Jimmy Wu 1 , Alex Khodaverdian 2 , Benjamin Weitz 2 , Nir Yosef 2
Affiliation  

BACKGROUND Network connectivity problems are abundant in computational biology research, where graphs are used to represent a range of phenomena: from physical interactions between molecules to more abstract relationships such as gene co-expression. One common challenge in studying biological networks is the need to extract meaningful, small subgraphs out of large databases of potential interactions. A useful abstraction for this task turned out to be the Steiner Network problems: given a reference "database" graph, find a parsimonious subgraph that satisfies a given set of connectivity demands. While this formulation proved useful in a number of instances, the next challenge is to account for the fact that the reference graph may not be static. This can happen for instance, when studying protein measurements in single cells or at different time points, whereby different subsets of conditions can have different protein milieu. RESULTS AND DISCUSSION We introduce the condition Steiner Network problem in which we concomitantly consider a set of distinct biological conditions. Each condition is associated with a set of connectivity demands, as well as a set of edges that are assumed to be present in that condition. The goal of this problem is to find a minimal subgraph that satisfies all the demands through paths that are present in the respective condition. We show that introducing multiple conditions as an additional factor makes this problem much harder to approximate. Specifically, we prove that for C conditions, this new problem is NP-hard to approximate to a factor of C - ϵ , for every C ≥ 2 and ϵ > 0 , and that this bound is tight. Moving beyond the worst case, we explore a special set of instances where the reference graph grows monotonically between conditions, and show that this problem admits substantially improved approximation algorithms. We also developed an integer linear programming solver for the general problem and demonstrate its ability to reach optimality with instances from the human protein interaction network. CONCLUSION Our results demonstrate that in contrast to most connectivity problems studied in computational biology, accounting for multiplicity of biological conditions adds considerable complexity, which we propose to address with a new solver. Importantly, our results extend to several network connectivity problems that are commonly used in computational biology, such as Prize-Collecting Steiner Tree, and provide insight into the theoretical guarantees for their applications in a multiple condition setting.

中文翻译:

异构图上的连通性问题。

背景技术网络连接问题在计算生物学研究中非常普遍,其中图表用于表示一系列现象:从分子之间的物理相互作用到更抽象的关系,例如基因共表达。研究生物网络的一个常见挑战是需要从潜在相互作用的大型数据库中提取有意义的小子图。这个任务的一个有用的抽象是施泰纳网络问题:给定一个参考“数据库”图,找到一个满足给定连接需求集的简约子图。虽然这个公式在许多情况下被证明是有用的,但下一个挑战是考虑参考图可能不是静态的这一事实。例如,这可能发生,在研究单个细胞或不同时间点的蛋白质测量时,不同的条件子集可能具有不同的蛋白质环境。结果与讨论 我们介绍了条件施泰纳网络问题,我们同时考虑了一组不同的生物学条件。每个条件都与一组连通性需求以及一组假定存在于该条件下的边相关联。这个问题的目标是找到一个最小子图,通过相应条件中存在的路径满足所有需求。我们表明,引入多个条件作为附加因素会使这个问题更难近似。具体来说,我们证明对于 C 条件,对于每个 C ≥ 2 和 ϵ > 0,这个新问题是 NP-hard 近似为 C - ϵ 的因子,而且这个界限很紧。超越最坏的情况,我们探索了一组特殊的实例,其中参考图在条件之间单调增长,并表明该问题允许显着改进的近似算法。我们还为一般问题开发了一个整数线性规划求解器,并证明了它能够通过来自人类蛋白质相互作用网络的实例达到最优。结论 我们的结果表明,与计算生物学中研究的大多数连通性问题相比,考虑生物条件的多样性会增加相当大的复杂性,我们建议使用新的求解器来解决这个问题。重要的是,我们的结果扩展到计算生物学中常用的几个网络连接问题,例如奖品收集施泰纳树,
更新日期:2019-11-01
down
wechat
bug