当前位置: X-MOL 学术Algorithms Mol. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Implications of non-uniqueness in phylogenetic deconvolution of bulk DNA samples of tumors.
Algorithms for Molecular Biology ( IF 1 ) Pub Date : 2019-09-03 , DOI: 10.1186/s13015-019-0155-6
Yuanyuan Qi 1 , Dikshant Pradhan 2 , Mohammed El-Kebir 1
Affiliation  

BACKGROUND Tumors exhibit extensive intra-tumor heterogeneity, the presence of groups of cellular populations with distinct sets of somatic mutations. This heterogeneity is the result of an evolutionary process, described by a phylogenetic tree. In addition to enabling clinicians to devise patient-specific treatment plans, phylogenetic trees of tumors enable researchers to decipher the mechanisms of tumorigenesis and metastasis. However, the problem of reconstructing a phylogenetic tree T given bulk sequencing data from a tumor is more complicated than the classic phylogeny inference problem. Rather than observing the leaves of T directly, we are given mutation frequencies that are the result of mixtures of the leaves of T. The majority of current tumor phylogeny inference methods employ the perfect phylogeny evolutionary model. The underlying Perfect Phylogeny Mixture (PPM) combinatorial problem typically has multiple solutions. RESULTS We prove that determining the exact number of solutions to the PPM problem is #P-complete and hard to approximate within a constant factor. Moreover, we show that sampling solutions uniformly at random is hard as well. On the positive side, we provide a polynomial-time computable upper bound on the number of solutions and introduce a simple rejection-sampling based scheme that works well for small instances. Using simulated and real data, we identify factors that contribute to and counteract non-uniqueness of solutions. In addition, we study the sampling performance of current methods, identifying significant biases. CONCLUSIONS Awareness of non-uniqueness of solutions to the PPM problem is key to drawing accurate conclusions in downstream analyses based on tumor phylogenies. This work provides the theoretical foundations for non-uniqueness of solutions in tumor phylogeny inference from bulk DNA samples.

中文翻译:

非唯一性对肿瘤大量 DNA 样本的系统发育去卷积的影响。

背景技术肿瘤表现出广泛的肿瘤内异质性,即存在具有不同体细胞突变组的细胞群群。这种异质性是进化过程的结果,由系统发育树描述。除了使临床医生能够制定针对患者的治疗计划外,肿瘤的系统发育树还使研究人员能够破译肿瘤发生和转移的机制。然而,在给定来自肿瘤的大量测序数据的情况下重建系统发育树 T 的问题比经典的系统发育推断问题更复杂。我们不是直接观察 T 的叶子,而是给定突变频率,这是 T 的叶子混合的结果。当前的大多数肿瘤系统发育推断方法采用完美的系统发育进化模型。潜在的完美系统发育混合物 (PPM) 组合问题通常有多种解决方案。结果 我们证明,确定 PPM 问题解决方案的确切数量是#P 完全的,并且难以在常数因子内近似。此外,我们表明随机均匀地采样解决方案也很困难。从积极的方面来说,我们提供了一个多项式时间可计算的解决方案数量上限,并引入了一个简单的基于拒绝抽样的方案,该方案适用于小型实例。使用模拟数据和真实数据,我们确定了促成和抵消解决方案非唯一性的因素。此外,我们研究了当前方法的采样性能,确定了显着的偏差。结论 认识到 PPM 问题解决方案的非唯一性是在基于肿瘤系统发育的下游分析中得出准确结论的关键。这项工作为从大量 DNA 样本中推断肿瘤系统发育的解决方案的非唯一性提供了理论基础。
更新日期:2019-11-01
down
wechat
bug