当前位置: X-MOL 学术BMC Med. Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Parameter, noise, and tree topology effects in tumor phylogeny inference.
BMC Medical Genomics ( IF 2.7 ) Pub Date : 2019-12-23 , DOI: 10.1186/s12920-019-0626-0
Kiran Tomlinson 1, 2 , Layla Oesper 1
Affiliation  

BACKGROUND Accurate inference of the evolutionary history of a tumor has important implications for understanding and potentially treating the disease. While a number of methods have been proposed to reconstruct the evolutionary history of a tumor from DNA sequencing data, it is not clear how aspects of the sequencing data and tumor itself affect these reconstructions. METHODS We investigate when and how well these histories can be reconstructed from multi-sample bulk sequencing data when considering only single nucleotide variants (SNVs). Specifically, we examine the space of all possible tumor phylogenies under the infinite sites assumption (ISA) using several approaches for enumerating phylogenies consistent with the sequencing data. RESULTS On noisy simulated data, we find that the ISA is often violated and that low coverage and high noise make it more difficult to identify phylogenies. Additionally, we find that evolutionary trees with branching topologies are easier to reconstruct accurately. We also apply our reconstruction methods to both chronic lymphocytic leukemia and clear cell renal cell carcinoma datasets and confirm that ISA violations are common in practice, especially in lower-coverage sequencing data. Nonetheless, we show that an ISA-based approach can be relaxed to produce high-quality phylogenies. CONCLUSIONS Consideration of practical aspects of sequencing data such as coverage or the model of tumor evolution (branching, linear, etc.) is essential to effectively using the output of tumor phylogeny inference methods. Additionally, these factors should be considered in the development of new inference methods.

中文翻译:

肿瘤系统发育推断中的参数、噪声和树拓扑效应。

背景技术准确推断肿瘤的进化史对于理解和潜在地治疗该疾病具有重要意义。虽然已经提出了多种方法来根据 DNA 测序数据重建肿瘤的进化历史,但尚不清楚测序数据和肿瘤本身的各个方面如何影响这些重建。方法 我们研究了当仅考虑单核苷酸变异(SNV)时,何时以及如何能够从多样本批量测序数据重建这些历史。具体来说,我们使用多种方法来枚举与测序数据一致的系统发育,在无限位点假设(ISA)下检查所有可能的肿瘤系统发育的空间。结果在噪声模拟数据上,我们发现 ISA 经常被违反,并且低覆盖率和高噪声使得识别系统发育变得更加困难。此外,我们发现具有分支拓扑的进化树更容易准确地重建。我们还将我们的重建方法应用于慢性淋巴细胞白血病和透明细胞肾细胞癌数据集,并确认 ISA 违规在实践中很常见,尤其是在较低覆盖率的测序数据中。尽管如此,我们表明基于 ISA 的方法可以放松以产生高质量的系统发育。结论 考虑测序数据的实际方面,例如覆盖范围或肿瘤进化模型(分支、线性等)对于有效使用肿瘤系统发育推断方法的输出至关重要。此外,在开发新的推理方法时应考虑这些因素。
更新日期:2019-12-23
down
wechat
bug