当前位置: X-MOL 学术Bioinformatics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Full-length de novo viral quasispecies assembly through variation graph construction.
Bioinformatics ( IF 5.8 ) Pub Date : 2019-12-15 , DOI: 10.1093/bioinformatics/btz443
Jasmijn A Baaijens 1 , Bastiaan Van der Roest 2 , Johannes Köster 3, 4 , Leen Stougie 1, 5, 6 , Alexander Schönhuth 1, 6, 7
Affiliation  

MOTIVATION Viruses populate their hosts as a viral quasispecies: a collection of genetically related mutant strains. Viral quasispecies assembly is the reconstruction of strain-specific haplotypes from read data, and predicting their relative abundances within the mix of strains is an important step for various treatment-related reasons. Reference genome independent ('de novo') approaches have yielded benefits over reference-guided approaches, because reference-induced biases can become overwhelming when dealing with divergent strains. While being very accurate, extant de novo methods only yield rather short contigs. The remaining challenge is to reconstruct full-length haplotypes together with their abundances from such contigs. RESULTS We present Virus-VG as a de novo approach to viral haplotype reconstruction from preassembled contigs. Our method constructs a variation graph from the short input contigs without making use of a reference genome. Then, to obtain paths through the variation graph that reflect the original haplotypes, we solve a minimization problem that yields a selection of maximal-length paths that is, optimal in terms of being compatible with the read coverages computed for the nodes of the variation graph. We output the resulting selection of maximal length paths as the haplotypes, together with their abundances. Benchmarking experiments on challenging simulated and real datasets show significant improvements in assembly contiguity compared to the input contigs, while preserving low error rates compared to the state-of-the-art viral quasispecies assemblers. AVAILABILITY AND IMPLEMENTATION Virus-VG is freely available at https://bitbucket.org/jbaaijens/virus-vg. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

中文翻译:

通过变异图构建进行全长从头病毒准种组装。

动机病毒在宿主中以病毒准种的形式存在:一系列遗传相关的突变株。病毒准种装配是根据读取的数据重建特定菌株的单倍型,并且由于各种与治疗相关的原因,预测它们在混合菌株中的相对丰度是重要的一步。与参考基因组无关的方法(“从头”)比参考指导方法产生了好处,因为当处理不同菌株时,参考诱导的偏倚会变得不堪重负。尽管非常准确,但现有的从头方法仅产生相当短的重叠群。剩下的挑战是从这些重叠群中重建全长单倍型及其丰度。结果我们提出病毒-VG作为从预装配的重叠群重建病毒单倍型的从头方法。我们的方法无需使用参考基因​​组即可从短输入重叠群构建变异图。然后,要通过变异图获得反映原始单倍型的路径,我们解决了一个最小化问题,该问题产生了一个最大长度的路径选择,该路径在与为变异图的节点计算的读取覆盖范围兼容方面是最佳的。我们将最大长度路径的结果选择输出为单倍型及其丰度。在具有挑战性的模拟数据集和真实数据集上进行的基准测试表明,与输入重叠群相比,汇编连续性有了显着改善,而与最新型的病毒准种汇编程序相比,保留了较低的错误率。可用性和实施​​Virus-VG可从https:// bitbucket免费获得。组织/ jbaaijens / virus-vg。补充信息补充数据可从Bioinformatics在线获得。
更新日期:2020-01-13
down
wechat
bug