当前位置: X-MOL 学术Genome Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Removing reference bias and improving indel calling in ancient DNA data analysis by mapping to a sequence variation graph
Genome Biology ( IF 12.3 ) Pub Date : 2020-09-17 , DOI: 10.1186/s13059-020-02160-7
Rui Martiniano 1 , Erik Garrison 2, 3 , Eppie R Jones 4 , Andrea Manica 4 , Richard Durbin 1, 2
Affiliation  

Background During the last decade, the analysis of ancient DNA (aDNA) sequence has become a powerful tool for the study of past human populations. However, the degraded nature of aDNA means that aDNA molecules are short and frequently mutated by post-mortem chemical modifications. These features decrease read mapping accuracy and increase reference bias, in which reads containing non-reference alleles are less likely to be mapped than those containing reference alleles. Alternative approaches have been developed to replace the linear reference with a variation graph which includes known alternative variants at each genetic locus. Here, we evaluate the use of variation graph software vg to avoid reference bias for aDNA and compare with existing methods. Results We use vg to align simulated and real aDNA samples to a variation graph containing 1000 Genome Project variants and compare with the same data aligned with bwa to the human linear reference genome. Using vg leads to a balanced allelic representation at polymorphic sites, effectively removing reference bias, and more sensitive variant detection in comparison with bwa , especially for insertions and deletions (indels). Alternative approaches that use relaxed bwa parameter settings or filter bwa alignments can also reduce bias but can have lower sensitivity than vg , particularly for indels. Conclusions Our findings demonstrate that aligning aDNA sequences to variation graphs effectively mitigates the impact of reference bias when analyzing aDNA, while retaining mapping sensitivity and allowing detection of variation, in particular indel variation, that was previously missed.

中文翻译:

通过映射到序列变异图来消除参考偏差并改善古代 DNA 数据分析中的 indel 调用

背景 在过去十年中,古代 DNA (aDNA) 序列分析已成为研究过去人类种群的有力工具。然而,aDNA 的降解性质意味着 aDNA 分子很短,并且经常因死后化学修饰而发生突变。这些特征降低了读取映射的准确性并增加了参考偏差,其中包含非参考等位基因的读取比包含参考等位基因的读取更不可能被映射。已经开发出替代方法来用变异图代替线性参考,该变异图包括每个遗传位点的已知替代变体。在这里,我们评估变异图软件 vg 的使用,以避免 aDNA 的参考偏差,并与现有方法进行比较。结果我们使用 vg 将模拟和真实的 aDNA 样本与包含 1000 个基因组计划变体的变异图对齐,并将与 bwa 对齐的相同数据与人类线性参考基因组进行比较。与 bwa 相比,使用 vg 可在多态性位点实现平衡的等位基因表示,有效消除参考偏差,并更灵敏地检测变异,尤其是插入和缺失(indel)。使用宽松 bwa 参数设置或过滤 bwa 对齐的替代方法也可以减少偏差,但灵敏度可能低于 vg ,特别是对于插入缺失。结论我们的研究结果表明,在分析 aDNA 时,将 aDNA 序列与变异图对齐可有效减轻参考偏差的影响,同时保持作图敏感性并允许检测变异,
更新日期:2020-09-17
down
wechat
bug