当前位置: X-MOL 学术BMC Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An analytical pipeline for identifying and mapping the integration sites of HIV and other retroviruses.
BMC Genomics ( IF 4.4 ) Pub Date : 2020-03-09 , DOI: 10.1186/s12864-020-6647-4
Daria W Wells 1 , Shuang Guo 1 , Wei Shao 2 , Michael J Bale 3 , John M Coffin 4 , Stephen H Hughes 3 , Xiaolin Wu 1
Affiliation  

All retroviruses, including human immunodeficiency virus (HIV), must integrate a DNA copy of their genomes into the genome of the infected host cell to replicate. Although integrated retroviral DNA, known as a provirus, can be found at many sites in the host genome, integration is not random. The adaption of linker-mediated PCR (LM-PCR) protocols for high-throughput integration site mapping, using randomly-sheared genomic DNA and Illumina paired-end sequencing, has dramatically increased the number of mapped integration sites. Analysis of samples from human donors has shown that there is clonal expansion of HIV infected cells and that clonal expansion makes an important contribution to HIV persistence. However, analysis of HIV integration sites in samples taken from patients requires extensive PCR amplification and high-throughput sequencing, which makes the methodology prone to certain specific artifacts. To address the problems with artifacts, we use a comprehensive approach involving experimental procedures linked to a bioinformatics analysis pipeline. Using this combined approach, we are able to reduce the number of PCR/sequencing artifacts that arise and identify the ones that remain. Our streamlined workflow combines random cleavage of the DNA in the samples, end repair, and linker ligation in a single step. We provide guidance on primer and linker design that reduces some of the common artifacts. We also discuss how to identify and remove some of the common artifacts, including the products of PCR mispriming and PCR recombination, that have appeared in some published studies. Our improved bioinformatics pipeline rapidly parses the sequencing data and identifies bona fide integration sites in clonally expanded cells, producing an Excel-formatted report that can be used for additional data processing. We provide a detailed protocol that reduces the prevalence of artifacts that arise in the analysis of retroviral integration site data generated from in vivo samples and a bioinformatics pipeline that is able to remove the artifacts that remain.

中文翻译:

用于识别和绘制HIV和其他逆转录病毒整合位点的分析管道。

所有逆转录病毒,包括人类免疫缺陷病毒(HIV),都必须将其基因组的DNA副本整合到受感染宿主细胞的基因组中才能复制。尽管可以在宿主基因组的许多位点发现整合的逆转录病毒DNA(称为前病毒),但整合并不是随机的。使用随机剪切的基因组DNA和Illumina的双末端测序技术,将连接子介导的PCR(LM-PCR)协议用于高通量整合位点作图,大大增加了所映射整合位点的数量。对来自人类供体的样品的分析表明,HIV感染细胞存在克隆扩增,并且克隆扩增对HIV持久性做出了重要贡献。但是,分析患者样本中的HIV整合位点需要大量的PCR扩增和高通量测序,这使得该方法易于出现某些特定的问题。为了解决人工制品的问题,我们使用了一种综合方法,该方法涉及与生物信息学分析流程链接的实验程序。使用这种组合方法,我们能够减少出现的PCR /测序假象的数量,并识别出剩余的假象。我们简化的工作流程将样品中DNA的随机切割,末端修复和连接子连接结合在一个步骤中。我们提供有关引物和接头设计的指南,以减少一些常见的伪像。我们还将讨论如何识别和消除一些已发表的研究中出现的常见伪像,包括PCR错误引物和PCR重组的产物。我们改进的生物信息学流水线可快速解析测序数据并识别克隆扩展细胞中的真正整合位点,从而生成可用于其他数据处理的Excel格式的报告。我们提供了一个详细的协议,可以减少在分析从体内样本产生的逆转录病毒整合位点数据时产生的人工产物的流行率,以及能够去除残留人工产物的生物信息学流水线。
更新日期:2020-03-09
down
wechat
bug