当前位置: X-MOL 学术Microb. Genom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
prewas: data pre-processing for more informative bacterial GWAS.
Microbial Genomics ( IF 3.9 ) Pub Date : 2020-04-20 , DOI: 10.1099/mgen.0.000368
Katie Saund 1 , Zena Lapp 2 , Stephanie N Thiede 1 , Ali Pirani 1 , Evan S Snitkin 1, 3
Affiliation  

While variant identification pipelines are becoming increasingly standardized, less attention has been paid to the pre-processing of variants prior to their use in bacterial genome-wide association studies (bGWAS). Three nuances of variant pre-processing that impact downstream identification of genetic associations include the separation of variants at multiallelic sites, separation of variants in overlapping genes, and referencing of variants relative to ancestral alleles. Here we demonstrate the importance of these variant pre-processing steps on diverse bacterial genomic datasets and present prewas, an R package, that standardizes the pre-processing of multiallelic sites, overlapping genes, and reference alleles before bGWAS. This package facilitates improved reproducibility and interpretability of bGWAS results. prewas enables users to extract maximal information from bGWAS by implementing multi-line representation for multiallelic sites and variants in overlapping genes. prewas outputs a binary SNP matrix that can be used for SNP-based bGWAS and will prevent the masking of minor alleles during bGWAS analysis. The optional binary gene matrix output can be used for gene-based bGWAS, which will enable users to maximize the power and evolutionary interpretability of their bGWAS studies. prewas is available for download from GitHub.

中文翻译:

以前:数据预处理以获取更多信息的细菌GWAS。

尽管变体识别流程变得越来越标准化,但在将变体用于细菌全基因组关联研究(bGWAS)之前,对变体的预处理的关注较少。影响遗传关联的下游识别的变体预处理的三个细微差别包括:在多等位基因位点处分离变体,分离重叠基因中的变体以及参考相对于祖先等位基因的变体。在这里,我们证明了这些变异的预处理步骤对各种细菌基因组数据集的重要性,并提出了prewas,一个R包,用于标准化多等位基因位点,重叠基因和bGWAS之前的参考等位基因的预处理。该软件包有助于提高bGWAS结果的可重复性和可解释性。prewas通过对重叠基因中的多等位基因位点和变体实施多线表示,使用户能够从bGWAS中提取最大信息。prewas输出一个二进制SNP矩阵,该矩阵可用于基于SNP的bGWAS,并将防止在bGWAS分析过程中掩盖次要等位基因。可选的二进制基因矩阵输出可用于基于基因的bGWAS,这将使用户能够最大程度地利用其bGWAS研究的能力和进化解释性。可从GitHub下载prewas。可选的二进制基因矩阵输出可用于基于基因的bGWAS,这将使用户能够最大程度地利用其bGWAS研究的能力和进化解释性。可从GitHub下载prewas。可选的二进制基因矩阵输出可用于基于基因的bGWAS,这将使用户能够最大化其bGWAS研究的能力和进化解释性。可从GitHub下载prewas。
更新日期:2020-04-20
down
wechat
bug