当前位置: X-MOL 学术Mol. Ecol. Resour. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SambaR: An R package for fast, easy and reproducible population‐genetic analyses of biallelic SNP data sets
Molecular Ecology Resources ( IF 5.5 ) Pub Date : 2021-01-27 , DOI: 10.1111/1755-0998.13339
Menno J de Jong 1, 2 , Joost F de Jong 3 , A Rus Hoelzel 1 , Axel Janke 2, 4, 5
Affiliation  

SNP data sets can be used to infer a wealth of information about natural populations, including information about their structure, genetic diversity, and the presence of loci under selection. However, SNP data analysis can be a time‐consuming and challenging process, not in the least because at present many different software packages are needed to execute and depict the wide variety of mainstream population‐genetic analyses. Here, we present SambaR, an integrative and user‐friendly R package which automates and simplifies quality control and population‐genetic analyses of biallelic SNP data sets. SambaR allows users to perform mainstream population‐genetic analyses and to generate a wide variety of ready to publish graphs with a minimum number of commands (less than 10). These wrapper commands call functions of existing packages (including adegenet, ape, LEA, poppr, pcadapt and StAMPP) as well as new tools uniquely implemented in SambaR. We tested SambaR on online available SNP data sets and found that SambaR can process data sets of over 100,000 SNPs and hundreds of individuals within hours, given sufficient computing power. Newly developed tools implemented in SambaR facilitate optimization of filter settings, objective interpretation of ordination analyses, enhance comparability of diversity estimates from reduced representation library SNP data sets, and generate reduced SNP panels and structure‐like plots with Bayesian population assignment probabilities. SambaR facilitates rapid population genetic analyses on biallelic SNP data sets by removing three major time sinks: file handling, software learning, and data plotting. In addition, SambaR provides a convenient platform for SNP data storage and management, as well as several new utilities, including guidance in setting appropriate data filters. The SambaR source script, manual and example data set are distributed through GitHub: https://github.com/mennodejong1986/SambaR.

中文翻译:

SambaR:一个 R 包,用于对双等位基因 SNP 数据集进行快速、简单和可重复的群体遗传分析

SNP 数据集可用于推断有关自然种群的大量信息,包括有关其结构、遗传多样性和选择位点存在的信息。然而,SNP 数据分析可能是一个耗时且具有挑战性的过程,尤其是因为目前需要许多不同的软件包来执行和描述各种各样的主流群体遗传分析。在这里,我们展示了 SambaR,这是一个集成且用户友好的 R 包,它可以自动化和简化双等位基因 SNP 数据集的质量控制和群体遗传分析。SambaR 允许用户执行主流的种群遗传分析,并使用最少数量的命令(少于 10 个)生成各种准备发布的图表。这些包装器命令调用现有包(包括adegenet、ape、LEA、poppr、pcadapt 和StAMPP)的函数以及在SambaR 中独特实现的新工具。我们在在线可用的 SNP 数据集上测试了 SambaR,发现如果有足够的计算能力,SambaR 可以在数小时内处理超过 100,000 个 SNP 和数百个人的数据集。在 SambaR 中实施的新开发工具促进了过滤器设置的优化、排序分析的客观解释、增强了来自简化表示库 SNP 数据集的多样性估计的可比性,并生成了简化的 SNP 面板和具有贝叶斯种群分配概率的结构样图。SambaR 通过消除三个主要的时间汇来促进对双等位基因 SNP 数据集的快速群体遗传分析:文件处理、软件学习、和数据绘图。此外,SambaR 为 SNP 数据存储和管理提供了一个方便的平台,以及几个新的实用程序,包括设置适当数据过滤器的指导。SambaR 源脚本、手册和示例数据集通过 GitHub 分发:https://github.com/mennodejong1986/SambaR。
更新日期:2021-01-27
down
wechat
bug