当前位置: X-MOL 学术Mol. Ecol. Resour. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dsuite ‐ Fast D‐statistics and related admixture evidence from VCF files
Molecular Ecology Resources ( IF 5.5 ) Pub Date : 2020-10-04 , DOI: 10.1111/1755-0998.13265
Milan Malinsky 1 , Michael Matschiner 2, 3 , Hannes Svardal 4, 5
Affiliation  

Patterson's D, also known as the ABBA‐BABA statistic, and related statistics such as the f4‐ratio, are commonly used to assess evidence of gene flow between populations or closely related species. Currently available implementations often require custom file formats, implement only small subsets of the available statistics, and are impractical to evaluate all gene flow hypotheses across data sets with many populations or species due to computational inefficiencies. Here, we present a new software package Dsuite, an efficient implementation allowing genome scale calculations of the D and f4‐ratio statistics across all combinations of tens or hundreds of populations or species directly from a variant call format (VCF) file. Our program also implements statistics suited for application to genomic windows, providing evidence of whether introgression is confined to specific loci, and it can also aid in interpretation of a system of f4‐ratio results with the use of the “f‐branch” method. Dsuite is available at https://github.com/millanek/Dsuite, is straightforward to use, substantially more computationally efficient than comparable programs, and provides a convenient suite of tools and statistics, including some not previously available in any software package. Thus, Dsuite facilitates the assessment of evidence for gene flow, especially across larger genomic data sets.

中文翻译:


Dsuite ‐ 来自 VCF 文件的快速 D 统计和相关混合证据



Patterson's D ,也称为 ABBA-BABA 统计量,以及f 4比率等相关统计量,通常用于评估种群或密切相关物种之间基因流动的证据。目前可用的实现通常需要自定义文件格式,仅实现可用统计数据的一小部分,并且由于计算效率低下,评估具有许多种群或物种的数据集的所有基因流假设是不切实际的。在这里,我们提出了一个新的软件包Dsuite ,这是一种有效的实现,允许直接从变体调用格式 (VCF) 文件对数十或数百个种群或物种的所有组合的Df 4比率统计数据进行基因组规模计算。我们的程序还实现了适合应用于基因组窗口的统计数据,提供了基因渗入是否仅限于特定基因座的证据,并且它还可以使用“ f分支”方法帮助解释f 4比率结果系统。 Dsuite可以在 https://github.com/millanek/Dsuite 上找到,使用起来很简单,比同类程序的计算效率更高,并且提供了一套方便的工具和统计数据,包括一些以前在任何软件包中都没有的工具和统计数据。因此, Dsuite有助于评估基因流的证据,尤其是在更大的基因组数据集上。
更新日期:2020-10-04
down
wechat
bug