当前位置: X-MOL 学术BMC Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
mixIndependR: a R package for statistical independence testing of loci in database of multi-locus genotypes
BMC Bioinformatics ( IF 3 ) Pub Date : 2021-01-06 , DOI: 10.1186/s12859-020-03945-0
Bing Song , August E. Woerner , John Planz

Multi-locus genotype data are widely used in population genetics and disease studies. In evaluating the utility of multi-locus data, the independence of markers is commonly considered in many genomic assessments. Generally, pairwise non-random associations are tested by linkage disequilibrium; however, the dependence of one panel might be triplet, quartet, or other. Therefore, a compatible and user-friendly software is necessary for testing and assessing the global linkage disequilibrium among mixed genetic data. This study describes a software package for testing the mutual independence of mixed genetic datasets. Mutual independence is defined as no non-random associations among all subsets of the tested panel. The new R package “mixIndependR” calculates basic genetic parameters like allele frequency, genotype frequency, heterozygosity, Hardy–Weinberg equilibrium, and linkage disequilibrium (LD) by mutual independence from population data, regardless of the type of markers, such as simple nucleotide polymorphisms, short tandem repeats, insertions and deletions, and any other genetic markers. A novel method of assessing the dependence of mixed genetic panels is developed in this study and functionally analyzed in the software package. By comparing the observed distribution of two common summary statistics (the number of heterozygous loci [K] and the number of share alleles [X]) with their expected distributions under the assumption of mutual independence, the overall independence is tested. The package “mixIndependR” is compatible to all categories of genetic markers and detects the overall non-random associations. Compared to pairwise disequilibrium, the approach described herein tends to have higher power, especially when number of markers is large. With this package, more multi-functional or stronger genetic panels can be developed, like mixed panels with different kinds of markers. In population genetics, the package “mixIndependR” makes it possible to discover more about admixture of populations, natural selection, genetic drift, and population demographics, as a more powerful method of detecting LD. Moreover, this new approach can optimize variants selection in disease studies and contribute to panel combination for treatments in multimorbidity. Application of this approach in real data is expected in the future, and this might bring a leap in the field of genetic technology. The R package mixIndependR, is available on the Comprehensive R Archive Network (CRAN) at: https://cran.r-project.org/web/packages/mixIndependR/index.html .

中文翻译:

mixIndependR:R包,用于在多基因座基因型数据库中对基因座进行统计独立性测试

多基因座基因型数据被广泛用于群体遗传学和疾病研究。在评估多基因座数据的效用时,通常在许多基因组评估中考虑标记的独立性。通常,成对的非随机关联通过连锁不平衡进行测试;但是,一个面板的依赖性可能是三重奏,四重奏或其他。因此,需要一个兼容且用户友好的软件来测试和评估混合遗传数据之间的全局连锁不平衡。这项研究描述了用于测试混合遗传数据集的相互独立性的软件包。相互独立定义为在测试小组的所有子集中没有非随机关联。新的R包“ mixIndependR”可计算基本遗传参数,例如等位基因频率,基因型频率,杂合性,与种群数据互不依赖而产生的Hardy-Weinberg平衡和连锁不平衡(LD),无论标记的类型如何,例如简单的核苷酸多态性,短串联重复序列,插入和缺失以及任何其他遗传标记。在这项研究中开发了一种评估混合遗传小组的依赖性的新方法,并在软件包中进行了功能分析。通过比较两个共同的摘要统计数据(杂合基因座数[K]和共有等位基因数[X])与它们在相互独立性假设下的预期分布之间的分布,测试了整体独立性。包“ mixIndependR”与所有类别的遗传标记兼容,并检测总体的非随机关联。与成对不平衡相比,因此,本文所述的方法倾向于具有较高的功效,尤其是当标记物的数量较大时。使用此软件包,可以开发出更多功能或更强的遗传检测面板,例如带有不同标记的混合检测面板。在种群遗传学中​​,“ mixIndependR”软件包使人们有可能发现更多有关种群混合,自然选择,遗传漂移和人口统计学的信息,这是检测LD的更有效方法。此外,这种新方法可以优化疾病研究中的变体选择,并有助于针对多发病率的治疗进行面板组合。有望在未来将这种方法应用于实际数据中,这可能会在基因技术领域带来飞跃。R包mixIndependR可在综合R存档网络(CRAN)上找到,网址为:https://cran.r-project。
更新日期:2021-01-07
down
wechat
bug