当前位置: X-MOL 学术Genes Genom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
GEN2VCF: a converter for human genome imputation output format to VCF format.
Genes & Genomics ( IF 2.1 ) Pub Date : 2020-08-16 , DOI: 10.1007/s13258-020-00982-0
Dong Mun Shin 1, 2 , Mi Yeong Hwang 1 , Bong-Jo Kim 1 , Keun Ho Ryu 2, 3 , Young Jin Kim 1
Affiliation  

Background

For a genome-wide association study in humans, genotype imputation is an essential analysis tool for improving association mapping power. When IMPUTE software is used for imputation analysis, an imputation output (GEN format) should be converted to variant call format (VCF) with imputed genotype dosage for association analysis. However, the conversion requires multiple software packages in a pipeline with a large amount of processing time.

Objective

We developed GEN2VCF, a fast and convenient GEN format to VCF conversion tool with dosage support.

Methods

The performance of GEN2VCF was compared to BCFtools, QCTOOL, and Oncofunco. The test data set was a 1 Mb GEN-formatted file of 5000 samples. To determine the performance of various sample sizes, tests were performed from 1000 to 5000 samples with a step size of 1000. Runtime and memory usage were used as performance measures.

Results

GEN2VCF showed drastically increased performances with respect to runtime and memory usage. Runtime and memory usage of GEN2VCF was at least 1.4- and 7.4-fold lower compared to other methods, respectively.

Conclusions

GEN2VCF provides users with efficient conversion from GEN format to VCF with the best-guessed genotype, genotype posterior probabilities, and genotype dosage, as well as great flexibility in implementation with other software packages in a pipeline.



中文翻译:

GEN2VCF:人类基因组插补输出格式到VCF格式的转换器。

背景

对于人类的全基因组关联研究,基因型插补是提高关联映射能力的重要分析工具。当使用 IMPUTE 软件进行插补分析时,应将插补输出(GEN 格式)转换为带有插补基因型剂量的变异调用格式 (VCF),以进行关联分析。然而,转换需要在一个管道中有大量的处理时间的多个软件包。

客观的

我们开发了 GEN2VCF,这是一种快速方便的 GEN 格式到 VCF 转换工具,具有剂量支持。

方法

GEN2VCF 的性能与 BCFtools、QCTOOL 和 Oncofunco 进行了比较。测试数据集是包含 5000 个样本的 1 Mb GEN 格式文件。为了确定各种样本大小的性能,我们对 1000 到 5000 个样本进行了测试,步长为 1000。运行时间和内存使用量被用作性能度量。

结果

GEN2VCF 在运行时和内存使用方面显示出显着提高的性能。与其他方法相比,GEN2VCF 的运行时间和内存使用率分别至少低 1.4 倍和 7.4 倍。

结论

GEN2VCF 为用户提供了从 GEN 格式到 VCF 的高效转换,具有最佳猜测的基因型、基因型后验概率和基因型剂量,以及在管道中与其他软件包一起实施时的极大灵活性。

更新日期:2020-08-16
down
wechat
bug