当前位置: X-MOL 学术Algorithms Mol. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A fast and accurate enumeration-based algorithm for haplotyping a triploid individual.
Algorithms for Molecular Biology ( IF 1.5 ) Pub Date : 2018-06-01 , DOI: 10.1186/s13015-018-0129-0
Jingli Wu 1 , Qian Zhang 2
Affiliation  

BACKGROUND Haplotype assembly, reconstructing haplotypes from sequence data, is one of the major computational problems in bioinformatics. Most of the current methodologies for haplotype assembly are designed for diploid individuals. In recent years, genomes having more than two sets of homologous chromosomes have attracted many research groups that are interested in the genomics of disease, phylogenetics, botany and evolution. However, there is still a lack of methods for reconstructing polyploid haplotypes. RESULTS In this work, the minimum error correction with genotype information (MEC/GI) model, an important combinatorial model for haplotyping a single individual, is used to study the triploid individual haplotype reconstruction problem. A fast and accurate enumeration-based algorithm enumeration haplotyping triploid with least difference (EHTLD) is proposed for solving the MEC/GI model. The EHTLD algorithm tries to reconstruct the three haplotypes according to the order of single nucleotide polymorphism (SNP) loci along them. When reconstructing a given SNP site, the EHTLD algorithm enumerates three kinds of SNP values in terms of the corresponding site's genotype value, and chooses the one, which leads to the minimum difference between the reconstructed haplotypes and the sequenced fragments covering that SNP site, to fill the SNP loci being reconstructed. CONCLUSION Extensive experimental comparisons were performed between the EHTLD algorithm and the well known HapCompass and HapTree. Compared with algorithms HapCompass and HapTree, the EHTLD algorithm can reconstruct more accurate haplotypes, which were proven by a number of experiments.

中文翻译:


一种快速、准确的基于计数的算法,用于对三倍体个体进行单倍型分析。



背景技术单倍型组装,即从序列数据重建单倍型,是生物信息学中的主要计算问题之一。目前大多数单倍型组装方法都是为二倍体个体设计的。近年来,具有两组以上同源染色体的基因组吸引了许多对疾病基因组学、系统发育学、植物学和进化感兴趣的研究小组。然而,仍然缺乏重建多倍体单倍型的方法。结果本工作采用基因型信息最小误差校正(MEC/GI)模型作为单个个体单倍型分析的重要组合模型,用于研究三倍体个体单倍型重建问题。为求解MEC/GI模型,提出了一种快速、准确的基于最小差异枚举单元型三倍体(EHTLD)的枚举算法。 EHTLD 算法尝试根据单核苷酸多态性 (SNP) 位点的顺序重建三个单倍型。当重建给定的SNP位点时,EHTLD算法根据相应位点的基因型值枚举三种SNP值,并选择一种使得重建的单倍型与覆盖该SNP位点的测序片段之间差异最小的一种,以填充正在重建的 SNP 位点。结论 EHTLD 算法与众所周知的 HapCompass 和 HapTree 进行了广泛的实验比较。与HapCompass和HapTree算法相比,EHTLD算法能够重建更准确的单倍型,这一点已经被大量实验证明。
更新日期:2019-11-01
down
wechat
bug