当前位置: X-MOL 学术Bioinformatics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
HiC-Hiker: a probabilistic model to determine contig orientation in chromosome-length scaffolds with Hi-C.
Bioinformatics ( IF 5.8 ) Pub Date : 2020-05-05 , DOI: 10.1093/bioinformatics/btaa288
Ryo Nakabayashi 1 , Shinichi Morishita 1
Affiliation  

De novo assembly of reference-quality genomes used to require enormously laborious tasks. In particular, it is extremely time-consuming to build genome markers for ordering assembled contigs along chromosomes; thus, they are only available for well-established model organisms. To resolve this issue, recent studies demonstrated that Hi-C could be a powerful and cost-effective means to output chromosome-length scaffolds for non-model species with no genome marker resources, because the Hi-C contact frequency between a pair of two loci can be a good estimator of their genomic distance, even if there is a large gap between them. Indeed, state-of-the-art methods such as 3D-DNA are now widely used for locating contigs in chromosomes. However, it remains challenging to reduce errors in contig orientation because shorter contigs have fewer contacts with their neighboring contigs. These orientation errors lower the accuracy of gene prediction, read alignment, and synteny block estimation in comparative genomics.

中文翻译:

HiC-Hiker:一种概率模型,用于确定具有Hi-C的染色体长度支架中重叠群的方向。

从头开始参考质量基因组的组装过去需要大量的工作。特别地,建立基因组标记以沿着染色体排序组装的重叠群非常耗时。因此,它们仅可用于完善的模型生物。为了解决这个问题,最近的研究表明,Hi-C可能是一种无基因组标记资源的无模型物种输出染色体长度支架的强大且具有成本效益的方法,因为Hi-C在两个基因对之间的接触频率即使它们之间有很大的缺口,基因座也可以很好地估计它们的基因组距离。实际上,目前最先进的方法(例如3D-DNA)已广泛用于在染色体中定位重叠群。然而,由于较短的重叠群与其相邻重叠群的接触较少,因此减少重叠群方向上的误差仍然具有挑战性。这些方向错误降低了比较基因组学中基因预测,读取比对和同义区块估计的准确性。
更新日期:2020-07-03
down
wechat
bug