当前位置: X-MOL 学术Plant Biotech. J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Prospects of telomere-to-telomere assembly in barley: Analysis of sequence gaps in the MorexV3 reference genome
Plant Biotechnology Journal ( IF 10.1 ) Pub Date : 2022-03-25 , DOI: 10.1111/pbi.13816
Pavla Navrátilová 1 , Helena Toegelová 1 , Zuzana Tulpová 1 , Yi-Tzu Kuo 2 , Nils Stein 2, 3 , Jaroslav Doležel 1 , Andreas Houben 2 , Hana Šimková 1 , Martin Mascher 2, 4
Affiliation  

The first gapless, telomere-to-telomere (T2T) sequence assemblies of plant chromosomes were reported recently. However, sequence assemblies of most plant genomes remain fragmented. Only recent breakthroughs in accurate long-read sequencing have made it possible to achieve highly contiguous sequence assemblies with a few tens of contigs per chromosome, that is a number small enough to allow for a systematic inquiry into the causes of the remaining sequence gaps and the approaches and resources needed to close them. Here, we analyse sequence gaps in the current reference genome sequence of barley cv. Morex (MorexV3). Optical map and sequence raw data, complemented by ChIP-seq data for centromeric histone variant CENH3, were used to estimate the abundance of centromeric, ribosomal DNA, and subtelomeric repeats in the barley genome. These estimates were compared with copy numbers in the MorexV3 pseudomolecule sequence. We found that almost all centromeric sequences and 45S ribosomal DNA repeat arrays were absent from the MorexV3 pseudomolecules and that the majority of sequence gaps can be attributed to assembly breakdown in long stretches of satellite repeats. However, missing sequences cannot fully account for the difference between assembly size and flow cytometric genome size estimates. We discuss the prospects of gap closure with ultra-long sequence reads.

中文翻译:

大麦端粒到端粒组装的前景:MorexV3参考基因组中的序列缺口分析

最近报道了植物染色体的第一个无间隙端粒到端粒(T2T)序列组装。然而,大多数植物基因组的序列组装仍然是碎片化的。只有最近在准确长读长测序方面的突破才有可能实现高度连续的序列组装,每条染色体有几十个重叠群,这个数量足够小,可以系统地调查剩余序列缺口的原因和关闭它们所需的方法和资源。在这里,我们分析了大麦 cv 的当前参考基因组序列中的序列缺口。莫雷克斯 (MorexV3)。光学图谱和序列原始数据,辅以着丝粒组蛋白变体 CENH3 的 ChIP-seq 数据,用于估计大麦基因组中着丝粒、核糖体 DNA 和亚端粒重复序列的丰度。这些估计与 MorexV3 假分子序列中的拷贝数进行了比较。我们发现 MorexV3 假分子中几乎没有所有着丝粒序列和 45S 核糖体 DNA 重复阵列,并且大多数序列间隙可归因于长链的卫星重复中的装配故障。然而,缺失的序列不能完全解释装配大小和流式细胞术基因组大小估计之间的差异。我们讨论了超长序列读取缺口闭合的前景。然而,缺失的序列不能完全解释装配大小和流式细胞术基因组大小估计之间的差异。我们讨论了超长序列读取缺口闭合的前景。然而,缺失的序列不能完全解释装配大小和流式细胞术基因组大小估计之间的差异。我们讨论了超长序列读取缺口闭合的前景。
更新日期:2022-03-25
down
wechat
bug