当前位置: X-MOL 学术Biochimie › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Novel G-quadruplex prone sequences emerge in the complete assembly of the human X chromosome
Biochimie ( IF 3.9 ) Pub Date : 2021-09-09 , DOI: 10.1016/j.biochi.2021.09.004
Natália Bohálová 1 , Jean-Louis Mergny 2 , Václav Brázda 3
Affiliation  

G-quadruplexes are non-B secondary structures with regulatory functions and therapeutic potential. Improvements in sequencing methods recently allowed the completion of the first human chromosome which is now available as a gapless, end-to-end assembly, with the previously remaining spaces filled and newly identified regions added. We compared the presence of G-quadruplex forming sequences in the current human reference genome (GRCh38) and in the new end-to-end assembly of the X chromosome constructed by high-coverage ultra-long-read nanopore sequencing. This comparison revealed that, even though the corrected length of the chromosome X assembly is surprisingly 1.14% shorter than expected, the number of G-quadruplex forming sequences found in this gapless chromosome is significantly higher, with 493 new motifs having G4Hunter scores above 1.4 and 23 new sequences with G4Hunter scores above 3.5. This observation reflects an improved precision of the new sequencing approaches and points to an underestimation of G-quadruplex propensity in the previous, widely used version of the human genome assembly, especially for motifs with a high G4Hunter score, expected to be very stable. These G-quadruplex forming sequences probably remained undiscovered in earlier genome datasets due to previously unsolved G-rich and repetitive genomic regions. These observations allow a precise targeting of these important regulatory regions.



中文翻译:

人类 X 染色体的完整组装中出现了新的 G-四链体倾向序列

G-四链体是具有调节功能和治疗潜力的非 B 二级结构。最近测序方法的改进允许完成第一条人类染色体,该染色体现在可作为无间隙的端到端组装,填充先前剩余的空间并添加新识别的区域。我们比较了当前人类参考基因组 (GRCh38) 和通过高覆盖率超长读长纳米孔测序构建的 X 染色体的新端到端组装中 G-四链体形成序列的存在。这一比较表明,尽管 X 染色体组装的校正长度比预期的短 1.14%,但在这个无间隙染色体中发现的 G-四链体形成序列的数量明显更高,有 493 个新基序的 G4Hunter 得分高于 1。4 个和 23 个 G4Hunter 得分高于 3.5 的新序列。这一观察结果反映了新测序方法的精度提高,并指出在先前广泛使用的人类基因组组装版本中低估了 G-四链体倾向,特别是对于具有高 G4Hunter 分数的基序,预计将非常稳定。由于先前未解决的富含 G 和重复基因组区域,这些 G-四链体形成序列可能在早期的基因组数据集中仍未被发现。这些观察结果可以精确定位这些重要的调节区域。特别是对于具有高 G4Hunter 分数的图案,预计会非常稳定。由于先前未解决的富含 G 和重复基因组区域,这些 G-四链体形成序列可能在早期的基因组数据集中仍未被发现。这些观察结果可以精确定位这些重要的调节区域。特别是对于具有高 G4Hunter 分数的图案,预计会非常稳定。由于先前未解决的富含 G 和重复基因组区域,这些 G-四链体形成序列可能在早期的基因组数据集中仍未被发现。这些观察结果可以精确定位这些重要的调节区域。

更新日期:2021-09-12
down
wechat
bug