当前位置: X-MOL 学术Genome Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Automated annotation of human centromeres with HORmon
Genome Research ( IF 7 ) Pub Date : 2022-06-01 , DOI: 10.1101/gr.276362.121
Olga Kunyavskaya 1 , Tatiana Dvorkina 1 , Andrey V Bzikadze 2 , Ivan A Alexandrov 1 , Pavel A Pevzner 3
Affiliation  

Recent advances in long-read sequencing opened a possibility to address the long-standing questions about the architecture and evolution of human centromeres. They also emphasized the need for centromere annotation (partitioning human centromeres into monomers and higher-order repeats [HORs]). Although there was a half-century-long series of semi-manual studies of centromere architecture, a rigorous centromere annotation algorithm is still lacking. Moreover, an automated centromere annotation is a prerequisite for studies of genetic diseases associated with centromeres and evolutionary studies of centromeres across multiple species. Although the monomer decomposition (transforming a centromere into a monocentromere written in the monomer alphabet) and the HOR decomposition (representing a monocentromere in the alphabet of HORs) are currently viewed as two separate problems, we show that they should be integrated into a single framework in such a way that HOR (monomer) inference affects monomer (HOR) inference. We thus developed the HORmon algorithm that integrates the monomer/HOR inference and automatically generates the human monomers/HORs that are largely consistent with the previous semi-manual inference.

中文翻译:

使用 HORmon 自动注释人类着丝粒

长读长测序的最新进展为解决关于人类着丝粒的结构和进化的长期存在的问题提供了可能性。他们还强调了着丝粒注释的必要性(将人类着丝粒分成单体和高阶重复 [HOR])。尽管对着丝粒结构进行了长达半个世纪的系列半手工研究,但仍然缺乏严格的着丝粒注释算法。此外,自动着丝粒注释是研究与着丝粒相关的遗传疾病和跨多个物种的着丝粒进化研究的先决条件。尽管单体分解(将着丝粒转化为单体字母表中的单着丝粒)和 HOR 分解(代表 HOR 字母表中的单着丝粒)目前被视为两个独立的问题,但我们表明它们应该集成到一个框架中以 HOR(单体)推断影响单体(HOR)推断的方式。因此,我们开发了 HORmon 算法,该算法集成了单体/HOR 推理,并自动生成与之前的半手动推理基本一致的人类单体/HOR。
更新日期:2022-06-01
down
wechat
bug