当前位置: X-MOL 学术BMC Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Inter-chromosomal k-mer distances
BMC Genomics ( IF 4.4 ) Pub Date : 2021-09-06 , DOI: 10.1186/s12864-021-07952-0
Alon Kafri 1 , Benny Chor 1 , David Horn 2
Affiliation  

Inversion Symmetry is a generalization of the second Chargaff rule, stating that the count of a string of k nucleotides on a single chromosomal strand equals the count of its inverse (reverse-complement) k-mer. It holds for many species, both eukaryotes and prokaryotes, for ranges of k which may vary from 7 to 10 as chromosomal lengths vary from 2Mbp to 200 Mbp. Building on this formalism we introduce the concept of k-mer distances between chromosomes. We formulate two k-mer distance measures, D1 and D2, which depend on k. D1 takes into account all k-mers (for a single k) appearing on single strands of the two compared chromosomes, whereas D2 takes into account both strands of each chromosome. Both measures reflect dissimilarities in global chromosomal structures. After defining the various distance measures and summarizing their properties, we also define proximities that rely on the existence of synteny blocks between chromosomes of different bacterial strains. Comparing pairs of strains of bacteria, we find negative correlations between synteny proximities and k-mer distances, thus establishing the meaning of the latter as measures of evolutionary distances among bacterial strains. The synteny measures we use are appropriate for closely related bacterial strains, where considerable sections of chromosomes demonstrate high direct or reversed equality. These measures are not appropriate for comparing different bacteria or eukaryotes. K-mer structural distances can be defined for all species. Because of the arbitrariness of strand choices, we employ only the D2 measure when comparing chromosomes of different species. The results for comparisons of various eukaryotes display interesting behavior which is partially consistent with conventional understanding of evolutionary genomics. In particular, we define ratios of minimal k-mer distances (KDR) between unmasked and masked chromosomes of two species, which correlate with both short and long evolutionary scales. k-mer distances reflect dissimilarities among global chromosomal structures. They carry information which aggregates all mutations. As such they can complement traditional evolution studies , which mainly concentrate on coding regions.

中文翻译:

染色体间 k-mer 距离

倒置对称是第二个 Chargaff 规则的推广,它指出单个染色体链上的 k 个核苷酸串的计数等于其反向(反向互补)k-mer 的计数。它适用于许多物种,包括真核生物和原核生物,k 的范围可能从 7 到 10,因为染色体长度从 2Mbp 到 200 Mbp。在这种形式主义的基础上,我们引入了染色体之间的 k-mer 距离的概念。我们制定了两个 k-mer 距离度量,D1 和 D2,这取决于 k。D1 考虑了出现在两条比较染色体的单链上的所有 k 聚体(对于单个 k),而 D2 考虑了每条染色体的两条链。这两种测量都反映了全球染色体结构的不同之处。在定义了各种距离度量并总结了它们的属性之后,我们还定义了依赖于不同细菌菌株染色体之间同线性块的存在的邻近性。比较细菌菌株对,我们发现同线性接近度和 k-mer 距离之间存在负相关,从而确定后者作为细菌菌株之间进化距离的度量的意义。我们使用的同线性测量适用于密切相关的细菌菌株,其中相当多的染色体部分表现出高度的直接或反向平等。这些措施不适用于比较不同的细菌或真核生物。可以为所有物种定义 K-mer 结构距离。由于链选择的任意性,我们在比较不同物种的染色体时仅使用 D2 度量。各种真核生物的比较结果显示出有趣的行为,这与进化基因组学的传统理解部分一致。特别是,我们定义了两个物种的未掩蔽和掩蔽染色体之间的最小 k 聚体距离 (KDR) 的比率,这与短进化尺度和长进化尺度相关。k-mer 距离反映了全局染色体结构之间的差异。它们携带汇总所有突变的信息。因此,它们可以补充传统的进化研究,主要集中在编码区域。这与短期和长期的进化尺度相关。k-mer 距离反映了全局染色体结构之间的差异。它们携带汇总所有突变的信息。因此,它们可以补充传统的进化研究,主要集中在编码区域。这与短期和长期的进化尺度相关。k-mer 距离反映了全局染色体结构之间的差异。它们携带汇总所有突变的信息。因此,它们可以补充传统的进化研究,主要集中在编码区域。
更新日期:2021-09-06
down
wechat
bug