当前位置: X-MOL 学术Microb. Genom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Mycobacterium tuberculosis complex lineage 5 exhibits high levels of within-lineage genomic diversity and differing gene content compared to the type strain H37Rv
Microbial Genomics ( IF 4.0 ) Pub Date : 2021-07-09 , DOI: 10.1099/mgen.0.000437
C N'Dira Sanoussi 1, 2, 3 , Mireia Coscolla 4 , Boatema Ofori-Anyinam 5, 6 , Isaac Darko Otchere 7 , Martin Antonio 8 , Stefan Niemann 9, 10 , Julian Parkhill 11, 12 , Simon Harris 11 , Dorothy Yeboah-Manu 7 , Sebastien Gagneux 13, 14 , Leen Rigouts 2, 3 , Dissou Affolabi 1 , Bouke C de Jong 2 , Conor J Meehan 2, 15
Affiliation  

Pathogens of the Mycobacterium tuberculosis complex (MTBC) are considered to be monomorphic, with little gene content variation between strains. Nevertheless, several genotypic and phenotypic factors separate strains of the different MTBC lineages (L), especially L5 and L6 (traditionally termed Mycobacterium africanum ) strains, from each other. However, this genome variability and gene content, especially of L5 strains, has not been fully explored and may be important for pathobiology and current approaches for genomic analysis of MTBC strains, including transmission studies. By comparing the genomes of 355 L5 clinical strains (including 3 complete genomes and 352 Illumina whole-genome sequenced isolates) to each other and to H37Rv, we identified multiple genes that were differentially present or absent between H37Rv and L5 strains. Additionally, considerable gene content variability was found across L5 strains, including a split in the L5.3 sub-lineage into L5.3.1 and L5.3.2. These gene content differences had a small knock-on effect on transmission cluster estimation, with clustering rates influenced by the selected reference genome, and with potential overestimation of recent transmission when using H37Rv as the reference genome. We conclude that full capture of the gene diversity, especially high-resolution outbreak analysis, requires a variation of the single H37Rv-centric reference genome mapping approach currently used in most whole-genome sequencing data analysis pipelines. Moreover, the high within-lineage gene content variability suggests that the pan-genome of M. tuberculosis is at least several kilobases larger than previously thought, implying that a concatenated or reference-free genome assembly (de novo) approach may be needed for particular questions.

中文翻译:

与典型菌株 H37Rv 相比,结核分枝杆菌复合谱系 5 表现出高水平的谱系内基因组多样性和不同的基因内容

结核分枝杆菌复合体 (MTBC)的病原体被认为是单态性的,菌株之间的基因含量差异很小。然而,一些基因型和表型因素将不同 MTBC 谱系 (L) 的菌株,特别是 L5 和 L6(传统上称为非洲分枝杆菌)菌株彼此分开。然而,这种基因组变异性和基因内容,尤其是 L5 菌株的基因组变异性和基因内容,尚未得到充分探索,可能对病理学和 MTBC 菌株基因组分析(包括传播研究)的当前方法很重要。通过比较 355 个 L5 临床毒株(包括 3 个完整基因组和 352 个 Illumina 全基因组测序分离株)的基因组以及与 H37Rv 的基因组,我们鉴定了 H37Rv 和 L5 毒株之间存在或不存在差异的多个基因。此外,在 L5 菌株中发现了相当大的基因内容变异性,包括 L5.3 亚谱系分裂为 L5.3.1 和 L5.3.2。这些基因内容差异对传播聚类估计有很小的连锁反应,聚类率受所选参考基因组的影响,并且当使用 H37Rv 作为参考基因组时,可能会高估最近的传播。我们得出的结论是,要全面捕获基因多样性,尤其是高分辨率暴发分析,需要对目前大多数全基因组测序数据分析流程中使用的以 H37Rv 为中心的单一参考基因组作图方法进行变体。此外,谱系内基因含量的高变异性表明结核分枝杆菌的全基因组比之前想象的至少大几千碱基,这意味着可能需要串联或无参考基因组组装(从头)方法来实现特定的目的。问题。
更新日期:2021-07-12
down
wechat
bug