当前位置: X-MOL 学术Microb. Genom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A publicly accessible database for Clostridioides difficile genome sequences supports tracing of transmission chains and epidemics.
Microbial Genomics ( IF 4.0 ) Pub Date : 2020-08-01 , DOI: 10.1099/mgen.0.000410
Martinique Frentrup 1 , Zhemin Zhou 2 , Matthias Steglich 1, 3 , Jan P Meier-Kolthoff 1 , Markus Göker 1 , Thomas Riedel 1, 3 , Boyke Bunk 1 , Cathrin Spröer 1 , Jörg Overmann 1, 3, 4 , Marion Blaschitz 5 , Alexander Indra 5 , Lutz von Müller 6 , Thomas A Kohl 7, 8 , Stefan Niemann 7, 8 , Christian Seyboldt 9 , Frank Klawonn 10, 11 , Nitin Kumar 12 , Trevor D Lawley 12 , Sergio García-Fernández 13, 14 , Rafael Cantón 13, 14 , Rosa Del Campo 13, 14 , Ortrud Zimmermann 15 , Uwe Groß 15 , Mark Achtman 2 , Ulrich Nübel 1, 3, 4
Affiliation  

Clostridioides difficile is the primary infectious cause of antibiotic-associated diarrhea. Local transmissions and international outbreaks of this pathogen have been previously elucidated by bacterial whole-genome sequencing, but comparative genomic analyses at the global scale were hampered by the lack of specific bioinformatic tools. Here we introduce a publicly accessible database within EnteroBase (http://enterobase.warwick.ac.uk) that automatically retrieves and assembles C. difficile short-reads from the public domain, and calls alleles for core-genome multilocus sequence typing (cgMLST). We demonstrate that comparable levels of resolution and precision are attained by EnteroBase cgMLST and single-nucleotide polymorphism analysis. EnteroBase currently contains 18 254 quality-controlled C. difficile genomes, which have been assigned to hierarchical sets of single-linkage clusters by cgMLST distances. This hierarchical clustering is used to identify and name populations of C. difficile at all epidemiological levels, from recent transmission chains through to epidemic and endemic strains. Moreover, it puts newly collected isolates into phylogenetic and epidemiological context by identifying related strains among all previously published genome data. For example, HC2 clusters (i.e. chains of genomes with pairwise distances of up to two cgMLST alleles) were statistically associated with specific hospitals (P<10−4) or single wards (P=0.01) within hospitals, indicating they represented local transmission clusters. We also detected several HC2 clusters spanning more than one hospital that by retrospective epidemiological analysis were confirmed to be associated with inter-hospital patient transfers. In contrast, clustering at level HC150 correlated with k-mer-based classification and was largely compatible with PCR ribotyping, thus enabling comparisons to earlier surveillance data. EnteroBase enables contextual interpretation of a growing collection of assembled, quality-controlled C. difficile genome sequences and their associated metadata. Hierarchical clustering rapidly identifies database entries that are related at multiple levels of genetic distance, facilitating communication among researchers, clinicians and public-health officials who are combatting disease caused by C. difficile .

中文翻译:


艰难梭菌基因组序列的可公开访问的数据库支持追踪传播链和流行病。



艰难梭菌是抗生素相关性腹泻的主要感染原因。此前已通过细菌全基因组测序阐明了这种病原体的本地传播和国际爆发,但由于缺乏特定的生物信息学工具,全球范围内的比较基因组分析受到阻碍。在这里,我们在 EnteroBase (http://enterobase.warwick.ac.uk) 中引入了一个可公开访问的数据库,该数据库自动从公共领域检索和组装艰难梭菌短读,并调用等位基因进行核心基因组多位点序列分型 (cgMLST )。我们证明,通过 EnteroBase cgMLST 和单核苷酸多态性分析可以达到相当的分辨率和精度水平。 EnteroBase 目前包含 18 254 个质量控制的艰难梭菌基因组,这些基因组已通过 cgMLST 距离分配给单连锁簇的分层集。这种层次聚类用于识别和命名所有流行病学层面的艰难梭菌种群,从最近的传播链到流行和地方性菌株。此外,它通过在所有先前发布的基因组数据中识别相关菌株,将新收集的分离株置于系统发育和流行病学背景中。例如,HC2簇(即成对距离最多为两个cgMLST等位基因的基因组链)与特定医院( P <10id=9>−4)或医院内的单个病房( P =0.01)在统计上相关,表明它们代表本地传播集群。 我们还检测到跨越一家以上医院的多个 HC2 集群,通过回顾性流行病学分析,证实这些集群与医院间患者转移有关。相反,HC150水平的聚类与基于k聚体的分类相关,并且与PCR核糖分型很大程度上兼容,从而能够与早期的监测数据进行比较。 EnteroBase 能够对越来越多的组装、质量控制的艰难梭菌基因组序列及其相关元数据进行上下文解释。层次聚类可快速识别在多个遗传距离水平上相关的数据库条目,促进与艰难梭菌引起的疾病作斗争的研究人员、临床医生和公共卫生官员之间的沟通。
更新日期:2020-08-27
down
wechat
bug