当前位置: X-MOL 学术Microb. Genom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Universal whole-sequence-based plasmid typing and its utility to prediction of host range and epidemiological surveillance.
Microbial Genomics ( IF 4.0 ) Pub Date : 2020-10-01 , DOI: 10.1099/mgen.0.000435
James Robertson 1 , Kyrylo Bessonov 1 , Justin Schonfeld 1 , John H E Nash 1
Affiliation  

Bacterial plasmids play a large role in allowing bacteria to adapt to changing environments and can pose a significant risk to human health if they confer virulence and antimicrobial resistance (AMR). Plasmids differ significantly in the taxonomic breadth of host bacteria in which they can successfully replicate, this is commonly referred to as ‘host range’ and is usually described in qualitative terms of ‘narrow’ or ‘broad’. Understanding the host range potential of plasmids is of great interest due to their ability to disseminate traits such as AMR through bacterial populations and into human pathogens. We developed the MOB-suite to facilitate characterization of plasmids and introduced a whole-sequence-based classification system based on clustering complete plasmid sequences using Mash distances (https://github.com/phac-nml/mob-suite). We updated the MOB-suite database from 12 091 to 23 671 complete sequences, representing 17 779 unique plasmids. With advances in new algorithms for rapidly calculating average nucleotide identity (ANI), we compared clustering characteristics using two different distance measures – Mash and ANI – and three clustering algorithms on the unique set of plasmids. The plasmid nomenclature is designed to group highly similar plasmids together that are unlikely to have multiple representatives within a single cell. Based on our results, we determined that clusters generated using Mash and complete-linkage clustering at a Mash distance of 0.06 resulted in highly homogeneous clusters while maintaining cluster size. The taxonomic distribution of plasmid biomarker sequences for replication and relaxase typing, in combination with MOB-suite whole-sequence-based clusters have been examined in detail for all high-quality publicly available plasmid sequences. We have incorporated prediction of plasmid replication host range into the MOB-suite based on observed distributions of these sequence features in combination with known plasmid hosts from the literature. Host range is reported as the highest taxonomic rank that covers all of the plasmids which share replicon or relaxase biomarkers or belong to the same MOB-suite cluster code. Reporting host range based on these criteria allows for comparisons of host range between studies and provides information for plasmid surveillance.

中文翻译:

通用的基于全序列的质粒分型及其在预测宿主范围和流行病学监测中的效用。

细菌质粒在使细菌适应不断变化的环境方面发挥着重要作用,如果它们赋予毒力和抗菌素耐药性 (AMR),则可能对人类健康构成重大风险。质粒在它们可以成功复制的宿主细菌的分类学广度上有很大不同,这通常被称为“宿主范围”,通常用“窄”或“宽”的定性术语来描述。了解质粒的宿主范围潜力非常有趣,因为它们能够通过细菌种群传播 AMR 等特征并传播到人类病原体中。我们开发了 MOB 套件以促进质粒的表征,并引入了基于使用 Mash 距离对完整质粒序列进行聚类的基于全序列的分类系统 (https://github.com/phac-nml/mob-suite)。我们将 MOB 套件数据库从 12 091 更新到 23 671 个完整序列,代表 17 779 个独特的质粒。随着用于快速计算平均核苷酸同一性 (ANI) 的新算法的进步,我们使用两种不同的距离度量(Mash 和 ANI)以及针对一组独特质粒的三种聚类算法来比较聚类特征。质粒命名法旨在将高度相似的质粒组合在一起,这些质粒不太可能在单个细胞内具有多个代表。根据我们的结果,我们确定使用 Mash 和完全链接聚类以 0.06 的 Mash 距离生成的集群导致高度同质的集群,同时保持集群大小。用于复制和松弛酶分型的质粒生物标志物序列的分类学分布,结合 MOB-suite 基于全序列的集群,已经详细检查了所有高质量公开可用的质粒序列。我们根据观察到的这些序列特征的分布结合文献中已知的质粒宿主,将质粒复制宿主范围的预测纳入 MOB 套件。宿主范围被报告为最高分类等级,涵盖所有共享复制子或松弛酶生物标志物或属于相同 MOB 套件簇代码的质粒。根据这些标准报告宿主范围可以比较研究之间的宿主范围,并为质粒监测提供信息。我们根据观察到的这些序列特征的分布结合文献中已知的质粒宿主,将质粒复制宿主范围的预测纳入 MOB 套件。宿主范围被报告为最高分类等级,涵盖所有共享复制子或松弛酶生物标志物或属于相同 MOB 套件簇代码的质粒。根据这些标准报告宿主范围可以比较研究之间的宿主范围,并为质粒监测提供信息。我们根据观察到的这些序列特征的分布结合文献中已知的质粒宿主,将质粒复制宿主范围的预测纳入 MOB 套件。宿主范围被报告为最高分类等级,涵盖所有共享复制子或松弛酶生物标志物或属于相同 MOB 套件簇代码的质粒。根据这些标准报告宿主范围可以比较研究之间的宿主范围,并为质粒监测提供信息。
更新日期:2020-10-27
down
wechat
bug