当前位置: X-MOL 学术Microb. Genom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
AB_SA: Accessory genes-Based Source Attribution - tracing the source of Salmonella enterica Typhimurium environmental strains.
Microbial Genomics ( IF 4.0 ) Pub Date : 2020-07-01 , DOI: 10.1099/mgen.0.000366
Laurent Guillier 1, 2 , Michèle Gourmelon 3 , Solen Lozach 3 , Sabrina Cadel-Six 1 , Marie-Léone Vignaud 1 , Nanna Munck 4 , Tine Hald 4 , Federica Palma 1
Affiliation  

The partitioning of pathogenic strains isolated in environmental or human cases to their sources is challenging. The pathogens usually colonize multiple animal hosts, including livestock, which contaminate the food-production chain and the environment (e.g. soil and water), posing an additional public-health burden and major challenges in the identification of the source. Genomic data opens up new opportunities for the development of statistical models aiming to indicate the likely source of pathogen contamination. Here, we propose a computationally fast and efficient multinomial logistic regression source-attribution classifier to predict the animal source of bacterial isolates based on ‘source-enriched’ loci extracted from the accessory-genome profiles of a pangenomic dataset. Depending on the accuracy of the model’s self-attribution step, the modeller selects the number of candidate accessory genes that best fit the model for calculating the likelihood of (source) category membership. The Accessory genes-Based Source Attribution (AB_SA) method was applied to a dataset of strains of Salmonella enterica Typhimurium and its monophasic variant ( S . enterica 1,4,[5],12:i:-). The model was trained on 69 strains with known animal-source categories (i.e. poultry, ruminant and pig). The AB_SA method helped to identify 8 genes as predictors among the 2802 accessory genes. The self-attribution accuracy was 80 %. The AB_SA model was then able to classify 25 of the 29 S . enterica Typhimurium and S . enterica 1,4,[5],12:i:- isolates collected from the environment (considered to be of unknown source) into a specific category (i.e. animal source), with more than 85 % of probability. The AB_SA method herein described provides a user-friendly and valuable tool for performing source-attribution studies in only a few steps. AB_SA is written in R and freely available at https://github.com/lguillier/AB_SA.

中文翻译:


AB_SA:基于辅助基因的来源归因 - 追踪鼠伤寒沙门氏菌环境菌株的来源。



从环境或人类病例中分离出的致病菌株的来源具有挑战性。病原体通常寄生在包括牲畜在内的多种动物宿主身上,从而污染食品生产链和环境(例如土壤和水),造成额外的公共卫生负担,并对来源识别造成重大挑战。基因组数据为统计模型的开发开辟了新的机会,旨在表明病原体污染的可能来源。在这里,我们提出了一种计算快速且高效的多项逻辑回归源归因分类器,以根据从泛基因组数据集的辅助基因组概况中提取的“源富集”基因座来预测细菌分离株的动物来源。根据模型自我归因步骤的准确性,建模者选择最适合模型的候选辅助基因的数量,以计算(源)类别成员资格的可能性。基于辅助基因的来源归因(AB_SA) 方法应用于鼠伤寒沙门氏菌及其单相变体 ( S.enterica 1,4,[5],12:i:-) 菌株的数据集。该模型在 69 个已知动物来源类别(即家禽、反刍动物和猪)的菌株上进行了训练。 AB_SA 方法帮助在 2802 个辅助基因中识别出 8 个基因作为预测因子。自我归因准确率为80%。然后 AB_SA 模型能够对 29 个S中的 25 个进行分类。伤寒肠杆菌和沙门氏菌Enterica 1,4,[5],12:i:- 从环境(被认为来源不明)中收集的分离物进入特定类别(即动物来源),概率超过 85%。本文描述的 AB_SA 方法提供了一种用户友好且有价值的工具,只需几个步骤即可执行源归因研究。 AB_SA 是用 R 编写的,可在 https://github.com/lguillier/AB_SA 上免费获取。
更新日期:2020-08-20
down
wechat
bug