当前位置: X-MOL 学术Biol. Direct › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fingerprinting cities: differentiating subway microbiome functionality.
Biology Direct ( IF 5.7 ) Pub Date : 2019-10-30 , DOI: 10.1186/s13062-019-0252-y
Chengsheng Zhu 1 , Maximilian Miller 1, 2, 3 , Nick Lusskin 1 , Yannick Mahlich 1, 2, 3, 4 , Yanran Wang 1 , Zishuo Zeng 1 , Yana Bromberg 1, 4
Affiliation  

BACKGROUND Accumulating evidence suggests that the human microbiome impacts individual and public health. City subway systems are human-dense environments, where passengers often exchange microbes. The MetaSUB project participants collected samples from subway surfaces in different cities and performed metagenomic sequencing. Previous studies focused on taxonomic composition of these microbiomes and no explicit functional analysis had been done till now. RESULTS As a part of the 2018 CAMDA challenge, we functionally profiled the available ~ 400 subway metagenomes and built predictor for city origin. In cross-validation, our model reached 81% accuracy when only the top-ranked city assignment was considered and 95% accuracy if the second city was taken into account as well. Notably, this performance was only achievable if the similarity of distribution of cities in the training and testing sets was similar. To assure that our methods are applicable without such biased assumptions we balanced our training data to account for all represented cities equally well. After balancing, the performance of our method was slightly lower (76/94%, respectively, for one or two top ranked cities), but still consistently high. Here we attained an added benefit of independence of training set city representation. In testing, our unbalanced model thus reached (an over-estimated) performance of 90/97%, while our balanced model was at a more reliable 63/90% accuracy. While, by definition of our model, we were not able to predict the microbiome origins previously unseen, our balanced model correctly judged them to be NOT-from-training-cities over 80% of the time. Our function-based outlook on microbiomes also allowed us to note similarities between both regionally close and far-away cities. Curiously, we identified the depletion in mycobacterial functions as a signature of cities in New Zealand, while photosynthesis related functions fingerprinted New York, Porto and Tokyo. CONCLUSIONS We demonstrated the power of our high-speed function annotation method, mi-faser, by analysing ~ 400 shotgun metagenomes in 2 days, with the results recapitulating functional signals of different city subway microbiomes. We also showed the importance of balanced data in avoiding over-estimated performance. Our results revealed similarities between both geographically close (Ofa and Ilorin) and distant (Boston and Porto, Lisbon and New York) city subway microbiomes. The photosynthesis related functional signatures of NYC were previously unseen in taxonomy studies, highlighting the strength of functional analysis.

中文翻译:


城市指纹识别:区分地铁微生物组功能。



背景技术越来越多的证据表明,人类微生物群影响个人和公共健康。城市地铁系统是人类密集的环境,乘客经常交换微生物。 MetaSUB项目参与者从不同城市的地铁表面收集样本并进行宏基因组测序。之前的研究主要集中在这些微生物组的分类组成上,到目前为止还没有进行明确的功能分析。结果作为 2018 年 CAMDA 挑战赛的一部分,我们对可用的约 400 个地铁宏基因组进行了功能分析,并构建了城市起源预测器。在交叉验证中,当仅考虑排名靠前的城市分配时,我们的模型达到了 81% 的准确率,如果也考虑到第二个城市,则准确率达到 95%。值得注意的是,只有在训练和测试集中城市分布的相似性相似的情况下才能实现这一性能。为了确保我们的方法在没有这种有偏见的假设的情况下适用,我们平衡了我们的训练数据,以同样好地考虑所有代表的城市。平衡后,我们的方法的性能稍低(对于一两个排名最高的城市分别为 76/94%),但仍然保持较高水平。在这里,我们获得了训练集城市代表的独立性的额外好处。在测试中,我们的不平衡模型因此达到了(高估的)90/97% 的性能,而我们的平衡模型的准确度更可靠,为 63/90%。虽然根据我们模型的定义,我们无法预测以前未见过的微生物组起源,但我们的平衡模型在超过 80% 的情况下正确判断它们不是来自训练城市。我们对微生物组的基于功能的展望也使我们能够注意到区域邻近和遥远城市之间的相似之处。 奇怪的是,我们发现分枝杆菌功能的缺失是新西兰城市的特征,而光合作用相关功能则是纽约、波尔图和东京的特征。结论 我们在 2 天内分析了约 400 个鸟枪法宏基因组,展示了高速功能注释方法 mi-faser 的强大功能,结果概括了不同城市地铁微生物组的功能信号。我们还展示了平衡数据对于避免高估绩效的重要性。我们的结果揭示了地理位置接近(奥法和伊洛林)和遥远(波士顿和波尔图、里斯本和纽约)城市地铁微生物组之间的相似性。纽约市与光合作用相关的功能特征以前在分类学研究中从未见过,这凸显了功能分析的优势。
更新日期:2020-04-22
down
wechat
bug