当前位置: X-MOL 学术bioRxiv. Ecol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Optimizing predictive models to prioritize viral discovery in zoonotic reservoirs
bioRxiv - Ecology Pub Date : 2021-08-05 , DOI: 10.1101/2020.05.22.111344
Daniel J. Becker , Gregory F. Albery , Anna R. Sjodin , Timothée Poisot , Laura M. Bergner , Tad A. Dallas , Evan A. Eskew , Maxwell J. Farrell , Sarah Guth , Barbara A. Han , Nancy B. Simmons , Michiel Stock , Emma C. Teeling , Colin J. Carlson

Despite global investment in One Health disease surveillance, it remains difficult—and often very costly—to identify and monitor the wildlife reservoirs of novel zoonotic viruses. Statistical models can be used to guide sampling prioritization, but predictions from any given model may be highly uncertain; moreover, systematic model validation is rare, and the drivers of model performance are consequently under-documented. Here, we use bat hosts of betacoronaviruses as a case study for the data-driven process of comparing and validating predictive models of likely reservoir hosts. In the first quarter of 2020, we generated an ensemble of eight statistical models that predict host-virus associations and developed priority sampling recommendations for potential bat reservoirs and potential bridge hosts for SARS-CoV-2. Over more than a year, we tracked the discovery of 40 new bat hosts of betacoronaviruses, validated initial predictions, and dynamically updated our analytic pipeline. We find that ecological trait-based models perform extremely well at predicting these novel hosts, whereas network methods consistently perform roughly as well or worse than expected at random. These findings illustrate the importance of ensembling as a buffer against variation in model quality and highlight the value of including host ecology in predictive models. Our revised models show improved performance and predict over 400 bat species globally that could be undetected hosts of betacoronaviruses. Although 20 species of horseshoe bats (Rhinolophus spp.) are known to be the primary reservoir of SARS-like viruses, we find at least three-fourths of plausible betacoronavirus reservoirs in this bat genus might still be undetected. Our study is the first to demonstrate through systematic validation that machine learning models can help optimize wildlife sampling for undiscovered viruses and illustrates how such approaches are best implemented through a dynamic process of prediction, data collection, validation, and updating.

中文翻译:

优化预测模型以优先考虑人畜共患病宿主中的病毒发现

尽管全球对 One Health 疾病监测进行了投资,但识别和监测新型人畜共患病病毒的野生动物宿主仍然很困难,而且通常成本很高。统计模型可用于指导抽样优先级排序,但任何给定模型的预测都可能具有高度不确定性;此外,系统模型验证很少见,因此模型性能的驱动因素记录不足。在这里,我们使用β冠状病毒的蝙蝠宿主作为比较和验证可能宿主宿主的预测模型的数据驱动过程的案例研究。在 2020 年第一季度,我们生成了一个由八个统计模型组成的集合,用于预测宿主病毒关联,并为 SARS-CoV-2 的潜在蝙蝠宿主和潜在桥宿主制定了优先采样建议。一年多来,我们跟踪了 40 种新的蝙蝠冠状病毒宿主的发现,验证了初步预测,并动态更新了我们的分析管道。我们发现基于生态特征的模型在预测这些新宿主方面表现非常好,而网络方法在随机情况下的表现与预期大致相同或更差。这些发现说明了集成作为缓冲模型质量变化的重要性,并强调了将宿主生态学纳入预测模型的价值。我们修改后的模型显示了改进的性能,并预测了全球 400 多种蝙蝠物种,它们可能是未检测到的 β 冠状病毒宿主。虽然有 20 种马蹄蝠(我们发现基于生态特征的模型在预测这些新宿主方面表现非常好,而网络方法在随机情况下的表现与预期大致相同或更差。这些发现说明了集成作为缓冲模型质量变化的重要性,并强调了将宿主生态学纳入预测模型的价值。我们修改后的模型显示了改进的性能,并预测了全球 400 多种蝙蝠物种,它们可能是未检测到的 β 冠状病毒宿主。虽然有 20 种马蹄蝠(我们发现基于生态特征的模型在预测这些新宿主方面表现非常好,而网络方法在随机情况下的表现与预期大致相同或更差。这些发现说明了集成作为缓冲模型质量变化的重要性,并强调了将宿主生态学纳入预测模型的价值。我们修改后的模型显示了改进的性能,并预测了全球 400 多种蝙蝠物种,它们可能是未检测到的 β 冠状病毒宿主。虽然有 20 种马蹄蝠(这些发现说明了集成作为缓冲模型质量变化的重要性,并强调了将宿主生态学纳入预测模型的价值。我们修改后的模型显示了改进的性能,并预测了全球 400 多种蝙蝠物种,它们可能是未检测到的 β 冠状病毒宿主。虽然有 20 种马蹄蝠(这些发现说明了集成作为缓冲模型质量变化的重要性,并强调了将宿主生态学纳入预测模型的价值。我们修改后的模型显示了改进的性能,并预测了全球 400 多种蝙蝠物种,它们可能是未检测到的 β 冠状病毒宿主。虽然有 20 种马蹄蝠(Rhinolophus spp.) 是 SARS 样病毒的主要宿主,我们发现该蝙蝠属中至少四分之三的似是而非的 β冠状病毒宿主可能仍未被检测到。我们的研究首次通过系统验证证明机器学习模型可以帮助优化未发现病毒的野生动物采样,并说明如何通过预测、数据收集、验证和更新的动态过程最好地实施此类方法。
更新日期:2021-08-09
down
wechat
bug