当前位置: X-MOL 学术Pattern Recogn. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An infodemiological framework for tracking the spread of SARS-CoV-2 using integrated public data
Pattern Recognition Letters ( IF 5.1 ) Pub Date : 2022-04-26 , DOI: 10.1016/j.patrec.2022.04.030
Zhimin Liu 1 , Zuodong Jiang 1 , Geoffrey Kip 1 , Kirti Snigdha 1 , Jennings Xu 1 , Xiaoying Wu 1 , Najat Khan 1 , Timothy Schultz 1
Affiliation  

The outbreak of the SARS-CoV-2 novel coronavirus has caused a health crisis of immeasurable magnitude. Signals from heterogeneous public data sources could serve as early predictors for infection waves of the pandemic, particularly in its early phases, when infection data was scarce. In this article, we characterize temporal pandemic indicators by leveraging an integrated set of public data and apply them to a Prophet model to predict COVID-19 trends. An effective natural language processing pipeline was first built to extract time-series signals of specific articles from a news corpus. Bursts of these temporal signals were further identified with Kleinberg's burst detection algorithm. Across different US states, correlations for Google Trends of COVID-19 related terms, COVID-19 news volume, and publicly available wastewater SARS-CoV-2 measurements with weekly COVID-19 case numbers were generally high with lags ranging from 0 to 3 weeks, indicating them as strong predictors of viral spread. Incorporating time-series signals of these effective predictors significantly improved the performance of the Prophet model, which was able to predict the COVID-19 case numbers between one and two weeks with average mean absolute error rates of 0.38 and 0.46 respectively across different states



中文翻译:

使用集成公共数据跟踪 SARS-CoV-2 传播的信息流行病学框架

SARS-CoV-2 新型冠状病毒的爆发造成了无法估量的健康危机。来自异构公共数据源的信号可以作为大流行感染浪潮的早期预测因子,尤其是在感染数据稀缺的早期阶段。在本文中,我们通过利用一组集成的公共数据来表征时间性大流行指标,并将它们应用于 Prophet 模型以预测 COVID-19 趋势。首先构建了一个有效的自然语言处理管道,以从新闻语料库中提取特定文章的时间序列信号。使用 Kleinberg 的突发检测算法进一步识别这些时间信号的突发。在美国不同州,COVID-19 相关术语、COVID-19 新闻量的谷歌趋势相关性,每周 COVID-19 病例数和公开可用的废水 SARS-CoV-2 测量值通常很高,滞后时间为 0 到 3 周,表明它们是病毒传播的有力预测指标。结合这些有效预测因子的时间序列信号显着提高了 Prophet 模型的性能,该模型能够预测一到两周内的 COVID-19 病例数,不同州的平均绝对错误率分别为 0.38 和 0.46

更新日期:2022-04-30
down
wechat
bug