Data-driven test strategy for COVID-19 using machine learning: A study in Lahore, Pakistan,Socio-Economic Planning Sciences

当前位置： X-MOL 学术 › Socio-Econ. Plan. Sci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Data-driven test strategy for COVID-19 using machine learning: A study in Lahore, Pakistan
Socio-Economic Planning Sciences ( IF 6.1 ) Pub Date : 2021-06-08 , DOI: 10.1016/j.seps.2021.101091
Chuanli Huang _{1,

2} , Min Wang ₃ , Warda Rafaqat ₁ , Salman Shabbir ₄ , Liping Lian ₅ , Jun Zhang ₁ , Siuming Lo ₂ , Weiguo Song ₁

Affiliation

Aims

We aimed at giving a preliminary analysis of the weakness of a current test strategy, and proposing a data-driven strategy that was self-adaptive to the dynamic change of pandemic. The effect of driven-data selection over time and space was also within the deep concern.

Methods

A mathematical definition of the test strategy were given. With the real COVID-19 test data from March to July collected in Lahore, a significance analysis of the possible features was conducted. A machine learning method based on logistic regression and priority ranking were proposed for the data-driven test strategy. With performance assessed by the area under the receiver operating characteristic curve (AUC), time series analysis and spatial cross-test were conducted.

Results

The transition of risk factors accounted for the failure of the current test strategy. The proposed data-driven strategy could enhance the positive detection rate from 2.54% to 28.18%, and the recall rate from 8.05% to 89.35% under strictly limited test capacity. Much more optimal utilization of test resources could be realized where 89.35% of total positive cases could be detected with merely 48.17% of the original test amount. The strategy showed self-adaptability with the development of pandemic, while the strategy driven by local data was proved to be optimal.

Conclusions

We recommended a generalization of such a data-driven test strategy for a better response to the global developing pandemic. Besides, the construction of the COVID-19 data system should be more refined on space for local applications.

中文翻译：

使用机器学习的 COVID-19 数据驱动测试策略：巴基斯坦拉合尔的一项研究

宗旨

我们旨在对当前测试策略的弱点进行初步分析，并提出一种自适应大流行动态变化的数据驱动策略。驱动数据选择随时间和空间的影响也在深切关注之内。

方法

给出了测试策略的数学定义。利用在拉合尔收集的 3 月至 7 月的真实 COVID-19 测试数据，对可能的特征进行了显着性分析。针对数据驱动的测试策略，提出了一种基于逻辑回归和优先排序的机器学习方法。通过接受者操作特征曲线（AUC）下的面积评估性能，进行时间序列分析和空间交叉测试。

结果

风险因素的转变是当前测试策略失败的原因。所提出的数据驱动策略可以在严格限制测试能力的情况下将阳性检测率从 2.54% 提高到 28.18%，召回率从 8.05% 提高到 89.35%。可以实现更优化的测试资源利用，其中仅用原始测试量的 48.17% 就可以检测到 89.35% 的阳性病例。该策略随着大流行的发展表现出自适应性，而由本地数据驱动的策略被证明是最优的。