当前位置: X-MOL 学术J. Big Data › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Preventive healthcare policies in the US: solutions for disease management using Big Data Analytics.
Journal of Big Data ( IF 8.1 ) Pub Date : 2020-06-23 , DOI: 10.1186/s40537-020-00315-8
Feras A Batarseh 1 , Iya Ghassib 2 , Deri Sondor Chong 3 , Po-Hsuan Su 3
Affiliation  

Data-driven healthcare policy discussions are gaining traction after the Covid-19 outbreak and ahead of the 2020 US presidential elections. The US has a hybrid healthcare structure; it is a system that does not provide universal coverage, albeit few years ago enacted a mandate (Affordable Care Act-ACA) that provides coverage for the majority of Americans. The US has the highest health expenditure per capita of all western and developed countries; however, most Americans don’t tap into the benefits of preventive healthcare. It is estimated that only 8% of Americans undergo routine preventive screenings. On a national level, very few states (15 out of the 50) have above-average preventive healthcare metrics. In literature, many studies focus on the cure of diseases (research areas such as drug discovery and disease prediction); whilst a minority have examined data-driven preventive measures—a matter that Americans and policy makers ought to place at the forefront of national issues. In this work, we present solutions for preventive practices and policies through Machine Learning (ML) methods. ML is morally neutral, it depends on the data that train the models; in this work, we make the case that Big Data is an imperative paradigm for healthcare. We examine disparities in clinical data for US patients by developing correlation and imputation methods for data completeness. Non-conventional patterns are identified. The data lifecycle followed is methodical and deliberate; 1000+ clinical, demographical, and laboratory variables are collected from the Centers for Disease Control and Prevention (CDC). Multiple statistical models are deployed (Pearson correlations, Cramer’s V, MICE, and ANOVA). Other unsupervised ML models are also examined (K-modes and K-prototypes for clustering). Through the results presented in the paper, pointers to preventive chronic disease tests are presented, and the models are tested and evaluated.

中文翻译:

美国的预防保健政策:使用大数据分析的疾病管理解决方案。

在Covid-19爆发后以及2020年美国总统大选之前,以数据为驱动力的医疗政策讨论日益受到关注。美国具有混合医疗结构。尽管几年前颁布了一项为大多数美国人提供医疗保险的授权(Affordable Care Act-ACA),但该系统无法提供医疗保险。在所有西方和发达国家中,美国的人均医疗保健支出最高。但是,大多数美国人没有利用预防保健的好处。据估计,只有8%的美国人接受了常规的预防性筛查。在全国范围内,很少有州(50个州中有15个)的预防保健指标高于平均水平。在文献中,许多研究集中在疾病的治疗上(研究领域,如药物发现和疾病预测)。少数人研究了以数据为依据的预防措施,这是美国人和政策制定者应将其置于国家问题的最前沿的问题。在这项工作中,我们通过机器学习(ML)方法介绍了预防措施和策略的解决方案。机器学习在道德上是中立的,它取决于训练模型的数据。在这项工作中,我们认为大数据是医疗保健的当务之急。我们通过开发相关性和估算数据完整性的方法来检查美国患者的临床数据差异。确定了非常规模式。遵循的数据生命周期是有条不紊和蓄意的。从疾病控制与预防中心(CDC)收集了1000多个临床,人口统计学和实验室变量。部署了多种统计模型(Pearson相关,Cramer的V,MICE和ANOVA)。还检查了其他无监督的ML模型(用于聚类的K模式和K原型)。通过本文提供的结果,提出了预防性慢性疾病测试的指标,并对模型进行了测试和评估。
更新日期:2020-06-23
down
wechat
bug