当前位置: X-MOL 学术Data Min. Knowl. Discov. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Predictive modeling of infant mortality
Data Mining and Knowledge Discovery ( IF 4.8 ) Pub Date : 2021-01-18 , DOI: 10.1007/s10618-020-00728-2
Antonia Saravanou , Clemens Noelke , Nicholas Huntington , Dolores Acevedo-Garcia , Dimitrios Gunopulos

The Infant Mortality Rate (IMR) is defined as the number of infants for every thousand infants that do not survive until their first birthday. IMR is an important metric not only because it provides information about infant births in an area, but it also measures the general societal health status. In the United States of America, the IMR is higher than many other developed countries, despite the high level of prosperity. It is important to note here that the U.S.A. exhibits strong and persistent inequalities in the IMR across different racial and ethnic groups (Kochanek et al. in Natl Vital Stat Rep 65(4):1–122, 2006). In this paper, we study predictive models in the problem of infant mortality. We implement traditional machine learning models and state-of-the-art neural network models with various combinations of features extracted from birth certificates. Those combinations include features that can be summed as socio-economic and ethical features related to the mother and the father of the infant and medical measurements during the pregnancy and the delivery. We approach the classification problem of infant mortality, whether an infant will survive until her first birthday or not, both as binary and multi-class based on the time of death. We focus on understanding and exploring the importance of features extracted from the birth certificates. For example, we test the performance of models trained on the general population to models trained in subsets of the population, e.g., for individual races. We show in our experimental evaluation comparisons between different predictive models (including those used by epidemiology researchers), various combinations of features, different distributions in the training set and features’ importance.



中文翻译:

婴儿死亡率的预测模型

婴儿死亡率(IMR)的定义是,每千个直到第一个生日才存活的婴儿的数量。IMR是一项重要的指标,不仅因为它提供了有关某个地区婴儿出生的信息,而且还衡量了整个社会的健康状况。尽管美利坚合众国的IMR水平很高,但其IMR仍高于其他许多发达国家。在此必须注意的是,美国在不同种族和族裔群体的IMR中表现出强烈且持续的不平等现象(Kochanek等人,《国家统计杂志》 65(4):1-122,2006年)。在本文中,我们研究婴儿死亡率问题中的预测模型。我们使用从出生证明中提取的特征的各种组合来实现传统的机器学习模型和最新的神经网络模型。这些组合包括可以概括为与婴儿的母亲和父亲有关的社会经济和道德特征以及怀孕和分娩期间的医学测量结果的特征。我们处理婴儿死亡率的分类问题,即婴儿能否存活到第一个生日,无论是基于死亡时间的二进制多类。我们专注于理解和探索从出生证明中提取的特征的重要性。例如,我们测试了在总体群体上训练的模型与在总体子集中训练的模型(例如针对单个种族)的性能。我们在实验评估中显示了不同预测模型(包括流行病学研究人员所使用的模型),特征的各种组合,训练集中的不同分布以及特征重要性之间的比较。

更新日期:2021-01-19
down
wechat
bug