当前位置: X-MOL 学术Int. J. Artif. Intell. Tools › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Framework for Integrating Domain Knowledge in Logistic Regression with Application to Hospital Readmission Prediction
International Journal on Artificial Intelligence Tools ( IF 1.0 ) Pub Date : 2019-10-01 , DOI: 10.1142/s0218213019600066
Sandro Radovanović 1 , Boris Delibašić 1 , Miloš Jovanović 1 , Milan Vukićević 1 , Milija Suknović 1
Affiliation  

It is commonly understood that machine learning algorithms discover and extract knowledge based on data at hand. However, a huge amount of knowledge is available which is in machine-readable format and ready for inclusion in machine learning algorithms and models. In this paper, we propose a framework that integrates domain knowledge in form of ontologies/hierarchies into logistic regression using stacked generalization. Namely, relations from ontology/hierarchy are used in stacking manner in order to obtain higher, more abstract concepts. Obtained concepts are further used for prediction. The problem we solved is unplanned 30-days hospital readmission, which is considered as one of the major problems in healthcare. Proposed framework yields better results compared to Ridge, Lasso, and Tree Lasso Logistic Regression. Results suggest that the proposed framework improves AUC by up to 9.5% on pediatric datasets and up to 4% on morbidly obese patients’ datasets and also improves AUPRC by up to 5.7% on pediatric datasets and up to 2.6% on morbidly obese patients’ datasets on average. This indicates that the inclusion of domain knowledge improves the predictive performance of Logistic Regression.

中文翻译:

将逻辑回归中的领域知识与医院再入院预测应用集成的框架

众所周知,机器学习算法是根据手头的数据发现和提取知识的。然而,大量的知识是可用的,它们是机器可读的格式,可以包含在机器学习算法和模型中。在本文中,我们提出了一个框架,该框架使用堆叠泛化将领域知识以本体/层次结构的形式集成到逻辑回归中。即,来自本体/层次结构的关系以堆叠方式使用,以获得更高、更抽象的概念。获得的概念进一步用于预测。我们解决的问题是计划外的30天再入院,这被认为是医疗保健的主要问题之一。与 Ridge、Lasso 和 Tree Lasso Logistic 回归相比,提出的框架产生了更好的结果。结果表明,所提出的框架在儿科数据集上将 AUC 提高了高达 9.5%,在病态肥胖患者的数据集上提高了高达 4%,并且在儿科数据集上将 AUPRC 提高了高达 5.7%,在病态肥胖患者的数据集上提高了高达 2.6%一般。这表明包含领域知识提高了逻辑回归的预测性能。
更新日期:2019-10-01
down
wechat
bug