Towards Predicting Risk of Coronary Artery Disease from Semi-Structured Dataset,Interdisciplinary Sciences: Computational Life Sciences

当前位置： X-MOL 学术 › Interdiscip. Sci. Comput. Life Sci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Towards Predicting Risk of Coronary Artery Disease from Semi-Structured Dataset
Interdisciplinary Sciences: Computational Life Sciences ( IF 4.8 ) Pub Date : 2020-03-19 , DOI: 10.1007/s12539-020-00363-x
Smita Roy ₁ , Asif Ekbal ₁ , Samrat Mondal ₁ , Maunendra Sankar Desarkar ₂ , Shubham Chattopadhyay ₃

Affiliation

Many kinds of disease-related data are now available and researchers are constantly attempting to mine useful information out of these. Medical data are not always homogeneous and in structured form, and mostly they are time-stamped data. Thus, special care is required to prevent any kind of information loss during mining such data. Mining medical data is challenging as predicting the non-accurate result is often not acceptable in this domain. In this paper, we have analyzed a partially annotated coronary artery disease (CAD) dataset which was originally in a semi-structured form. We have created a set of some well-defined features from the dataset, and then build predictive models for CAD risk identification using different supervised learning algorithms. We then further enhanced the performances of the models using a feature selection technique. Experiments show that results are quite interesting, and are expected to help medical practitioners for investigating CAD risk in patients.

中文翻译：

从半结构化数据集预测冠状动脉疾病的风险

现在可以获得多种与疾病相关的数据，研究人员不断尝试从中挖掘有用的信息。医疗数据并不总是同质的和结构化的形式，它们大多是带有时间戳的数据。因此，在挖掘此类数据期间需要特别小心以防止任何类型的信息丢失。挖掘医疗数据具有挑战性，因为在该领域中预测不准确的结果通常是不可接受的。在本文中，我们分析了部分注释的冠状动脉疾病 (CAD) 数据集，该数据集最初是半结构化形式。我们从数据集中创建了一组定义明确的特征，然后使用不同的监督学习算法为 CAD 风险识别构建预测模型。然后，我们使用特征选择技术进一步增强了模型的性能。

更新日期：2020-03-19

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>