当前位置: X-MOL 学术Int. J. Gen. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Classifiers for Predicting Coronary Artery Disease Based on Gene Expression Profiles in Peripheral Blood Mononuclear Cells
International Journal of General Medicine ( IF 2.1 ) Pub Date : 2021-09-15 , DOI: 10.2147/ijgm.s329005
Jie Liu 1, 2 , Xiaodong Wang 1, 2 , Junhua Lin 1 , Shaohua Li 1 , Guoxiong Deng 1, 2 , Jinru Wei 1, 2
Affiliation  

Objective: Coronary artery disease (CAD) is a serious global health concern. Current diagnostic methods for CAD involve risk to the patient and are costly, so better diagnostic tools are needed. We defined four classifiers based on gene expression profiles in peripheral blood mononuclear cells and determined their potential for CAD detection.
Methods: We downloaded a CAD-related data set (GSE113079) from the Gene Expression Omnibus (GEO) database. We identified differentially expressed genes (DEGs) in peripheral blood mononuclear cells between CAD samples and healthy controls. DEGs were analyzed for functional enrichment. To create a robust CAD classifier, DEGs were identified by feature selection using the principal component analysis. Then, least absolute shrinkage and selection operator (LASSO) logistic regression, random forest, and support vector machine (SVM) models were created. Gene set variation analysis (GSVA) score and gene set enrichment analysis (GSEA) were also conducted. The performance of the models was evaluated in terms of the area under receiver operating characteristic curves (AUC).
Results: In the training set, we found 135 up-regulated genes and 104 down-regulated genes in CAD patients compared with controls. The DEGs were involved in some pathways associated with CAD, such as pathways involving calcium and interleukin-17 signaling. Twenty genes were identified as optimal features and used to generate the logistic classifier based on LASSO. The AUC for the classifier was 1.00 in the training set and 0.997 in the test set. Using the 20 DEGs, SVM and random forest classifiers were also generated and showed high diagnostic efficacy, with respective AUCs of 0.997 and 1.00 against the training set. A GSVA score was also established using the top 20 significant DEGs, which showed an AUC of 0.971 in the training set and 0.989 in the test set. Furthermore, GSEA showed autophagy and the proteasome to be major pathways involving the DEGs.
Conclusion: We identified a set of genes specific for CAD whose expression can be measured non-invasively. Using these genes, we defined four diagnostic classifiers using multiple methods.

Keywords: coronary artery disease, diagnosis, gene expression, classifier


中文翻译:

基于外周血单个核细胞基因表达谱预测冠状动脉疾病的分类器

目的:冠状动脉疾病(CAD)是一个严重的全球健康问题。目前的 CAD 诊断方法对患者存在风险且成本高昂,因此需要更好的诊断工具。我们根据外周血单个核细胞中的基因表达谱定义了四个分类器,并确定了它们用于 CAD 检测的潜力。
方法:我们从 Gene Expression Omnibus (GEO) 数据库下载了一个 CAD 相关数据集 (GSE113079)。我们在 CAD 样本和健康对照之间确定了外周血单个核细胞中的差异表达基因 (DEG)。分析 DEG 的功能富集。为了创建稳健的 CAD 分类器,使用主成分分析通过特征选择来识别 DEG。然后,创建了最小绝对收缩和选择算子 (LASSO) 逻辑回归、随机森林和支持向量机 (SVM) 模型。还进行了基因集变异分析(GSVA)评分和基因集富集分析(GSEA)。根据接受者操作特征曲线下面积(AUC)评估模型的性能。
结果:在训练集中,与对照组相比,我们在 CAD 患者中发现了 135 个上调基因和 104 个下调基因。DEG 参与了一些与 CAD 相关的途径,例如涉及钙和白细胞介素 17 信号传导的途径。20 个基因被确定为最佳特征,并用于生成基于 LASSO 的逻辑分类器。分类器的 AUC 在训练集中为 1.00,在测试集中为 0.997。使用 20 个 DEG,还生成了 SVM 和随机森林分类器,并显示出很高的诊断效率,相对于训练集的 AUC 分别为 0.997 和 1.00。还使用前 20 个重要的 DEG 建立了 GSVA 分数,其在训练集中显示 AUC 为 0.971,在测试集中显示为 0.989。此外,GSEA 显示自噬和蛋白酶体是涉及 DEG 的主要途径。
结论:我们确定了一组 CAD 特异性基因,其表达可以无创测量。使用这些基因,我们使用多种方法定义了四种诊断分类器。

关键词:冠状动脉疾病,诊断,基因表达,分类器
更新日期:2021-09-15
down
wechat
bug