Neuroscience Letters ( IF 2.5 ) Pub Date : 2020-12-24 , DOI: 10.1016/j.neulet.2020.135596 Lulu Zhu , Xulong Wu , Bingyi Xu , Zhi Zhao , Jialei Yang , Jianxiong Long , Li Su
Background
Schizophrenia (SCZ) is a highly heritable mental disorder with a substantial disease burden and high mortality. Machine learning (ML) method can be used to identify individuals with schizophrenia (SCZ) on the basis of blood gene expression data with high accuracy.
Methods
This study aimed to differentiate patients with SCZ from healthy individuals by using the messenger RNA expression level in peripheral blood of 48 patients with SCZ and 50 controls via ML algorithms, namely, artificial neural networks, extreme gradient boosting, support vector machine (SVM), decision tree, and random forest. The expression of six mRNAs was detected using quantitative real-time polymerase chain reaction (qRT-PCR).
Results
The relative expression levels of GNAI1 (P < 0.001), PRKCA (P < 0.001), and PRKCB (P = 0.021) increased in the SCZ group, whereas those of FYN (P < 0.001), LYN (P = 0.022), and YWHAZ (P < 0.001) decreased in the SCZ group. We generated models with various combinations of genes based on five ML algorithms. The SVM model with six factors (GNAI1, FYN, PRKCA, YWHAZ, PRKCB, and LYN genes) was the best model for distinguishing patients with SCZ from healthy individuals (AUC = 0.993, sensitivity = 1.000, specificity = 0.895, and Youden index = 0.895).
Conclusions
This study suggested that the combination of genes using the ML method is better than the use of a single gene to discriminate patients with SCZ from healthy individuals. The combination of GNAI1, FYN, PRKCA, YWHAZ, PRKCB, and LYN under the SVM model can be used as a diagnostic biomarker for SCZ.
中文翻译:
基于外周血基因表达的精神分裂症诊断机器学习算法
背景
精神分裂症(SCZ)是一种高度可遗传的精神障碍,疾病负担重,死亡率高。机器学习(ML)方法可用于根据血液基因表达数据高精度地识别患有精神分裂症(SCZ)的个体。
方法
这项研究的目的是通过ML算法(包括人工神经网络,极端梯度增强,支持向量机(SVM),ML),48位SCZ患者和50位对照使用外周血中的信使RNA表达水平,以区分SCZ患者与健康个体决策树和随机森林。使用定量实时聚合酶链反应(qRT-PCR)检测了六个mRNA的表达。
结果
SCZ组中GNAI1(P <0.001),PRKCA(P <0.001)和PRKCB(P = 0.021)的相对表达水平增加,而FYN(P <0.001),LYN(P = 0.022)和FZ的相对表达水平增加。 SCZ组的YWHAZ(P <0.001)降低。我们基于五种ML算法生成了具有各种基因组合的模型。具有六个因素的SVM模型(GNAI1,FYN,PRKCA,YWHAZ,PRKCB和LYN 基因)是区分SCZ患者与健康个体的最佳模型(AUC = 0.993,灵敏度= 1.000,特异性= 0.895,Youden指数= 0.895)。
结论
这项研究表明,使用ML方法进行基因组合比使用单个基因更好地将SCZ患者与健康个体区分开。的组合GNAI1,FYN,PRKCA,YWHAZ,PRKCB和LYN的SVM模型下可被用作用于SCZ诊断生物标志物。