当前位置: X-MOL 学术Comput. Biol. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Classification algorithms applied to blood-based transcriptome meta-analysis to predict idiopathic Parkinson's disease.
Computers in Biology and Medicine ( IF 7.0 ) Pub Date : 2020-08-01 , DOI: 10.1016/j.compbiomed.2020.103925
Marcelo Falchetti 1 , Rui Daniel Prediger 2 , Alfeu Zanotto-Filho 3
Affiliation  

Diagnosis of Parkinson's disease (PD) remains a challenge in clinical practice, mostly due to lack of peripheral blood markers. Transcriptomic analysis in blood samples has emerged as a potential means to identify biomarkers and gene signatures of PD. In this context, classification algorithms can assist to detect data patterns such as phenotypes and transcriptional signatures with potential diagnostic application. In this study, we performed gene expression meta-analysis in blood transcriptome of PD and control patients in order to identify a gene-set capable of predicting PD with classification algorithms. We examined microarray data from public repositories and, after systematic review, 4 independent cohorts (GSE6613, GSE57475, GSE72267 and GSE99039) comprising 711 samples (388 idiopathic PD and 323 healthy individuals) were selected. Initially, analysis of differentially expressed genes resulted in minimal overlap among datasets. To circumvent this, we carried out meta-analysis of 17,712 genes across datasets, and calculated weighted mean Hedges' g effect sizes. From the top-100- positive and negative gene effect sizes, algorithms of collinearity recognition and recursive predictor elimination were used to generate a 59-gene signature of idiopathic PD. This signature was fine-tuned with 9 classification algorithms applied to 4 sample size-adjusted training groups to create 36 models. Of these, 33 showed accuracy higher than the non-information rate, and 2 models built on Support Vector Machine Regression bestowed best accuracy to predict PD and healthy control samples. In summary, the gene meta-analysis followed by machine learning methodology employed herein identified a gene-set capable of accurately predicting idiopathic PD in blood samples.



中文翻译:

分类算法应用于基于血液的转录组荟萃分析,以预测特发性帕金森氏病。

帕金森氏病(PD)的诊断在临床实践中仍然是一个挑战,主要是由于缺乏外周血标志物。血液样品中的转录组分析已成为鉴定PD的生物标志物和基因特征的潜在手段。在这种情况下,分类算法可以帮助潜在的诊断应用程序检测数据模式,例如表型和转录签名。在这项研究中,我们对PD和对照组患者的血液转录组进行了基因表达荟萃分析,以鉴定能够通过分类算法预测PD的基因集。我们检查了来自公共储存库的微阵列数据,经过系统回顾后,选择了包括711个样本(388个特发性PD和323个健康个体)的4个独立队列(GSE6613,GSE57475,GSE72267和GSE99039)。最初,差异表达基因的分析导致数据集之间的重叠最小。为了避免这种情况,我们对数据集中的17,712个基因进行了荟萃分析,并计算了加权平均对冲g效果大小。从前100个正负基因效应的大小来看,共线性识别和递归预测因子消除算法用于生成特发性PD的59个基因的签名。通过将9种分类算法应用于4个样本量调整的训练组来微调此签名,以创建36个模型。其中,有33个显示的准确度高于未提供信息的准确度,并且基于支持向量机回归构建的2个模型为预测PD和健康对照样品提供了最佳的准确度。总之,本文采用的基因荟萃分析和随后的机器学习方法确定了能够准确预测血样中特发性PD的基因集。

更新日期:2020-08-01
down
wechat
bug