当前位置: X-MOL 学术BMC Med. Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Label propagation-based semi-supervised feature selection on decoding clinical phenotypes with RNA-seq data
BMC Medical Genomics ( IF 2.7 ) Pub Date : 2021-08-31 , DOI: 10.1186/s12920-021-00985-0
Xue Jiang 1 , Miao Chen 1 , Weichen Song 1 , Guan Ning Lin 1, 2
Affiliation  

Clinically, behavior, cognitive, and mental functions are affected during the neurodegenerative disease progression. To date, the molecular pathogenesis of these complex disease is still unclear. With the rapid development of sequencing technologies, it is possible to delicately decode the molecular mechanisms corresponding to different clinical phenotypes at the genome-wide transcriptomic level using computational methods. Our previous studies have shown that it is difficult to distinguish disease genes from non-disease genes. Therefore, to precisely explore the molecular pathogenesis under complex clinical phenotypes, it is better to identify biomarkers corresponding to different disease stages or clinical phenotypes. So, in this study, we designed a label propagation-based semi-supervised feature selection approach (LPFS) to prioritize disease-associated genes corresponding to different disease stages or clinical phenotypes. In this study, we pioneering put label propagation clustering and feature selection into one framework and proposed label propagation-based semi-supervised feature selection approach. LPFS prioritizes disease genes related to different disease stages or phenotypes through the alternative iteration of label propagation clustering based on sample network and feature selection with gene expression profiles. Then the GO and KEGG pathway enrichment analysis were carried as well as the gene functional analysis to explore molecular mechanisms of specific disease phenotypes, thus to decode the changes in individual behavioral and mental characteristics during neurodegenerative disease progression. Large amounts of experiments were conducted to verify the performance of LPFS with Huntington’s gene expression data. Experimental results shown that LPFS performs better in comparison with the-state-of-art methods. GO and KEGG enrichment analysis of key gene sets shown that TGF-beta signaling pathway, cytokine-cytokine receptor interaction, immune response, and inflammatory response were gradually affected during the Huntington’s disease progression. In addition, we found that the expression of SLC4A11, ZFP474, AMBP, TOP2A, PBK, CCDC33, APSL, DLGAP5, and Al662270 changed seriously by the development of the disease. In this study, we designed a label propagation-based semi-supervised feature selection model to precisely selected key genes of different disease phenotypes. We conducted experiments using the model with Huntington’s disease mice gene expression data to decode the mechanisms of it. We found many cell types, including astrocyte, microglia, and GABAergic neuron, could be involved in the pathological process.

中文翻译:

基于标签传播的半监督特征选择对利用 RNA-seq 数据解码临床表型

临床上,行为、认知和心理功能在神经退行性疾病进展期间受到影响。迄今为止,这些复杂疾病的分子发病机制仍不清楚。随着测序技术的飞速发展,利用计算方法在全基因组转录组水平上精细解码不同临床表型对应的分子机制成为可能。我们之前的研究表明,很难区分疾病基因和非疾病基因。因此,要精准探索复杂临床表型下的分子发病机制,最好识别出与不同疾病分期或临床表型相对应的生物标志物。所以,在这项研究中,我们设计了一种基于标签传播的半监督特征选择方法(LPFS)来优先考虑与不同疾病阶段或临床表型相对应的疾病相关基因。在这项研究中,我们开创性地将标签传播聚类和特征选择放在一个框架中,并提出了基于标签传播的半监督特征选择方法。LPFS 通过基于样本网络的标签传播聚类的交替迭代和具有基因表达谱的特征选择,对与不同疾病阶段或表型相关的疾病基因进行优先排序。然后进行GO和KEGG通路富集分析以及基因功能分析,探索特定疾病表型的分子机制,从而解码神经退行性疾病进展过程中个体行为和心理特征的变化。进行了大量的实验来验证 LPFS 与亨廷顿基因表达数据的性能。实验结果表明,与最先进的方法相比,LPFS 的性能更好。关键基因集的GO和KEGG富集分析表明,在亨廷顿病进展过程中,TGF-β信号通路、细胞因子-细胞因子受体相互作用、免疫反应和炎症反应逐渐受到影响。此外,我们发现SLC4A11、ZFP474、AMBP、TOP2A、PBK、CCDC33、APSL、DLGAP5和Al662270的表达随着疾病的发展而发生严重变化。在这项研究中,我们设计了一个基于标签传播的半监督特征选择模型来精确选择不同疾病表型的关键基因。我们使用具有亨廷顿病小鼠基因表达数据的模型进行了实验,以解码其机制。我们发现许多细胞类型,包括星形胶质细胞、小胶质细胞和 GABA 能神经元,都可能参与病理过程。
更新日期:2021-08-31
down
wechat
bug