当前位置: X-MOL 学术J. Big Data › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Prediction of chemoresistance trait of cancer cell lines using machine learning algorithms and systems biology analysis
Journal of Big Data ( IF 8.1 ) Pub Date : 2021-07-05 , DOI: 10.1186/s40537-021-00477-z
Atousa Ataei 1 , Albert A. Rizvanov 1 , Niloufar Seyed Majidi 2 , S. Shahriar Arab 2 , Javad Zahiri 3 , Mehrdad Rostami 4
Affiliation  

Most of the current cancer treatment approaches are invasive along with a broad spectrum of side effects. Furthermore, cancer drug resistance known as chemoresistance is a huge obstacle during treatment. This study aims to predict the resistance of several cancer cell-lines to a drug known as Cisplatin. In this papers the NCBI GEO database was used to obtain data and then the harvested data was normalized and its batch effects were corrected by the Combat software. In order to select the appropriate features for machine learning, the feature selection/reduction was performed based on the Fisher Score method. Six different algorithms were then used as machine learning algorithms to detect Cisplatin resistant and sensitive samples in cancer cell lines. Moreover, Differentially Expressed Genes (DEGs) between all the sensitive and resistance samples were harvested. The selected genes were enriched in biological pathways by the enrichr database. Topological analysis was then performed on the constructed networks using Cytoscape software. Finally, the biological description of the output genes from the performed analyses was investigated through literature review. Among the six classifiers which were trained to distinguish between cisplatin resistance samples and the sensitive ones, the KNN and the Naïve Bayes algorithms were proposed as the most convenient machines according to some calculated measures. Furthermore, the results of the systems biology analysis determined several potential chemoresistance genes among which PTGER3, YWHAH, CTNNB1, ANKRD50, EDNRB, ACSL6, IFNG and, CTNNB1 are topologically more important than others. These predictions pave the way for further experimental researches.



中文翻译:

使用机器学习算法和系统生物学分析预测癌细胞系的化学抗性特征

目前的大多数癌症治疗方法都是侵入性的,并且具有广泛的副作用。此外,被称为化学抗性的癌症耐药性是治疗过程中的巨大障碍。这项研究旨在预测几种癌细胞系对一种称为顺铂的药物的耐药性。在本文中,NCBI GEO 数据库用于获取数据,然后对采集的数据进行归一化,并通过 Combat 软件校正其批次效应。为了为机器学习选择合适的特征,基于Fisher Score进行特征选择/减少方法。然后使用六种不同的算法作为机器学习算法来检测癌细胞系中的顺铂耐药和敏感样本。此外,收获了所有敏感和抗性样品之间的差异表达基因(DEG)。所选基因通过富集数据库在生物学途径中富集。然后使用 Cytoscape 软件对构建的网络进行拓扑分析。最后,通过文献综述研究了来自执行分析的输出基因的生物学描述。在经过训练以区分顺铂耐药样本和敏感样本的六个分类器中,根据一些计算量度,KNN 和朴素贝叶斯算法被认为是最方便的机器。此外,PTGER3、YWHAH、CTNNB1、ANKRD50、EDRNB、ACSL6、IFNGCTNNB1在拓扑上比其他更重要。这些预测为进一步的实验研究铺平了道路。

更新日期:2021-07-05
down
wechat
bug