当前位置: X-MOL 学术Comput. Biol. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Development of an absolute assignment predictor for triple-negative breast cancer subtyping using machine learning approaches
Computers in Biology and Medicine ( IF 7.7 ) Pub Date : 2020-12-09 , DOI: 10.1016/j.compbiomed.2020.104171
Fadoua Ben Azzouz 1 , Bertrand Michel 2 , Hamza Lasla 1 , Wilfried Gouraud 1 , Anne-Flore François 3 , Fabien Girka 3 , Théo Lecointre 3 , Catherine Guérin-Charbonnel 1 , Philippe P Juin 4 , Mario Campone 5 , Pascal Jézéquel 6
Affiliation  

Triple-negative breast cancer (TNBC) heterogeneity represents one of the main obstacles to precision medicine for this disease. Recent concordant transcriptomics studies have shown that TNBC could be divided into at least three subtypes with potential therapeutic implications. Although a few studies have been conducted to predict TNBC subtype using transcriptomics data, the subtyping was partially sensitive and limited by batch effect and dependence on a given dataset, which may penalize the switch to routine diagnostic testing. Therefore, we sought to build an absolute predictor (i.e., intra-patient diagnosis) based on machine learning algorithms with a limited number of probes. To that end, we started by introducing probe binary comparison for each patient (indicators). We based the predictive analysis on this transformed data. Probe selection was first involved combining both filter and wrapper methods for variable selection using cross-validation. We tested three prediction models (random forest, gradient boosting [GB], and extreme gradient boosting) using this optimal subset of indicators as inputs. Nested cross-validation consistently allowed us to choose the best model. The results showed that the fifty selected indicators highlighted the biological characteristics associated with each TNBC subtype. The GB based on this subset of indicators performs better than other models.



中文翻译:

使用机器学习方法开发三阴性乳腺癌亚型的绝对任务预测因子

三阴性乳腺癌(TNBC)异质性是该疾病精密医学的主要障碍之一。最近的一致转录组学研究表明,TNBC可以分为具有潜在治疗意义的至少三种亚型。尽管已经进行了一些研究,使用转录组学数据来预测TNBC亚型,但是该亚型是部分敏感的,并且受批处理效应和对给定数据集的依赖所限制,这可能不利于常规诊断测试的转换。因此,我们寻求基于有限数量的探针的机器学习算法来构建绝对预测因子(即,患者内诊断)。为此,我们从介绍每位患者(指标)的探针二元比较开始。我们基于此转换后的数据进行了预测分析。探针选择首先涉及使用交叉验证结合过滤器方法和包装器方法进行变量选择。我们使用指标的最佳子集作为输入,测试了三个预测模型(随机森林,梯度提升[GB]和极端梯度提升)。嵌套的交叉验证始终使我们能够选择最佳模型。结果表明,选择的五十种指标突显了与每种TNBC亚型相关的生物学特性。基于这一指标子集的GB的表现优于其他模型。嵌套的交叉验证始终使我们能够选择最佳模型。结果表明,选择的五十种指标突显了与每种TNBC亚型相关的生物学特性。基于这一指标子集的GB的表现优于其他模型。嵌套的交叉验证始终使我们能够选择最佳模型。结果表明,选择的五十种指标突显了与每种TNBC亚型相关的生物学特性。基于这一指标子集的GB的表现优于其他模型。

更新日期:2020-12-11
down
wechat
bug