当前位置: X-MOL 学术Biol. Direct › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Predictability of drug-induced liver injury by machine learning.
Biology Direct ( IF 5.7 ) Pub Date : 2020-02-13 , DOI: 10.1186/s13062-020-0259-4
Marco Chierici 1 , Margherita Francescatto 1 , Nicole Bussola 1, 2 , Giuseppe Jurman 1 , Cesare Furlanello 1
Affiliation  

BACKGROUND Drug-induced liver injury (DILI) is a major concern in drug development, as hepatotoxicity may not be apparent at early stages but can lead to life threatening consequences. The ability to predict DILI from in vitro data would be a crucial advantage. In 2018, the Critical Assessment Massive Data Analysis group proposed the CMap Drug Safety challenge focusing on DILI prediction. METHODS AND RESULTS The challenge data included Affymetrix GeneChip expression profiles for the two cancer cell lines MCF7 and PC3 treated with 276 drug compounds and empty vehicles. Binary DILI labeling and a recommended train/test split for the development of predictive classification approaches were also provided. We devised three deep learning architectures for DILI prediction on the challenge data and compared them to random forest and multi-layer perceptron classifiers. On a subset of the data and for some of the models we additionally tested several strategies for balancing the two DILI classes and to identify alternative informative train/test splits. All the models were trained with the MAQC data analysis protocol (DAP), i.e., 10x5 cross-validation over the training set. In all the experiments, the classification performance in both cross-validation and external validation gave Matthews correlation coefficient (MCC) values below 0.2. We observed minimal differences between the two cell lines. Notably, deep learning approaches did not give an advantage on the classification performance. DISCUSSION We extensively tested multiple machine learning approaches for the DILI classification task obtaining poor to mediocre performance. The results suggest that the CMap expression data on the two cell lines MCF7 and PC3 are not sufficient for accurate DILI label prediction. REVIEWERS This article was reviewed by Maciej Kandula and Paweł P. Labaj.

中文翻译:

机器学习对药物性肝损伤的可预测性。

背景技术药物诱发的肝损伤(DILI)是药物开发中的主要问题,因为肝毒性在早期可能并不明显,但可能导致威胁生命的后果。从体外数据预测DILI的能力将是至关重要的优势。2018年,关键评估海量数据分析小组提出了以DILI预测为重点的CMap药物安全挑战。方法和结果挑战数据包括用276种药物化合物和空载剂处理的两种癌细胞系MCF7和PC3的Affymetrix GeneChip表达谱。还提供了二进制DILI标记和推荐的训练/测试分类,用于开发预测分类方法。我们设计了三种深度学习架构,用于对挑战数据进行DILI预测,并将它们与随机森林和多层感知器分类器进行了比较。在数据的子集和某些模型中,我们另外测试了几种策略,用于平衡两个DILI类并确定可供选择的信息性训练/测试拆分。所有模型都使用MAQC数据分析协议(DAP)进行了训练,即在训练集上进行10x5交叉验证。在所有实验中,交叉验证和外部验证中的分类性能均使Matthews相关系数(MCC)值低于0.2。我们观察到两种细胞系之间的最小差异。值得注意的是,深度学习方法在分类性能上没有优势。讨论我们针对DILI分类任务广泛测试了多种机器学习方法,从而获得了较差的性能。结果表明,两个细胞系MCF7和PC3上的CMap表达数据不足以进行准确的DILI标签预测。审阅者本文由Maciej Kandula和PawełP. Labaj审阅。
更新日期:2020-04-22
down
wechat
bug