当前位置: X-MOL 学术Biol. Direct › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An ensemble learning approach for modeling the systems biology of drug-induced injury
Biology Direct ( IF 5.5 ) Pub Date : 2021-01-12 , DOI: 10.1186/s13062-020-00288-x
Joaquim Aguirre-Plans 1 , Janet Piñero 1 , Terezinha Souza 2 , Giulia Callegaro 3 , Steven J Kunnen 3 , Ferran Sanz 1 , Narcis Fernandez-Fuentes 4, 5 , Laura I Furlong 1 , Emre Guney 1 , Baldo Oliva 1
Affiliation  

Drug-induced liver injury (DILI) is an adverse reaction caused by the intake of drugs of common use that produces liver damage. The impact of DILI is estimated to affect around 20 in 100,000 inhabitants worldwide each year. Despite being one of the main causes of liver failure, the pathophysiology and mechanisms of DILI are poorly understood. In the present study, we developed an ensemble learning approach based on different features (CMap gene expression, chemical structures, drug targets) to predict drugs that might cause DILI and gain a better understanding of the mechanisms linked to the adverse reaction. We searched for gene signatures in CMap gene expression data by using two approaches: phenotype-gene associations data from DisGeNET, and a non-parametric test comparing gene expression of DILI-Concern and No-DILI-Concern drugs (as per DILIrank definitions). The average accuracy of the classifiers in both approaches was 69%. We used chemical structures as features, obtaining an accuracy of 65%. The combination of both types of features produced an accuracy around 63%, but improved the independent hold-out test up to 67%. The use of drug-target associations as feature obtained the best accuracy (70%) in the independent hold-out test. When using CMap gene expression data, searching for a specific gene signature among the landmark genes improves the quality of the classifiers, but it is still limited by the intrinsic noise of the dataset. When using chemical structures as a feature, the structural diversity of the known DILI-causing drugs hampers the prediction, which is a similar problem as for the use of gene expression information. The combination of both features did not improve the quality of the classifiers but increased the robustness as shown on independent hold-out tests. The use of drug-target associations as feature improved the prediction, specially the specificity, and the results were comparable to previous research studies.

中文翻译:

用于模拟药物引起的损伤的系统生物学的集成学习方法

药物性肝损伤(DILI)是由于服用常用药物引起的对肝脏产生损害的不良反应。据估计,每年全球 10 万居民中就有 20 人受到 DILI 的影响。尽管 DILI 是肝衰竭的主要原因之一,但其病理生理学和机制仍知之甚少。在本研究中,我们开发了一种基于不同特征(CMap 基因表达、化学结构、药物靶标)的集成学习方法,以预测可能导致 DILI 的药物,并更好地了解与不良反应相关的机制。我们使用两种方法在 CMap 基因表达数据中搜索基因特征:来自 DisGeNET 的表型-基因关联数据,以及比较 DILI 关注和无 DILI 关注药物的基因表达的非参数测试(根据 DILIrank 定义)。两种方法中分类器的平均准确率为 69%。我们使用化学结构作为特征,获得了 65% 的准确率。两种类型特征的组合产生了约 63% 的准确度,但将独立保持测试提高了 67%。使用药物-靶标关联作为特征在独立保留测试中获得了最佳准确度(70%)。当使用 CMap 基因表达数据时,在标志性基因中搜索特定基因签名可以提高分类器的质量,但仍然受到数据集固有噪声的限制。当使用化学结构作为特征时,已知的 DILI 引起药物的结构多样性阻碍了预测,这与使用基因表达信息时存在类似的问题。这两个功能的组合并没有提高分类器的质量,但提高了鲁棒性,如独立保留测试所示。使用药物-靶标关联作为特征改善了预测,特别是特异性,并且结果与之前的研究相当。
更新日期:2021-01-13
down
wechat
bug