当前位置: X-MOL 学术bioRxiv. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Predicting Cell-Penetrating Peptides: Building and Interpreting Random Forest based prediction Models
bioRxiv - Bioinformatics Pub Date : 2020-10-16 , DOI: 10.1101/2020.10.15.341149
Shilpa Yadahalli , Chandra S. Verma

Targeting intracellular pathways with peptide drugs is becoming increasingly desirable but often limited in application due to their poor cell permeability. Understanding cellular permeability of peptides remains a major challenge with very little structure-activity relationship known. Fortunately, there exist a class of peptides called Cell-Penetrating Peptides (CPPs), which have the ability to cross cell membranes and are also capable of delivering biologically active cargo into cells. Discovering patterns that make peptides cell-permeable have a variety of applications in drug delivery. In the current study, we build prediction models for CPPs exploring features covering a range of properties based on amino acid sequences, using Random forest classifiers which are often more interpretable than other ensemble machine learning algorithms. While obtaining prediction accuracies of ~96%, we also interpret our prediction models using TreeInterpreter, LIME and SHAP to decipher the contributions of important features and optimal feature space for CPP class. We propose that our work might offer an intuitive guide for incorporating features that impart cell-penetrability into the design of novel CPPs.

中文翻译:

预测细胞穿透肽:建立和解释基于随机森林的预测模型

用肽药物靶向细胞内途径变得越来越需要,但由于它们的细胞渗透性差而常常在应用中受到限制。了解肽的细胞通透性仍然是一项主要挑战,几乎没有已知的结构-活性关系。幸运的是,存在一类称为细胞穿透肽(CPPs)的肽,它们具有穿越细胞膜的能力,并且还能够将具有生物活性的货物运送到细胞中。发现使肽具有细胞渗透性的模式在药物递送中具有多种应用。在当前的研究中,我们使用随机森林分类器建立了CPP的预测模型,探索基于氨基酸序列的一系列特性,这些分类器通常比其他整体机器学习算法更易于解释。在获得约96%的预测精度的同时,我们还使用TreeInterpreter,LIME和SHAP解释了我们的预测模型,以破译CPP类的重要特征和最佳特征空间的贡献。我们建议,我们的工作可能会为将赋予细胞穿透性的功能纳入新颖CPP的设计提供直观的指导。
更新日期:2020-10-17
down
wechat
bug