当前位置: X-MOL 学术AlChE J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Prediction of cell penetrating peptides and their uptake efficiency using random forest-based feature selections
AIChE Journal ( IF 3.5 ) Pub Date : 2022-05-25 , DOI: 10.1002/aic.17781
Peng Liu 1, 2 , Yijie Ding 2 , Ying Rong 3 , Dong Chen 4
Affiliation  

Cell penetrating peptides (CPPs) are short peptides that can carry biomolecules of varying sizes across the cell membrane into the cytoplasm. Correctly identifying CPPs is the basis for studying their functions and mechanisms. Here, we propose a novel CPP predictor that is able to predict CPPs and their uptake efficiency. In our method, five feature descriptors are applied to encode the sequence and compose a hybrid feature vector. Afterward, the wrapper + random forest algorithm is employed, which combines feature selection with the prediction process to find features that are crucial for identifying CPPs. The jackknife cross validation result shows that our predictor is comparable to state-of-the-art CPP predictors, and our method reduces the feature dimension, which improves computational efficiency and avoids overfitting, allowing our predictor to be adopted to identify large-scale CPP data.

中文翻译:

使用基于随机森林的特征选择预测细胞穿透肽及其摄取效率

细胞穿透肽 (CPP) 是一种短肽,可以携带不同大小的生物分子穿过细胞膜进入细胞质。正确识别CPPs是研究其功能和机制的基础。在这里,我们提出了一种新的 CPP 预测器,它能够预测 CPP 及其吸收效率。在我们的方法中,使用五个特征描述符对序列进行编码并组成一个混合特征向量。之后,采用包装器+随机森林算法,将特征选择与预测过程相结合,以找到对识别 CPP 至关重要的特征。折刀交叉验证结果表明,我们的预测器与最先进的 CPP 预测器相当,并且我们的方法降低了特征维数,从而提高了计算效率并避免了过拟合,
更新日期:2022-05-25
down
wechat
bug