当前位置: X-MOL 学术Chemometr. Intell. Lab. Systems › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
iPredCNC: Computational prediction model for cancerlectins and non-cancerlectins using novel cascade features subset selection
Chemometrics and Intelligent Laboratory Systems ( IF 3.7 ) Pub Date : 2019-12-01 , DOI: 10.1016/j.chemolab.2019.103876
Zaheer Ullah Khan , Farman Ali , Irfan Ahmad , Maqsood Hayat , Dechang Pi

Abstract Lectins are special types of protein that play a crucial role in tumor cell differentiation due to their significant binding affinity to certain types of saccharide (carbohydrate) groups. They are also closely related to certain types of proteins that initiate tumor cell survival, growth, metastasis, carcinoma, and different stages of tumor. Differentiating the specific functions of proteins remains challenging in the post-genomic era. This endeavor is vital in therapeutic cancer studies, but web-lab experiments related to this issue are expensive and time-consuming. To cope with this situation, several computational sequence-based methods have been proposed to differentiate the specific functions of proteins. In the current study, we have developed a fast-accurate cascade feature selection-based machine learning model for cancer lectins using different sequence-based feature descriptive techniques. This model yielded 85.21% accuracy, 87.84% sensitivity, 81.92% specificity, and 0.922 AUC with a multilayer perceptron over k-fold, and stratified k-fold cross-validation tests. These concrete empirical results show the authenticity and robustness of the proposed study compared to all existing approaches. This proposed novel methodology would be a handy tool in cancer therapeutics research, drug design, and academic studies. All the source codes and data regarding this manuscript are freely available via http://www.github.com/zaheeerkhancs/iPredCNC .

中文翻译:

iPredCNC:使用新型级联特征子集选择的癌症凝集素和非癌症凝集素的计算预测模型

摘要 凝集素是一种特殊类型的蛋白质,由于它们对某些类型的糖类(碳水化合物)基团具有显着的结合亲和力,因此在肿瘤细胞分化中起着至关重要的作用。它们还与启动肿瘤细胞存活、生长、转移、癌变和肿瘤不同阶段的某些类型的蛋白质密切相关。在后基因组时代,区分蛋白质的特定功能仍然具有挑战性。这一努力对于治疗性癌症研究至关重要,但与此问题相关的网络实验室实验既昂贵又耗时。为了应对这种情况,已经提出了几种基于计算序列的方法来区分蛋白质的特定功能。在目前的研究中,我们使用不同的基于序列的特征描述技术为癌症凝集素开发了一种快速准确的基于级联特征选择的机器学习模型。该模型具有 85.21% 的准确度、87.84% 的灵敏度、81.92% 的特异性和 0.922 的 AUC,其中多层感知器在 k 倍和分层 k 倍交叉验证测试中。这些具体的实证结果表明,与所有现有方法相比,所提出的研究的真实性和稳健性。这种提议的新方法将成为癌症治疗研究、药物设计和学术研究的便捷工具。关于本手稿的所有源代码和数据均可通过 http://www.github.com/zaheeerkhancs/iPredCNC 免费获得。92% 的特异性和 0.922 AUC,具有超过 k 倍的多层感知器和分层 k 倍交叉验证测试。这些具体的实证结果表明,与所有现有方法相比,所提出的研究的真实性和稳健性。这种提议的新方法将成为癌症治疗研究、药物设计和学术研究的便捷工具。关于本手稿的所有源代码和数据均可通过 http://www.github.com/zaheeerkhancs/iPredCNC 免费获得。92% 的特异性和 0.922 AUC,具有超过 k 倍的多层感知器和分层 k 倍交叉验证测试。这些具体的实证结果表明,与所有现有方法相比,所提出的研究的真实性和稳健性。这种提议的新方法将成为癌症治疗研究、药物设计和学术研究的便捷工具。关于本手稿的所有源代码和数据均可通过 http://www.github.com/zaheeerkhancs/iPredCNC 免费获得。
更新日期:2019-12-01
down
wechat
bug