当前位置: X-MOL 学术Comput. Math. Method Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
iBLP: An XGBoost-Based Predictor for Identifying Bioluminescent Proteins
Computational and Mathematical Methods in Medicine Pub Date : 2021-01-07 , DOI: 10.1155/2021/6664362
Dan Zhang 1 , Hua-Dong Chen 2 , Hasan Zulfiqar 1 , Shi-Shi Yuan 1 , Qin-Lai Huang 1 , Zhao-Yue Zhang 1 , Ke-Jun Deng 1
Affiliation  

Bioluminescent proteins (BLPs) are a class of proteins that widely distributed in many living organisms with various mechanisms of light emission including bioluminescence and chemiluminescence from luminous organisms. Bioluminescence has been commonly used in various analytical research methods of cellular processes, such as gene expression analysis, drug discovery, cellular imaging, and toxicity determination. However, the identification of bioluminescent proteins is challenging as they share poor sequence similarities among them. In this paper, we briefly reviewed the development of the computational identification of BLPs and subsequently proposed a novel predicting framework for identifying BLPs based on eXtreme gradient boosting algorithm (XGBoost) and using sequence-derived features. To train the models, we collected BLP data from bacteria, eukaryote, and archaea. Then, for getting more effective prediction models, we examined the performances of different feature extraction methods and their combinations as well as classification algorithms. Finally, based on the optimal model, a novel predictor named iBLP was constructed to identify BLPs. The robustness of iBLP has been proved by experiments on training and independent datasets. Comparison with other published method further demonstrated that the proposed method is powerful and could provide good performance for BLP identification. The webserver and software package for BLP identification are freely available at http://lin-group.cn/server/iBLP.

中文翻译:

iBLP:用于识别生物发光蛋白的基于 XGBoost 的预测器

生物发光蛋白(BLPs)是一类广泛分布在许多生物体中的蛋白质,具有各种发光机制,包括生物发光和发光生物的化学发光。生物发光已广泛用于细胞过程的各种分析研究方法,例如基因表达分析、药物发现、细胞成像和毒性测定。然而,生物发光蛋白的鉴定具有挑战性,因为它们之间的序列相似性较差。在本文中,我们简要回顾了 BLP 计算识别的发展,随后提出了一种新的预测框架,用于基于 eXtreme 梯度提升算法 (XGBoost) 和使用序列衍生特征来识别 BLP。为了训练模型,我们从细菌中收集了 BLP 数据,真核生物和古细菌。然后,为了获得更有效的预测模型,我们检查了不同特征提取方法及其组合以及分类算法的性能。最后,基于最优模型,构建了一个名为 iBLP 的新型预测器来识别 BLP。iBLP 的鲁棒性已被训练和独立数据集的实验证明。与其他已发表的方法的比较进一步表明,所提出的方法是强大的,可以为 BLP 识别提供良好的性能。BLP 识别的网络服务器和软件包可在 http://lin-group.cn/server/iBLP 上免费获得。我们检查了不同特征提取方法及其组合以及分类算法的性能。最后,基于最优模型,构建了一个名为 iBLP 的新型预测器来识别 BLP。iBLP 的鲁棒性已被训练和独立数据集的实验证明。与其他已发表的方法的比较进一步表明,所提出的方法是强大的,可以为 BLP 识别提供良好的性能。BLP 识别的网络服务器和软件包可在 http://lin-group.cn/server/iBLP 上免费获得。我们检查了不同特征提取方法及其组合以及分类算法的性能。最后,基于最优模型,构建了一个名为 iBLP 的新型预测器来识别 BLP。iBLP 的鲁棒性已被训练和独立数据集的实验证明。与其他已发表的方法的比较进一步表明,所提出的方法是强大的,可以为 BLP 识别提供良好的性能。BLP 识别的网络服务器和软件包可在 http://lin-group.cn/server/iBLP 上免费获得。与其他已发表的方法的比较进一步表明,所提出的方法是强大的,可以为 BLP 识别提供良好的性能。BLP 识别的网络服务器和软件包可在 http://lin-group.cn/server/iBLP 上免费获得。与其他已发表的方法的比较进一步表明,所提出的方法是强大的,可以为 BLP 识别提供良好的性能。BLP 识别的网络服务器和软件包可在 http://lin-group.cn/server/iBLP 上免费获得。
更新日期:2021-01-07
down
wechat
bug