当前位置: X-MOL 学术Interdiscip. Sci. Comput. Life Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
iPseU-Layer: Identifying RNA Pseudouridine Sites Using Layered Ensemble Model.
Interdisciplinary Sciences: Computational Life Sciences ( IF 4.8 ) Pub Date : 2020-03-13 , DOI: 10.1007/s12539-020-00362-y
Yashuang Mu 1, 2 , Ruijun Zhang 3 , Lidong Wang 3 , Xiaodong Liu 4
Affiliation  

Pseudouridine represents one of the most prevalent post-transcriptional RNA modifications. The identification of pseudouridine sites is an essential step toward understanding RNA functions, RNA structure stabilization, translation process, and RNA stability; however, high-throughput experimental techniques remain expensive and time-consuming in lab explorations and biochemical processes. Thus, how to develop an efficient pseudouridine site identification method based on machine learning is very important both in academic research and drug development. Motived by this, we present an effective layered ensemble model designated as iPseU-Layer for identification of RNA pseudouridine sites. The proposed iPseU-Layer approach is essentially based on three different machine learning layers including: feature selection layer, feature extraction and fusion layer, and prediction layer. The feature selection layer reduces the dimensionality, which can be regarded as a data pre-processing stage. The feature extraction and fusion layer utilizes an ensemble method which is implemented through various machine learning algorithms to generate some outputs. The prediction layer applies classic random forest to identify the final results. Furthermore, we systematically conduct the validation experiments using cross-validation tests and independent test with the current state-of-the-art models. The proposed iPseU-Layer provides a promising predictive performance in terms of sensitivity, specificity, accuracy and Matthews correlation coefficient. Collectively, these findings indicate that the framework of iPseU-Layer is a feasible and effective strategy for the prediction of RNA pseudouridine sites.

中文翻译:

iPseU层:使用分层集成模型识别RNA假尿苷位点。

假尿苷代表最普遍的转录后RNA修饰之一。伪尿苷位点的鉴定是了解RNA功能,RNA结构稳定,翻译过程和RNA稳定性的重要步骤。然而,高通量实验技术在实验室探索和生化过程中仍然昂贵且耗时。因此,如何开发一种基于机器学习的高效伪尿苷位点识别方法在学术研究和药物开发中都非常重要。基于此,我们提出了一个有效的分层集成模型,命名为iPseU-Layer,用于鉴定RNA假尿苷位点。提出的iPseU-Layer方法基本上基于三个不同的机器学习层,包括:特征选择层,特征提取和融合层以及预测层。特征选择层降低了维数,可以将其视为数据预处理阶段。特征提取和融合层利用集成方法,该方法通过各种机器学习算法实现,以生成一些输出。预测层应用经典随机森林来识别最终结果。此外,我们使用当前的最新模型,使用交叉验证测试和独立测试系统地进行验证实验。提出的iPseU层在灵敏度,特异性,准确性和Matthews相关系数方面提供了有希望的预测性能。总的来说,
更新日期:2020-03-13
down
wechat
bug