当前位置: X-MOL 学术Curr. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
CSBPI_Site: Multi-Information Sources of Features to RNA Binding Sites Prediction
Current Bioinformatics ( IF 4 ) Pub Date : 2021-05-31 , DOI: 10.2174/1574893615666210108093950
Lichao Zhang 1 , Zihong Huang 1 , Liang Kong 2
Affiliation  

Background: RNA-binding proteins establish posttranscriptional gene regulation by coordinating maturation, editing, transport, stability, and translation of cellular RNAs. Immunoprecipitation experiments could identify the interaction between RNA and proteins, but they are limited due to the experimental environment and material. Therefore, it is essential to construct computational models to identify the function sites.

Objective: Although some computational methods have been proposed to predict RNA binding sites, the accuracy could be further improved. Moreover, it is necessary to construct a dataset with more samples to design a reliable model. Here we present a computational model based on multi-information sources to identify RNA binding sites.

Methods: We construct an accurate computational model named CSBPI_Site, based on extreme gradient boosting. The specifically designed 15-dimensional feature vector captures four types of information (chemical shift, chemical bond, chemical properties and position information).

Results: The satisfied accuracy of 0.86 and AUC of 0.89 were obtained by leave-one-out crossvalidation. Meanwhile, the accuracies were slightly different (range from 0.83 to 0.85) among the three classifiers algorithm, which showed that the novel features are stable and fit to multiple classifiers. These results showed that the proposed method is effective and robust for the identification of noncoding RNA binding sites.

Conclusion: Our method based on multi-information sources is effective to represent the binding sites information among ncRNAs. The satisfied prediction results of Diels-Alder riboz-yme based on CSBPI_Site indicates that our model is valuable to identify the function site.



中文翻译:

CSBPI_Site:RNA结合位点预测特征的多信息源

背景:RNA 结合蛋白通过协调细胞 RNA 的成熟、编辑、运输、稳定性和翻译来建立转录后基因调控。免疫沉淀实验可以识别RNA和蛋白质之间的相互作用,但由于实验环境和材料的限制。因此,构建计算模型来识别功能位点是必不可少的。

目的:虽然已经提出了一些计算方法来预测 RNA 结合位点,但其准确性还可以进一步提高。此外,需要构建具有更多样本的数据集来设计可靠的模型。在这里,我们提出了一个基于多信息源的计算模型来识别 RNA 结合位点。

方法:我们基于极端梯度提升构建了一个名为 CSBPI_Site 的精确计算模型。专门设计的 15 维特征向量捕获四种类型的信息(化学位移、化学键、化学性质和位置信息)。

结果:留一法交叉验证获得了满意的准确度0.86和AUC 0.89。同时,三种分类器算法的准确率略有不同(范围从0.83到0.85),这表明新特征是稳定的并且适合多个分类器。这些结果表明,所提出的方法对于鉴定非编码 RNA 结合位点是有效且稳健的。

结论:我们基于多信息源的方法可以有效地表示ncRNAs之间的结合位点信息。Diels-Alder riboz-yme 基于 CSBPI_Site 的令人满意的预测结果表明我们的模型对于识别功能位点是有价值的。

更新日期:2021-05-31
down
wechat
bug