当前位置: X-MOL 学术Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Prediction of N7-methylguanosine sites in human RNA based on optimal sequence features.
Genomics ( IF 3.4 ) Pub Date : 2020-07-25 , DOI: 10.1016/j.ygeno.2020.07.035
Yu-He Yang 1 , Chi Ma 1 , Jia-Shu Wang 1 , Hui Yang 1 , Hui Ding 1 , Shu-Guang Han 1 , Yan-Wen Li 2
Affiliation  

N-7 methylguanosine (m7G) modification is a ubiquitous post-transcriptional RNA modification which is vital for maintaining RNA function and protein translation. Developing computational tools will help us to easily predict the m7G sites in RNA sequence. In this work, we designed a sequence-based method to identify the modification site in human RNA sequences. At first, several kinds of sequence features were extracted to code m7G and non-m7G samples. Subsequently, we used mRMR, F-score, and Relief to obtain the optimal subset of features which could produce the maximum prediction accuracy. In 10-fold cross-validation, results showed that the highest accuracy is 94.67% achieved by support vector machine (SVM) for identifying m7G sites in human genome. In addition, we examined the performances of other algorithms and found that the SVM-based model outperformed others. The results indicated that the predictor could be a useful tool for studying m7G. A prediction model is available at https://github.com/MapFM/m7g_model.git.



中文翻译:

基于最优序列特征的人类RNA中N7-甲基鸟苷位点的预测。

N-7 甲基鸟苷 (m7G) 修饰是一种普遍存在的转录后 RNA 修饰,对于维持 RNA 功能和蛋白质翻译至关重要。开发计算工具将帮助我们轻松预测 RNA 序列中的 m7G 位点。在这项工作中,我们设计了一种基于序列的方法来识别人类 RNA 序列中的修饰位点。首先提取几种序列特征对m7G和非m7G样本进行编码。随后,我们使用 mRMR、F-score 和 Relief 来获得可以产生最大预测精度的最佳特征子集。在10倍交叉验证中,结果表明支持向量机(SVM)识别人类基因组中的m7G位点的最高准确率为94.67%。此外,我们检查了其他算法的性能,发现基于 SVM 的模型优于其他算法。结果表明,预测器可能是研究 m7G 的有用工具。预测模型可在 https://github.com/MapFM/m7g_model.git 获得。

更新日期:2020-07-31
down
wechat
bug