当前位置: X-MOL 学术Brief. Funct. Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A comprehensive comparison and analysis of computational predictors for RNA N6-methyladenosine sites of Saccharomyces cerevisiae.
Briefings in Functional Genomics ( IF 4 ) Pub Date : 2019-11-19 , DOI: 10.1093/bfgp/elz018
Xiaolei Zhu 1, 2 , Jingjing He 2 , Shihao Zhao 1 , Wei Tao 1 , Yi Xiong 3 , Shoudong Bi 1
Affiliation  

N6-methyladenosine (m6A) modification, as one of the commonest post-transcription modifications in RNAs, has been reported to be highly related to many biological processes. Over the past decade, several tools for m6A sites prediction of Saccharomyces cerevisiae have been developed and are freely available online. However, the quality of predictions by these tools is difficult to quantify and compare. In this study, an independent dataset M6Atest6540 was compiled to systematically evaluate nine publicly available m6A prediction tools for S. cerevisiae. The experimental results indicate that RAM-ESVM achieved the best performance on M6Atest6540; however, most models performed substantially worse than their performances reported in the original papers. The benchmark dataset Met2614, which was used as the training dataset for the nine methods, were further analyzed by using a position bias index. The results demonstrated the significantly different bias of dataset Met2614 compared with the RNA segments around m6A sites recorded in RMBase. Moreover, newMet2614 was collected by randomly selecting RNA segments from non-redundant data recorded in RMBase, and three different kinds of features were extracted. The performances of the models built on Met2614 and newMet2614 with the features were compared, which shows the better generalization of models built on newMet2614. Our results also indicate the position-specific propensity-based features outperform other features, although they are also easily over-fitted on a biased dataset.

中文翻译:

酿酒酵母 RNA N6-甲基腺苷位点计算预测因子的综合比较和分析。

N6-甲基腺苷 (m6A) 修饰作为 RNA 中最常见的转录后修饰之一,据报道与许多生物过程高度相关。在过去的十年中,已经开发了几种用于酿酒酵母 m6A 位点预测的工具,并且可以在线免费获得。然而,这些工具的预测质量难以量化和比较。在这项研究中,编译了一个独立的数据集 M6Atest6540,以系统地评估 9 种公开可用的酿酒酵母 m6A 预测工具。实验结果表明RAM-ESVM在M6Atest6540上取得了最好的性能;然而,大多数模型的表现比原始论文中报道的表现差得多。基准数据集 Met2614,用作九种方法的训练数据集,通过使用位置偏差指数进一步分析。结果表明,数据集 Met2614 的偏差与 RMBase 中记录的 m6A 位点周围的 RNA 片段相比存在显着差异。此外,通过从RMBase记录的非冗余数据中随机选择RNA片段来收集newMet2614,并提取三种不同的特征。将基于 Met2614 和 newMet2614 构建的模型的性能与特征进行了比较,表明基于 newMet2614 构建的模型具有更好的泛化能力。我们的结果还表明,基于位置特定倾向的特征优于其他特征,尽管它们也很容易在有偏见的数据集上过度拟合。结果表明,数据集 Met2614 的偏差与 RMBase 中记录的 m6A 位点周围的 RNA 片段相比存在显着差异。此外,通过从RMBase记录的非冗余数据中随机选择RNA片段来收集newMet2614,并提取三种不同的特征。将基于 Met2614 和 newMet2614 构建的模型的性能与特征进行了比较,表明基于 newMet2614 构建的模型具有更好的泛化能力。我们的结果还表明,基于位置特定倾向的特征优于其他特征,尽管它们也很容易在有偏见的数据集上过度拟合。结果表明,数据集 Met2614 的偏差与 RMBase 中记录的 m6A 位点周围的 RNA 片段相比存在显着差异。此外,通过从RMBase记录的非冗余数据中随机选择RNA片段来收集newMet2614,并提取三种不同的特征。对比了基于 Met2614 和 newMet2614 构建的模型与特征的性能,表明基于 newMet2614 构建的模型具有更好的泛化能力。我们的结果还表明,基于位置特定倾向的特征优于其他特征,尽管它们也很容易在有偏见的数据集上过度拟合。将基于 Met2614 和 newMet2614 构建的模型的性能与特征进行了比较,表明基于 newMet2614 构建的模型具有更好的泛化能力。我们的结果还表明,基于位置特定倾向的特征优于其他特征,尽管它们也很容易在有偏见的数据集上过度拟合。将基于 Met2614 和 newMet2614 构建的模型的性能与特征进行了比较,表明基于 newMet2614 构建的模型具有更好的泛化能力。我们的结果还表明,基于位置特定倾向的特征优于其他特征,尽管它们也很容易在有偏见的数据集上过度拟合。
更新日期:2019-11-01
down
wechat
bug