当前位置: X-MOL 学术Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
miRDetect: A combinatorial approach for automated detection of novel miRNA precursors from plant EST data using homology and Random Forest classification.
Genomics ( IF 4.4 ) Pub Date : 2020-05-05 , DOI: 10.1016/j.ygeno.2020.05.002
Garima Ayachit 1 , Himanshu Pandya 1 , Jayashankar Das 2
Affiliation  

Identification of microRNAs from plants is a crucial step for understanding the mechanisms of pathways and regulation of genes. A number of tools have been developed for the detection of microRNAs from small RNA-seq data. However, there is a lack of pipeline for detection of miRNA from EST dataset even when a huge resource is publicly available and the method is known. Here we present miRDetect, a python implementation to detect novel miRNA precursors from plant EST data using homology and machine learning approach. 10-fold cross validation was applied to choose best classifier based on ROC, accuracy, MCC and F1-scores using 112 features. miRDetect achieved a classification accuracy of 93.35% on a Random Forest classifier and outperformed other precursor detection tools in terms of performance. The miRDetect pipeline aids in identifying novel plant precursors using a mixed approach and will be helpful to researchers with less informatics background.



中文翻译:

miRDetect:一种使用同源性和随机森林分类从植物 EST 数据中自动检测新型 miRNA 前体的组合方法。

从植物中鉴定 microRNAs 是了解通路机制和基因调控的关键步骤。已经开发了许多工具来从小 RNA-seq 数据中检测 microRNA。然而,即使有大量资源是公开可用的并且方法是已知的,也缺乏从 EST 数据集中检测 miRNA 的管道。在这里,我们介绍了 miRDetect,这是一种 Python 实现,可使用同源性和机器学习方法从植物 EST 数据中检测新的 miRNA 前体。使用 112 个特征,基于 ROC、准确性、MCC 和 F1 分数应用 10 折交叉验证来选择最佳分类器。miRDetect 在随机森林分类器上实现了 93.35% 的分类准确率,并且在性能方面优于其他前体检测工具。

更新日期:2020-05-05
down
wechat
bug