当前位置: X-MOL 学术Mol. Ther. Nucl. Acids › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Novel Human miRNA-Disease Association Inference Based on Random Forest
Molecular Therapy - Nucleic Acids ( IF 6.5 ) Pub Date : 2018-10-11 , DOI: 10.1016/j.omtn.2018.10.005
Xing Chen 1 , Chun-Chun Wang 1 , Jun Yin 1 , Zhu-Hong You 2
Affiliation  

Since the first microRNA (miRNA) was discovered, a lot of studies have confirmed the associations between miRNAs and human complex diseases. Besides, obtaining and taking advantage of association information between miRNAs and diseases play an increasingly important role in improving the treatment level for complex diseases. However, due to the high cost of traditional experimental methods, many researchers have proposed different computational methods to predict potential associations between miRNAs and diseases. In this work, we developed a computational model of Random Forest for miRNA-disease association (RFMDA) prediction based on machine learning. The training sample set for RFMDA was constructed according to the human microRNA disease database (HMDD) version (v.)2.0, and the feature vectors to represent miRNA-disease samples were defined by integrating miRNA functional similarity, disease semantic similarity, and Gaussian interaction profile kernel similarity. The Random Forest algorithm was first employed to infer miRNA-disease associations. In addition, a filter-based method was implemented to select robust features from the miRNA-disease feature set, which could efficiently distinguish related miRNA-disease pairs from unrelated miRNA-disease pairs. RFMDA achieved areas under the curve (AUCs) of 0.8891, 0.8323, and 0.8818 ± 0.0014 under global leave-one-out cross-validation, local leave-one-out cross-validation, and 5-fold cross-validation, respectively, which were higher than many previous computational models. To further evaluate the accuracy of RFMDA, we carried out three types of case studies for four human complex diseases. As a result, 43 (esophageal neoplasms), 46 (lymphoma), 47 (lung neoplasms), and 48 (breast neoplasms) of the top 50 predicted disease-related miRNAs were verified by experiments in different kinds of case studies. The results of cross-validation and case studies indicated that RFMDA is a reliable model for predicting miRNA-disease associations.



中文翻译:


基于随机森林的新型人类 miRNA-疾病关联推断



自从第一个 microRNA (miRNA) 被发现以来,大量研究证实了 miRNA 与人类复杂疾病之间的关联。此外,获取并利用miRNA与疾病之间的关联信息对于提高复杂疾病的治疗水平发挥着越来越重要的作用。然而,由于传统实验方法成本高昂,许多研究人员提出了不同的计算方法来预测 miRNA 与疾病之间的潜在关联。在这项工作中,我们开发了一种基于机器学习的随机森林计算模型,用于 miRNA 疾病关联 (RFMDA) 预测。根据人类microRNA疾病数据库(HMDD)(v.)2.0版本构建RFMDA的训练样本集,并通过整合miRNA功能相似性、疾病语义相似性和高斯相互作用定义代表miRNA-疾病样本的特征向量配置文件内核相似性。随机森林算法首先用于推断 miRNA 与疾病的关联。此外,还采用基于过滤器的方法从 miRNA 疾病特征集中选择稳健的特征,可以有效地区分相关 miRNA 疾病对和不相关 miRNA 疾病对。 RFMDA 在全局留一交叉验证、局部留一交叉验证和 5 倍交叉验证下的曲线下面积 (AUC) 分别为 0.8891、0.8323 和 0.8818 ± 0.0014,高于许多以前的计算模型。为了进一步评估 RFMDA 的准确性,我们针对四种人类复杂疾病进行了三类案例研究。 结果,前50个预测的疾病相关miRNA中有43个(食管肿瘤)、46个(淋巴瘤)、47个(肺肿瘤)和48个(乳腺肿瘤)通过不同类型案例研究的实验得到验证。交叉验证和案例研究的结果表明 RFMDA 是预测 miRNA 与疾病关联的可靠模型。

更新日期:2018-10-11
down
wechat
bug