当前位置: X-MOL 学术Methods › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Weighted Matrix Factorization on multi-relational data for LncRNA-Disease Association prediction
Methods ( IF 4.8 ) Pub Date : 2020-02-01 , DOI: 10.1016/j.ymeth.2019.06.015
Yuehui Wang , Guoxian Yu , Jun Wang , Guangyuan Fu , Maozu Guo , Carlotta Domeniconi

Influx evidences show that red long non-coding RNAs (lncRNAs) play important roles in various critical biological processes, and they afffect the development and progression of various human diseases. Therefore, it is necessary to precisely identify the lncRNA-disease associations. The identification precision can be improved by developing data integrative models. However, current models mainly need to project heterogeneous data onto the homologous networks, and then merge these networks into a composite one for integrative prediction. We recognize that this projection overrides the individual structure of the heterogeneous data, and the combination is impacted by noisy networks. As a result, the performance is compromised. Given that, we introduce a Weighted Matrix Factorization model on multi-relational data to predict LncRNA-Disease Associations (WMFLDA). WMFLDA firstly uses a heterogeneous network to capture the inter(intra)-associations between different types of nodes (including genes, lncRNAs, and Disease Ontology terms). Then, it presets weights to these inter-association and intra-association matrices of the network, and cooperatively decomposes these matrices into low-rank ones to explore the underlying relationships between nodes. Next, it jointly optimizes the low-rank matrices and the weights. After that, WMFLDA approximates the lncRNA-disease association matrix using the optimized matrices and weights, and thus to achieve the prediction. WMFLDA obtains a much better performance than related data integrative solutions across different experiment settings and evaluation metrics. It can not only respect the intrinsic structures of individual data sources, but can also fuse them with selection.

中文翻译:

用于 LncRNA-疾病关联预测的多关系数据的加权矩阵分解

Influx 证据表明,红色长链非编码 RNA (lncRNAs) 在各种关键的生物过程中发挥着重要作用,它们影响着各种人类疾病的发生和发展。因此,有必要精确识别lncRNA-疾病关联。通过开发数据集成模型可以提高识别精度。然而,目前的模型主要需要将异构数据投影到同源网络上,然后将这些网络合并成一个复合网络进行综合预测。我们认识到这种投影覆盖了异构数据的个体结构,并且这种组合受到噪声网络的影响。结果,性能受到影响。鉴于,我们在多关系数据上引入了加权矩阵分解模型来预测 LncRNA-疾病关联 (WMFLDA)。WMFLDA 首先使用异构网络来捕获不同类型节点(包括基因、lncRNA 和疾病本体术语)之间的内部(内部)关联。然后,它为网络的这些关联间和关联内矩阵预先设置权重,并将这些矩阵协同分解为低秩矩阵,以探索节点之间的潜在关系。接下来,它联合优化低秩矩阵和权重。之后,WMFLDA 使用优化的矩阵和权重逼近 lncRNA-疾病关联矩阵,从而实现预测。WMFLDA 在不同的实验设置和评估指标上获得了比相关数据集成解决方案更好的性能。它不仅可以尊重单个数据源的内在结构,还可以将它们与选择融合。
更新日期:2020-02-01
down
wechat
bug