当前位置: X-MOL 学术Brief. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Attentional multi-level representation encoding based on convolutional and variance autoencoders for lncRNA-disease association prediction.
Briefings in Bioinformatics ( IF 6.8 ) Pub Date : 2020-05-23 , DOI: 10.1093/bib/bbaa067
Nan Sheng , Hui Cui , Tiangang Zhang , Ping Xuan

As the abnormalities of long non-coding RNAs (lncRNAs) are closely related to various human diseases, identifying disease-related lncRNAs is important for understanding the pathogenesis of complex diseases. Most of current data-driven methods for disease-related lncRNA candidate prediction are based on diseases and lncRNAs. Those methods, however, fail to consider the deeply embedded node attributes of lncRNA-disease pairs, which contain multiple relations and representations across lncRNAs, diseases and miRNAs. Moreover, the low-dimensional feature distribution at the pairwise level has not been taken into account. We propose a prediction model, VADLP, to extract, encode and adaptively integrate multi-level representations. Firstly, a triple-layer heterogeneous graph is constructed with weighted inter-layer and intra-layer edges to integrate the similarities and correlations among lncRNAs, diseases and miRNAs. We then define three representations including node attributes, pairwise topology and feature distribution. Node attributes are derived from the graph by an embedding strategy to represent the lncRNA-disease associations, which are inferred via their common lncRNAs, diseases and miRNAs. Pairwise topology is formulated by random walk algorithm and encoded by a convolutional autoencoder to represent the hidden topological structural relations between a pair of lncRNA and disease. The new feature distribution is modeled by a variance autoencoder to reveal the underlying lncRNA-disease relationship. Finally, an attentional representation-level integration module is constructed to adaptively fuse the three representations for lncRNA-disease association prediction. The proposed model is tested over a public dataset with a comprehensive list of evaluations. Our model outperforms six state-of-the-art lncRNA-disease prediction models with statistical significance. The ablation study showed the important contributions of three representations. In particular, the improved recall rates under different top $k$ values demonstrate that our model is powerful in discovering true disease-related lncRNAs in the top-ranked candidates. Case studies of three cancers further proved the capacity of our model to discover potential disease-related lncRNAs.

中文翻译:

基于卷积和方差自动编码器的注意多级表示编码,用于 lncRNA 疾病关联预测。

由于长链非编码RNA(lncRNA)的异常与人类各种疾病密切相关,因此识别疾病相关的lncRNA对于了解复杂疾病的发病机制具有重要意义。目前大多数用于疾病相关 lncRNA 候选预测的数据驱动方法都是基于疾病和 lncRNA。然而,这些方法未能考虑 lncRNA-疾病对的深层嵌入节点属性,其中包含跨 lncRNA、疾病和 miRNA 的多种关系和表示。此外,还没有考虑成对级别的低维特征分布。我们提出了一个预测模型 VADLP,用于提取、编码和自适应集成多级表示。首先,构建了一个三层异构图,具有加权的层间和层内边,以整合 lncRNA、疾病和 miRNA 之间的相似性和相关性。然后我们定义了三种表示,包括节点属性、成对拓扑和特征分布。节点属性通过嵌入策略从图中派生,以表示 lncRNA 与疾病的关联,这些关联是通过其常见的 lncRNA、疾病和 miRNA 推断出来的。成对拓扑由随机游走算法制定,并由卷积自编码器编码,以表示一对 lncRNA 与疾病之间隐藏的拓扑结构关系。新的特征分布由方差自动编码器建模,以揭示潜在的 lncRNA 与疾病的关系。最后,构建了一个注意力表示级别的集成模块,以自适应地融合三种表示用于 lncRNA 疾病关联预测。建议的模型在公共数据集上进行了测试,其中包含完整的评估列表。我们的模型优于具有统计显着性的六种最先进的 lncRNA 疾病预测模型。消融研究显示了三种表示的重要贡献。特别是,不同最高 $k$ 值下提高的召回率表明我们的模型在发现排名靠前的候选者中真正与疾病相关的 lncRNA 方面是强大的。三种癌症的案例研究进一步证明了我们的模型发现潜在疾病相关 lncRNA 的能力。建议的模型在公共数据集上进行了测试,其中包含完整的评估列表。我们的模型优于具有统计显着性的六种最先进的 lncRNA 疾病预测模型。消融研究显示了三种表示的重要贡献。特别是,不同最高 $k$ 值下提高的召回率表明我们的模型在发现排名靠前的候选者中真正与疾病相关的 lncRNA 方面是强大的。三种癌症的案例研究进一步证明了我们的模型发现潜在疾病相关 lncRNA 的能力。建议的模型在公共数据集上进行了测试,其中包含完整的评估列表。我们的模型优于具有统计显着性的六种最先进的 lncRNA 疾病预测模型。消融研究显示了三种表示的重要贡献。特别是,不同最高 $k$ 值下提高的召回率表明我们的模型在发现排名靠前的候选者中真正与疾病相关的 lncRNA 方面是强大的。三种癌症的案例研究进一步证明了我们的模型发现潜在疾病相关 lncRNA 的能力。在不同的最高 $k$ 值下提高的召回率表明我们的模型在发现排名靠前的候选者中真正与疾病相关的 lncRNA 方面是强大的。三种癌症的案例研究进一步证明了我们的模型发现潜在疾病相关 lncRNA 的能力。在不同的最高 $k$ 值下提高的召回率表明我们的模型在发现排名靠前的候选者中真正与疾病相关的 lncRNA 方面是强大的。三种癌症的案例研究进一步证明了我们的模型发现潜在疾病相关 lncRNA 的能力。
更新日期:2020-05-23
down
wechat
bug