当前位置: X-MOL 学术J. Comput. Sci. Tech. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
GAEBic: A Novel Biclustering Analysis Method for miRNA-Targeted Gene Data Based on Graph Autoencoder
Journal of Computer Science and Technology ( IF 1.9 ) Pub Date : 2021-03-31 , DOI: 10.1007/s11390-021-0804-3
Li Wang , Hao Zhang , Hao-Wu Chang , Qing-Ming Qin , Bo-Rui Zhang , Xue-Qing Li , Tian-Heng Zhao , Tian-Yue Zhang

Unlike traditional clustering analysis, the biclustering algorithm works simultaneously on two dimensions of samples (row) and variables (column). In recent years, biclustering methods have been developed rapidly and widely applied in biological data analysis, text clustering, recommendation system and other fields. The traditional clustering algorithms cannot be well adapted to process high-dimensional data and/or large-scale data. At present, most of the biclustering algorithms are designed for the differentially expressed big biological data. However, there is little discussion on binary data clustering mining such as miRNA-targeted gene data. Here, we propose a novel biclustering method for miRNA-targeted gene data based on graph autoencoder named as GAEBic. GAEBic applies graph autoencoder to capture the similarity of sample sets or variable sets, and takes a new irregular clustering strategy to mine biclusters with excellent generalization. Based on the miRNA-targeted gene data of soybean, we benchmark several different types of the biclustering algorithm, and find that GAEBic performs better than Bimax, Bibit and the Spectral Biclustering algorithm in terms of target gene enrichment. This biclustering method achieves comparable performance on the high throughput miRNA data of soybean and it can also be used for other species.



中文翻译:

GAEBic:基于图自动编码器的miRNA靶向基因数据的新型聚类分析方法

与传统的聚类分析不同,双聚类算法可同时在样本(行)和变量(列)两个维度上工作。近年来,二类聚类方法发展迅速,广泛应用于生物数据分析,文本聚类,推荐系统等领域。传统的聚类算法不能很好地适应于处理高维数据和/或大规模数据。目前,大多数二聚类算法是为差异表达的大生物数据而设计的。但是,关于二进制数据聚类挖掘(例如针对miRNA的基因数据)的讨论很少。在这里,我们提出了一种基于图自动编码器GAEBic的针对miRNA靶向基因数据的新的双聚类方法。GAEBic应用图自动编码器来捕获样本集或变量集的相似性,并采用一种新的不规则聚类策略来挖掘具有出色泛化性的双聚类。根据大豆的miRNA靶向基因数据,我们对几种不同类型的双聚类算法进行了基准测试,发现GAEBic在目标基因富集方面比Bimax,Bibit和光谱双聚类算法表现更好。这种双聚类方法在大豆的高通量miRNA数据上具有可比的性能,也可用于其他物种。关于目标基因富集的Bibit和光谱谱聚类算法。这种双聚类方法在大豆的高通量miRNA数据上具有可比的性能,也可用于其他物种。关于目标基因富集的Bibit和光谱谱聚类算法。这种双聚类方法在大豆的高通量miRNA数据上具有可比的性能,也可用于其他物种。

更新日期:2021-04-14
down
wechat
bug