当前位置: X-MOL 学术IEEE/ACM Trans. Comput. Biol. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
NMFGO: Gene Function Prediction via Nonnegative Matrix Factorization with Gene Ontology.
IEEE/ACM Transactions on Computational Biology and Bioinformatics ( IF 4.5 ) Pub Date : 2018-07-31 , DOI: 10.1109/tcbb.2018.2861379
Guoxian Yu , Keyao Wang , Guangyuan Fu , Maozu Guo , Jun Wang

Gene Ontology (GO) is a controlled vocabulary of terms that describe molecule function, biological roles, and cellular locations of gene products (i.e., proteins and RNAs), it hierarchically organizes more than 43,000 GO terms via the direct acyclic graph. A gene is generally annotated with several of these GO terms. Therefore, accurately predicting the association between genes and massive terms is a difficult challenge. To combat with this challenge, we propose an matrix factorization based approach called NMFGO. NMFGO stores the available GO annotations of genes in a gene-term association matrix and adopts an ontological structure based taxonomic similarity measure to capture the GO hierarchy. Next, it factorizes the association matrix into two low-rank matrices via nonnegative matrix factorization regularized with the GO hierarchy. After that, it employs a semantic similarity based k nearest neighbor classifier in the low-rank matrices approximated subspace to predict gene functions. Empirical study on three model species (S. cerevisiae, H. sapiens, and A. thaliana) shows that NMFGO is robust to the input parameters and achieves significantly better prediction performance than GIC, TO, dRW- kNN, and NtN, which were re-implemented based on the instructions of the original papers. The supplementary file and demo codes of NMFGO are available at http://mlda.swu.edu.cn/codes.php?name=NMFGO.

中文翻译:

NMFGO:通过具有基因本体的非负矩阵分解进行基因功能预测。

基因本体论(GO)是描述分子功能,生物学作用以及基因产物(即蛋白质和RNA)的细胞位置的术语的受控词汇表,它通过直接无环图层次结构地组织了43,000多个GO术语。通常使用这些GO术语中的几个来注释基因。因此,准确预测基因与大量术语之间的关联是一项艰巨的挑战。为了应对这一挑战,我们提出了一种基于矩阵分解的方法,称为NMFGO。NMFGO将可用的基因GO注释存储在基因项关联矩阵中,并采用基于本体结构的生物分类相似性度量来捕获GO层次结构。接下来,它通过使用GO层次结构正则化的非负矩阵分解将关联矩阵分解为两个低阶矩阵。之后,它在低秩矩阵近似子空间中采用基于语义相似度的k最近邻分类器来预测基因功能。对三种模式物种(酿酒酵母,智人和拟南芥)的经验研究表明,NMFGO对输入参数具有鲁棒性,并且与GIC,TO,dRW-kNN和NtN相比,NMFGO具有更好的预测性能。 -根据原始论文的说明实施。NMFGO的补充文件和演示代码可从http://mlda.swu.edu.cn/codes.php?name=NMFGO获得。Thaliana)表明,NMFGO对输入参数具有鲁棒性,并且比GIC,TO,dRW-kNN和NtN(根据原始论文的说明重新实现)具有更好的预测性能。NMFGO的补充文件和演示代码可从http://mlda.swu.edu.cn/codes.php?name=NMFGO获得。Thaliana)表明,NMFGO对输入参数具有鲁棒性,并且比GIC,TO,dRW-kNN和NtN(根据原始论文的说明重新实现)具有更好的预测性能。NMFGO的补充文件和演示代码可从http://mlda.swu.edu.cn/codes.php?name=NMFGO获得。
更新日期:2020-03-07
down
wechat
bug