当前位置: X-MOL 学术Mach. Learn. Sci. Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An automated approach for determining the number of components in non-negative matrix factorization with application to mutational signature learning
Machine Learning: Science and Technology ( IF 6.013 ) Pub Date : 2021-01-01 , DOI: 10.1088/2632-2153/abc60a
Gal Gilad 1, 2 , Itay Sason 1, 2 , Roded Sharan 1
Affiliation  

Non-negative matrix factorization (NMF) is a popular method for finding a low rank approximation of a matrix, thereby revealing the latent components behind it. In genomics, NMF is widely used to interpret mutation data and derive the underlying mutational processes and their activities. A key challenge in the use of NMF is determining the number of components, or rank of the factorization. Here we propose a novel method, CV2K, to choose this number automatically from data that is based on a detailed cross validation procedure combined with a parsimony consideration. We apply our method for mutational signature analysis and demonstrate its utility on both simulated and real data sets. In comparison to previous approaches, some of which involve human assessment, CV2K leads to improved predictions across a wide range of data sets.



中文翻译:

一种自动方法,用于确定非负矩阵分解中的组件数,并将其应用于突变特征学习

非负矩阵分解(NMF)是一种流行的方法,用于发现矩阵的低秩逼近,从而揭示矩阵背后的潜在成分。在基因组学中,NMF被广泛用于解释突变数据并得出潜在的突变过程及其活性。使用NMF的关键挑战是确定组件的数量或分解的等级。在这里,我们提出了一种新颖的方法CV2K,该方法基于结合交叉考虑的详细交叉验证过程从数据中自动选择此数字。我们将我们的方法用于突变特征分析,并在模拟和真实数据集上证明了其实用性。与以前的方法(其中一些方法涉及人工评估)相比,CV2K可以改善各种数据集的预测。

更新日期:2021-01-01
down
wechat
bug