Knowledge-Based Systems ( IF 8.8 ) Pub Date : 2021-06-01 , DOI: 10.1016/j.knosys.2021.107184 Martin Trnecka , Marketa Trneckova
A key step in applying Boolean matrix factorization (BMF) is establishing the correct model order for the data, i.e., decide where the knowledge stops and the noise starts, or simply, decide the proper number of factors that describe the data well. There are two main approaches to BMF, namely, Discrete Basis Problem (DBP) and Approximation Factorization Problem (AFP). Although the model order selection technique for DBP exists, there is no technique tailored for AFP. We show that the number of factors for DBP cannot be used in AFP, and we present a novel way, reflecting the nature of AFP, how to establish the proper number of factors. Moreover, we show that the number of factors established for AFP is – from a knowledge-representation viewpoint – better than that for DBP.
中文翻译:
近似布尔矩阵分解问题的模型阶数选择
应用布尔矩阵分解 (BMF) 的一个关键步骤是为数据建立正确的模型顺序,即决定知识在哪里停止和噪音开始,或者简单地决定可以很好地描述数据的适当数量的因素。BMF 有两种主要方法,即离散基问题 (DBP) 和近似因式分解问题 (AFP)。尽管存在 DBP 的模型顺序选择技术,但没有为 AFP 量身定制的技术。我们表明 DBP 的因子数量不能在 AFP 中使用,并且我们提出了一种反映 AFP 性质的新方法,即如何建立适当的因子数量。此外,我们表明,从知识表示的角度来看,为 AFP 建立的因素数量优于 DBP。