当前位置: X-MOL 学术Neurocomputing › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
DCA based approaches for bi-level variable selection and application for estimate multiple sparse covariance matrices
Neurocomputing ( IF 6 ) Pub Date : 2021-09-21 , DOI: 10.1016/j.neucom.2021.09.039
Hoai An Le Thi 1, 2 , Duy Nhat Phan 1 , Tao Pham Dinh 3
Affiliation  

Variable selection plays an important role in analyzing high dimensional data and is a fundamental problem in machine learning. When the data possesses certain group structures in which individual variables are also meaningful scientifically, we are naturally interested in selecting important groups as well as important variables within the selected groups. This is referred to as the bi-level variable selection which is much more complex than the selection of individual variables. In recent years, research on the topic of variable selection is very active, but the majority of the work is focused on the individual variable selection. There is therefore a need to further develop more effective approaches for bi-level variable selection. Since DC (Difference of Convex functions) programming and DCA (DC Algorithm), powerful tools in nonconvex programming framework, have been successfully investigated for individual variable selection, we believe that they could be efficiently exploited for the more difficult bi-level variable selection task. In that direction, we investigate in this work DC approximations of the mixed zero norm (0,0) and the combined norm (0+q,0). The resulting approximate problems are then formulated as DC programs for which DCA based algorithms are introduced. As an application, these DCA schemes are developed for estimating multiple sparse covariance matrices sharing some common structures such as the locations or weights of non-zero elements. The experimental results on both simulated and real datasets indicate the efficiency of our algorithms.



中文翻译:

基于 DCA 的双级变量选择方法和应用估计多个稀疏协方差矩阵

变量选择在分析高维数据中起着重要作用,是机器学习中的一个基本问题。当数据具有某些组结构,其中单个变量也具有科学意义时,我们自然会对选择重要组以及所选组内的重要变量感兴趣。这被称为双层变量选择,它比单个变量的选择复杂得多。近年来,关于变量选择主题的研究非常活跃,但大部分工作都集中在个体变量选择上。因此,需要进一步开发更有效的双水平变量选择方法。由于 DC(凸函数差)编程和 DCA(DC 算法),非凸编程框架中的强大工具,已经成功地研究了个体变量选择,我们相信它们可以有效地用于更困难的双层变量选择任务。在这个方向上,我们在这项工作中研究了混合零范数(0,0) 和组合范数 (0+q,0)。然后将产生的近似问题公式化为 DC 程序,其中引入了基于 DCA 的算法。作为一种应用,这些 DCA 方案被开发用于估计多个稀疏协方差矩阵,这些矩阵共享一些公共结构,例如非零元素的位置或权重。在模拟和真实数据集上的实验结果表明了我们算法的效率。

更新日期:2021-10-01
down
wechat
bug