An active set Newton-CG method for ℓ1 optimization☆
Introduction
Sparse recovery has received a lot of attention in the last decades. This is due to the fact that many applications such as signal or image processing and data mining/classification can be formulated as sparse recovery problems. Sparse recovery refers to recovering a sparse vector from a set of linear measurements, which is one of the most fundamental issues in compressed sensing (CS) [14]. Mathematically, it can be expressed as where counts the number of nonzero entries of x, (usually ) is called a sensing matrix, and is a measurement vector. Problem (1.1) is NP-hard and difficult to solve in practice. To find the solution of (1.1), alternative models have been used, which include the basis pursuit (BP) problem the closely related -regularized least squares problem the lasso problem and the BP denoising problems where and denotes the Euclidean norm of vectors. The theory for penalty functions shows that the solution (1.3) approaches the solution of (1.2) as μ go to zero. The lasso problem (1.4) and the BP denoising problem (1.5) are equivalent to (1.3) for appropriate choices of the parameters t and σ. Thus it is very necessary and important to study problem (1.3). In this paper, we consider the more general -regularized optimization problem where f is continuously differentiable and .
Recently, there have been many different approaches for solving (1.6). A large class of first order algorithms for solving problem (1.6) is based on the iterative shrinkage thresholding algorithms (ISTA)[13], [17] or variants of ISTA. To accelerate convergence, a two-step ISTA (TwiISA) algorithm was developed in [5] and the sequential subspace optimization techniques was added to ISTA [16]. Beck and Teboulle [4] constructed a faster iterative shrinkage-thresholding algorithm called FISTA that keeps its simplicity of ISTA but possesses a better global rate of convergence. Furthermore, to improve practical performance result of the above methods, Wright et al. [35] introduced the sparse reconstruction by separable approximation (SpaRSA) for solving (1.6). Hager et al. [23] analyzed the convergence rate of SpaRSA and proposed an improved version of SpaRSA based on a cyclic version of the BB iteration and an adaptive choice for the reference function value in the line search. Hale et al. [22] proposed a fixed point continuation algorithm (FPC_BB) that embeds ISTA [13], [17] in a continuation strategy. Wen et al. [33], [34] proposed the FPC active set (FPC_AS) algorithm that combines shrinkage, subspace optimization and continuation technique. Cheng and Dai [10] proposed gradient-based methods with active set strategy to solve (1.6). Liang et al. [25] proposed a general forward-backwark splitting method which identifies the active manifold in a finite number of iterations and has a local linear convergence.
Various types of algorithms are also designed to solve the equivalent constrained optimization reformulation of problem (1.6) or (1.3). For instance, interior point methods [28], projected gradient method [18] and alternating direction method of multipliers SALSA [1], [6]. Other algorithms for the minimization include coordinate-wise descent methods [32], Bergman iterative regularization based methods [37], reduced-space algorithm [11], second-order methods [8], [9], [27], [36], [29], quasi-Newton methods [24], [26], gradient methods [30] for minimizing the more general function , where J is nonsmooth, H is smooth, and both are convex, a smoothed penalty algorithm (SPA) [2]. We refer to papers [7], [14], [19], [8], [31], [26], [11], [25] for more advances in this area.
As pointed out by the authors in [33], [34], the algorithm ISTA is very efficient in obtaining a support superset, but it is not efficient in recovering signal values. This motivated them to develop an efficient algorithm FPC_AS. The algorithm FPC_AS is divided into two stages that are performed repeatedly. Specifically, at the first stage “nonmonotone line search (NMLS)”, a first-order iterative “shrinkage” method to estimate the support of the solution. At the second stages, “subspace optimization”, a smaller smooth subproblem is solved to recover the magnitudes of x. Theoretically, the authors in [33], [34] showed that there exists an accumulation of of generated by the algorithm FPC_AS, which is a stationary point of problem (1.6). Our approach is partially motivated by our belief that a second-order method should be faster than the first-order iterative shrinkage method. To accelerate the algorithm FPC_AS, we shall propose an active set Newton-CG to solve problem (1.6). We first investigate the active set identification technique of ISTA and provide some good properties. Based on the active set identification technique of ISTA, we propose the algorithm to solve (1.6). Specifically, the active variables and free variables are defined by the identification technique at each iteration. At each iteration, the same direction as that of the FPC_AS method [33], [34] at the first stage is used to update the active variables, while a second-order method is utilized for solving a smooth subproblem in order to update the free variables. Hence the method is distinct from the FPC_AS method [33], [34]. In addition, the nonmonotone line search [20] of the proposed method is different from that of the FPC_AS method. Under appropriate conditions, we show that every accumulation of of generated by the proposed algorithm is a stationary point of problem (1.6). Numerical experiments with logistic regression problems and compressive sensing problems demonstrate that the proposed approach is competitive with several known methods.
The remainder of the paper is organized as follows. Some notations and properties related to (1.6) are given in Section 2. In Section 3, we propose the algorithm. In Section 4, we establish the global convergence of the algorithm. Some numerical results are reported in Section 5 and conclusions are made in the last section.
Section snippets
Notation and properties
Let be a stationary point of problem (1.6). We define the active set to be the set of indices corresponding to the zero components of and the inactive set to be the support of , respectively; i.e., Furthermore, the active set and the support of can be subdivided into two sets, respectively. and where is the ith component of the gradient vector
Active set estimate of ISTA
In this section, we investigate the active set identification technique of ISTA and give some good properties of it. Consider the generic iteration of ISTA [13], [17]: where is a given positive constant. From the optimality condition of the above problem, we get Then, we get that the indices of the zero variables at belong to the following set
The new algorithm
In this section, based on the active set identification technique in Section 3, we develop a fast Newton-CG method for solving optimization. We make the following assumptions on the objective function.
Assumption 4.1 The level set is bounded. In some neighbourhood of Ω, f is continuously differentiable and its gradient is Lipschitz continuous, i.e., there exists a constant such that
Numerical experiments
In this section, we present some numerical experiments to test the performance of the proposed algorithm and compare it with the following five state-of-the-art -minimization algorithms.
FPC_AS [33]. FPC_AS is divided into two stages that are performed repeatedly. At the first stage, a first-order method based on “shrinkage” is performed to obtain a working index set. At the second stage, it utilizes a second-order method to solve a smooth subproblem defined by the working index set. The two
Conclusion
In this paper, we investigated the active set identification technique used by ISTA and gave some good properties of it. Based on the active set identification technique, we proposed a Newton-CG method. Under appropriate conditions, we showed that the method based on the nonmonotone line search techniques is globally convergent. The numerical results presented in Section 5 demonstrate the effectiveness of the algorithm for solving -regularized nonconvex problems and some standard
Acknowledgements
The authors thank the two anonymous referees very much for their valuable comments and suggestions, which helped us to improve the quality of this manuscript greatly.
References (39)
- et al.
Subspace optimization methods for linear least squares with non-quadratic regularization
Appl. Comput. Harmon. Anal.
(2007) - et al.
Fast image recovery using variable splitting and constrained optimization
IEEE Trans. Image Process.
(2010) - et al.
A first-order smoothed penalty method for compressed sensing
SIAM J. Optim.
(2011) - et al.
Two point step size gradient methods
IMA J. Numer. Anal.
(1988) - et al.
A fast iterative shrinkage-thresholding algorithm for linear inverse problems
SIAM J. Imaging Sci.
(2009) - et al.
A new twist: two-step iterative shrinkage/thresholding algorithms for image restoration
IEEE Trans. Image Process.
(2007) Local linear convergence of the alternating direction method of multipliers on quadratic or linear program
SIAM J. Optim.
(2013)- et al.
From sparse solutions of systems of equations to sparse modeling of signals and images
SIAM Rev.
(2009) - et al.
A family of second-order methods for convex -regularized optimization
Math. Program., Ser. A
(2016) - et al.
An inexact successive quadratic approximation method for regularized optimization
Math. Program., Ser. B
(2016)
Gradient-based method with active set strategy for optimization
Math. Comp.
A reduced-space algorithm for minimizing -regularized convex functions
SIAM J. Optim.
Trust-region methods
An iterative thresholding algorithm for linear inverse problems with a sparsity constraint
Comm. Pure Appl. Math.
Compressed sensing
IEEE Trans. Inform. Theory
Benchmarking optimization software with performance profiles
Math. Program.
An EM algorithm for wavelet-based image restoration
IEEE Trans. Image Process.
Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems
IEEE J. Sel. Top. Signal Process.
A second-order method for strongly convex -regularization problems
Math. Program., Ser. A
Cited by (1)
An inexact quasi-Newton algorithm for large-scale ℓ<inf>1</inf> optimization with box constraints
2023, Applied Numerical Mathematics
- ☆
Supported by the Chinese NSF Grant (nos. 11971106, 11371154, 11331012 and 81173633), the Key Project of Chinese National Programs for Fundamental Research and Development (no. 2015CB856002), the China National Funds for Distinguished Young Scientists (no. 11125107), by the Ministry of Education, Humanities and Social Sciences project (no. 17JYJAZH011) and by the Natural Science Foundation of Guangdong Province (2018A030313229).