Abstract
In this paper, we explore sparsity and homogeneity of regression coefficients incorporating prior constraint information. The sparsity means that a small fraction of regression coefficients is nonzero, and the homogeneity means that regression coefficients are grouped and have exactly the same value in each group. A general pairwise fusion approach is proposed to deal with the sparsity and homogeneity detection when combining prior convex constraints. We develop a modified alternating direction method of multipliers algorithm to obtain the estimators and demonstrate its convergence. The efficiency of both sparsity and homogeneity detection can be improved by combining the prior information. Our proposed method is further illustrated by simulation studies and analysis of an ozone dataset.
Similar content being viewed by others
References
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J., et al.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends® Mach. Learn. 3(1), 1–122 (2011)
Breiman, L., Friedman, J.H.: Estimating optimal transformations for multiple regression and correlation. J. Am. Stat. Assoc. 80(391), 580–598 (1985)
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)
Fan, Y., Tang, C.Y.: Tuning parameter selection in high dimensional penalized likelihood. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 75(3), 531–552 (2013)
Geyer, C.J.: On the asymptotics of convex stochastic optimization. Unpublished manuscript (1996)
Ke, Z.T., Fan, J., Wu, Y.: Homogeneity pursuit. J. Am. Stat. Assoc. 110(509), 175–194 (2015)
Ma, S., Huang, J.: A concave pairwise fusion approach to subgroup analysis. J. Am. Stat. Assoc. 112(517), 410–423 (2017)
Silvapulle, M.J., Sen, P.K.: Constrained Statistical Inference: Order, Inequality, and Shape Constraints. Wiley, Hoboken (2011)
Stahlecker, P.: A priori Information und Minimax-Schätzung im linearen Regressionsmodell. Athenäum (1987)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 58, 267–288 (1996)
Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., Knight, K.: Sparsity and smoothness via the fused lasso. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 67(1), 91–108 (2005)
Tseng, P.: Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109(3), 475–494 (2001)
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 68(1), 49–67 (2006)
Zhang, C.H.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38(2), 894–942 (2010)
Zhu, Y., Shen, X., Pan, W.: Simultaneous grouping pursuit and feature selection over an undirected graph. J. Am. Stat. Assoc. 108(502), 713–725 (2013)
Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101(476), 1418–1429 (2006)
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 A. Proof of Proposition 2.1
By the definition \(\varvec{\eta }_1^{(m+1)}, \mathbf {\varvec{\eta }}_2^{(m+1)}\), for any \(\varvec{\eta }_1\), \(\mathbf {\varvec{\eta } }_2\), we have
Denote \(\Xi (\varvec{\beta }) = \{((\varvec{\eta }_1, \varvec{\eta }_2) : \varvec{\beta }^{(m+1)}- \varvec{\eta }_1 ={\mathbf {0}}, {\mathbf {L}}\varvec{\beta }^{(m+1)}- \varvec{\eta }_2 ={\mathbf {0}} \}\),
Then,
For any integer t, \({\mathbf {v}}_1^{(m+t-1)} = {\mathbf {v}}_1^{(m)} + \tau _1\sum _{i=1}^{t-1}(\varvec{\beta }_1^{(m+i)} - \varvec{\eta }_1^{(m+i)} ) \), and \({\mathbf {v}}^{(m+t-1)}_2={\mathbf {v}}^{(m)}_2+\tau _2 \sum \nolimits _{i=1}^{t-1}({\mathbf {L}}\varvec{\beta }^{(m+i)}- \varvec{\eta }^{(m+i)}_2)\), then we have
By the differentiable of (\(S_n( \varvec{\beta },\varvec{\eta }_1,\varvec{\eta }_2, {\mathbf {v}}_1, {\mathbf {v}}_2 )\)) with respect to \(\varvec{\beta }\) and (\(S_n( \varvec{\beta },\varvec{\eta }_1,\varvec{\eta }_2, {\mathbf {v}}_1, {\mathbf {v}}_2 )\)) is convex with respect to \(\mathbf {\varvec{\eta }}_1, \varvec{\eta }_2\), based on Theorem 4.1 in [12], the limits of \((\varvec{\beta }^{(m)}, \varvec{\eta }_1^{(m)}, \varvec{\eta }_2^{(m)} )\) exist, which denote as \((\varvec{\beta }^{*}, \varvec{\eta }_1^{*}, \varvec{\eta }_2^{*} )\). Therefore,
and for any \(t\ge 0\)
Hence, \(\lim _{m\rightarrow \infty }\Vert {\mathbf {r}}^{(m)}_1\Vert ^{2}=r_1^{*}= \Vert \varvec{\beta }^{*}- \varvec{\eta }_1^{*}\Vert ^2 = 0\) and \(\lim _{m\rightarrow \infty }\Vert {\mathbf {r}}^{(m)}_2\Vert ^{2}=r_2^{*}= \Vert {\mathbf {L}}\varvec{\beta }^{*}- \varvec{\eta }_2^{*}\Vert ^2 = 0\).
By definition that \(\varvec{\beta }^{(m+1)}\) is the minimizer of \(S_n\biggl (\varvec{\beta }, \varvec{\eta }^{(m)}_1,\varvec{\eta }_2^{(m)},{\mathbf {v}}_1^{(m)}, {\mathbf {v}}_2^{(m)} \biggr )\), then
and further on,
where the last equality is based on \({\mathbf {v}}_1^{(m+1)} = {\mathbf {v}}_1^{(m)} + \tau _1(\varvec{\beta }^{(m+1)} - \varvec{\eta }_1^{(m+1)} ), {\mathbf {v}}_2^{(m+1)} = {\mathbf {v}}_2^{(m)} + \tau _2({\mathbf {L}}\varvec{\beta }^{(m+1)} - \varvec{\eta }_2^{(m+1)} )\). Therefore,
Since \(\Vert \varvec{\beta }^{*} - \varvec{\eta }_1^{*} \Vert ^{2}= \Vert {\mathbf {L}}\varvec{\beta }^{*} - \varvec{\eta }_2^{*}\Vert ^2=0\),
Consequently, we have \(\lim _{m\rightarrow \infty }{\mathbf {s}}_1^{(m+1)}+{\mathbf {s}}_2^{(m+1)}={\mathbf {0}}\).
1.2 B. Proof of Theorem 2.2
This proof is an adaptation to a prior constraint case of the proof given by [11]. Denote \({\mathbf {u}}= (u_1,\ldots , u_p )^{\mathrm {T}}\), and
and note that \(V_n\) is minimized at \(\sqrt{n}(\widehat{\varvec{\beta }}_n -\varvec{\beta })\). First note that, finite dimensional p, based on the CLT
For \(L_1\) penalty,
and
Then, for finite dimensional p we have \(V_n({\mathbf {u}}) \rightarrow _{d} V({\mathbf {u}})\), where
Since \(V_n({\mathbf {u}})\) is convex and \(V({\mathbf {u}})\) has a unique minimum, it follows [5] that
Rights and permissions
About this article
Cite this article
Li, Y., Jin, B. Pairwise Fusion Approach Incorporating Prior Constraint Information. Commun. Math. Stat. 8, 47–62 (2020). https://doi.org/10.1007/s40304-018-0168-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40304-018-0168-3
Keywords
- Alternating direction method of multipliers
- Prior constraint information
- Sparsity
- Homogeneity
- Linear regression