Sparse estimations in kink regression model

Yamaka, Woraphon

doi:10.1007/s00500-021-05797-z

Sparse estimations in kink regression model

Focus
Published: 17 April 2021

Volume 25, pages 7825–7838, (2021)
Cite this article

Soft Computing Aims and scope Submit manuscript

Woraphon Yamaka ORCID: orcid.org/0000-0002-0787-1437¹

214 Accesses
3 Citations
Explore all metrics

Abstract

When modeling the kink regression model, it is possible to have an excessive number of explanatory variables and their corresponding coefficients, thereby leading to the over-parameterization and multicollinearity problems. Motivated by these problems, five sparse estimation methods, namely LASSO, sparse Ridge, SCAD, MCP, and Bridge, are considered to perform simultaneous variable selection and parameter estimation, as alternatives to the Ordinary Least Squares (OLS), in the kink regression model. To compare the performance of these sparse estimators, both simulation and real data applications are proposed. According to the simulation results, we demonstrate the superior performance of sparse estimations in terms of selection accuracy and prediction by comparing them to the non-sparse estimations. However, it is not apparent which sparse estimations are more appropriate for estimating the kink regression. However, in an application study, the comparison result indicates that the SCAD penalty would be a preferable penalty function for the application of kink regression to the life expectancy data as the lowest EBIC and the highest \({\text{Adj - }}R^{2}\) are obtained.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Shrinkage and Sparse Estimation for High-Dimensional Linear Models

Weak Signals in High-Dimensional Logistic Regression Models

Usage of the GO estimator in high dimensional linear models

Article 18 June 2020

References

Ahrens A, Hansen CB, Schaffer ME (2019) lassopack: Model selection and prediction with regularized regression in Stata. arXiv preprint arXiv:1901.05397
Bertsimas D, Van Parys B (2020) Sparse high-dimensional regression: exact scalable algorithms and phase transitions. Ann Stat 48(1):300–323
Article MathSciNet Google Scholar
Bühlmann P, Van De Geer S (2011) Statistics for high-dimensional data: methods, theory and applications. Springer Science & Business Media. Springer-Verlag Berlin Heidelberg, Berlin
Card D, Lee DS, Pei Z, Weber A (2015) Inference on causal effects in a generalized regression kink design. Econometrica 83(6):2453–2483
Article MathSciNet Google Scholar
Cervantes F, Usevitch B, Valera L, Kreinovich V (2018) Why sparse? Fuzzy techniques explain empirical efficiency of sparsity-based data-and image-processing algorithms. In: Zadeh L, Yager R, Shahbazova S, Reformat M, Kreinovich V (eds) Recent developments and the new direction in soft-computing foundations and applications. Studies in Fuzziness and Soft Computing, vol 361. Springer, Cham, pp 419–428. https://doi.org/10.1007/978-3-319-75408-6
Chalise P, Fridley BL (2012) Comparison of penalty functions for sparse canonical correlation analysis. Comput Stat Data Anal 56(2):245–254
Article MathSciNet Google Scholar
Cilluffo G, Sottile G, La Grutta S, Muggeo VM (2020) The Induced Smoothed lasso: a practical framework for hypothesis testing in high dimensional regression. Stat Methods Med Res 29(3):765–777
Article MathSciNet Google Scholar
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360
Article MathSciNet Google Scholar
Farrar DE, Glauber RR (1967) Multicollinearity in regression analysis: the problem revisited. Rev Econ Stat 49:92–107
Article Google Scholar
Fokianos K (2008) Comparing two samples by penalized logistic regression. Electron J Stat 2:564–580
Article MathSciNet Google Scholar
Fong Y, Huang Y, Gilbert PB, Permar SR (2017) chngpt: threshold regression model estimation and inference. BMC Bioinform 18(1):454
Article Google Scholar
Frank IE, Friedman JH (1993) A statistical view of some chemometrics regression tools. Technometrics 35:109–148
Article Google Scholar
Froymson MA (1960) Multiple regression analysis. In: Ralston A, Wilf HS (eds) Mathematical methods for digital computers. Wiley, New York
Google Scholar
Fu WJ (1998) Penalized regressions: the bridge versus the Lasso. J Comput Graph Stat 7(3):397–416
MathSciNet Google Scholar
Hansen BE (2017) Regression kink with an unknown threshold. J Bus Econ Stat 35(2):228–240
Article MathSciNet Google Scholar
Hastie T, Tibshirani R, Friedman J, Franklin J (2005) The elements of statistical learning: data mining, inference and prediction. Math Intell 27(2):83–85
Google Scholar
Hebiri M, Van De Geer S (2011) The Smooth-Lasso and other ℓ1+ ℓ2-penalized methods. Electron J Stat 5:1184–1226
Article MathSciNet Google Scholar
Hoerl AE, Kennard RW (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1):55–67
Article Google Scholar
Huang A, Liu D (2016) EBglmnet: a comprehensive R package for sparse generalized linear regression models. Bioinformatics: btw143. https://doi.org/https://doi.org/10.1093/bioinformatics/btw143 (advance online publication)
Kim Y, Choi YK, Emery S (2013) Logistic regression with multiple random effects: a simulation study of estimation methods and statistical packages. Am Stat 67(3):171–182
Article Google Scholar
Klir G, Yuan B (1995) Fuzzy sets and fuzzy logic, vol 4. Prentice hall, Upper Saddle River, New Jersey
MATH Google Scholar
Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1–2):273–324
Article Google Scholar
Lee S (2015) An additive sparse penalty for variable selection in high-dimensional linear regression model. Commun Stat Appl Methods 22(2):147–157
Google Scholar
Lien D, Hu Y, Liu L (2017) Subjective well-being and income: a re-examination of satiation using the regression kink model with an unknown threshold. J Appl Economet 32(2):463–469
Article MathSciNet Google Scholar
Maneejuk P, Yamaka W (2020) Significance test for linear regression: how to test without P-values? J Appl Stat 48(5):827–845
Article MathSciNet Google Scholar
Maneejuk P, Pastpipatkul P, Sriboonchitta S (2016) Economic growth and income inequality: evidence from Thailand. In: Huynh VN, Inuiguchi M, Le B, Le B, Denoeux T (eds) Integrated uncertainty in knowledge modelling and decision making. IUKM 2016. Lecture Notes in Computer Science, vol 9978. Springer, Cham, pp 649–663. https://doi.org/10.1007/978-3-319-49046-5
Sriboochitta S, Yamaka W, Maneejuk P, Pastpipatkul P (2017) A generalized information theoretical approach to nonlinear time series model. In: Kreinovich V, Sriboonchitta S, Huynh VN (eds) Robustness in econometrics. Studies in Computational Intelligence, vol 692. Springer, Cham, pp 333–348. https://doi.org/10.1007/978-3-319-50742-2
Stone M (1977) An Asymptotic equivalence of choice of model by cross-validation and Akaike’s criterion. J Roy Stat Soc Ser B 39(1):44–47
MathSciNet MATH Google Scholar
Tateishi S, Matsui H, Konishi S (2010) Nonlinear regression modeling via the lasso-type regularization. J Stat Plan Infer 140(5):1125–1134
Article MathSciNet Google Scholar
Tibprasorn P, Maneejuk P, Sriboochitta S (2017) Generalized information theoretical approach to panel regression kink model. Thai J Math 133–145
Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J Roy Stat Soc Ser B 58(1):267–288
MathSciNet MATH Google Scholar
Wang T, Zhu L (2011) Consistent tuning parameter selection in high dimensional sparse linear regression. J Multivar Anal 102(7):1141–1151
Article MathSciNet Google Scholar
Wasserstein RL, Lazar NA (2016) The ASA’s statement on p-values: context, process, and purpose. Am Stat 70(2):129–133
Article MathSciNet Google Scholar
Yamaka W (2021) Variable selection and estimation in kink regression model. In: Ngoc Thach N, Kreinovich V, Trung ND (eds) Data science for financial econometrics. Studies in Computational Intelligence, vol 898. Springer, Cham, pp 151–164. https://doi.org/10.1007/978-3-030-48853-6
Zhang CH (2010) Nearly unbiased variable selection under minimax concave penalty. Ann Stat 38(2):894–942
Article MathSciNet Google Scholar
Zhang Y, Zhou Q, Jiang L (2017) Panel kink regression with an unknown threshold. Econ Lett 157:116–121
Article MathSciNet Google Scholar

Download references

Acknowledgments

The author is grateful to two reviewers for several helpful suggestions and discussions. Thanks also go to Dr. Laxmi Worachai for her helpful comments.

Funding

This study is funded by the Center of Excellence in Econometrics, Faculty of Economics, Chiang Mai University (Grant number: R000023389).

Author information

Authors and Affiliations

Center of Excellence in Econometrics, Faculty of Economics, Chiang Mai University, Chiang Mai, Thailand
Woraphon Yamaka

Authors

Woraphon Yamaka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Woraphon Yamaka.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any authors.

Additional information

Communicated by Vladik Kreinovich.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yamaka, W. Sparse estimations in kink regression model. Soft Comput 25, 7825–7838 (2021). https://doi.org/10.1007/s00500-021-05797-z

Download citation

Accepted: 05 April 2021
Published: 17 April 2021
Issue Date: June 2021
DOI: https://doi.org/10.1007/s00500-021-05797-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sparse estimations in kink regression model

Abstract

Access this article

Similar content being viewed by others

Shrinkage and Sparse Estimation for High-Dimensional Linear Models

Weak Signals in High-Dimensional Logistic Regression Models

Usage of the GO estimator in high dimensional linear models

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Sparse estimations in kink regression model

Abstract

Access this article

Similar content being viewed by others

Shrinkage and Sparse Estimation for High-Dimensional Linear Models

Weak Signals in High-Dimensional Logistic Regression Models

Usage of the GO estimator in high dimensional linear models

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation