Abstract
Information-based model selection criteria such as the AIC and BIC employ check loss functions to measure the goodness of fit for quantile regression models. Model selection using a check loss function is robust due to its resistance to outlying observations. In the present study, we suggest modifying the check loss function to achieve a more efficient goodness of fit. Because the cusp of the check loss is quadratically adjusted in the modified version, greater efficiency (or variance reduction) in the model selection is expected. Because we focus on model selection here, we do not modify the model-fitting process. Generalized cross-validation is another common method for choosing smoothing parameters in quantile smoothing splines. We describe how this can be adjusted using the modified check loss to increase efficiency. The proposed generalized cross-validation is designed to reflect the target quantile and sample size. Two real data sets and simulation studies are presented to evaluate its performance using linear and nonlinear quantile regression models.
Similar content being viewed by others
References
Chen, J., & Chen, Z. (2012). Extended BIC for small-n-large-p sparse GLM. Statistica Sinica, 22, 555–574.
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360.
Golub, G. H., Heath, M., & Wahba, G. (1979). Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics, 21(2), 215–223.
He, X., Ng, P., & Portnoy, S. (1998). Bivariate quantile smoothing splines, Journal of the Royal Statistical Society. Series B (Statistical Methodology), 60(3), 537-550.
Jung, Y., MacEachern, S. N., & Kim, H. J. (2021). Modified check loss for efficient estimation via model selection in quantile regression. Journal of Applied Statistics, 48(5), 866–886.
Koenker, R. (1994). Quantile smoothing splines. Biometrika, 81(4), 673–680.
Koenker, R. (2005). Quantile regression. Cambridge University Press.
Koenker, R., & Bassett, G. (1978). Regression quantiles. Econometrica, 46(1), 33–50.
Konishi, S., & Kitagawa, G. (1996). Generalised information criteria in model selection. Biometrika, 83(4), 875–890.
Lee, Y., MacEachern, S. N., & Jung, Y. (2012). Regularization of case-specific parameters for robustness and efficiency. Statistical Science, 27(3), 350–372.
Muggeo, V. M., Sciandra, M., & Augugliaro, L. (2012). Quantile regression via iterative least squares computations. Journal of Statistical Computation and Simulation, 82(11), 1557–1569.
Nychka, D., Furrer, R., Paige, J., & Sain, S. (2017). fields: Tools for spatial data. R package version 11.6.
Nychka, D., Gray, G., Haaland, P., Martin, D., & O’Connell, M. (1995). A nonparametric regression approach to syringe grading for quality improvement. Journal of the American Statistical Association, 90(432), 1171–1178.
Ronchetti, E. (1985). Robust model selection in regression. Statistis & Probability Letters, 3(1), 21–23.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.
Yuan, M. (2006). GACV for quantile smoothing splines. Computational Statistics & Data Analysis, 50(3), 813–829.
Acknowledgements
Yoonsuh Jung’s work is partially supported by National Research Foundation of Korea (NRF) grants funded by the Korean government (MIST) (No. 2019R1F1A1040515 and No. 2019R1A4A1028134).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Appendix
Appendix
In practice, \(\sigma\) is unknown and it is estimated as \({\hat{\sigma }}=\sum _{i=1}^{n}\rho _q(u_i)/n\). Maximizing the log likelihood with \({\hat{\sigma }}\) is thus equivalent to maximizing
Because the second and the third terms in the last equation are constant, maximizing the log likelihood is equivalent to minimizing \(n\log ({\hat{\sigma }})\). Then, adding the penalty term \(\alpha (n,k)\) yields the information-based criterion in (2).
1.1 Simulations for a rule \(c_{q,n}\)
Although \(c_{q,n}\) in GCV is not explicitly expressed in (5), it was originally set to 1 in Nychka et al. (1995). However, using this value was not satisfactory in our experiments. To find an appropriate value for \(c_{q,n}\) and to provide an empirical rule, we attempt several values for \(c_{q,n}\). Tables 11 and 12 shows the MSE when \(c_{q,n}=1\) under Model 1 and Model 2. Tables 13 and 14 contain the results when \(c_{q,n} = n^{(q - 0.5)^2}\). Finally, the current results (Tables 8 and 9 ) are the results from our rule \(c_{q,n} = n^{|q - 0.5|}\). We chose the current rule because the overall MSE and the reduction in MSE when GCV is replaced by EGCV are the most satisfactory.
1.2 Modified check loss in model fitting process
In our work, the modified check loss is used only for the model selection process. However, using it in the model fitting process can improve the overall modeling procedure. Lee et al. (2012)’s work show the theoretical justification and intuitive explanation of the modified check loss in terms of model fitting. Recent work by Jung et al. (2021) presents some theoretical properties of the modified check loss under the framework of cross-validation. In this section, we replace the check loss with the modified check in the model fitting process. The results are summarized in Table 15. Here, GCV and EGCV use check loss for the model fitting, and use the modified check loss for the model selection. \(GCV^{M}\) and \(EGCV^{M}\) employ the modified check loss for both the model fitting and selection. Overall, we see some clear pattern. There is larger reduction when the modified check loss is used in the model fitting process. (\(GCV^{M}\) and \(EGCV^{M}\) are respectively show lower MSE than GCV and EGCV.) When the modified check loss is used the model selection process, we still observe some improvement. (EGCV and \(EGCV^{M}\) are respectively show lower MSE than GCV and \(GCV^{M}\).) A final comparison would be the usage of the modified check loss for only fitting part (\(GCV^{M}\)) and the usage of it for only selection process (EGCV). Employing the modified check loss only in the fitting process show better results. This reflects the general fact that model fitting procedure is more important than the tuning parameter selection. Thus, improving the model fitting produces greater reduction in MSE compared to improving the model selection. However, after improving the model fitting process, there still is room for further improvement. Table 15 clearly shows this by comparing \(GCV^{M}\) and \(EGCV^{M}\). Please note that the main topic of this paper is to show the improvement of the model selection procedure. The improvement in the model fitting part by the modified check loss is extensively shown in Lee et al. (2012). Therefore, we want to focus its role only in the model selection part in this paper. Of course, we know that using the modified check loss for both the model fitting and model selection produce the best results. Finally the difference between all four methods is gradually reducing as sample size increase because the modification is designed to disappear as n increases.
Rights and permissions
About this article
Cite this article
Shin, W., Kim, M. & Jung, Y. Efficient information-based criteria for model selection in quantile regression. J. Korean Stat. Soc. 51, 245–281 (2022). https://doi.org/10.1007/s42952-021-00137-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42952-021-00137-1