Skip to main content
Log in

Robust regression based on shrinkage with application to Living Environment Deprivation

  • Original Paper
  • Published:
Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

Abstract

A robust estimator is proposed for the parameters that characterize the linear regression problem. It is based on the notion of shrinkages, often used in Finance and previously studied for outlier detection in multivariate data. A thorough simulation study is conducted to investigate: the efficiency with Normal and heavy-tailed errors, the robustness under contamination, the computational time, the affine equivariance and breakdown value of the regression estimator. Two classical data-sets often used in the literature and a real socioeconomic data-set about the Living Environment Deprivation of areas in Liverpool (UK), are studied. The results from the simulations and the real data examples show the advantages of the proposed robust estimator in regression.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  • Agulló J, Croux C, Van Aelst S (2008) The multivariate least-trimmed squares estimator. J Multivar Anal 99(3):311–338

    Article  Google Scholar 

  • Arribas-Bel D, Patino JE, Duque JC (2017) Remote sensing-based measurement of Living Environment Deprivation: improving classical approaches with machine learning. PLOS ONE 12(5):e0176684

    Article  CAS  Google Scholar 

  • Cabana E, Lillo R E, Laniado H (Nov 2019) Multivariate outlier detection based on a robust mahalanobis distance with shrinkage estimators. Stat Pap. ISSN 1613-9798. https://doi.org/10.1007/s00362-019-01148-1

  • Croux C, Rousseeuw PJ, Hössjer O (1994) Generalized S-estimators. J Am Stat Assoc 89(428):1271

    Article  Google Scholar 

  • Croux C, Van Aelst S, Dehon C (2003) Bounded influence regression using high breakdown scatter matrices. Ann Inst Stat Math 55(2):265–285

    Google Scholar 

  • D’Alimonte D, Cornford D (2008) Outlier detection with partial information: application to emergency mapping. Stoch Environ Res Risk Assess 22(5):613–620

    Article  Google Scholar 

  • De Grève JP, Vanbeveren D (1980) Close binary systems before and after mass transfer: a comparison of observations and theory. Astrophy Space Sci 68(2):433–457

    Article  Google Scholar 

  • DeMiguel V, Martin-Utrera A, Nogales FJ (2013) Size matters: optimal calibration of shrinkage estimators for portfolio selection. J Bank Finance 37(8):3018–3034

    Article  Google Scholar 

  • Donoho DL, Huber PJ (1983) The notion of breakdown point. In: Bickel PJ, Doksum K, Hodges JL (eds) A festschrift for Erich L. Lehmann, vol 157184. CRC Press, Wadsworth

  • Edgeworth FY (1887) On observations relating to several quantities. Hermathena 6:279–285

    Google Scholar 

  • Falk M (1997) On mad and comedians. Ann Inst Stat Math 49(4):615–644

    Article  Google Scholar 

  • Gervini D, Yohai VJ (2002) A class of robust and fully efficient regression estimators. Ann Stat 30(2):583–616

    Article  Google Scholar 

  • Hawkins DM, Olive DJ (2002) Inconsistency of resampling algorithms for high-breakdown regression estimators and a new algorithm. J Am Stat Assoc 97(457):136–148

    Article  Google Scholar 

  • Hawkins DM, Bradu D, Kass GV (1984) Location of several outliers in multiple-regression data using elemental sets. Technometrics 26(3):197

    Article  Google Scholar 

  • Huber PJ (1964) Robust estimation of a location parameter. Ann Math Stat 35(1):73–101

    Article  Google Scholar 

  • Huber PJ (1973) Robust regression: asymptotics, conjectures and monte Carlo. Ann Stat 1(5):799–821

    Article  Google Scholar 

  • Huber P J (1981) Robust statistics. Wiley, New York

    Book  Google Scholar 

  • Humphreys R M (1978) Studies of luminous stars in nearby galaxies. I. Supergiants and O stars in the Milky Way. Astrophys J Suppl Ser 38:309

    Article  Google Scholar 

  • James W, Stein C (1992) Estimation with quadratic loss. In: Kotz S, Johnson NL (eds) Breakthroughs in Statistics. Springer Series in Statistics (Perspectives in Statistics). Springer, New York, NY, pp 443–460

  • Jeong D, St-Hilaire A, Ouarda T, Gachon P (2012) Comparison of transfer functions in statistical downscaling models for daily temperature and precipitation over canada. Stoch Environ Res Risk Assess 26(5):633–653

    Article  Google Scholar 

  • Jolliffe I (2011) Principal component analysis. In: Lovric M (eds) International encyclopedia of statistical science. Springer, Berlin, pp 1094–1096 

  • Ledoit O, Wolf M (2003a) Improved estimation of the covariance matrix of stock returns with an application to portfolio selection. J Empir Finance 10(5):603–621

    Article  Google Scholar 

  • Ledoit O, Wolf M N (2003b) Honey, I shrunk the sample covariance matrix. UPF Economics and Business Working Paper No. 691

  • Ledoit O, Wolf M (2004) A well-conditioned estimator for large-dimensional covariance matrices. J Multivar Anal 88(2):365–411

    Article  Google Scholar 

  • Leroy AM, Rousseeuw PJ (1987) Robust regression and outlier detection. John wiley & sons, New York

    Google Scholar 

  • Lopuhaa HP, Rousseeuw PJ (1991) Breakdown points of affine equivariant estimators of multivariate location and covariance matrices. Ann Stat 19(1):229–248

    Article  Google Scholar 

  • Maronna R, Morgenthaler S (1986) Robust regression through robust covariances. Commun Stat—Theory Methods 15(4):1347–1365

    Article  Google Scholar 

  • Maronna RA, Zamar RH (2002) Robust estimates of location and dispersion for high-dimensional datasets. Technometrics 44(4):307–317

    Article  Google Scholar 

  • Maronna RA, Martin RD, Yohai VJ (2006) Robust statistics : theory and methods. Wiley, New York

    Book  Google Scholar 

  • Mourino H, Barao MI (2010) A comparison between the linear regression model with autocorrelated errors and the partial adjustment model. Stoch Environ Res Risk Assess 24(4):499–511

    Article  Google Scholar 

  • Oja H (2010) Multivariate nonparametric methods with R: an approach based on spatial signs and ranks. Springer, Berlin

    Book  Google Scholar 

  • Pan Z, Liu P, Gao S, Feng M, Zhang Y (2018) Evaluation of flood season segmentation using seasonal exceedance probability measurement after outlier identification in the three gorges reservoir. Stoch Environ Res Risk Assess 32(6):1573–1586

    Article  Google Scholar 

  • Riani M, Perrotta D, Torti F (2012) FSDA: a MATLAB toolbox for robust analysis and interactive data exploration. Chemometr Intell Lab Syst 116:17–32

    Article  CAS  Google Scholar 

  • Rousseeuw PJ (1983) Multivariate estimation with high breakdown point. Math Stat Appl 8:287–297

    Google Scholar 

  • Rousseeuw PJ (1984) Least median of squares regression. J Am Stat Assoc 79(388):871–880

    Article  Google Scholar 

  • Rousseeuw PJ, Croux C (1993) Alternatives to the median absolute deviation. J Am Stat Assoc 88(424):1273

    Article  Google Scholar 

  • Rousseeuw P, Yohai V (1984) Robust regression by means of S-estimators. Springer, New York, pp 256–272

    Google Scholar 

  • Rousseeuw PJ, Aelst SV, Van Driessen K, Agulló J (2004) Robust multivariate regression. Technometrics 46(3):293–305

    Article  Google Scholar 

  • Ruppert D (1992) Computing S estimators for regression and multivariate location/dispersion. J Comput Graph Stat 1(3):253

    Google Scholar 

  • Sajesh TA, Srinivasan MR (2012) Outlier detection for high dimensional data using the Comedian approach. J Stat Comput Simul 82(5):745–757

    Article  Google Scholar 

  • Sguera C, Galeano P, Lillo RE (2016) Functional outlier detection by a local depth with application to no x levels. Stoch Environ Res Risk Assess 30(4):1115–1130

    Article  Google Scholar 

  • Siegel AF (1982) Robust regression using repeated medians. Biometrika 69(1):242

    Article  Google Scholar 

  • Stromberg AJ, Hössjer O, Hawkins DM (2000) The least trimmed differences regression estimator and alternatives. J Am Stat Assoc 95(451):853–864

    Article  Google Scholar 

  • Tung Y, Yeh K, Yang J (1997) Regionalization of unit hydrograph parameters: 1. Comp Regres Anal Tech 11:17

    Google Scholar 

  • Vardi Y, Zhang CH (2000) The multivariate L1-median and associated data depth. Proc Natl Acad Sci U S Am 97(4):1423–6

    Article  CAS  Google Scholar 

  • Verboven S, Hubert M (2005) LIBRA: a MATLAB library for robust analysis. Chemometr Intell Lab Syst 75(2):127–136

    Article  CAS  Google Scholar 

  • Xiong S, Joseph VR (2013) Regression with outlier shrinkage. J Stat Plan Inference 143(11):1988–2001

    Article  Google Scholar 

  • Yohai VJ (1987) High breakdown-point and high efficiency robust estimates for regression. Ann Stat 15(2):642–656

    Article  Google Scholar 

  • Yu C, Yao W (2017) Robust linear regression: a review and comparison. Commun Stat—Simul Comput 46(8):6261–6282

    Article  Google Scholar 

  • Zou H, Hastie T, Tibshirani R (2006) Sparse principal component analysis. J Comput Graph Stat 15(2):265–286

    Article  Google Scholar 

Download references

Acknowledgements

The authors are grateful to the editor and the referee for the constructive and valuable comments. This research was partially supported by MINISTERIO DE ECONOMIA, INDUSTRIA Y COMPETITIVIDAD, Award Number: ECO2015-66593-P.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elisa Cabana.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research was partially supported by MINISTERIO DE ECONOMIA, INDUSTRIA Y COMPETITIVIDAD, Award Number: ECO2015-66593-P.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 116 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cabana, E., Lillo, R.E. & Laniado, H. Robust regression based on shrinkage with application to Living Environment Deprivation. Stoch Environ Res Risk Assess 34, 293–310 (2020). https://doi.org/10.1007/s00477-020-01774-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00477-020-01774-4

Keywords

Navigation