Skip to main content
Log in

Cost-sensitive selection of variables by ensemble of model sequences

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Many applications require the collection of data on different variables or measurements over many system performance metrics. We term those broadly as measures or variables. Often data collection along each measure incurs a cost, thus it is desirable to consider the cost of measures in modeling. This is a fairly new class of problems in the area of cost-sensitive learning. A few attempts have been made to incorporate costs in combining and selecting measures. However, existing studies either do not strictly enforce a budget constraint, or are not the ‘most’ cost effective. With a focus on classification problems, we propose a computationally efficient approach that could find a near optimal model under a given budget by exploring the most ‘promising’ part of the solution space. Instead of outputting a single model, we produce a model schedule—a list of models, sorted by model costs and expected predictive accuracy. This could be used to choose the model with the best predictive accuracy under a given budget, or to trade off between the budget and the predictive accuracy. Experiments on some benchmark datasets show that our approach compares favorably to competing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723

    Article  MathSciNet  Google Scholar 

  2. Bartlett PL, Traskin M (2007) Adaboost is consistent. J Mach Learn Res 8:2347–2368

    MathSciNet  MATH  Google Scholar 

  3. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  Google Scholar 

  4. Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. Chapman and Hall, London

    MATH  Google Scholar 

  5. Caruana R, Karampatziakis N, Yessenalina A (2008) An empirical evaluation of supervised learning in high dimensions. In: Proceedings of the twenty-fifth international conference on machine learning (ICML), pp 96–103

  6. Caruana R, Niculescu-Mizil A (2006) An empirical comparison of supervised learning algorithms. In: Proceedings of the 23rd international conference on machine learning (ICML)

  7. Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD, pp 785–794

  8. Cortes C, Vapnik VN (1995) Support-vector networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  9. Delaigle A, Hall P, Jin J (2011) Robustness and accuracy of methods for high dimensional data analysis based on student’s t-statistic. J R Stat Soc Ser B 73(3):283–301

    Article  MathSciNet  Google Scholar 

  10. Donoho D, Jin J (2008) Higher criticism thresholding: optimal feature selection when useful features are rare and weak. Proc Natl Acad Sci USA 105(39):14790–14795

    Article  Google Scholar 

  11. Efron B, Hastie T, Johnstone IM, Tibshirani R (2004) Least angle regression. Ann Stat 32:407–499

    Article  MathSciNet  Google Scholar 

  12. Elkan C (2001) The foundations of cost-sensitive learning. In: In Proceedings of the 17th international joint conference on artificial intelligence, pp 973–978

  13. Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Proceedings of the 13th international conference on machine learning (ICML)

  14. Friedman J (2002) Stochastic gradient boosting. Comput Stat Data Anal 38(4):367–378

    Article  MathSciNet  Google Scholar 

  15. Friedman J, Hastie T, Tibshirani R (2010) Regulzrization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22

    Article  Google Scholar 

  16. Greiner R, Grove A, Roth D (2002) Learning cost-sensitive active classifiers. Artif Intell 139(2):137–174

    Article  MathSciNet  Google Scholar 

  17. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    MATH  Google Scholar 

  18. Ji S, Carin L (2007) Cost-sensitive feature acquisition and classification. Pattern Recogn 40(5):1474–1485

    Article  Google Scholar 

  19. Lichman M (2013) UC Irvine machine learning repository. http://archive.ics.uci.edu/ml

  20. Liu H, Motoda H (1998) Feature selection for knowledge discovery and data mining. Springer, Berlin

    Book  Google Scholar 

  21. Luenberger DG (2003) Linear and nonlinear programming. Springer, Berlin

    MATH  Google Scholar 

  22. Meinshausen N, Buhlmann P (2006) High-dimensional graphs and variable selection with the lasso. Ann Stat 34(3):1436–1462

    Article  MathSciNet  Google Scholar 

  23. Min F, He H, Qian Y, Zhu W (2011) Test-cost-sensitive attribute reduction. Inf Sci 181(22):4928–4942

    Article  Google Scholar 

  24. Nagaraju V, Yan D, Fiondella L (2018) A framework for selecting a subset of metrics considering cost. In: 24th ISSAT international conference on reliability and quality in design (RQD 2018)

  25. O’Brien DB, Gupta MR, Gray, RM (2008) Cost-sensitive multi-class classiØcation from probability estimates. In: Proceedings of the 25th international conference on machine learning (ICML)

  26. Park M, Hastie T (2007) L1-regularization path algorithm for generalized linear models. J R Stat Soc B 69(4):659–677

    Article  MathSciNet  Google Scholar 

  27. Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238

    Article  Google Scholar 

  28. Schwarz GE (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464

    Article  MathSciNet  Google Scholar 

  29. Sheng VS, Ling CX (2006) Thresholding for making classifiers cost-sensitive. In: Proceedings of AAAI

  30. Spackman KA (1989) Signal detection theory: valuable tools for evaluating inductive learning. In: Proceedings of the 6th international workshop on machine learning

  31. Tang J, Alelyani S, Liu H (2014) Feature selection for classification: a review. In: Aggarwal CC (ed) Data classification: algorithms and applications. Chapman and Hall, London

    Google Scholar 

  32. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 58(1):267–288

    MathSciNet  MATH  Google Scholar 

  33. Wang A, Bian X, Liu P, Yan D (2019) DC\(^{2}\): a divide-and-conquer algorithm for large-scale kernel learning with application to clustering. arXiv:1911.06944

  34. Wang H (2009) Forward regression for ultra-high dimensional variable screening. J Am Stat Assoc 104(488):1512–1524

    Article  MathSciNet  Google Scholar 

  35. Wang X, Leng C (2016) High dimensional ordinary least squares projection for screening variables. J R Stat Soc Ser B 78(3):589–611

    Article  MathSciNet  Google Scholar 

  36. Yan D, Li C, Cong N, Yu L, Gong P (2019) A structured approach to the analysis of remote sensing images. Int J Remote Sens 40(20):7874–7897

    Article  Google Scholar 

  37. Yan D, Wang Y, Wang J, Wu G, Wang H (2019) Fast communication-efficient spectral clustering over distributed data. arXiv:1905.01596

  38. Yan D, Xu Y (2019) Learning over inherently distributed data. arXiv:1907.13208

  39. Zadrozny B, Langford J, Abe N (2003) Cost-sensitive learning by cost-proportionate example weighting. In: Proceedings of IEEE international conference on data mining (ICDM)

  40. Zhou Q, Zhou H, Li T (2016) Cost-sensitive feature selection using random forest: selecting low-cost subsets of informative features. Knowl Based Syst 95:1–11

    Article  Google Scholar 

  41. Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B 67(2):301–320

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We thank the editors and the anonymous reviewers for their helpful comments and suggestions. This work was partially supported by the University Industry Collaborative Award (R32020110000000) from the University of Massachusetts Dartmouth.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Donghui Yan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yan, D., Qin, Z., Gu, S. et al. Cost-sensitive selection of variables by ensemble of model sequences. Knowl Inf Syst 63, 1069–1092 (2021). https://doi.org/10.1007/s10115-021-01551-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-021-01551-x

Keywords

Navigation