Cost-sensitive selection of variables by ensemble of model sequences

Yan, Donghui; Qin, Zhiwei; Gu, Songxiang; Xu, Haiping; Shao, Ming

doi:10.1007/s10115-021-01551-x

Cost-sensitive selection of variables by ensemble of model sequences

Regular Paper
Published: 11 March 2021

Volume 63, pages 1069–1092, (2021)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Donghui Yan ORCID: orcid.org/0000-0002-5131-1509¹,
Zhiwei Qin²,
Songxiang Gu³,
Haiping Xu⁴ &
…
Ming Shao⁴

602 Accesses
1 Citation
Explore all metrics

Abstract

Many applications require the collection of data on different variables or measurements over many system performance metrics. We term those broadly as measures or variables. Often data collection along each measure incurs a cost, thus it is desirable to consider the cost of measures in modeling. This is a fairly new class of problems in the area of cost-sensitive learning. A few attempts have been made to incorporate costs in combining and selecting measures. However, existing studies either do not strictly enforce a budget constraint, or are not the ‘most’ cost effective. With a focus on classification problems, we propose a computationally efficient approach that could find a near optimal model under a given budget by exploring the most ‘promising’ part of the solution space. Instead of outputting a single model, we produce a model schedule—a list of models, sorted by model costs and expected predictive accuracy. This could be used to choose the model with the best predictive accuracy under a given budget, or to trade off between the budget and the predictive accuracy. Experiments on some benchmark datasets show that our approach compares favorably to competing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parameter Sensitivity Analysis for the Progressive Sampling-Based Bayesian Optimization Method for Automated Machine Learning Model Selection

Global optimization in machine learning: the design of a predictive analytics application

Article 01 November 2018

Model Predictivity Assessment: Incremental Test-Set Selection and Accuracy Evaluation

References

Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723
Article MathSciNet Google Scholar
Bartlett PL, Traskin M (2007) Adaboost is consistent. J Mach Learn Res 8:2347–2368
MathSciNet MATH Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article Google Scholar
Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. Chapman and Hall, London
MATH Google Scholar
Caruana R, Karampatziakis N, Yessenalina A (2008) An empirical evaluation of supervised learning in high dimensions. In: Proceedings of the twenty-fifth international conference on machine learning (ICML), pp 96–103
Caruana R, Niculescu-Mizil A (2006) An empirical comparison of supervised learning algorithms. In: Proceedings of the 23rd international conference on machine learning (ICML)
Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD, pp 785–794
Cortes C, Vapnik VN (1995) Support-vector networks. Mach Learn 20(3):273–297
MATH Google Scholar
Delaigle A, Hall P, Jin J (2011) Robustness and accuracy of methods for high dimensional data analysis based on student’s t-statistic. J R Stat Soc Ser B 73(3):283–301
Article MathSciNet Google Scholar
Donoho D, Jin J (2008) Higher criticism thresholding: optimal feature selection when useful features are rare and weak. Proc Natl Acad Sci USA 105(39):14790–14795
Article Google Scholar
Efron B, Hastie T, Johnstone IM, Tibshirani R (2004) Least angle regression. Ann Stat 32:407–499
Article MathSciNet Google Scholar
Elkan C (2001) The foundations of cost-sensitive learning. In: In Proceedings of the 17th international joint conference on artificial intelligence, pp 973–978
Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Proceedings of the 13th international conference on machine learning (ICML)
Friedman J (2002) Stochastic gradient boosting. Comput Stat Data Anal 38(4):367–378
Article MathSciNet Google Scholar
Friedman J, Hastie T, Tibshirani R (2010) Regulzrization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22
Article Google Scholar
Greiner R, Grove A, Roth D (2002) Learning cost-sensitive active classifiers. Artif Intell 139(2):137–174
Article MathSciNet Google Scholar
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
MATH Google Scholar
Ji S, Carin L (2007) Cost-sensitive feature acquisition and classification. Pattern Recogn 40(5):1474–1485
Article Google Scholar
Lichman M (2013) UC Irvine machine learning repository. http://archive.ics.uci.edu/ml
Liu H, Motoda H (1998) Feature selection for knowledge discovery and data mining. Springer, Berlin
Book Google Scholar
Luenberger DG (2003) Linear and nonlinear programming. Springer, Berlin
MATH Google Scholar
Meinshausen N, Buhlmann P (2006) High-dimensional graphs and variable selection with the lasso. Ann Stat 34(3):1436–1462
Article MathSciNet Google Scholar
Min F, He H, Qian Y, Zhu W (2011) Test-cost-sensitive attribute reduction. Inf Sci 181(22):4928–4942
Article Google Scholar
Nagaraju V, Yan D, Fiondella L (2018) A framework for selecting a subset of metrics considering cost. In: 24th ISSAT international conference on reliability and quality in design (RQD 2018)
O’Brien DB, Gupta MR, Gray, RM (2008) Cost-sensitive multi-class classiØcation from probability estimates. In: Proceedings of the 25th international conference on machine learning (ICML)
Park M, Hastie T (2007) L1-regularization path algorithm for generalized linear models. J R Stat Soc B 69(4):659–677
Article MathSciNet Google Scholar
Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Article Google Scholar
Schwarz GE (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Article MathSciNet Google Scholar
Sheng VS, Ling CX (2006) Thresholding for making classifiers cost-sensitive. In: Proceedings of AAAI
Spackman KA (1989) Signal detection theory: valuable tools for evaluating inductive learning. In: Proceedings of the 6th international workshop on machine learning
Tang J, Alelyani S, Liu H (2014) Feature selection for classification: a review. In: Aggarwal CC (ed) Data classification: algorithms and applications. Chapman and Hall, London
Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 58(1):267–288
MathSciNet MATH Google Scholar
Wang A, Bian X, Liu P, Yan D (2019) DC\(^{2}\): a divide-and-conquer algorithm for large-scale kernel learning with application to clustering. arXiv:1911.06944
Wang H (2009) Forward regression for ultra-high dimensional variable screening. J Am Stat Assoc 104(488):1512–1524
Article MathSciNet Google Scholar
Wang X, Leng C (2016) High dimensional ordinary least squares projection for screening variables. J R Stat Soc Ser B 78(3):589–611
Article MathSciNet Google Scholar
Yan D, Li C, Cong N, Yu L, Gong P (2019) A structured approach to the analysis of remote sensing images. Int J Remote Sens 40(20):7874–7897
Article Google Scholar
Yan D, Wang Y, Wang J, Wu G, Wang H (2019) Fast communication-efficient spectral clustering over distributed data. arXiv:1905.01596
Yan D, Xu Y (2019) Learning over inherently distributed data. arXiv:1907.13208
Zadrozny B, Langford J, Abe N (2003) Cost-sensitive learning by cost-proportionate example weighting. In: Proceedings of IEEE international conference on data mining (ICDM)
Zhou Q, Zhou H, Li T (2016) Cost-sensitive feature selection using random forest: selecting low-cost subsets of informative features. Knowl Based Syst 95:1–11
Article Google Scholar
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B 67(2):301–320
Article MathSciNet Google Scholar

Download references

Acknowledgements

We thank the editors and the anonymous reviewers for their helpful comments and suggestions. This work was partially supported by the University Industry Collaborative Award (R32020110000000) from the University of Massachusetts Dartmouth.

Author information

Authors and Affiliations

Department of Mathematics and Program in Data Science, University of Massachusetts, Dartmouth, MA, 02747, USA
Donghui Yan
DiDi Research America, Mountain View, CA, USA
Zhiwei Qin
JD Digital, Mountain View, CA, USA
Songxiang Gu
Department of Computer and Information Science, University of Massachusetts, Dartmouth, MA, USA
Haiping Xu & Ming Shao

Authors

Donghui Yan
View author publications
You can also search for this author in PubMed Google Scholar
Zhiwei Qin
View author publications
You can also search for this author in PubMed Google Scholar
Songxiang Gu
View author publications
You can also search for this author in PubMed Google Scholar
Haiping Xu
View author publications
You can also search for this author in PubMed Google Scholar
Ming Shao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Donghui Yan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yan, D., Qin, Z., Gu, S. et al. Cost-sensitive selection of variables by ensemble of model sequences. Knowl Inf Syst 63, 1069–1092 (2021). https://doi.org/10.1007/s10115-021-01551-x

Download citation

Received: 01 December 2018
Revised: 28 January 2021
Accepted: 10 February 2021
Published: 11 March 2021
Issue Date: May 2021
DOI: https://doi.org/10.1007/s10115-021-01551-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cost-sensitive selection of variables by ensemble of model sequences

Abstract

Access this article

Similar content being viewed by others

Parameter Sensitivity Analysis for the Progressive Sampling-Based Bayesian Optimization Method for Automated Machine Learning Model Selection

Global optimization in machine learning: the design of a predictive analytics application

Model Predictivity Assessment: Incremental Test-Set Selection and Accuracy Evaluation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Cost-sensitive selection of variables by ensemble of model sequences

Abstract

Access this article

Similar content being viewed by others

Parameter Sensitivity Analysis for the Progressive Sampling-Based Bayesian Optimization Method for Automated Machine Learning Model Selection

Global optimization in machine learning: the design of a predictive analytics application

Model Predictivity Assessment: Incremental Test-Set Selection and Accuracy Evaluation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation