Skip to main content
Log in

Subset selection via continuous optimization with applications to network design

  • Published:
Environmental Monitoring and Assessment Aims and scope Submit manuscript

Abstract

Choosing a subset of representative items from a set of alternatives is an important problem in many scientific fields such as environmental science and statistics. For most practical problems, however, a computationally efficient solution method is not known to exist. While this problem has attracted a significant amount of attention, the majority of specifically designed algorithms do not scale well with respect to the problem size or do not provide a usable open-source package. In this study, we show that any global continuous optimization technique can be used for solving the representative subset selection problem. The latter is achieved by designing a simple transformation which embeds the problem’s discrete space into a larger continuous space. The proposed methodology is applied to design problems in environmental and statistical domains. We evaluate the proposed method using several open-source global optimization packages, and show that this technique compares favorably with existing direct methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Ahmed, N.A., & Gokhale, D.V. (2006). Entropy expressions and their estimators for multivariate distributions. IEEE Transactions on Information Theory, 35(3), 688–692.

    Article  Google Scholar 

  • Allcott, H., & Gentzkow, M. (2017). Social media and fake news in the 2016 election. Journal of Economic Perspectives.

  • Bäck, T. (1996). Evolutionary algorithms in theory and practice: evolution strategies, evolutionary programming, genetic algorithms. New York: Oxford University Press, Inc.

    Google Scholar 

  • von Brömssen, C., Fölster, J., Futter, M., & McEwan, K. (2018). Statistical models for evaluating suspected artefacts in long-term environmental monitoring data. Environmental Monitoring and Assessment, 190(9), 558.

    Article  Google Scholar 

  • Bruns, D.A., Wiersma, G.B., & Rykiel, E.J. Jr. (1991). Ecosystem monitoring at global baseline sites. Environmental Monitoring and Assessment, 17(1), 3–31.

    Article  CAS  Google Scholar 

  • Chan, C.K., & Yao, X. (2008). Air pollution in mega cities in China. Atmospheric Environment, 42(1), 1–42.

    Article  CAS  Google Scholar 

  • Chao, Q., Yu, Y., & Zhou, Z. (2015). Subset selection by Pareto optimization. In Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., & Garnett, R. (Eds.) Advances in neural information processing systems 28, Curran Associates, Inc. (pp. 1774–1782).

  • Chun-Wa, K., Jon, L., & Queyranne, M. (1995). An exact algorithm for maximum entropy sampling. Operations Research, 43(4), 684–691.

    Article  Google Scholar 

  • Cormen, T.H., Leiserson, C.E., Rivest, R.L., & Stein, C. (2001). Introduction to algorithms, Second Edition. The MIT Press and McGraw-Hill Book Company, Cambridge, Massachusetts London, England.

  • Geoffrey, D., Stepahie, M., & Marco, A. (1997). Adaptive greedy approximations. Constructive Approximation, 13(1), 57–98.

    Article  Google Scholar 

  • Goos, P., & Bradley, J. (2011). Optimal design of experiments: a case study approach. West Sussex: Wiley.

    Book  Google Scholar 

  • Kennedy, J., & Mendes, R. (2002). Population structure and particle swarm performance. In Proceedings of the 2002 Congress on Evolutionary Computation. CEC’02 (Cat. No.02TH8600), (Vol. 2 pp. 1671–1676).

  • Le, N.D., & Zidek, J.V. (2006). Statistical analysis of environmental space-time processes. New York: Springer.

    Google Scholar 

  • Melanie, M. (1998). An introduction to genetic algorithms. Cambridge: The MIT Press .

    Google Scholar 

  • Mi, Z., Meng, J., Guan, D., Shan, Y., Liu, Z., Wang, Y., Feng, K., & Wei, Y. (2017). Pattern changes in determinants of Chinese emissions. Environmental Research Letters, 12(7), 074003.

    Article  Google Scholar 

  • Mullen, K. (2014). Continuous global optimization in R. Journal of Statistical Software. Articles, 60 (6), 1–45.

    Google Scholar 

  • Natarajan, B. (1995). Sparse approximate solutions to linear systems. SIAM Journal on Computing, 24(2), 227– 234.

    Article  Google Scholar 

  • Park, S.S., Jeong, J.U., & Schauer, J.J. (2013). Sources and their contribution of particulate water-soluble organic carbon observed during one year at a traffic dominated site. Atmospheric Environment, 77, 348?-357.

    Article  CAS  Google Scholar 

  • Price, K., Storn, R.M., & Lampinen, J.A. (2005). Differential evolution: a practical approach to global optimization (Natural Computing Series). Berlin: Springer.

    Google Scholar 

  • Ramanathan, V., & Carmichael, G. (2008). Global and regional climate changes due to black carbon. Nature Geoscience, 1, 221–227.

    Article  CAS  Google Scholar 

  • Ramiro, R., Ferreira, M., & Schmidt, A.M. (2010). Stochastic search algorithms for optimal design of monitoring networks. Environmetrics, 21(1), 102–112.

    Google Scholar 

  • Roy, F.B. (2000). Physics from Fisher information: a unification. American Journal of Physics, 68 (11), 1064– 1065.

    Google Scholar 

  • Rubinstein, R.Y., & Kroese, D.P. (2017). Simulation and the Monte Carlo method, 3rd edn. New York: Wiley.

    Google Scholar 

  • Shannon, C.E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27(3), 379–423.

    Article  Google Scholar 

  • Silverman, C., & Singer-Vine, J. (2016). Most Americans who see fake news believe it, new survey says. BuzzFeed News.

  • Stokes, J.R., & Horvath, A. (2010). Supply-chain environmental effects of wastewater utilities. Environmental Research Letters, 5(1), 014015.

    Article  Google Scholar 

  • Wiersma, G. (1984). Integrated global background monitoring network. In Symposium on research and monitoring in circumpolar biosphere reserves, Alberta, Canada, 27 Aug 1984, United States.

  • Wolters, M. (2015). A genetic algorithm for selection of fixed-size subsets with application to design problems. Journal of Statistical Software Code Snippets, 68(1), 1–18.

    Google Scholar 

  • Yu, B., & Yuan, B. (1992). A dynamic selection algorithm for globally optimal subsets. Engineering Applications of Artificial Intelligence, 5(5), 457–462.

    Article  Google Scholar 

Download references

Acknowledgments

I am thoroughly grateful to the anonymous reviewers and the editor for their valuable and constructive remarks and suggestions.

Funding

This work was supported by the Australian Research Council Centre of Excellence for Mathematical & Statistical Frontiers, under CE140100049 grant number.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Radislav Vaisman.

Ethics declarations

This article does not contain any studies with human or animal subjects performed by any of the authors.

Conflict of interest

The author declares that there are no conflicts of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vaisman, R. Subset selection via continuous optimization with applications to network design. Environ Monit Assess 192, 361 (2020). https://doi.org/10.1007/s10661-019-7938-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10661-019-7938-6

Keywords

Navigation