Skip to main content

Advertisement

Log in

Testing the equality of a large number of populations

  • Original Paper
  • Published:
TEST Aims and scope Submit manuscript

Abstract

Given k independent samples with finite but arbitrary dimension, this paper deals with the problem of testing for the equality of their distributions that can be continuous, discrete or mixed. In contrast to the classical setting where k is assumed to be fixed and the sample size from each population increases without bound, here k is assumed to be large and the size of each sample is either bounded or small in comparison with k. The asymptotic distribution of two test statistics is stated under the null hypothesis of the equality of the k distributions as well as under alternatives, which let us to study the asymptotic power of the resulting tests. Specifically, it is shown that both test statistics are asymptotically free distributed under the null hypothesis. The finite sample performance of the tests based on the asymptotic null distribution is studied via simulation. An application of the proposal to a real data set is included. The use of the proposed procedure for infinite dimensional data, as well as other possible extensions, are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agrawal A, Catalini C, Goldfarb A (2014) Some simple economics of crowdfunding. Innov Policy Econ 14:63–97

    Article  Google Scholar 

  • Alba-Fernández MV, Jiménez-Gamero MD, Muñoz-García J (2008) A test for the two-sample problem based on empirical characteristic functions. Comput Stat Data Anal 52:3730–3748

    Article  MathSciNet  Google Scholar 

  • Alba-Fernández MV, Batsidis A, Jiménez-Gamero MD, Jodrá P (2017) A class of tests for the two-sample problem for count data. J Comput Appl Math 318:220–229

    Article  MathSciNet  Google Scholar 

  • Anderson NH, Hall P, Titterington DM (1994) Two-sample tests for measuring discrepancies between two multivariate probability density functions using kernel-based density estimates. J Multivar Anal 50:41–54

    Article  MathSciNet  Google Scholar 

  • Bárcenas R, Ortega J, Quiroz AJ (2017) Quadratic forms of the empirical processes for the two-sample problem for functional data. Test 26:503–526

  • Baringhaus L, Franz C (2004) On a new multivariate two-sample test. J Multivar Anal 88:190–206

    Article  MathSciNet  Google Scholar 

  • Baringhaus L, Kolbe D (2015) two-sample tests based on empirical Hankel trasforms. Stat Pap 56:597–617

    Article  Google Scholar 

  • Cousido-Rocha M, de Uña-Álvarez J, Hart JD (2019) Testing equality of a large number of densities under mixing conditions. Test 28:1203–1228

    Article  MathSciNet  Google Scholar 

  • Cuesta-Albertos JA, Fraiman R, Ransford T (2006) Random projections and goodness-of-fit tests in infinite-dimensional spaces. Bull Braz Math Soc 37:1–25

    Article  MathSciNet  Google Scholar 

  • Hall P, Van Keilegom I (2007) Two-sample tests in functional data analysis starting from discrete data. Stat Sin 17:1511–1531

    MathSciNet  MATH  Google Scholar 

  • Henze N, Jiménez-Gamero MD (2020) A test for Gaussianity in Hilbert spaces via the empirical characteristic functional. Scand J Stat. https://doi.org/10.1111/sjos.12470

    Article  MATH  Google Scholar 

  • Hušková M, Meintanis SG (2008) Tests for the multivariate \(k\)-sample problem based on the empirical characteristic function. J Nonparametr Stat 20:263–277

    Article  MathSciNet  Google Scholar 

  • Jammalamadaka SR, Jiménez-Gamero MD, Meintanis SG (2019) A class of goodness-of-fit tests for circular distributions based on trigonometric moments. SORT 43:317–336

    MathSciNet  MATH  Google Scholar 

  • Jiang Q, Hušková M, Meintanis SG, Zhu L (2019) Asymptotics, finite-sample comparisons and applications for two-sample tests with functional data. J Multivar Anal 170:202–220

    Article  MathSciNet  Google Scholar 

  • Jiménez-Gamero MD, Alba-Fernández MV, Jodrá P, Barranco-Chamorro I (2017) Fast tests for the two-sample problem based on the empirical characteristic function. Math Comput Simul 137:390–410

    Article  MathSciNet  Google Scholar 

  • Kiefer J (1959) \(k\)-sample analogues of the Kolmogorov–Smirnov and Cramer-V. Mises Tests Ann Math Stat 30:420–447

    Article  MathSciNet  Google Scholar 

  • Laha RG, Rohatgi VK (1979) Probability theory. Wiley, New York

    MATH  Google Scholar 

  • Martínez-Camblor P, de Uña-Álvarez J (2009) Non-parametric k-sample tests: density functions vs distribution functions. Comput Stat Data Anal 53:3344–57

    Article  MathSciNet  Google Scholar 

  • Mollick E (2014) The dynamics of crowdfunding: an exploratory study. J Bus Ventur 29:1–16

    Article  Google Scholar 

  • Pardo-Fernández JC, Jiménez-Gamero MD (2019) Testing for the conditional variance in nonparametric regression models. AStA Adv Stat Anal 103:387–410

    Article  MathSciNet  Google Scholar 

  • Pardo-Fernández JC, Jiménez-Gamero MD, El Ghouch A (2015) A nonparametric ANOVA-type test for regression curves based on characteristic functions. Scand J Stat 42:197–213

    Article  MathSciNet  Google Scholar 

  • Rivas-Martínez GI, Jiménez-Gamero MD, Moreno Rebollo JL (2019) A two-sample test for the error distribution in nonparametric regression. Stat Pap 60:1369–1395

    Article  Google Scholar 

  • Zhan D, Hart JD (2014) Testing equality of a large number of densities. Biometrika 101:449–464

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors thank the Associate Editor and two anonymous referees for their constructive comments and suggestions which helped to improve the presentation. M.D. Jiménez-Gamero has been partially supported by Grants MTM2017-89422-P (Spanish Ministry of Economy, Industry and Competitiveness, the State Agency of Investigation, the European Regional Development Fund) and P18-FR-2369 (Junta de Andalucía). M. Cousido-Rocha has received financial support of SiDOR research group through the Grant Competitive Reference Group, 2016-2019 (ED431C 2016/040), funded by the Consellería de Cultura, Educación e Ordenación Universitaria, Xunta de Galicia, also by Grant MTM2017-89422-P. M.V. Alba-Fernández and F. Jiménez-Jiménez acknowledge the financial support provided by the Grant EI_SEJ5_2019 (Universidad de Jaén).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. D. Jiménez-Gamero.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 234 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiménez-Gamero, M.D., Cousido-Rocha, M., Alba-Fernández, M.V. et al. Testing the equality of a large number of populations. TEST 31, 1–21 (2022). https://doi.org/10.1007/s11749-021-00769-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11749-021-00769-9

Keywords

Mathematics Subject Classification

Navigation