Abstract
Pearson’s chi-square statistic is well established for testing goodness-of-fit of various hypotheses about observed frequency distributions in contingency tables. A general formula for ANOVA-like decompositions of Pearson’s statistic is given under the independence assumption along with their extensions to higher-order tables. Mathematically, it makes the terms in the partitions and orthogonality among them obvious. Practically, it enables simultaneous analyses of marginal and joint probabilities in contingency tables under a variety of hypotheses about the marginal probabilities. Specifically, this framework accommodates the specification of theoretically driven probabilities as well as the well known cases in which the marginal probabilities are fixed or estimated from the data. The former allows tests of prescribed marginal probabilities, while the latter allows tests of the associations among variables after eliminating the marginal effects. Mixtures of these two cases are also permitted. Examples are given to illustrate the tests.
Similar content being viewed by others
References
Agresti A (1990) Categorical data analysis. Wiley, New York
Agresti A (2002) Categorical data analysis, 2nd edn. Wiley, New York
Andersen EB (1980) The statistical analysis of categorical data. Springer, Berlin
Andersen EB (1991) The statistical analysis of categorical data (second, revised and enlarged edition). Springer, Berlin
Bartlett MS (1935) Partitioning Pearson’s chi-squared statistic for a partially ordered three-way contingency table. J R Stat Soc Ser B 2:248–252
Beh EJ, Davy PJ (1998) Partitioning Pearson’s chi-squared statistic for a completely ordered three-way contingency table. Aust N Z J Stat 40:465–477
Beh EJ, Lombardo R (2014) Correspondence analysis, theory, practice and new strategies. Wiley, Chichester
Beh EJ, Simonetti B, D’Ambra L (2007) Partitioning a non-symmetric measure of association for three-way contingency tables. J Multivar Anal 98:1391–1411
Bishop YMM, Fienberg SE, Holland PW (1975) Discrete multivariate analysis. MIT Press, Cambridge
Carlier A, Kroonenberg PM (1996) Decompositions and biplots in three-way correspondence analysis. Psychometrika 61:355–373
Cheng PE, Liou JW, Liou M, Aston AD (2006) Data information in contingency tables: a fallacy of hierarchical loglinear models. J Data Sci 4:387–398
Cressie N, Read TCR (1984) Multinomial goodness-of-fit tests. J R Stat Soc Ser B 46:440–464
Friendly M (1994) Mosaic displays for multi-way contingency tables. J Am Stat Assoc 89:190–200
Goodman LA (1964) Simultaneous confidence intervals for contrasts among multinomial populations. Ann Math Stat 35:716–725
Goodman LA (1969) On partitioning \(\chi ^2\) and detecting partial association in three-way contingency tables. J R Stat Soc Ser B 31:486–498
Goodman LA (1970) The multivariate analysis of qualitative data: Interactions among multiple classifications. J Am Stat Assoc 65:226–256
Hoeffding W (1965) Asymptotically optimal tests for multinomial distributions. Ann Math Stat 36:369–408
Kroonenberg PM (2008) Applied multiway data analysis. Wiley, Hoboken
Lancaster HO (1951) Complex contingency tables treated by the partition of the chi-square. J R Stat Soc Ser B 13:242–249
Lang JB (1996) On the partitioning of goodness-of-fit statistics for multivariate categorical response models. J Am Stat Assoc 91:1017–1023
Loisel S, Takane Y (2016) Partitions of Pearson’s chi-square statistic for frequency tables: a comprehensive account. Comput Stat 31:1429–1452
Lombardo R, Carlier A, D’Ambra L (1996) Nonsymmetric correspondence analysis for three-way contingency tables. Methodologica 4:59–80
Lombardo R, Takane Y, Beh E (2017) Chi2x3way package. https://cran.r-project.org/web/packages/chi2x3way/index.html
Ogasawara T, Takahashi T (1951) Independence of quadratic forms of a random sample from a normal population. Sci Bull Hiroshima Univ 15:1–9
Pearson K (1900) On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philos Mag 1:157–175
Plackett RL (1962) A note on interactions in contingency tables. J R Stat Soc Ser B 24:162–166
Rao CR (1973) Linear statistical inference and its applications. Wiley, New-York
Read TCR, Cressie N (1988) Goodness-of fit statistics for discrete multivariate data. Springer, New York
Roy SN, Kastenbaum MA (1956) On the hypothesis of no “interaction” in a multi-way contingency table. Ann Math Stat 27:749–757
Simpson EH (1951) The interpretation of interaction in contingency tables. J R Stat Soc Ser B 13:238–241
Strandskov HH, Edelen WE (1946) Monozygotic and dizygotic twin birth frequencies in the total, the white and the colored us population. Genetics 31:438–446
van der Heijden PGM, de Leeuw J (1985) Correspondence analysis used complementary to loglinear analysis. Psychometrika 50:429–447
Yanai H, Takeuchi K, Takane Y (2011) Projection matrices, generalized inverse matrices, and singular value decomposition. Springer, New York
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Lombardo, R., Takane, Y. & Beh, E.J. Familywise decompositions of Pearson’s chi-square statistic in the analysis of contingency tables. Adv Data Anal Classif 14, 629–649 (2020). https://doi.org/10.1007/s11634-019-00374-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-019-00374-7
Keywords
- Pearson’s chi-square statistic
- Orthogonal projectors
- Familywise decompositions
- Hypothesised probabilities
- Observed proportions