Abstract
This paper focuses on studying the relationships among a set of categorical (ordinal) variables collected in a contingency table. Besides the marginal and conditional (in)dependencies, thoroughly analyzed in the literature, we consider the context-specific independencies holding only in a subspace of the outcome space of the conditioning variables. To this purpose we consider the hierarchical multinomial marginal models and we provide several original results about the representation of context-specific independencies through these models. The theoretical results are supported by an application concerning the innovation degree of Italian enterprises.
Similar content being viewed by others
References
Albert JH (1996) Bayesian selection of log-linear models. Can J Stat 24(3):327–347
Agresti A (2013) Categorical data analysis. Wiley, Hoboken
Bartolucci F, Colombi R, Forcina A (2007) An extended class of marginal link functions for modelling contingency tables by equality and inequality constraints. Stat Sin 17:691–711
Bergsma WP, Rudas T (2002) Marginal models for categorical data. Ann Stat 30(1):140–159
Boutilier C, Friedman N, Goldszmidt M, Koller D (1996) Context-specific independence in Bayesian networks. In: Proceedings of the twelfth international conference on Uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc., pp 115–123
Cazzaro M, Colombi R (2008) Modelling two way contingency tables with recursive logits and odds ratios. Stat Methods Appl 17(4):435–453
Cazzaro M, Colombi R (2014) Marginal nested interactions for contingency tables. Commun Stat Theory Methods 43(13):2799–2814
Colombi R, Forcina A (2014) A class of smooth models satisfying marginal and context specific conditional independencies. J Multivar Anal 126:75–85
Colombi R, Giordano S, Cazzaro M (2014) hmmm: an R package for hierarchical multinomial marginal models. J Stat Softw 59(11):1–25
Dellaportas P, Forster JJ (1999) Markov chain Monte Carlo model determination for hierarchical and graphical log-linear models. Biometrika 86(3):615–633
Drton M (2009) Likelihood ratio tests and singularities. Ann Stat 37(2):979–1012
Forcina A (2012) Smoothness of conditional independence models for discrete data. J Multivar Anal 106:49–56
Glonek GFV, McCullagh P (1995) Multivariate logistic models. J R Stat Soc Ser B (Methodological) 57:533–546
Højsgaard S (2004) Statistical inference in context specific interaction models for contingency tables. Scand J Stat 31(1):143–158
Istat. Italian innovation survey on SM enterprises (2012)
La Rocca L, Roverato A (2017) Discrete graphical models. Handbook of graphical models. Handbooks of modern statistical methods. Chapman and Hall/CRC, Boca Raton
Lauritzen SL (1996) Graphical models, vol 17. Clarendon Press, Oxford
Meyer D, Zeileis A, Hornik K (2017) vcd: Visualizing Categorical Data. R package version 1.4-4
Nicolussi F, Cazzaro M (2017) Context-specific independencies for ordinal variables in chain regression models. arXiv:1712.05229
Ntzoufras I, Tarantola C (2013) Conjugate and conditional conjugate Bayesian analysis of discrete graphical models of marginal independence. Comput Stat Data Anal 66:161–177
Ntzoufras I, Tarantola C, Lupparelli M (2018) Probability based independence sampler for Bayesian quantitative learning in graphical log-linear marginal models. Working paper No. 149. University of Pavia, Department of Economics and Management
Nyman H, Pensar J, Koski T, Corander J (2014) Stratified graphical models—context-specific independence in graphical models. Bayesian Anal 9(4):883–908
Nyman H, Pensar J, Koski T, Corander J (2016) Context specific independence in graphical log-linear models. Comput Stat 31(4):1493–1512
R Core Team (2014) R: a language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria
Roverato A (2017) Graphical models for categorical data. Cambridge University Press, Cambridge
Rudas T, Bergsma WP, Németh R (2010) Marginal log-linear parameterization of conditional independence models. Biometrika 97(4):1006–1012
Sadeghi K, Rinaldo A (2018) Markov properties of discrete determinantal point processes. arxiv:1810.02294v2
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix : Proofs and further results
Appendix : Proofs and further results
We are going to prove Theorems 1, 2, and 3. Note that, in order to do that, we preliminary need to declare and demonstrate the following results: Lemma 1 and Corollary 1.
Lemma 1
Given a HMM parameter \(\eta _{\mathcal {L}}^{\mathcal {M}}(\varvec{i}_{\mathcal {L}})\), where the interaction set can be expressed as union of two incompatible sets, \(\mathcal {L}=L\cup C\), belonging in \(\mathcal {M}\), it can be decomposed as follows
Proof of Lemma 1
From the Proposition (1) of Bartolucci et al. (2007), each parameter \(\eta _{\mathcal {L}}^{\mathcal {M}}(\varvec{i}_{\mathcal {L}})\), where \(\mathcal {L}=L\cup C\) can be rewritten as
where \(\eta ^{\mathcal {M}}_{L}(\varvec{i}_{L}|\varvec{i}_{ J},\varvec{i}^*_{C\backslash J},\varvec{I}_{\mathcal {M}\backslash \mathcal {L}})\) is the HMM parameter \(\varvec{\eta }^{\mathcal {M}}_{L}\) evaluated in the conditional distribution where the variables in \(X_J\) assume values \(\varvec{i}_{ J}\) and the variables in \(X_{C\backslash J}\) are set to the categories \(\varvec{i}^*_{C\backslash J}\).
When the set C is composed of only one index, \(C=\gamma _1\), the decomposition in formula (13) becomes
that corresponds to formula (12).
When two variables belong to the set C, \(C=\left\{ \gamma _1,\gamma _2\right\} \), by applying formula (13) only to \(\gamma _{1}\) we get
Note that, the first addend, on the right hand side, can be further decomposed by using the (13) as:
Now, by considering the HMM parameter \(\eta ^{\mathcal {M}}_{\mathcal {L}\backslash \gamma _2}(\varvec{i}_{\mathcal {L}\backslash \gamma _2}|\varvec{i}^*_{\gamma _2},\varvec{I}_{\mathcal {M}\backslash \mathcal {L}})\) and by applying the formula (13) to \(\gamma _1\), we get
It is easy to see that the last addend on the right hand side of the (16) is exactly the first addend on the right hand side of (17). Thus, by replacing the (16) and (17) in formula (15) we get:
that again corresponds to formula (12).
In general, when the set C is composed of k variables, \(C=\left\{ \gamma _1,\ldots ,\gamma _k\right\} \), we apply formula (13), focusing on only one variable of C. Thus, at first step we get
Then, we apply formula (13) recursively, focusing on only one variable of C at a time, to any parameter in the formula without any index \(\varvec{i}^*\) in the conditioning set
where \(\gamma _{jp}=\cup _{i=1}^{j}\gamma _{i}\).
Now, we take into account all the parameters having both \(\varvec{i}\) and \(\varvec{i}^*\) in the conditioning set. Let us denote them as \(\eta ^{\mathcal {M}}_{L}(\varvec{i}_{L}|\varvec{i}_{A},\varvec{i}^*_{B},\varvec{I}_{\mathcal {M}\backslash LAB})\). We can recognize it in the last term of the right hand side of the decomposition (21) obtained applying the (13) to \(\eta ^{\mathcal {M}}_{\mathcal {L}\backslash B}(\varvec{i}_{\mathcal {L}\backslash B}|\varvec{i}^*_{B},{\varvec{I}}_{\mathcal {M}\backslash LAB})\):
Thus, we can isolate the term \(\eta ^{\mathcal {M}}_{L}(\varvec{i}_{L}|\varvec{i}_{A},\varvec{i}^*_{B},\varvec{I}_{\mathcal {M}\backslash LAB})\) as follows:
Now, in formula (20), we replace each addend like \(\eta ^{\mathcal {M}}_{L}(\varvec{i}_{L}|\varvec{i}_{A},\varvec{i}^*_{B},\varvec{I}_{\mathcal {M}\backslash LAB})\) with the expression in formula (22) and we apply this procedure recursively to each addend like \(\eta ^{\mathcal {M}}_{L}(\varvec{i}_{L}|\varvec{i}^*_{A\backslash J},\varvec{i}_{BJ},\varvec{I}_{\mathcal {M}\backslash LAB})\). In this way we finally obtain exactly formula (12). \(\square \)
Corollary 1
A parameter \(\varvec{\eta }_{L}^{\mathcal {M}}\) can be decomposed as the sum of greater order parameters as follows:
where \(\mathcal {L}=L\cup C\) and \(C\cap L=\emptyset \).
Proof of Corollary 1
The proof comes easily by isolating the first term in the right hand side of the formula (12) of Lemma 1. \(\square \)
We are now ready to go into details of the proofs of the theorems.
Proof of Theorem 1
Let us consider the parameters \(\varvec{\eta }^{\mathcal {M}}_{\mathcal {L}}\) when \(\mathcal {L}=(A\cup B\cup C)\subseteq \mathcal {M}\). From Lemma 1 we can decompose it as
where \(\eta ^{\mathcal {M}}_{\mathcal {L}\backslash C}(\varvec{i}_{\mathcal {L}\backslash C}|\varvec{i}_{C},\varvec{I}_{\mathcal {M}\backslash \mathcal {L}})\) is the marginal parameter \(\varvec{\eta }^{\mathcal {M}}_{\mathcal {L}\backslash C}\) evaluated in the conditional distribution \((A\cup B|C=\varvec{i}_C)\). The term \(\eta ^{\mathcal {M}}_{\mathcal {L}\backslash C}(\varvec{i}_{\mathcal {L}\backslash C}|\varvec{i}_{C},\varvec{I}_{\mathcal {M}\backslash \mathcal {L}})\) is equal to zero if and only if the CSI in formula (4) holds. Thus,
Note that in the case of baseline aggregation criterion, the cell \(\varvec{i}^*_{J}\) is equivalent to \(\mathbf I _{J}\) thus, from formula (2), the parameter \(\eta _{\mathcal {L}\backslash J}^{\mathcal {M}}(\varvec{i}_{\mathcal {L}\backslash J}|\varvec{i}^*_{J},\varvec{I}_{\mathcal {M}\backslash \mathcal {L}})\) is equal to \(\eta _{\mathcal {L}\backslash J}^{\mathcal {M}}(\varvec{i}_{\mathcal {L}\backslash J},\varvec{I}_{(\mathcal {M}\backslash \mathcal {L})J})\).
Finally, by considering that the previous decomposition holds for each set \(q\in \mathcal {Q}=\left\{ q\subseteq (A \cup B):\right. \) \(\left. A\cap q\ne \emptyset ,B\cap q\ne \emptyset \right\} \), the formula (7) comes. \(\square \)
Proof of Theorem 2
By resuming the proof of Theorem 1, note that all the considerations until the decomposition in formula (25) still hold. However, by using the local aggregation criterion \(\varvec{i}^*_{J}\ne \mathbf I _{J}\) it is worthwhile to consider that the identity \(\eta ^{\mathcal {M}}_{\mathcal {L}\backslash J}(\varvec{i}_{\mathcal {L}\backslash J}|\varvec{i}^*_{J},\varvec{I}_{\mathcal {M}\backslash \mathcal {L}})=\eta ^{\mathcal {M}}_{\mathcal {L}\backslash J}(\varvec{i}_{\mathcal {L}\backslash J},\varvec{I}_{(\mathcal {M}\backslash \mathcal {L})J})\) does not hold any more, such as in the local coding \(\varvec{i}^*_{J}\) is equal to \((i_j+1)\) for all \(j\in J\), in short-cut \(\varvec{i}_{J}+1\). Further, the parameter \(\eta ^{\mathcal {M}}_{\mathcal {L}\backslash C}(\varvec{i}_{\mathcal {L}\backslash C},\varvec{I}_{(\mathcal {M}\backslash \mathcal {L})C})\) is built in the conditional distribution where the variables in \(X_C\) assume the reference value \(\mathbf I _C\). Note that \(\eta ^{\mathcal {M}}_{\mathcal {L}\backslash J}(\varvec{i}_{\mathcal {L}\backslash J}|\varvec{i}_{J}+1,\varvec{I}_{(\mathcal {M}\backslash \mathcal {L})J})\) does not belong to the HMM parametrization. Now we remark that between the baseline parameters, \(\varvec{\eta }(\cdot )_{\textit{b}}\), and the local parameters \(\varvec{\eta }(\cdot )_{\textit{l}}\), the following relationship holds:
When the variables in the conditioning set C are coded with the local approach, it is enough to apply the decomposition (26) only to the categories of the variables in the conditioning set C in order to return to the baseline approach. Thus we can rewrite (25) as:
where \(\eta _{A\cup B\cup c}^{\mathcal {M}}\) are the local parameters and they are exactly the same of formula (8). As in the proof of Theorem 1, the previous equivalence must hold for each subset q of \(A\cup B\) with at least one element in A and one element in B. \(\square \)
Proof of Theorem 3
Let us consider the inequality CSI statement as listed in formula (5).
When the set of categories \(\varvec{i}'_C\) in formula (5) is equal to \(\mathbf I _C\), i.e. when the CSI holds only when all variables in C assume the last level, the parameters \(\eta ^{\mathcal {M}}_{\mathcal {L}}(\varvec{i}_{\mathcal {L}})\) are null. Indeed, when \(\mathcal {L}=q\cup c\), with \(c\subseteq C\) and \(c\ne \emptyset \) the parameter is equal to \(\eta ^{\mathcal {M}}_{\mathcal {L}}(\varvec{i}_{q},\mathbf I _c)\) that is null by definition, whatever we code the variables (baseline, local or continuation). However, when \(c=\emptyset \), the parameter becomes \(\eta ^{\mathcal {M}}_{q}(\varvec{i}_{q})\) that is a contrast of logits or an higher order parameter evaluated in the (conditional) contingency table of \(X_q|X_C=I_C\). Hence, the parameter is null if and only if the independence holds, it is shown in the Example 4. Thus, since the \(\eta ^{\mathcal {M}}_{\mathcal {L}}(\varvec{i}_{q},\mathbf I _c)\) \(\forall q\in \mathcal {Q}\) and \(\forall c \subseteq C\) are evaluated in the conditional distribution \(X_C=\mathbf I _C\) where the CSI holds, these parameters are equal to zero.
When the \(\varvec{i}'_C\) is equal to \( (\mathbf I _{C\backslash j},\mathbf I _{j}-1)\), that is the level of each variable is equal to the last level but the variable j assumes the level \(\mathbf I _j-1\), as before, we have that all the parameters \(\eta _{\mathcal {L}}^{\mathcal {M}}(\varvec{i}_{\mathcal {L}})\) with \(\mathcal {L}=q\cup c\) and \(c\subseteq C\backslash j\) are equal to zero. However, note that in the parameter \(\eta ^{\mathcal {M}}_{\mathcal {L}}(\varvec{i}_{qj})\), whatever the aggregation criteria is chosen (baseline, local or continuation), the variable \(X_j\) takes value \(I_j-1\) or \(I_j\). Since in each of these distributions the CSI (5) holds, also this parameter is equal to zero and vice versa.
In general, when the CSI in (5) holds for a generic \(\varvec{i}'_C\), the parameters \(\eta _{\mathcal {L}}^{\mathcal {M}}(\varvec{i}_{\mathcal {L}})\), with \(\mathcal {L}=q \cup c\) for any \(\varvec{i}_c\) greater or equal to \(\varvec{i}'_c\), involve the categories of each variable \(X_j\) in \(X_C\), \(i_j\) or \(I_j\) (baseline approach), or \(i_j+1\) (local approach) or \(((i_j+1)+\cdots +I_j)\) (continuation approach). Since in all these cells the CSI holds, the parameters are equal to zero and vice versa. \(\square \)
Rights and permissions
About this article
Cite this article
Nicolussi, F., Cazzaro, M. Context-specific independencies in hierarchical multinomial marginal models. Stat Methods Appl 29, 767–786 (2020). https://doi.org/10.1007/s10260-019-00503-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10260-019-00503-8