Quasi-independence models with rational maximum likelihood estimator
Introduction
Huh (Huh, 2014) classified the varieties with rational maximum likelihood estimator using Kapranov's Horn uniformization (Kapranov, 1991). In spite of the classification, it can be difficult to tell a priori whether a given model has rational MLE, or not. Duarte, Marigliano, and Sturmfels (Duarte et al., 2019) have since applied Huh's ideas to varieties that are the closure of discrete statistical models. In the present paper, we study this problem for a family of discrete statistical models called quasi-independence models, also commonly known as independence models with structural zeros. Because quasi-independence models have a simple structure whose description is determined by a bipartite graph, this is a natural test case for trying to apply Huh's theory. Our complete classification of quasi-independence models with rational MLE is the main result of the present paper (Theorem 1.3, Theorem 5.4).
Let X and Y be two discrete random variables with m and n states, respectively. Quasi-independence models describe the situation in which some combinations of states of X and Y cannot occur together, but X and Y are otherwise independent of one another. This condition is known as quasi-independence in the statistics literature (Bishop et al., 2007). Quasi-independence models are basic models that arise in data analysis with log-linear models. For example, quasi-independence models arise in the biomedical field as rater agreement models (Agresti, 1992; Rapallo, 2005) and in engineering to model system failures at nuclear plants (Colombo and Ihm, 1988). There is a great deal of literature regarding hypothesis testing under the assumption of quasi-independence, see, for example, (Bocci and Rapallo, 2019; Goodman, 1994; Smith and McDonald, 1995). Results about existence and uniqueness of the maximum likelihood estimate in quasi-independence models as well as explicit computations in some cases can be found in (Bishop et al., 2007, Chapter 5).
In order to define quasi-independence models, let be a set of indices, where . These correspond to a matrix with structural zeros whose observed entries are given by the indices in S. We often use S to refer to both the set of indices and the matrix representation of this set and abbreviate the ordered pairs in S by ij. For all r, we denote by the open -dimensional probability simplex in ,
Definition 1.1 Let . Index the coordinates of by . Let denote the real vector space of dimension #S whose coordinates are indexed by S. Define the monomial map by The quasi-independence model associated to S is the model,
We note that the Zariski closure of is a toric variety since it is parametrized by monomials. To any quasi-independence model, we can associate a bipartite graph in the following way.
Definition 1.2 The bipartite graph associated to S, denoted , is the bipartite graph with independent sets and with an edge between i and j if and only if . The graph is chordal bipartite if every cycle of length greater than or equal to 6 has a chord. The graph is doubly chordal bipartite if every cycle of length greater than or equal 6 has at least two chords. We say that S is doubly chordal bipartite if is doubly chordal bipartite.
Let be a vector of counts of independent, identically distributed (iid) data. The maximum likelihood estimate, or MLE, for u in is the distribution that maximizes the probability of observing the data u over all distributions in the model. We describe the maximum likelihood estimation problem in more detail in Section 2. We say that has rational MLE if for generic choices of u, the MLE for u in can be written as a rational function in the entries of u. We can now state the key result of this paper.
Theorem 1.3 Let and let be the associated quasi-independence model. Let be the bipartite graph associated to S. Then has rational maximum likelihood estimate if and only if is doubly chordal bipartite.
Theorem 5.4 is a strengthened version of Theorem 1.3 in which we give an explicit formula for the MLE when is doubly chordal bipartite. The outline of the rest of the paper is as follows. In Section 2, we introduce general log-linear models and their MLEs and discuss some key results on these topics. In Section 3, we discuss the notion of a facial submodel of a log-linear model and prove that facial submodels of models with rational MLE also have rational MLE. In Section 4, we apply the results of Section 3 to show that if is not doubly chordal bipartite, then does not have rational MLE. The main bulk of the paper is in Sections 5, 6 and 7, where we show that if is doubly chordal bipartite, then the MLE is rational and we give an explicit formula for it. Section 5 covers combinatorial features of doubly chordal bipartite graphs and gives the statement of the main Theorem 5.4. Sections 6 and 7 are concerned with the verification that the formula for the MLE is correct.
Section snippets
Log-linear models and their maximum likelihood estimates
In this section, we collect some results from the literature on log-linear models and maximum likelihood estimation in these models. These results will be important tools in the proof of Theorem 5.4.
Let with entries . Denote by 1 the vector of all ones in . We assume throughout that .
Definition 2.1 The log-linear model associated to A is the set of probability distributions,
Algebraic and combinatorial tools are well-suited for the study of log-linear
Facial submodels of log-linear models
In order to prove that a quasi-independence model with rational MLE must have a doubly chordal bipartite associated graph , we first prove a result that applies to general log-linear models with rational MLE. Let be the matrix defining the monomial map for the log-linear model . Let denote the vanishing ideal of the Zariski closure of . We assume throughout that . Let , where denotes the convex hull of the columns of A.
We assume throughout
Quasi-independence models with non-rational MLE
In this section, we show that when S is not doubly chordal bipartite, the ML-degree of is strictly greater than one. We can apply Theorem 3.2 to quasi-independence models whose associated bipartite graphs are not doubly chordal bipartite using cycles and the following “double square” structure.
Example 4.1 The minimal example of a chordal bipartite graph that is not doubly chordal bipartite is the double-square graph. The matrix of the double-square graph has the form or any permutation of
The clique formula for the MLE
In this section we state the main result of the paper, which gives the specific form of the rational maximum likelihood estimates for quasi-independence models when they exist. These are described in terms of the complete bipartite subgraphs of the associated graph . A complete bipartite subgraph of corresponds to an entirely nonzero submatrix of S. This motivates our use of the word “clique” in the following definition.
Definition 5.1 A set of indices is a clique in S if
Intersections of cliques with a fixed column
In this section we prove some results that will set the stage for the proof of Theorem 5.4 that appears in Section 7. To prove that our formulas satisfy Birch's theorem, we need to understand what happens to sums of these formulas over certain sets of indices.
Let and let . Without loss of generality, we assume that , and that the last for all . Let We consider to be the index of a column in the matrix representation of S,
Checking the conditions of Birch's theorem
In the previous section, we wrote a formula for the sum of where i ranges over the rows of some maximal clique . Since the block induces its own maximal clique, Lemma 6.9 allows us to write the sum of the s for in the following concise way. This in turn verifies that the proposed maximum likelihood estimate has the same sufficient statistics as the normalized data , which is one of the conditions of Birch's theorem.
Corollary 7.1 Let S be DS-free. Then for any column ,
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
Jane Coons was partially supported by the US National Science Foundation (DGE 1746939). Seth Sullivant was partially supported by the US National Science Foundation (DMS 1615660).
References (18)
- et al.
The maximum likelihood degree of toric varieties
J. Symb. Comput.
(2019) - et al.
A quasi-independence model to estimate failure rates
Reliab. Eng. Syst. Saf.
(1988) - et al.
Which nonnegative matrices are slack matrices?
Linear Algebra Appl.
(2013) - et al.
Support sets in exponential families and oriented matroid theory
Int. J. Approx. Reason.
(2011) Modelling patterns of agreement and disagreement
Stat. Methods Med. Res.
(1992)- et al.
Markov Bases in Algebraic Statistics, vol. 199
(2012) - et al.
Discrete Multivariate Analysis: Theory and Practice
(2007) - et al.
Exact tests to compare contingency tables under quasi-independence and quasi-symmetry
J. Algebraic Stat.
(2019) - et al.
Algebraic algorithms for sampling from conditional distributions
Ann. Stat.
(1998)
Cited by (8)
Classical iterative proportional scaling of log-linear models with rational maximum likelihood estimator
2024, International Journal of Approximate ReasoningNONLINEAR ALGEBRA AND APPLICATIONS
2023, Numerical Algebra, Control and OptimizationStudy on error correction method of English long sentence translation based on support vector machine
2023, International Journal of Reasoning-based Intelligent SystemsNumerical optimization methods for financial time series GARCH(p, q) model, a comparative approach
2022, 8th International Conference on Optimization and Applications, ICOA 2022 - Proceedings