Compatibility, desirability, and the running intersection property
Introduction
What is compatibility?
Suppose we are given a few marginal probability functions over some variables: e.g., , , . We wonder whether there is a joint probability from which we can reproduce by marginalisation.
This is an example of the so-called marginal problem: that of the compatibility of a number of marginal assessments with a global model. This problem has received a long-standing interest in the literature, since the seminal works by Boole [14], Hoeffding [44], Fréchet [34], Kellerer [51] and Vorobev [88] (see also [20] and the references therein).
The problem is trivial when the marginal models are defined on disjoint sets of variables: in that case, we could for instance determine a compatible joint model by considering the stochastic product of the marginals. However, when those sets of variables are not disjoint, then the problem is not trivial anymore. More recent work on this problem investigated when some additional constraints are placed on the joint in [76], [80], and has also appeared in other, apparently far, contexts, such as quantum mechanics [35] or coalitional game theory [88]. It has also a very nice application in problems of polynomial optimisation, where it can dramatically reduce the computational complexity of solution algorithms by exploiting sparsity in the problem representation [56].
Obviously, a necessary condition for the compatibility of a number of marginal assessments is their pairwise compatibility, that is, the equality of the marginals over common variables; in our example, this requires that
This is not enough however. In fact, using the theory of hypergraphs, Beeri et al. [5] (see also [60]) established a necessary and sufficient condition for pairwise compatibility to imply global compatibility: the running intersection property (RIP).1 This requires the existence of a total order on the marginals such that if any two marginals have variables in common, then all the marginals between them in the order contain those variables too. In our example the natural order makes it. Therefore Eq. (1) being true makes sure that a compatible P exists. There could actually be more than one; the iterative proportional fitting procedure (IPFP) [29] yields a sequence of probabilities that converge to the compatible joint that maximises Kullback-Leibler information [19].
The works above investigate the compatibility of probabilities; when the possibility spaces are infinite, they are usually assumed to be countably additive on a suitable σ-field. Another direction of generalisation takes into account the possible partial specification of probabilities: for instance say that in the example are only partly known; this corresponds to replacing each of them with a set of candidate probabilities. The marginal problem then becomes checking whether there is a set of joint probabilities P from which we can recover the marginal (candidate) sets by marginalisation.
Set-based probabilistic modelling goes under the umbrella term of imprecise probabilities [4]. They include models of possibility measures [32], belief functions [77] or coherent lower previsions [89], among others. The marginal problem has been investigated for some of these models by Studený [82], [83], Vejnarová [86] and Jirousek [49], using the IPFP; van der Gaag [36] has dealt with it by propagating inequality constraints over a tree.
The marginal problem has a generalisation to the conditional case that we shall just call the compatibility problem. In this case we have any number of conditional probabilities over a set of variables and the problem is again to verify whether they have a compatible joint.
Instances of the compatibility problem have shown up in Artificial Intelligence in the research concerned with probabilistic logic and probabilistic satisfiability [38], [41], [43], [46], [71]; in these cases the focus is on variables with finite support (or just events) and solutions algorithms are often based on linear programming—yet probabilistic satisfiability is NP-hard [12]. Another approach to satisfiability, originated within de Finetti's school, is based on ‘full conditional measures’ [17], [31]; this model establishes links between conditional probabilities so as to avoid inconsistencies, and can equivalently be represented as ‘zero layers’ à la Krauss [54]. This allows in particular to deal with structural constraints (also called structural zeroes) between conditional probabilities via sequences of linear programs. With similar aims and properties, Walley et al. [91] have addressed a generalised version of probabilistic satisfiability that mixes conditional and unconditional information, that allows the assessments to be imprecisely specified, and that is not affected by problems due to zero probabilities.
Note in fact that compatibility needs Bayes' rule to be verified besides the simple use of marginalisation. But Bayes' rule is not applicable in the case of zero-probability events. Neglecting this issue can lead to overlook incompatibilities that ‘hide’ under these zero probabilities. The problem can eventually yield wrong inferences and it is particularly subtle as it is generally unknown in advance where those zero probabilities happen to be. Cozman and Ianni [18] have recently proposed an approach that builds on Walley et al.'s work and that, as such, correctly deals with these problems.
In a different direction, eleven years ago we have observed that the compatibility problem, as well as probabilistic satisfiability, can often be simplified taking sparsity into account through a graphical representation called coherence graphs [64, Sections 8.2–8.3].
Compatibility is such a general problem that has a life on its own also in the statistical literature. There we can find some early work by Strassen [81], Okner [72] and Kamakura and Wedel [50], and a great bulk of work made by Arnold et al. [1], [2], [3] who also consider the case of imprecise information. Kuo and Wang [93] have shown that the problem of zero probability is an issue also in the statistical case; in the same year we also have discussed the same question in the statistical literature [65]. In addition, we have proved that there is an iterative procedure that converges to the compatible joint model; this is somewhat similar in spirit to the IPFP, but our procedure works for the more involved conditional case and moreover it yields the entire set of compatible probabilities in the case of imprecision. While most work on compatibility focuses on discrete variables, Wang and Ip [92] are a relevant reference for the continuous case. Kuo et al. [55] provide one of the most recent works on the subject, with many references therein.
So there has been much work about compatibility in the conditional case across different communities (that do not seem to have talked much to each other). However, and to our surprise, we could not find any work making the connection to RIP there, which is even more surprising considered the clear connection that exists with RIP in the unconditional case.
Our aim in this paper is to establish a clear connection between RIP and compatibility in the most general possible setting: any possibility space, unrestricted domains (no σ-additivity/measurability problems), imprecise probabilities, conditional and unconditional information, no limitations due to zero probabilities.
To achieve these goals, we base our analysis on the imprecise-probability formalism of coherent sets of desirable gambles [89], [94]. As we have recently shown [96], [99], such a formalism is an equivalent reformulation of Bayesian decision theory, once it is freed of the precision constraint, with the advantage that it naturally meets all the requirements listed above. We introduce sets of desirable gambles in Section 2.
In the same section, we define compatibility in the unconditional case for sets of desirable gambles and prove in Theorem 2 that RIP and pairwise compatibility imply compatibility. This result generalises most of the previous work on the marginal problem along the lines discussed at the beginning of this section. We try to clarify this point by first specialising our results to sets of probabilities, and then by commenting on the relation of these results with previous ones.
We move to compatibility for the conditional case in Section 3. First, we give a generalised definition of compatibility (Definition 18). The definition makes us realise that compatibility is nothing else than strong coherence in Williams-Walley's theory [98, Definition 25], thus enabling us to exploit established tools in such a theory to pursue our aims. This turns out to be particularly important since we verify that the conditional case cannot be reduced to the unconditional one: in the former, compatibility does not imply pairwise compatibility; pairwise compatibility needs to be replaced by Walley's notion of avoiding partial loss. We go on then to specialise some of these notions for sets of probabilities.
In Section 4 we give our main results. We start by recalling the notion of tree decomposition related to RIP: i.e., that our probabilistic assessments can be represented graphically so as to eventually organise the variables of our problem in a junction tree; in such a tree, nodes are clusters of variables (cliques) that satisfy RIP. We give two procedures, analogous to the standard ones of collect and distribute evidence, for the propagation of desirable gambles over the tree. Then we prove in Theorem 9, Theorem 10 that:
- ∘
The first procedure terminates with a coherent set at the root of the tree if and only if our original assessments avoid partial loss. This is a first test of compatibility, because if that is not the case, then the original assessments are not compatible and we can stop.
- ∘
Otherwise, the second procedure yields the marginals of the joint compatible set of desirable gambles that extends our original assessments. Then the original assessments are compatible if and only if they coincide with such marginals.
Irrespectively of the proof method, let us remark that these results, being valid for desirable gambles, hold also for sets of probabilities and in particular for traditional, precise, probability (on any possibility space).
Let us recall that in the unconditional case, RIP is often regarded as the optimal way to exploit sparsity in a problem without loss of information. We show in Section 5 that in the conditional case this is no longer true: there are very common situations where we can immediately tell if compatibility holds without having to build a junction tree and perform a propagation. We systematise this observation by leveraging on our past work on coherence graphs [64]. These simplify the verification of coherence by yielding a partition of the original set of assessments into so-called superblocks. Here, we extend past results on coherence graphs to desirable gambles and show in Theorem 12 that in order to check compatibility it is enough to separately check it on superblocks. In addition we give a procedure to compute the compatible joint. The lesson here is that if we want to get the best out of the conditional case, we have to combine coherence graphs with junction trees.
We give our concluding views in Section 6. Appendix A contains additional remarks and observations. All the proofs of the paper have been gathered in Appendix B.
Section snippets
Sets of desirable gambles
The most general model we shall consider in this paper is that of coherent sets of desirable gambles. Let us introduce the main notions about this theory; we refer to [4, Chapter 1], [90] and [89, Chapter 3] for further details.
Definition 1 Consider a possibility space . A gamble on is a bounded real-valued function .Gambles
Compatibility of conditional models
We consider next a more general framework: that where our assessments are possibly of a conditional nature. Thus, given two disjoint subsets of our set of variables N, we assume that we have a belief model about the variables in O, given information about the variables in I. The situation considered in Section 2 corresponds to the particular case where I is empty: then, what we have is marginal information about the variables in O.
Exploiting the power of tree decomposition
In this section we consider the most general version of the compatibility problem, where we have n variables over which we assess r separately coherent conditional sets of desirable gambles .
In the following we shall sometimes focus only on the variables involved in a certain set ; we denote the qualitative form of their relation by the so-called ‘template’ .
As a running example we consider the following templates over variables:
Joining coherence graphs and RIP
It is important to realise that RIP or, equivalently, tree decompositions, do not necessarily simplify the compatibility check to the most. Consider for instance a case where the involved assessments define only two templates: and (this actually happens in Example 5 in Appendix A); the form of these templates is enough to deduce that the associated numerical assessments, whatever they are (provided that they are separately coherent), are strongly coherent, that is, compatible. In
Conclusions
In this paper, we have initially generalised the classical result on the compatibility of a number of marginal probabilities into a global one to the case where our belief models are sets of desirable gambles. This includes as particular cases sets of probability measures and also most models of non-additive measures, such as belief functions or possibility measures. Our generalisation covers also the case of infinite possibility spaces and is not constrained by measurability issues. There are,
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
We acknowledge the financial support of projects PGC2018-098623-B-I00 and GRUPIN/IDI/2018/000176 and of the Swiss NSF grant n. IZKSZ2_162188. We would also like to thank the anonymous Referees for a number of useful remarks.
References (100)
- et al.
Compatibility of partial or complete conditional probabilities specifications
J. Stat. Plan. Inference
(2004) - et al.
Sum-of-squares for bounded rationality
Int. J. Approx. Reason.
(2019) - et al.
Incoherence correction strategies in statistical matching
Int. J. Approx. Reason.
(2012) - et al.
Correction of incoherent conditional probability assessments
Int. J. Approx. Reason.
(2010) - et al.
Probabilistic satisfiability and coherence checking through integer programming
Int. J. Approx. Reason.
(2015) - et al.
Bayesian conditioning in possibility theory
Fuzzy Sets Syst.
(1997) - et al.
Probabilistic satisfiability
J. Complex.
(1988) - et al.
Updating ambiguous beliefs
J. Econ. Theory
(1993) Ordered valuation algebras: a generic framework for approximating inference
Int. J. Approx. Reason.
(2004)- et al.
Probabilistic satisfiability with imprecise probabilities
Int. J. Approx. Reason.
(2000)
Can logic be combined with probability? Probably
J. Appl. Log.
Exactly and almost compatible joint distributions for high-dimensional discrete conditional distributions
J. Multivar. Anal.
Existence of extensions and product extensions for discrete probability distributions
Discrete Math.
Updating credal networks is approximable in polynomial time
Int. J. Approx. Reason.
Marginal extension in the theory of coherent lower previsions
Int. J. Approx. Reason.
Coherent updating of non-additive measures
Int. J. Approx. Reason.
Coherence graphs
Artif. Intell.
Conditional models: coherence and inference through sequences of joint mass functions
J. Stat. Plan. Inference
Conglomerable coherence
Int. J. Approx. Reason.
Conformity and independence with coherent lower previsions
Int. J. Approx. Reason.
Conglomerable natural extension
Int. J. Approx. Reason.
Probabilistic logic
Artif. Intell.
Characterizing coherence, correcting incoherence
Int. J. Approx. Reason.
Accept & reject statement-based uncertainty models
Int. J. Approx. Reason.
A valuation-based language for expert systems
Int. J. Approx. Reason.
Axioms for probability and belief-function propagation
Conditional independence and natural conditional functions
Int. J. Approx. Reason.
Statistical matching of multiple sources: a look through coherence
Int. J. Approx. Reason.
Notes on ‘Notes on conditional previsions’
Int. J. Approx. Reason.
Towards a unified theory of imprecise probability
Int. J. Approx. Reason.
Direct algorithms for checking consistency and making inferences from conditional probability assessments
J. Stat. Plan. Inference
Compatibility of discrete conditional distributions with structural zeros
J. Multivar. Anal.
Notes on conditional previsions
Int. J. Approx. Reason.
Probability and time
Artif. Intell.
Conditional Specification of Statistical Models
Compatible conditional distributions
J. Am. Stat. Assoc.
On the desirability of acyclic database schemes
J. ACM
Dual probabilistic programming
PyRational: a rational choice modelling framework in python
A polarity theory for sets of desirable gambles
Marginal models for categorical data
Ann. Stat.
Nonserial Dynamic Programming
Probabilistic logic under coherence: complexity and algorithms
Ann. Math. Artif. Intell.
An introduction to chordal graphs and clique trees
An Investigation on the Laws of Thought, on which are Founded the Mathematical Theories of Logic and Probabilities
Probabilistic Logic in a Coherent Setting
I-divergence geometry of probability distributions and minimization problems
Ann. Probab.
Distributions with Given Marginals and Statistical Modelling
The inferential complexity of Bayesian and credal networks
Cited by (9)
Information algebras in the theory of imprecise probabilities, an extension
2022, International Journal of Approximate ReasoningCitation Excerpt :Now we can formulate the most important result of this section, i.e., proving that consistency and pairwise compatibility combined with a family of supports forming a hypertree guarantee global compatibility of a set of marginal assessments. This result is a generalisation of [1, Theorem 2, Proposition 1] and [5, Theorem 14] in our context. Notice that in all the previous results of this section we used only properties of domain-free information algebras.
Probability envelopes and their Dempster-Shafer approximations in statistical matching
2022, International Journal of Approximate ReasoningCitation Excerpt :Recently, in [30] the authors faced the compatibility problem in the frameworks of coherent conditional sets of desirable gambles and coherent conditional lower previsions [50,53], respectively, by assuming no logical constraints among the involved variables. Statistical matching reveals to be a simplified instance of the compatibility problem of [30], in which all distributions are precise, all but one variables are conditional on the same variable X and also the marginal on X is given. It turns out that, for statistical matching, due to the particular structure of (34), the hypothesis of no logical constraints automatically implies coherence, as proved in [47], thus the compatibility problem is trivially solved in this case.
Processing distortion models: A comparative study
2022, International Journal of Approximate ReasoningInformation algebras in the theory of imprecise probabilities
2022, International Journal of Approximate ReasoningCitation Excerpt :Compatibility for coherent sets of gambles In [1] a definition of pairwise compatibility for coherent sets of gambles is also given. We can reformulate it as follows.
Linear Complexity Entropy Regions
2021, IEEE International Symposium on Information Theory - Proceedings