Compatibility, desirability, and the running intersection property

https://doi.org/10.1016/j.artint.2020.103274Get rights and content

Abstract

Compatibility is the problem of checking whether some given probabilistic assessments have a common joint probabilistic model. When the assessments are unconditional, the problem is well established in the literature and finds a solution through the running intersection property (RIP). This is not the case of conditional assessments. In this paper, we study the compatibility problem in a very general setting: any possibility space, unrestricted domains, imprecise (and possibly degenerate) probabilities. We extend the unconditional case to our setting, thus generalising most of previous results in the literature. The conditional case turns out to be fundamentally different from the unconditional one. For such a case, we prove that the problem can still be solved in general by RIP but in a more involved way: by constructing a junction tree and propagating information over it. Still, RIP does not allow us to optimally take advantage of sparsity: in fact, conditional compatibility can be simplified further by joining junction trees with coherence graphs.

Introduction

What is compatibility?

Suppose we are given a few marginal probability functions over some variables: e.g., P1(X1,X2), P2(X2,X3), P3(X3,X4,X5). We wonder whether there is a joint probability P(X1,X2,X3,X4,X5) from which we can reproduce P1,P2,P3 by marginalisation.

This is an example of the so-called marginal problem: that of the compatibility of a number of marginal assessments with a global model. This problem has received a long-standing interest in the literature, since the seminal works by Boole [14], Hoeffding [44], Fréchet [34], Kellerer [51] and Vorobev [88] (see also [20] and the references therein).

The problem is trivial when the marginal models are defined on disjoint sets of variables: in that case, we could for instance determine a compatible joint model by considering the stochastic product of the marginals. However, when those sets of variables are not disjoint, then the problem is not trivial anymore. More recent work on this problem investigated when some additional constraints are placed on the joint in [76], [80], and has also appeared in other, apparently far, contexts, such as quantum mechanics [35] or coalitional game theory [88]. It has also a very nice application in problems of polynomial optimisation, where it can dramatically reduce the computational complexity of solution algorithms by exploiting sparsity in the problem representation [56].

Obviously, a necessary condition for the compatibility of a number of marginal assessments is their pairwise compatibility, that is, the equality of the marginals over common variables; in our example, this requires thatP1(X2)=P3(X2) and P2(X3)=P3(X3).

This is not enough however. In fact, using the theory of hypergraphs, Beeri et al. [5] (see also [60]) established a necessary and sufficient condition for pairwise compatibility to imply global compatibility: the running intersection property (RIP).1 This requires the existence of a total order on the marginals such that if any two marginals have variables in common, then all the marginals between them in the order contain those variables too. In our example the natural order P1,P2,P3 makes it. Therefore Eq. (1) being true makes sure that a compatible P exists. There could actually be more than one; the iterative proportional fitting procedure (IPFP) [29] yields a sequence of probabilities that converge to the compatible joint that maximises Kullback-Leibler information [19].

The works above investigate the compatibility of probabilities; when the possibility spaces are infinite, they are usually assumed to be countably additive on a suitable σ-field. Another direction of generalisation takes into account the possible partial specification of probabilities: for instance say that P1,P2,P3 in the example are only partly known; this corresponds to replacing each of them with a set of candidate probabilities. The marginal problem then becomes checking whether there is a set of joint probabilities P from which we can recover the marginal (candidate) sets by marginalisation.

Set-based probabilistic modelling goes under the umbrella term of imprecise probabilities [4]. They include models of possibility measures [32], belief functions [77] or coherent lower previsions [89], among others. The marginal problem has been investigated for some of these models by Studený [82], [83], Vejnarová [86] and Jirousek [49], using the IPFP; van der Gaag [36] has dealt with it by propagating inequality constraints over a tree.

The marginal problem has a generalisation to the conditional case that we shall just call the compatibility problem. In this case we have any number of conditional probabilities over a set of variables and the problem is again to verify whether they have a compatible joint.

Instances of the compatibility problem have shown up in Artificial Intelligence in the research concerned with probabilistic logic and probabilistic satisfiability [38], [41], [43], [46], [71]; in these cases the focus is on variables with finite support (or just events) and solutions algorithms are often based on linear programming—yet probabilistic satisfiability is NP-hard [12]. Another approach to satisfiability, originated within de Finetti's school, is based on ‘full conditional measures’ [17], [31]; this model establishes links between conditional probabilities so as to avoid inconsistencies, and can equivalently be represented as ‘zero layers’ à la Krauss [54]. This allows in particular to deal with structural constraints (also called structural zeroes) between conditional probabilities via sequences of linear programs. With similar aims and properties, Walley et al. [91] have addressed a generalised version of probabilistic satisfiability that mixes conditional and unconditional information, that allows the assessments to be imprecisely specified, and that is not affected by problems due to zero probabilities.

Note in fact that compatibility needs Bayes' rule to be verified besides the simple use of marginalisation. But Bayes' rule is not applicable in the case of zero-probability events. Neglecting this issue can lead to overlook incompatibilities that ‘hide’ under these zero probabilities. The problem can eventually yield wrong inferences and it is particularly subtle as it is generally unknown in advance where those zero probabilities happen to be. Cozman and Ianni [18] have recently proposed an approach that builds on Walley et al.'s work and that, as such, correctly deals with these problems.

In a different direction, eleven years ago we have observed that the compatibility problem, as well as probabilistic satisfiability, can often be simplified taking sparsity into account through a graphical representation called coherence graphs [64, Sections 8.2–8.3].

Compatibility is such a general problem that has a life on its own also in the statistical literature. There we can find some early work by Strassen [81], Okner [72] and Kamakura and Wedel [50], and a great bulk of work made by Arnold et al. [1], [2], [3] who also consider the case of imprecise information. Kuo and Wang [93] have shown that the problem of zero probability is an issue also in the statistical case; in the same year we also have discussed the same question in the statistical literature [65]. In addition, we have proved that there is an iterative procedure that converges to the compatible joint model; this is somewhat similar in spirit to the IPFP, but our procedure works for the more involved conditional case and moreover it yields the entire set of compatible probabilities in the case of imprecision. While most work on compatibility focuses on discrete variables, Wang and Ip [92] are a relevant reference for the continuous case. Kuo et al. [55] provide one of the most recent works on the subject, with many references therein.

So there has been much work about compatibility in the conditional case across different communities (that do not seem to have talked much to each other). However, and to our surprise, we could not find any work making the connection to RIP there, which is even more surprising considered the clear connection that exists with RIP in the unconditional case.

Our aim in this paper is to establish a clear connection between RIP and compatibility in the most general possible setting: any possibility space, unrestricted domains (no σ-additivity/measurability problems), imprecise probabilities, conditional and unconditional information, no limitations due to zero probabilities.

To achieve these goals, we base our analysis on the imprecise-probability formalism of coherent sets of desirable gambles [89], [94]. As we have recently shown [96], [99], such a formalism is an equivalent reformulation of Bayesian decision theory, once it is freed of the precision constraint, with the advantage that it naturally meets all the requirements listed above. We introduce sets of desirable gambles in Section 2.

In the same section, we define compatibility in the unconditional case for sets of desirable gambles and prove in Theorem 2 that RIP and pairwise compatibility imply compatibility. This result generalises most of the previous work on the marginal problem along the lines discussed at the beginning of this section. We try to clarify this point by first specialising our results to sets of probabilities, and then by commenting on the relation of these results with previous ones.

We move to compatibility for the conditional case in Section 3. First, we give a generalised definition of compatibility (Definition 18). The definition makes us realise that compatibility is nothing else than strong coherence in Williams-Walley's theory [98, Definition 25], thus enabling us to exploit established tools in such a theory to pursue our aims. This turns out to be particularly important since we verify that the conditional case cannot be reduced to the unconditional one: in the former, compatibility does not imply pairwise compatibility; pairwise compatibility needs to be replaced by Walley's notion of avoiding partial loss. We go on then to specialise some of these notions for sets of probabilities.

In Section 4 we give our main results. We start by recalling the notion of tree decomposition related to RIP: i.e., that our probabilistic assessments can be represented graphically so as to eventually organise the variables of our problem in a junction tree; in such a tree, nodes are clusters of variables (cliques) that satisfy RIP. We give two procedures, analogous to the standard ones of collect and distribute evidence, for the propagation of desirable gambles over the tree. Then we prove in Theorem 9, Theorem 10 that:

  • The first procedure terminates with a coherent set at the root of the tree if and only if our original assessments avoid partial loss. This is a first test of compatibility, because if that is not the case, then the original assessments are not compatible and we can stop.

  • Otherwise, the second procedure yields the marginals of the joint compatible set of desirable gambles that extends our original assessments. Then the original assessments are compatible if and only if they coincide with such marginals.

In A.4 we give also an alternative avenue to the proof of Theorem 9, Theorem 10 based on so-called valuation algebras [52], [78]. These are abstract representations of knowledge or information that encode primitive tools for distributed computation on a junction tree. Valuation algebras should provide more accessible proofs of distributed computation to those unfamiliar with desirability; moreover, such an avenue has turned out to be an opportunity for us to discuss more widely the interplay of logic, desirability and algebras.

Irrespectively of the proof method, let us remark that these results, being valid for desirable gambles, hold also for sets of probabilities and in particular for traditional, precise, probability (on any possibility space).

Let us recall that in the unconditional case, RIP is often regarded as the optimal way to exploit sparsity in a problem without loss of information. We show in Section 5 that in the conditional case this is no longer true: there are very common situations where we can immediately tell if compatibility holds without having to build a junction tree and perform a propagation. We systematise this observation by leveraging on our past work on coherence graphs [64]. These simplify the verification of coherence by yielding a partition of the original set of assessments into so-called superblocks. Here, we extend past results on coherence graphs to desirable gambles and show in Theorem 12 that in order to check compatibility it is enough to separately check it on superblocks. In addition we give a procedure to compute the compatible joint. The lesson here is that if we want to get the best out of the conditional case, we have to combine coherence graphs with junction trees.

We give our concluding views in Section 6. Appendix A contains additional remarks and observations. All the proofs of the paper have been gathered in Appendix B.

Section snippets

Sets of desirable gambles

The most general model we shall consider in this paper is that of coherent sets of desirable gambles. Let us introduce the main notions about this theory; we refer to [4, Chapter 1], [90] and [89, Chapter 3] for further details.

Definition 1

Gambles

Consider a possibility space X. A gamble on X is a bounded real-valued function f:XR.

Gambles are interpreted as uncertain rewards in a linear utility scale. We denote by L(X) the set of all gambles on X, and by L+(X){fL(X):f0,f0} the set of positive gambles. We

Compatibility of conditional models

We consider next a more general framework: that where our assessments are possibly of a conditional nature. Thus, given two disjoint subsets O,I of our set of variables N, we assume that we have a belief model about the variables in O, given information about the variables in I. The situation considered in Section 2 corresponds to the particular case where I is empty: then, what we have is marginal information about the variables in O.

Exploiting the power of tree decomposition

In this section we consider the most general version of the compatibility problem, where we have n variables X1,,Xn over which we assess r separately coherent conditional sets of desirable gambles DO1|XI1,,DOr|XIr.

In the following we shall sometimes focus only on the variables involved in a certain set DOj|XIj; we denote the qualitative form of their relation by the so-called ‘template’ XOj|XIj.

As a running example we consider the following r=13 templates over n=15 variables:X2|X1,X2|X4,X3|X2,

Joining coherence graphs and RIP

It is important to realise that RIP or, equivalently, tree decompositions, do not necessarily simplify the compatibility check to the most. Consider for instance a case where the involved assessments define only two templates: X1|X2 and X2|X3 (this actually happens in Example 5 in Appendix A); the form of these templates is enough to deduce that the associated numerical assessments, whatever they are (provided that they are separately coherent), are strongly coherent, that is, compatible. In

Conclusions

In this paper, we have initially generalised the classical result on the compatibility of a number of marginal probabilities into a global one to the case where our belief models are sets of desirable gambles. This includes as particular cases sets of probability measures and also most models of non-additive measures, such as belief functions or possibility measures. Our generalisation covers also the case of infinite possibility spaces and is not constrained by measurability issues. There are,

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

We acknowledge the financial support of projects PGC2018-098623-B-I00 and GRUPIN/IDI/2018/000176 and of the Swiss NSF grant n. IZKSZ2_162188. We would also like to thank the anonymous Referees for a number of useful remarks.

References (100)

  • C. Howson

    Can logic be combined with probability? Probably

    J. Appl. Log.

    (2009)
  • K.-L. Kuo et al.

    Exactly and almost compatible joint distributions for high-dimensional discrete conditional distributions

    J. Multivar. Anal.

    (2017)
  • F.M. Malvestuto

    Existence of extensions and product extensions for discrete probability distributions

    Discrete Math.

    (1988)
  • D. Mauá et al.

    Updating credal networks is approximable in polynomial time

    Int. J. Approx. Reason.

    (2012)
  • E. Miranda et al.

    Marginal extension in the theory of coherent lower previsions

    Int. J. Approx. Reason.

    (2007)
  • E. Miranda et al.

    Coherent updating of non-additive measures

    Int. J. Approx. Reason.

    (2015)
  • E. Miranda et al.

    Coherence graphs

    Artif. Intell.

    (2009)
  • E. Miranda et al.

    Conditional models: coherence and inference through sequences of joint mass functions

    J. Stat. Plan. Inference

    (2010)
  • E. Miranda et al.

    Conglomerable coherence

    Int. J. Approx. Reason.

    (2013)
  • E. Miranda et al.

    Conformity and independence with coherent lower previsions

    Int. J. Approx. Reason.

    (2016)
  • E. Miranda et al.

    Conglomerable natural extension

    Int. J. Approx. Reason.

    (2012)
  • N.J. Nilsson

    Probabilistic logic

    Artif. Intell.

    (1986)
  • E. Quaeghebeur

    Characterizing coherence, correcting incoherence

    Int. J. Approx. Reason.

    (2015)
  • E. Quaeghebeur et al.

    Accept & reject statement-based uncertainty models

    Int. J. Approx. Reason.

    (2015)
  • P.P. Shenoy

    A valuation-based language for expert systems

    Int. J. Approx. Reason.

    (1989)
  • P.P. Shenoy et al.

    Axioms for probability and belief-function propagation

  • M. Studený

    Conditional independence and natural conditional functions

    Int. J. Approx. Reason.

    (1995)
  • B. Vantaggi

    Statistical matching of multiple sources: a look through coherence

    Int. J. Approx. Reason.

    (2008)
  • P. Vicig et al.

    Notes on ‘Notes on conditional previsions’

    Int. J. Approx. Reason.

    (2007)
  • P. Walley

    Towards a unified theory of imprecise probability

    Int. J. Approx. Reason.

    (2000)
  • P. Walley et al.

    Direct algorithms for checking consistency and making inferences from conditional probability assessments

    J. Stat. Plan. Inference

    (2004)
  • Y.J. Wang et al.

    Compatibility of discrete conditional distributions with structural zeros

    J. Multivar. Anal.

    (2010)
  • P.M. Williams

    Notes on conditional previsions

    Int. J. Approx. Reason.

    (2007)
  • M. Zaffalon et al.

    Probability and time

    Artif. Intell.

    (2013)
  • B.C. Arnold et al.

    Conditional Specification of Statistical Models

    (1999)
  • B.C. Arnold et al.

    Compatible conditional distributions

    J. Am. Stat. Assoc.

    (1989)
  • C. Beeri et al.

    On the desirability of acyclic database schemes

    J. ACM

    (1983)
  • A. Benavoli

    Dual probabilistic programming

  • A. Benavoli

    PyRational: a rational choice modelling framework in python

  • A. Benavoli et al.

    A polarity theory for sets of desirable gambles

  • W. Bergsma et al.

    Marginal models for categorical data

    Ann. Stat.

    (2002)
  • U. Bertelé et al.

    Nonserial Dynamic Programming

    (1972)
  • V. Biazzo et al.

    Probabilistic logic under coherence: complexity and algorithms

    Ann. Math. Artif. Intell.

    (2005)
  • J. Blair et al.

    An introduction to chordal graphs and clique trees

  • G. Boole

    An Investigation on the Laws of Thought, on which are Founded the Mathematical Theories of Logic and Probabilities

    (1854)
  • G. Coletti et al.

    Probabilistic Logic in a Coherent Setting

    (2002)
  • I. Csiszár

    I-divergence geometry of probability distributions and minimization problems

    Ann. Probab.

    (1975)
  • C. Cuadras et al.

    Distributions with Given Marginals and Statistical Modelling

    (2002)
  • C. de Campos et al.

    The inferential complexity of Bayesian and credal networks

  • Cited by (9)

    • Information algebras in the theory of imprecise probabilities, an extension

      2022, International Journal of Approximate Reasoning
      Citation Excerpt :

      Now we can formulate the most important result of this section, i.e., proving that consistency and pairwise compatibility combined with a family of supports forming a hypertree guarantee global compatibility of a set of marginal assessments. This result is a generalisation of [1, Theorem 2, Proposition 1] and [5, Theorem 14] in our context. Notice that in all the previous results of this section we used only properties of domain-free information algebras.

    • Probability envelopes and their Dempster-Shafer approximations in statistical matching

      2022, International Journal of Approximate Reasoning
      Citation Excerpt :

      Recently, in [30] the authors faced the compatibility problem in the frameworks of coherent conditional sets of desirable gambles and coherent conditional lower previsions [50,53], respectively, by assuming no logical constraints among the involved variables. Statistical matching reveals to be a simplified instance of the compatibility problem of [30], in which all distributions are precise, all but one variables are conditional on the same variable X and also the marginal on X is given. It turns out that, for statistical matching, due to the particular structure of (34), the hypothesis of no logical constraints automatically implies coherence, as proved in [47], thus the compatibility problem is trivially solved in this case.

    • Processing distortion models: A comparative study

      2022, International Journal of Approximate Reasoning
    • Information algebras in the theory of imprecise probabilities

      2022, International Journal of Approximate Reasoning
      Citation Excerpt :

      Compatibility for coherent sets of gambles In [1] a definition of pairwise compatibility for coherent sets of gambles is also given. We can reformulate it as follows.

    • Linear Complexity Entropy Regions

      2021, IEEE International Symposium on Information Theory - Proceedings
    View all citing articles on Scopus
    View full text