Compatibility, desirability, and the running intersection property

doi:10.1016/j.artint.2020.103274

Artificial Intelligence

Volume 283, June 2020, 103274

https://doi.org/10.1016/j.artint.2020.103274 Get rights and content

Abstract

Compatibility is the problem of checking whether some given probabilistic assessments have a common joint probabilistic model. When the assessments are unconditional, the problem is well established in the literature and finds a solution through the running intersection property (RIP). This is not the case of conditional assessments. In this paper, we study the compatibility problem in a very general setting: any possibility space, unrestricted domains, imprecise (and possibly degenerate) probabilities. We extend the unconditional case to our setting, thus generalising most of previous results in the literature. The conditional case turns out to be fundamentally different from the unconditional one. For such a case, we prove that the problem can still be solved in general by RIP but in a more involved way: by constructing a junction tree and propagating information over it. Still, RIP does not allow us to optimally take advantage of sparsity: in fact, conditional compatibility can be simplified further by joining junction trees with coherence graphs.

Introduction

What is compatibility?

Suppose we are given a few marginal probability functions over some variables: e.g., $P_{1} (X_{1}, X_{2})$ , $P_{2} (X_{2}, X_{3})$ , $P_{3} (X_{3}, X_{4}, X_{5})$ . We wonder whether there is a joint probability $P (X_{1}, X_{2}, X_{3}, X_{4}, X_{5})$ from which we can reproduce $P_{1}, P_{2}, P_{3}$ by marginalisation.

This is an example of the so-called marginal problem: that of the compatibility of a number of marginal assessments with a global model. This problem has received a long-standing interest in the literature, since the seminal works by Boole [14], Hoeffding [44], Fréchet [34], Kellerer [51] and Vorobev [88] (see also [20] and the references therein).

The problem is trivial when the marginal models are defined on disjoint sets of variables: in that case, we could for instance determine a compatible joint model by considering the stochastic product of the marginals. However, when those sets of variables are not disjoint, then the problem is not trivial anymore. More recent work on this problem investigated when some additional constraints are placed on the joint in [76], [80], and has also appeared in other, apparently far, contexts, such as quantum mechanics [35] or coalitional game theory [88]. It has also a very nice application in problems of polynomial optimisation, where it can dramatically reduce the computational complexity of solution algorithms by exploiting sparsity in the problem representation [56].

Obviously, a necessary condition for the compatibility of a number of marginal assessments is their pairwise compatibility, that is, the equality of the marginals over common variables; in our example, this requires that $P_{1} (X_{2}) = P_{3} (X_{2}) and P_{2} (X_{3}) = P_{3} (X_{3}) .$

This is not enough however. In fact, using the theory of hypergraphs, Beeri et al. [5] (see also [60]) established a necessary and sufficient condition for pairwise compatibility to imply global compatibility: the running intersection property (RIP).¹ This requires the existence of a total order on the marginals such that if any two marginals have variables in common, then all the marginals between them in the order contain those variables too. In our example the natural order $P_{1}, P_{2}, P_{3}$ makes it. Therefore Eq. (1) being true makes sure that a compatible P exists. There could actually be more than one; the iterative proportional fitting procedure (IPFP) [29] yields a sequence of probabilities that converge to the compatible joint that maximises Kullback-Leibler information [19].

The works above investigate the compatibility of probabilities; when the possibility spaces are infinite, they are usually assumed to be countably additive on a suitable σ-field. Another direction of generalisation takes into account the possible partial specification of probabilities: for instance say that $P_{1}, P_{2}, P_{3}$ in the example are only partly known; this corresponds to replacing each of them with a set of candidate probabilities. The marginal problem then becomes checking whether there is a set of joint probabilities P from which we can recover the marginal (candidate) sets by marginalisation.

Set-based probabilistic modelling goes under the umbrella term of imprecise probabilities [4]. They include models of possibility measures [32], belief functions [77] or coherent lower previsions [89], among others. The marginal problem has been investigated for some of these models by Studený [82], [83], Vejnarová [86] and Jirousek [49], using the IPFP; van der Gaag [36] has dealt with it by propagating inequality constraints over a tree.

The marginal problem has a generalisation to the conditional case that we shall just call the compatibility problem. In this case we have any number of conditional probabilities over a set of variables and the problem is again to verify whether they have a compatible joint.

Instances of the compatibility problem have shown up in Artificial Intelligence in the research concerned with probabilistic logic and probabilistic satisfiability [38], [41], [43], [46], [71]; in these cases the focus is on variables with finite support (or just events) and solutions algorithms are often based on linear programming—yet probabilistic satisfiability is NP-hard [12]. Another approach to satisfiability, originated within de Finetti's school, is based on ‘full conditional measures’ [17], [31]; this model establishes links between conditional probabilities so as to avoid inconsistencies, and can equivalently be represented as ‘zero layers’ à la Krauss [54]. This allows in particular to deal with structural constraints (also called structural zeroes) between conditional probabilities via sequences of linear programs. With similar aims and properties, Walley et al. [91] have addressed a generalised version of probabilistic satisfiability that mixes conditional and unconditional information, that allows the assessments to be imprecisely specified, and that is not affected by problems due to zero probabilities.

Note in fact that compatibility needs Bayes' rule to be verified besides the simple use of marginalisation. But Bayes' rule is not applicable in the case of zero-probability events. Neglecting this issue can lead to overlook incompatibilities that ‘hide’ under these zero probabilities. The problem can eventually yield wrong inferences and it is particularly subtle as it is generally unknown in advance where those zero probabilities happen to be. Cozman and Ianni [18] have recently proposed an approach that builds on Walley et al.'s work and that, as such, correctly deals with these problems.

In a different direction, eleven years ago we have observed that the compatibility problem, as well as probabilistic satisfiability, can often be simplified taking sparsity into account through a graphical representation called coherence graphs [64, Sections 8.2–8.3].

Compatibility is such a general problem that has a life on its own also in the statistical literature. There we can find some early work by Strassen [81], Okner [72] and Kamakura and Wedel [50], and a great bulk of work made by Arnold et al. [1], [2], [3] who also consider the case of imprecise information. Kuo and Wang [93] have shown that the problem of zero probability is an issue also in the statistical case; in the same year we also have discussed the same question in the statistical literature [65]. In addition, we have proved that there is an iterative procedure that converges to the compatible joint model; this is somewhat similar in spirit to the IPFP, but our procedure works for the more involved conditional case and moreover it yields the entire set of compatible probabilities in the case of imprecision. While most work on compatibility focuses on discrete variables, Wang and Ip [92] are a relevant reference for the continuous case. Kuo et al. [55] provide one of the most recent works on the subject, with many references therein.

So there has been much work about compatibility in the conditional case across different communities (that do not seem to have talked much to each other). However, and to our surprise, we could not find any work making the connection to RIP there, which is even more surprising considered the clear connection that exists with RIP in the unconditional case.

Our aim in this paper is to establish a clear connection between RIP and compatibility in the most general possible setting: any possibility space, unrestricted domains (no σ-additivity/measurability problems), imprecise probabilities, conditional and unconditional information, no limitations due to zero probabilities.

To achieve these goals, we base our analysis on the imprecise-probability formalism of coherent sets of desirable gambles [89], [94]. As we have recently shown [96], [99], such a formalism is an equivalent reformulation of Bayesian decision theory, once it is freed of the precision constraint, with the advantage that it naturally meets all the requirements listed above. We introduce sets of desirable gambles in Section 2.

In the same section, we define compatibility in the unconditional case for sets of desirable gambles and prove in Theorem 2 that RIP and pairwise compatibility imply compatibility. This result generalises most of the previous work on the marginal problem along the lines discussed at the beginning of this section. We try to clarify this point by first specialising our results to sets of probabilities, and then by commenting on the relation of these results with previous ones.

We move to compatibility for the conditional case in Section 3. First, we give a generalised definition of compatibility (Definition 18). The definition makes us realise that compatibility is nothing else than strong coherence in Williams-Walley's theory [98, Definition 25], thus enabling us to exploit established tools in such a theory to pursue our aims. This turns out to be particularly important since we verify that the conditional case cannot be reduced to the unconditional one: in the former, compatibility does not imply pairwise compatibility; pairwise compatibility needs to be replaced by Walley's notion of avoiding partial loss. We go on then to specialise some of these notions for sets of probabilities.

In Section 4 we give our main results. We start by recalling the notion of tree decomposition related to RIP: i.e., that our probabilistic assessments can be represented graphically so as to eventually organise the variables of our problem in a junction tree; in such a tree, nodes are clusters of variables (cliques) that satisfy RIP. We give two procedures, analogous to the standard ones of collect and distribute evidence, for the propagation of desirable gambles over the tree. Then we prove in Theorem 9, Theorem 10 that:

∘
The first procedure terminates with a coherent set at the root of the tree if and only if our original assessments avoid partial loss. This is a first test of compatibility, because if that is not the case, then the original assessments are not compatible and we can stop.
∘
Otherwise, the second procedure yields the marginals of the joint compatible set of desirable gambles that extends our original assessments. Then the original assessments are compatible if and only if they coincide with such marginals.

In A.4 we give also an alternative avenue to the proof of Theorem 9, Theorem 10 based on so-called valuation algebras [52], [78]. These are abstract representations of knowledge or information that encode primitive tools for distributed computation on a junction tree. Valuation algebras should provide more accessible proofs of distributed computation to those unfamiliar with desirability; moreover, such an avenue has turned out to be an opportunity for us to discuss more widely the interplay of logic, desirability and algebras.

Irrespectively of the proof method, let us remark that these results, being valid for desirable gambles, hold also for sets of probabilities and in particular for traditional, precise, probability (on any possibility space).

Let us recall that in the unconditional case, RIP is often regarded as the optimal way to exploit sparsity in a problem without loss of information. We show in Section 5 that in the conditional case this is no longer true: there are very common situations where we can immediately tell if compatibility holds without having to build a junction tree and perform a propagation. We systematise this observation by leveraging on our past work on coherence graphs [64]. These simplify the verification of coherence by yielding a partition of the original set of assessments into so-called superblocks. Here, we extend past results on coherence graphs to desirable gambles and show in Theorem 12 that in order to check compatibility it is enough to separately check it on superblocks. In addition we give a procedure to compute the compatible joint. The lesson here is that if we want to get the best out of the conditional case, we have to combine coherence graphs with junction trees.

We give our concluding views in Section 6. Appendix A contains additional remarks and observations. All the proofs of the paper have been gathered in Appendix B.

Section snippets

Sets of desirable gambles

The most general model we shall consider in this paper is that of coherent sets of desirable gambles. Let us introduce the main notions about this theory; we refer to [4, Chapter 1], [90] and [89, Chapter 3] for further details.

Definition 1

Gambles

Consider a possibility space $X$ . A gamble on $X$ is a bounded real-valued function $f : X \to R$ .

Gambles are interpreted as uncertain rewards in a linear utility scale. We denote by

L (X)

the set of all gambles on

X

, and by

L^{+} (X) ≔ {f \in L (X) : f \geq 0, f \neq 0}

the set of positive gambles. We

Compatibility of conditional models

We consider next a more general framework: that where our assessments are possibly of a conditional nature. Thus, given two disjoint subsets $O, I$ of our set of variables N, we assume that we have a belief model about the variables in O, given information about the variables in I. The situation considered in Section 2 corresponds to the particular case where I is empty: then, what we have is marginal information about the variables in O.

Exploiting the power of tree decomposition

In this section we consider the most general version of the compatibility problem, where we have n variables $X_{1}, \dots, X_{n}$ over which we assess r separately coherent conditional sets of desirable gambles $D_{O_{1}} | X_{I_{1}}, \dots, D_{O_{r}} | X_{I_{r}}$ .

In the following we shall sometimes focus only on the variables involved in a certain set $D_{O_{j}} | X_{I_{j}}$ ; we denote the qualitative form of their relation by the so-called ‘template’ $X_{O_{j}} | X_{I_{j}}$ .

As a running example we consider the following $r = 13$ templates over $n = 15$ variables: $X_{2} | X_{1}, X_{2} | X_{4}, X_{3} | X_{2},$

Joining coherence graphs and RIP

It is important to realise that RIP or, equivalently, tree decompositions, do not necessarily simplify the compatibility check to the most. Consider for instance a case where the involved assessments define only two templates: $X_{1} | X_{2}$ and $X_{2} | X_{3}$ (this actually happens in Example 5 in Appendix A); the form of these templates is enough to deduce that the associated numerical assessments, whatever they are (provided that they are separately coherent), are strongly coherent, that is, compatible. In

Conclusions

In this paper, we have initially generalised the classical result on the compatibility of a number of marginal probabilities into a global one to the case where our belief models are sets of desirable gambles. This includes as particular cases sets of probability measures and also most models of non-additive measures, such as belief functions or possibility measures. Our generalisation covers also the case of infinite possibility spaces and is not constrained by measurability issues. There are,

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

We acknowledge the financial support of projects PGC2018-098623-B-I00 and GRUPIN/IDI/2018/000176 and of the Swiss NSF grant n. IZKSZ2_162188. We would also like to thank the anonymous Referees for a number of useful remarks.

References (100)

B.C. Arnold et al.
Compatibility of partial or complete conditional probabilities specifications
J. Stat. Plan. Inference
(2004)
A. Benavoli et al.
Sum-of-squares for bounded rationality
Int. J. Approx. Reason.
(2019)
A. Brozzi et al.
Incoherence correction strategies in statistical matching
Int. J. Approx. Reason.
(2012)
A. Capotorti et al.
Correction of incoherent conditional probability assessments
Int. J. Approx. Reason.
(2010)
F.G. Cozman et al.
Probabilistic satisfiability and coherence checking through integer programming
Int. J. Approx. Reason.
(2015)
D. Dubois et al.
Bayesian conditioning in possibility theory
Fuzzy Sets Syst.
(1997)
G. Georgakopoulos et al.
Probabilistic satisfiability
J. Complex.
(1988)
I. Gilboa et al.
Updating ambiguous beliefs
J. Econ. Theory
(1993)
R. Haenni
Ordered valuation algebras: a generic framework for approximating inference
Int. J. Approx. Reason.
(2004)
P. Hansen et al.
Probabilistic satisfiability with imprecise probabilities
Int. J. Approx. Reason.
(2000)

P.P. Shenoy et al.

(2013)

B.C. Arnold et al.

Conditional Specification of Statistical Models

(1999)

B.C. Arnold et al.

Compatible conditional distributions

J. Am. Stat. Assoc.

(1989)

C. Beeri et al.

On the desirability of acyclic database schemes

J. ACM

(1983)

A. Benavoli

Dual probabilistic programming

A. Benavoli

PyRational: a rational choice modelling framework in python

A. Benavoli et al.

A polarity theory for sets of desirable gambles

W. Bergsma et al.

Marginal models for categorical data

Ann. Stat.

(2002)

U. Bertelé et al.

Nonserial Dynamic Programming

(1972)

V. Biazzo et al.

Probabilistic logic under coherence: complexity and algorithms

Ann. Math. Artif. Intell.

(2005)

J. Blair et al.

An introduction to chordal graphs and clique trees

G. Boole

An Investigation on the Laws of Thought, on which are Founded the Mathematical Theories of Logic and Probabilities

(1854)

G. Coletti et al.

Probabilistic Logic in a Coherent Setting

(2002)

I. Csiszár

I-divergence geometry of probability distributions and minimization problems

Ann. Probab.

(1975)

C. Cuadras et al.

Distributions with Given Marginals and Statistical Modelling

(2002)

C. de Campos et al.

The inferential complexity of Bayesian and credal networks

Cited by (9)

Information algebras in the theory of imprecise probabilities, an extension
2022, International Journal of Approximate Reasoning
Citation Excerpt :
Now we can formulate the most important result of this section, i.e., proving that consistency and pairwise compatibility combined with a family of supports forming a hypertree guarantee global compatibility of a set of marginal assessments. This result is a generalisation of [1, Theorem 2, Proposition 1] and [5, Theorem 14] in our context. Notice that in all the previous results of this section we used only properties of domain-free information algebras.
In recent works, we have shown how to construct an information algebra of coherent sets of gambles, considering firstly a particular model to represent questions, called the multivariate model, and then generalizing it. Here we further extend the construction made to the highest level of generality, setting up an associated information algebra of coherent lower previsions, analyzing the connection of both the information algebras constructed with an instance of set algebras and, finally, establishing and inspecting a version of the marginal problem in this framework.
Set algebras are particularly important information algebras since they are their prototypical structures. They also represent the algebraic counterparts of classical propositional logic. As a consequence, this paper details as well how propositional logic is naturally embedded into the theory of imprecise probabilities.
Probability envelopes and their Dempster-Shafer approximations in statistical matching
2022, International Journal of Approximate Reasoning
Citation Excerpt :
Recently, in [30] the authors faced the compatibility problem in the frameworks of coherent conditional sets of desirable gambles and coherent conditional lower previsions [50,53], respectively, by assuming no logical constraints among the involved variables. Statistical matching reveals to be a simplified instance of the compatibility problem of [30], in which all distributions are precise, all but one variables are conditional on the same variable X and also the marginal on X is given. It turns out that, for statistical matching, due to the particular structure of (34), the hypothesis of no logical constraints automatically implies coherence, as proved in [47], thus the compatibility problem is trivially solved in this case.
Many economic applications require to integrate information coming from different data sources. In this work we consider a specific integration problem, called statistical matching, referring to integration of data sets where some variables are separately observed and some others are observed in all the data sets. This problem leads to the issue of non-uniqueness for the compatible (conditional) distributions and so it suggests to deal with sets of probabilities. For that we consider different strategies to get a (conditional) belief function that approximates the lower envelope of the class of compatible (conditional) probabilities. We first analyze the case without logical constraints among the variables and then generalize the obtained results by allowing for logical constraints. We finally show an application to real data.
Processing distortion models: A comparative study
2022, International Journal of Approximate Reasoning
When dealing with sets of probabilities, distortion or neighbourhood models are convenient practical tools, as they rely on very little parameters. In this paper, we study their behaviour when such models are combined and processed through some reasoning tools. More specifically, we study their behaviour when merging different distortion models quantifying uncertainty on the same quantity, and when manipulating distortion models defined over multiple variables.
Information algebras in the theory of imprecise probabilities
2022, International Journal of Approximate Reasoning
Citation Excerpt :
Compatibility for coherent sets of gambles In [1] a definition of pairwise compatibility for coherent sets of gambles is also given. We can reformulate it as follows.
In this paper we create a bridge between desirability and information algebras: we show how coherent sets of gambles, as well as coherent lower previsions, induce such structures. This allows us to enforce the view of such imprecise-probability objects as algebraic and logical structures; moreover, it enforces the interpretation of probability as information, and gives tools to manipulate them as such.
Linear Complexity Entropy Regions
2021, IEEE International Symposium on Information Theory - Proceedings
Algebras of sets and coherent sets of gambles
2021, arXiv

View all citing articles on Scopus

View full text

Compatibility, desirability, and the running intersection property

Abstract

Introduction

Section snippets

Sets of desirable gambles

Gambles

Compatibility of conditional models

Exploiting the power of tree decomposition

Joining coherence graphs and RIP

Conclusions

Declaration of Competing Interest

Acknowledgements

J. Stat. Plan. Inference

Int. J. Approx. Reason.

Int. J. Approx. Reason.

Int. J. Approx. Reason.

Int. J. Approx. Reason.

Fuzzy Sets Syst.

J. Complex.

J. Econ. Theory

Int. J. Approx. Reason.

Int. J. Approx. Reason.

J. Appl. Log.

J. Multivar. Anal.

Discrete Math.

Int. J. Approx. Reason.

Int. J. Approx. Reason.

Int. J. Approx. Reason.

Artif. Intell.

J. Stat. Plan. Inference

Int. J. Approx. Reason.

Int. J. Approx. Reason.

Int. J. Approx. Reason.

Artif. Intell.

Int. J. Approx. Reason.

Int. J. Approx. Reason.

Int. J. Approx. Reason.

Int. J. Approx. Reason.

Int. J. Approx. Reason.

Int. J. Approx. Reason.

Int. J. Approx. Reason.

J. Stat. Plan. Inference

J. Multivar. Anal.

Int. J. Approx. Reason.

Artif. Intell.

Conditional Specification of Statistical Models

Compatible conditional distributions

J. Am. Stat. Assoc.

On the desirability of acyclic database schemes

J. ACM

Dual probabilistic programming

PyRational: a rational choice modelling framework in python

A polarity theory for sets of desirable gambles

Marginal models for categorical data

Ann. Stat.

Nonserial Dynamic Programming

Probabilistic logic under coherence: complexity and algorithms

Ann. Math. Artif. Intell.

An introduction to chordal graphs and clique trees

An Investigation on the Laws of Thought, on which are Founded the Mathematical Theories of Logic and Probabilities

Probabilistic Logic in a Coherent Setting

I-divergence geometry of probability distributions and minimization problems

Ann. Probab.

Distributions with Given Marginals and Statistical Modelling

The inferential complexity of Bayesian and credal networks