Multi-Class Cost-Constrained Random Coding for Correlated Sources over the Multiple-Access Channel

Rezazadeh, Arezou; Font-Segura, Josep; Martinez, Alfonso; Guillén i Fàbregas, Albert

doi:10.3390/e23050569

Open AccessArticle

Multi-Class Cost-Constrained Random Coding for Correlated Sources over the Multiple-Access Channel

¹

Department of Electrical Engineering, Chalmers University of Technology, SE-412 96 Gothenburg, Sweden

²

Department of Information and Communication Technologies, Universitat Pompeu Fabra, 08018 Barcelona, Spain

³

Institució Catalana de Recerca i Estudis Avançats, 08010 Barcelona, Spain

⁴

Department of Engineering, University of Cambridge, Cambridge CB2 1PZ, UK

^*

Author to whom correspondence should be addressed.

^†

This work was carried out when the first author was with the Department of Information and Communication Technologies at Universitat Pompeu Fabra.

Entropy 2021, 23(5), 569; https://doi.org/10.3390/e23050569

Submission received: 12 April 2021 / Revised: 30 April 2021 / Accepted: 1 May 2021 / Published: 3 May 2021

(This article belongs to the Special Issue Finite-Length Information Theory)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

This paper studies a generalized version of multi-class cost-constrained random-coding ensemble with multiple auxiliary costs for the transmission of N correlated sources over an N-user multiple-access channel. For each user, the set of messages is partitioned into classes and codebooks are generated according to a distribution depending on the class index of the source message and under the constraint that the codewords satisfy a set of cost functions. Proper choices of the cost functions recover different coding schemes including message-dependent and message-independent versions of independent and identically distributed, independent conditionally distributed, constant-composition and conditional constant composition ensembles. The transmissibility region of the scheme is related to the Cover-El Gamal-Salehi region. A related family of correlated-source Gallager source exponent functions is also studied. The achievable exponents are compared for correlated and independent sources, both numerically and analytically.

Keywords:

multiple access channel; correlated sources; random coding; error exponents

1. Introduction

In information theory, the fundamental problem of communication over a channel is studied from two complementary perspectives. First, one characterizes the transmissibility conditions, namely the circumstances under which the error probability asymptically vanishes as the blocklength goes to infinity. Second, one describes by means of error exponents the speed at which this error probability vanishes; the larger the exponent, the faster the error probability tends to zero. Since finding an exact expression for error probability is very difficult, a large body of work has investigated upper and lower bounds on the average error probability, or equivalently lower and upper bounds for the error exponent. In point-to-point, that is, single-user communication, using separate source-channel random coding [1,2], possibly with expurgation [1] (Eq. 5.7.10), yields lower bounds on the error exponent. In contrast, finding an upper bound to the error exponent satisfied by every code is more challenging. Generally, the hypothesis-testing method [3] is employed to derive upper bounds for the error exponent. Two well-known upper bounds to the error exponent are the sphere-packing exponent [4] and the minimum-distance exponent [5]. In fact, for rates greater than critical rate [1] (Sec. 5.6), the random-coding and sphere-packing bounds coincide with each other, while the expurgated and minimum-distance bounds coincide at rate zero.

For point-to-point communication, it was shown in ref. [1] (Prob. 5.16) that joint source-channel coding leads in general to a larger exponent than separate source-channel coding. Additionally, using codewords with a composition dependent on the source message leads to a larger exponent than the case where codewords are drawn according to a fixed product distribution [6,7]. Moreover, a scheme where source messages are assigned to disjoint classes and encoded by codes that depend on the class index, attains the sphere-packing exponent in those cases where it is tight [8].

Many works have been devoted to studying the transmissibility and the error exponent for a two-user multiple-access channel (MAC) [9,10,11]. Separate source-channel coding for the MAC with independent sources was studied in refs. [9,12]. In ref. [13], a universal exponent for the MAC was derived by considering separate source-channel coding. In ref. [14], a transmissible region is derived for the MAC under mismatched decoding, where the decoding rule is fixed and possibly suboptimal. In ref. [15], it was shown that using structure coding can improve the error exponent of the MAC. The maximum-error-probability criterion and the impact of feedback for the MAC were studied in ref. [16]. By considering separate source-channel coding, lower and upper bounds for the error exponent of the MAC were respectively obtained in refs. [17,18]. For the MAC with independent sources, the idea of exploting the dependency between messages and codewords was studied in ref. [19]. In ref. [20], an achievable exponent for the MAC with independent sources was given in the dual domain, that is, as a lower dimensional problem over parameters in terms of Gallager functions. For the MAC with correlated sources, it was shown in ref. [11] that considering statistical dependency between messages and codewords for the MAC with correlated sources leads to a larger transmissible region. However, an example presented in ref. [21] shows that one can reliably transmit information through the MAC without satisfying the reliable transmission obtained in ref. [11]. In another line of work, superposition coding with Gacs Körner Witsenhausen (GKW) common part is used in ref. [22] to to describe the sufficient conditions lossless recoverability.

In contrast to single-user communication, the problem of reliable transmission of two correlated sources has not been solved yet and just the sufficient conditions of a reliable transmission has been derived. In ref. [23], by applying coding techniques, a new set of sufficient conditions were proposed. Moreover, in ref. [24] new sufficient conditions for the three-user MAC with correlated sources were studied. In ref. [25], an achievable exponent derived was presented in the primal domain, that is, as a multi-dimensional optimization problem over distributions that is generally difficult to analyze.

In this paper, we examine how statistical dependency between the messages and codewords improves the exponent, as well as its impact on the transmissibility region. In view of refs. [1] (Ch. 7) and [26], we study a generalized message-dependent cost-constrained random-coding ensemble with multiple cost functions. By choosing the proper cost functions, the multi-class cost-constrained ensemble subsumes multiple ensembles previously considered in the literature and recovers the transmissibility region in ref. [11].

The paper is organized as follows—in Section 2, we present the problem of transmission of N correlated sources over an N-input discrete memoryless multiple-access channel and provide the key definitions of error probability, transmissibility, random-coding ensemble, and achievable exponent. In Section 3, we review the existing random-coding ensembles, define a novel generalized multi-class cost-constraint ensemble and characterize its achievable exponent. In the discussion Section 4, we characterize the transmissibility region for our error exponent, relate the exponent to standard Gallager source and channel functions, and provide numerical results and formulas that allow us to rank the exponents attained by the various standard random-coding ensembles.

2. Problem Formulation

We study the simultaneous transmission of N correlated, discrete, memoryless sources over a channel; users are indexed by

ν \in N = {1, 2, \dots, N}

. The source messages

u_{ν}

of user

ν

have n symbols drawn from the alphabet

U_{ν}

. We denote by

u_{σ}

the ordered vector of source messages for all users in a set

σ \subset 2^{N}

, i.e., a subset of the set of all user indices, and similarly by

U_{σ}

the Cartesian product of the source alphabets in the set

σ

. When

σ = N

,

u_{N}

and

U_{N}

denote the ordered vector of source messages for all users and the Cartesian product of the all source alphabets respectively. The sources are memoryless and are characterized by the joint probability distribution

P_{N}

\begin{matrix} P_{N} (u_{N}) = \prod_{t = 1}^{n} P_{N} (u_{N, t}), \end{matrix}

(1)

and by the symbol joint probability distribution

P_{N}

. The source message and symbol marginal distributions of user

ν \in N

are denoted by

P_{ν}

and

P_{ν}

respectively. Assuming that the sources are independent, the marginal distributions induce new joint (mismatched) probability distributions of sets of users

σ \subset 2^{N}

. The induced independent-message and -symbol probabilities, denoted by

P_{σ}^{ind}

and

P_{σ}^{ind}

, are given by

\begin{matrix} P_{σ}^{ind} (u_{σ}) = \prod_{ν \in σ} P_{ν} (u_{ν}), \end{matrix}

(2)

and similarly for

P_{σ}^{ind}

.

Each user

ν

has an encoder that maps, without cooperation with the other users, the source message

u_{ν}

onto a codeword

x_{ν} (u_{ν})

also of length n and with symbols drawn from the alphabet

X_{ν}

. We denote the codebook of user

ν

by

C_{n}^{ν}

. We denote by

x_{σ} \in X_{σ}^{n}

the vector of codewords for all users in a set

σ \subset 2^{N}

. Both terminals simultaneously send these codewords over a discrete memoryless multiple access channel with output alphabet

Y

. The symbolwise transition probability is denoted by W, and the channel is characterized by a conditional probability distribution

\begin{matrix} W (y | x_{N}) = \prod_{t = 1}^{n} W (y_{t} | x_{N, t}), \end{matrix}

(3)

where

y

is the received sequence of length n.

Based on

y

, a joint decoder estimates all transmitted source messages

u_{N}

according to the maximum a posteriori criterion:

\begin{matrix} {\hat{u}}_{N} = \underset{u_{N} \in U_{N}^{n}}{arg max} P_{N} (u_{N}) W (y | x_{N} (u_{N})), \end{matrix}

(4)

where

U_{N}^{n}

denotes the set of all possible source messages

u_{N}

. An error occurs if the decoded messages

{\hat{u}}_{N}

differ from the transmitted

u_{N}

; we refer to

{\hat{u}}_{N} \neq u_{N}

as an error event. The error probability for a given set of codebooks,

P_{e} (C_{n}^{N})

, is thus given by

\begin{matrix} P_{e} (C_{n}^{N}) ≜ Pr \{{\hat{U}}_{N} \neq U_{N}\} . \end{matrix}

(5)

In our analysis, it will prove convenient to split the error event into

2^{N} - 1

distinct types of error events indexed by the non-empty subsets in the power set of the user indices

2^{N} \ \emptyset

, for example,

τ \in {{1}, {2}, {1, 2}}

for

N = 2

. More precisely, the error event of type

τ

corresponds to the conditions

{\hat{u}}_{ν} \neq u_{ν}

for all

ν \in τ

and

{\hat{u}}_{ν} = u_{ν}

for all

ν \in τ^{c}

, where

τ^{c}

is the complement of

τ

in the power set of the user indices.

We are interested in the asymptotics of the error probability for sufficiently large n, namely whether the error probability vanishes and how fast this probability tends to zero as it vanishes. The sources

U_{N}

are said to be transmissible over the channel if there exists a sequence of codebooks

C_{n}^{N}

such that

{lim}_{n \to \infty} P_{e} (C_{n}^{N}) = 0

. To characterize the speed at which the error probability vanishes, we use the notion of exponent. An exponent

E

is said to be achievable if there exists a sequence of codebooks such that

\begin{matrix} \underset{n \to \infty}{lim inf} - \frac{1}{n} log P_{e} (C_{n}^{N}) \geq E . \end{matrix}

(6)

Source transmissibility and error-exponent achievability are typically studied by means of random coding. With random coding, one generates and studies sequences of ensembles of codebooks whose codewords are randomly drawn from a distribution

Q_{ν} (x_{ν} | u_{ν})

independently for each user; as indicated by the notation, this distribution may possibly depend on the source message

u_{ν}

. The random-coding probability distribution for the channel input

Q_{N} (x_{N} | u_{N})

combined for all users is given by

\begin{matrix} Q_{N} (x_{N} | u_{N}) = \prod_{ν \in N} Q_{ν} (x_{ν} | u_{ν}) . \end{matrix}

(7)

The use of random coding allows us to study how the error probability averaged over the ensemble, denoted by

{\bar{P}}_{e}

vanishes as n grows. More importantly, it shows the existence of good codes in the ensemble such that their error probability vanishes. For the point-to-point and the multiple-access channels, a number of such random-coding ensembles have been studied in the literature, as reviewed in the following section, where we also present a multi-class cost-constrained ensemble subsuming all these ensembles and characterize the achievable exponent and transmissibility region of this ensemble.

Summary of Notation Used in the Paper

Sets are usually denoted by calligraphic upper case letter, e.g.,

X

, and the n-Cartesian product set of

X

is denoted by

X^{n}

. The cardinality of a set such as

X

is denoted by

| X |

. The indicator function representing an error event or that an element x belongs to a set

X

is denoted by

1 {x \in X}

.

The number of users is denoted by N and user indices are typically represented by

ν

. The set of all users is denoted by

N

. The power set of all subsets of

N

is denoted by

2^{N}

and the complement of a subset

σ \subset 2^{N}

is denoted by

σ^{c}

; sets in the power set of users are denoted that by Greek letters, for example,

τ

and

σ

. The number of source-message classes and of cost functions for user

ν

are respectively denoted by

K_{ν}

and

L_{ν}

; the sets of such classes are functions are respectively denoted by

K_{ν}

and

L_{ν}

. Indices for source classes and cost functions are typically denoted by

i_{ν}

and

ℓ_{ν}

respectively.

Subscripts and superscripts in a quantity A may represent sets of user indices

σ

. Depending on the context, the quantity represents a list or a suitable product of variables for all elements in the set

σ

. For instance, for

σ = {1, 2}

,

A^{σ} = (A^{1}, A^{2})

or

A_{σ} = (A_{1}, A_{2})

. If the quantity is a probability distribution, its value for

σ

represents the probability distribution of the sequence, for example,

Q_{σ}^{i_{σ}} (x_{σ}) = \prod_{ν \in σ} Q_{ν}^{i_{ν}} (x_{ν})

. If the quantity is a set, its value for

σ

is the Cartesian product, for example,

U_{σ} = U_{1} \times U_{2}

for

σ = {1, 2}

. If

σ = \emptyset

, then

A_{σ} = A^{σ} = 0

. If

σ

is a singleton, for example,

σ = {2}

, we simply write

A_{2}

or

A^{2}

. We denote the operation that merges and sorts two lists

A_{σ_{1}}

and

A_{σ_{2}}

with

σ_{1} \cap σ_{2} = \emptyset

into an ordered list containing all users in the union

σ_{1} \cup σ_{2}

by

[A_{σ_{1}}, A_{σ_{2}}]

. For sets of user indices, we denote such merging operation by

[σ_{1}, σ_{2}]

and we have

[σ, σ^{c}] = N

.

Scalar random variables are denoted by capital letters, for example, X and lowercase letters represent a particular realisation, for example,

x \in X

. Capital bold letter denotes random vectors or sequences, for example,

X

, while small bold letter

x \in X^{n}

denote deterministic vectors or sequences. Probability distributions for vectors or sequences, typically of length n, (resp. for symbols) are represented by text-style letters, for example,

P

,

Q

,

W

(resp. math-style letters, for example, P, Q, W). Sequences symbols are usually affixed a subscript to indicate a user index; the t-th symbol in the sequence

x_{ν}

is denoted by

x_{ν, t}

.

The source-symbol distribution for user

ν

is denoted by

P_{ν} (u_{ν})

. The joint distribution for users

σ

is denoted by

P_{σ} (u_{σ})

; the joint distribution, computed as if the sources were independent, is denoted by

P_{σ}^{ind} (u_{σ})

. The conditional source distribution for users

σ_{1}

given another set

σ_{2}

is denoted by

P_{σ_{1} | σ_{2}} (u_{σ_{1}} | u_{σ_{2}})

. Vector or sequence distributions are defined analogously with P replaced by

P

. Channel input distributions are denoted by

Q_{ν} (x_{ν})

,

Q_{ν}^{i_{ν}} (x_{ν})

, or

Q_{ν, u_{ν}}^{i_{ν}} (x_{ν})

, where

i_{ν}

denotes the index of the class source message and

Q_{ν, u_{ν}} (x_{ν})

is a shorthand for the conditional distribution

Q_{ν} (x_{ν} | u_{ν})

. Cost functions are similarly denoted by

a_{ν} (x_{ν})

,

a_{ν}^{i_{ν}} (x_{ν})

, or

a_{ν, u_{ν}}^{i_{ν}} (x_{ν})

. Vector or sequence distributions are defined analogously with Q or a respectively replaced by

Q

or

a

. The conditional distribution for the channel output symbol (resp. sequence) is denoted by

W (y | x_{N})

(resp.

W (y | x_{N})

).

3. Multi-Class Cost-Constrained Ensemble with Statistical Dependency

3.1. Review of Random-Coding Ensembles

The simplest and oldest random-coding ensemble is the independent, identically distributed (iid) [1,12,17,27], where the symbols

x_{ν, t}

in all codewords

x_{ν}

of a given user

ν

are generated independently according to the same input distributions

Q_{ν} (x_{ν, t})

for all source messages

u_{ν}

. Throughout the paper, we shall identify ensembles by hyphenated acronyms, where the first part indicates the possible dependence of the codeword on the source message and the second part describes the generation of symbols in a codeword. This first ensemble is thus the message-independent iid (mi-iid) ensemble, since codewords have the same distribution for all source messages and symbols are independent of each other and independent of the source message symbols too. For the mi-iid ensemble, the random-coding distribution is given by

\begin{matrix} Q_{ν}^{mi - iid} (x_{ν} | u_{ν}) = \prod_{t = 1}^{n} Q_{ν} (x_{ν, t}) . \end{matrix}

(8)

In the message-independent, independent-conditionally-distributed (mi-icd) ensemble, the codewords

x_{ν}

of user

ν

are generated identically for all source messages

u_{ν}

, independently of the full message

u_{ν}

, and with symbols according to a set of

| U_{ν} |

conditional probability distributions

Q_{ν, u_{ν}} (x_{ν}) ≜ Q_{ν} (x_{ν} | u_{ν})

. To this end, let

I_{u_{ν}} (u_{ν})

denote the set of positions where the symbol

u_{ν} \in U

appears in the sequence

u_{ν}

, namely

\begin{matrix} I_{u_{ν}} (u_{ν}) & = \{t \in {1, 2, \dots, n} : u_{ν, t} = u_{ν}\} . \end{matrix}

(9)

Within each subsequence of

u_{ν}

where

u_{ν, t} = u_{ν}

, represented by

u_{ν} (I_{u_{ν}} (u_{ν}))

, symbols are drawn independently according to

Q_{ν, u_{ν}} (x_{ν})

. For this mi-icd ensemble, codewords are generated according to

\begin{matrix} Q_{ν}^{mi - icd} (x_{ν} | u_{ν}) & = \prod_{u_{ν} \in U_{ν}} \prod_{t \in I_{u_{ν}} (u_{ν})} Q_{ν, u_{ν}} (x_{ν, t}) \end{matrix}

(10)

\begin{matrix} = \prod_{t = 1}^{n} Q_{ν, u_{ν, t}} (x_{ν, t}) . \end{matrix}

(11)

Compared to the mi-iid ensemble, the mi-icd ensemble can lead to a larger transmissible region for the multiple-access channel with correlated sources [11,21]. An example of generation of three codewords

x_{ν}^{(1)}

,

x_{ν}^{(2)}

and

x_{ν}^{(3)}

in the mi-icd ensemble is shown in Figure 1, for a given source sequence

u_{ν} = (α, β, β, γ, β, γ, γ, α, β, α)

with source alphabet

U = {α, β, γ}

. To generate each codeword

x_{ν}

with alphabet

X = {a, c, e}

, three subcodewords

x_{ν} (I_{α} (u_{ν})

,

x_{ν} (I_{β} (u_{ν})

and

x_{ν} (I_{γ} (u_{ν})

are pairwise-independently generated with i. i. d. distributions

Q_{ν, α} = (1 \ 3, 1 \ 3, 1 \ 3)

,

Q_{ν, β} = (1 \ 2, 1 \ 4, 1 \ 4)

and

Q_{ν, γ} = (1 \ 3, 2 \ 3, 0)

, respectively. Symbols generated according to

Q_{ν, α}

,

Q_{ν, β}

and

Q_{ν, γ}

are respectively represented as green circles, blue boxes and red diamonds in the figure. In the example,

I_{α} (u_{ν}) = {1, 8, 10}

,

I_{β} (u_{ν}) = {2, 3, 5, 9}

and

I_{γ} (u_{ν}) = {4, 6, 7}

. For instance, the subcodeword

x_{ν}^{(1)} (I_{γ} (u_{ν})

has three symbols, each generated independently from

Q_{ν, γ}

, leading to the red-diamond symbols

x_{ν}^{(1)} (I_{γ} (u_{ν}) = (a, a, a)

.

Next, we have the message-dependent iid (md-iid) ensemble [6,8,19,25,28], where codewords for each user are generated with i. i. d. symbols according to different distributions

Q_{ν}^{i_{ν}} (x_{ν})

that depend on the full source message through the class index

i_{ν}

of the class the source message belongs to. More precisely, for user

ν = 1, 2

with source marginal distribution

P_{ν}

, the

i_{ν}

-th class

A_{ν}^{i_{ν}}

, where

i_{ν} \in K_{ν} = {1, \dots, K_{ν}}

, is defined as the set of all source messages whose probability

P_{ν} (u_{ν})

is within a given interval, that is,

\begin{matrix} A_{ν}^{i_{ν}} = \{u_{ν} \in U_{ν}^{n} : γ_{ν, i_{ν}}^{n} < P_{ν} (u_{ν}) \leq γ_{ν, i_{ν} - 1}^{n}\}, \end{matrix}

(12)

where the thresholds

γ_{ν, j}

are

K_{ν} + 1

non-negative numbers, ordered from higher to lower, such that

0 = γ_{ν, K_{ν}} \leq γ_{ν, K_{ν} - 1} \leq \dots \leq γ_{ν, 1} < γ_{ν, 0} = 1

, and

{min}_{u_{ν}} P_{ν} (u_{ν}) < γ_{ν, K_{ν} - 1}

and

γ_{ν, 1} \leq {max}_{u_{ν}} P_{ν} (u_{ν})

. The md-iid random-coding distribution is given by

\begin{matrix} Q_{ν}^{md - iid} (x_{ν} | u_{ν}) = \prod_{t = 1}^{n} Q_{ν}^{i_{ν} (u_{ν})} (x_{ν, t}) . \end{matrix}

(13)

The exponent of this md-iid ensemble can be larger than that of the mi-iid ensemble for joint source-channel coding [8,20,28].

In the message-dependent, independent conditional symbol distributions (md-icd) ensemble, messages in the class

i_{ν}

for user

ν

are encoded with codewords whose symbols are generated independently according to the conditional input distribution

Q_{ν, u_{ν}}^{i_{ν}} (x_{ν})

. The random-coding distribution of the md-icd ensemble is thus given by

\begin{matrix} Q_{ν}^{md - icd} (x_{ν} | u_{ν}) = \prod_{u_{ν} \in U_{ν}} \prod_{t \in I_{u_{ν}} (u_{ν})} Q_{ν, u_{ν}}^{i_{ν} (u_{ν})} (x_{ν, t}) . \end{matrix}

(14)

In the message-independent, constant-composition (mi-cc) ensemble [29,30], codewords

x_{ν}

are drawn independently with an empirical distribution

{\hat{Q}}_{ν} (x_{ν})

close to a given

Q_{ν} (x_{ν})

, independently of the source message

u_{ν}

. For each user, codewords

x_{ν}

are randomly picked from

T_{ν}^{n} (Q_{ν})

, the set of all sequences whose empirical distribution has a variational distance to

Q_{ν}

of at most

1 \ n

, that is

\begin{matrix} T_{ν}^{n} (Q_{ν}) = \{x_{ν} \in X_{ν}^{n} : max_{x_{ν}} |{\hat{Q}}_{ν} (x_{ν}) - Q_{ν} (x_{ν})| < \frac{1}{n}\} . \end{matrix}

(15)

For this mi-cc ensemble, the random-coding distribution is given by

\begin{matrix} Q_{ν}^{mi - cc} (x_{ν} | u_{ν}) = \frac{1}{| T_{ν}^{n} (Q_{ν}) |} 1 \{x_{ν} \in T_{ν}^{n} (Q_{ν})\} . \end{matrix}

(16)

While the mi-cc and mi-iid ensembles lead to identical transmissibility conditions, the former may achieve strictly larger exponents for suboptimal input distributions already in single-user settings [29].

The message-independent, conditional constant-composition (mi-ccc) ensemble combines features of the mi-icd and mi-cc ensembles. For each subsequence

u_{ν} (I_{u_{ν}} (u_{ν}))

, the corresponding subcodewords

x_{ν} (I_{u_{ν}} (u_{ν}))

are drawn independently from the set

T_{ν}^{| I_{u_{ν}} (u_{ν}) |} (Q_{ν, u_{ν}})

of subsequences with empirical distribution close to

Q_{ν, u_{ν}} (x_{ν})

, namely

\begin{matrix} T_{ν}^{| I_{u_{ν}} (u_{ν}) |} (Q_{ν, u_{ν}}) = \{x_{ν} \in X_{ν}^{| I_{u_{ν}} (u_{ν}) |} : max_{x_{ν} \in x_{ν}} |{\hat{Q}}_{ν} (x_{ν}) - Q_{ν, u_{ν}} (x_{ν})| < \frac{1}{| I_{u_{ν}} (u_{ν}) |}\} . \end{matrix}

(17)

The random-coding distribution of the mi-ccc ensemble is given by

\begin{matrix} Q_{ν}^{mi - ccc} (x_{ν} | u_{ν}) = \prod_{u_{ν} \in U_{ν}} \frac{1}{|T_{ν}^{| I_{u_{ν}} (u_{ν}) |} (Q_{ν, u_{ν}})|} 1 \{x_{ν} (I_{u_{ν}} (u_{ν})) \in T_{ν}^{| I_{u_{ν}} (u_{ν}) |} (Q_{ν, u_{ν}})\} . \end{matrix}

(18)

An example of the generation of three codewords

x_{ν}^{(4)}

,

x_{ν}^{(5)}

and

x_{ν}^{(6)}

in the mi-ccc ensemble is also shown in Figure 1 as a comparison to the md-iid ensemble, for the same source sequence

u_{ν}

, source alphabet

U = {α, β, γ}

and input alphabet

X = {a, c, e}

. Now, to generate each codeword

x_{ν}

, three subcodewords

x_{ν} (I_{α} (u_{ν}))

,

x_{ν} (I_{β} (u_{ν}))

and

x_{ν} (I_{γ} (u_{ν}))

are pairwise-independently, uniformly drawn in the type classes with empirical distributions

{\hat{Q}}_{ν, α}

,

{\hat{Q}}_{ν, β}

and

{\hat{Q}}_{ν, γ}

that are closest to

Q_{ν, α}

,

Q_{ν, β}

and

Q_{ν, γ}

, respectively. Since in the example

| I_{α} (u_{ν}) | = 3

,

| I_{β} (u_{ν}) | = 4

and

| I_{γ} (u_{ν}) | = 3

, it follows that

{\hat{Q}}_{ν, α} = (1 \ 3, 1 \ 3, 1 \ 3)

,

{\hat{Q}}_{ν, β} = (1 \ 2, 1 \ 4, 1 \ 4)

and

{\hat{Q}}_{ν, γ} = (1 \ 3, 2 \ 3, 0)

. Symbols generated according to

{\hat{Q}}_{ν, α}

,

{\hat{Q}}_{ν, β}

and

{\hat{Q}}_{ν, γ}

are respectively represented as green doubled circles, blue doubled boxes and red doubled diamonds in the figure. For instance, all subcodewords

x_{ν}^{(j)} (I_{γ} (u_{ν}))

, for

j = 4, 5, 6

, have three symbols jointly generated from the constant-composition type

{\hat{Q}}_{ν, γ}

, that is, exactly one a and two cs.

The message-dependent, constant-composition (md-cc) ensemble combines the features of having different distributions for different messages with constant-composition random coding. For messages in the class

i_{ν} \in {1, \dots, K_{ν}}

for user

ν

, codewords are drawn from the set of sequences with empirical distribution close to

Q_{ν}^{i_{ν}} (x_{ν})

. For this ensemble, the random-coding distribution is given by

\begin{matrix} Q_{ν}^{md - cc} (x_{ν} | u_{ν}) = \frac{1}{|T_{ν}^{n} (Q_{ν}^{i_{ν} (u_{ν})})|} 1 \{x_{ν} \in T_{ν}^{n}(Q_{ν}^{i_{ν} (u_{ν})})} . \end{matrix}

(19)

Finally, the message-dependent, conditional constant-composition (md-ccc) ensemble combines several of the ensembles listed above. For a given message

u_{ν} = (u_{ν, 1}, \dots, u_{ν, n})

in the

i_{ν}

-th class, that is,

u_{ν} \in A_{ν}^{i_{ν}}

, the subsequence of

u_{ν}

having the same symbol

u_{ν}

, that is,

u_{ν} (I_{u_{ν}} (u_{ν}))

, is encoded with pairwise-independent codewords generated from the set of codewords with empirical distribution very close to

Q_{ν, u_{ν}}^{i_{ν}} (x_{ν})

. The random-coding distribution of the md-ccc ensemble is thus given by

\begin{matrix} Q_{ν}^{md - ccc} (x_{ν} | u_{ν}) = \prod_{u_{ν} \in U_{ν}} \frac{1}{|T_{ν}^{| I_{u_{ν}} (u_{ν}) |} (Q_{ν, u_{ν}}^{i_{ν} (u_{ν})})|} 1 \{x_{ν} (I_{u_{ν}} (u_{ν})) \in T_{ν}^{| I_{u_{ν}} (u_{ν}) |} (Q_{ν, u_{ν}}^{i_{ν} (u_{ν})})\} . \end{matrix}

(20)

3.2. Generalized Multi-Class Cost-Constrained Ensemble

Motivated by the ensembles listed in the previous section, and inspired by refs. [1] (Ch. 7) and [26] (Sec. II), we study a generalized message-dependent multi-class cost-constrained random-coding ensemble with multiple auxiliary costs.

For each user, we partition the set of source messages into

K_{ν}

disjoint classes with thresholds on the message probabilities as in Equation (12). Let the source message be in the

i_{ν}

-th class, that is,

i_{ν} (u_{ν}) = i_{ν}

. Given the source message

u_{ν}

and the source symbol

u_{ν}

, we consider the subsequence

u_{ν} (I_{u_{ν}} (u_{ν}))

, where

I_{u_{ν}} (u_{ν})

is defined in Equation (9), and we denote the corresponding source subsequence and subcodeword by

u_{ν} (I_{u_{ν}} (u_{ν}))

and

x_{ν} (I_{u_{ν}} (u_{ν}))

respectively. For each user

ν

, class index

i_{ν}

, and source message symbol

u_{ν}

, the subcodeword

x_{ν} (I_{u_{ν}} (u_{ν}))

is drawn according to a symbolwise i. i. d. distribution

Q_{ν, u_{ν}}^{i_{ν}} (x_{ν})

conditioned on a set of cost constraints being satisfied. We consider

L_{ν}

additive cost functions

a_{ν, u_{ν}}^{i_{ν}, ℓ_{ν}} (x_{ν})

,

ℓ_{ν} \in L_{ν} = {1, \dots, L_{ν}}

. The total cost

a_{ν, u_{ν}}^{i_{ν}, ℓ_{ν}} (x_{ν} (I_{u_{ν}} (u_{ν})))

of the subcodeword

x_{ν} (I_{u_{ν}} (u_{ν}))

is given by the sum of the symbol costs

a_{ν, u_{ν}}^{i_{ν}, ℓ_{ν}}

, namely

\begin{matrix} a_{ν, u_{ν}}^{i_{ν}, ℓ_{ν}} (x_{ν} (I_{u_{ν}} (u_{ν}))) = \sum_{j \in I_{u_{ν}} (u_{ν})} a_{ν, u_{ν}}^{i_{ν}, ℓ_{ν}} (x_{ν, j}) . \end{matrix}

(21)

We assume that the average cost

ϕ_{ν, u_{ν}}^{i_{ν}, ℓ_{ν}}

under the conditional distribution

Q_{ν, u_{ν}}^{i_{ν}}

is zero:

\begin{matrix} ϕ_{ν, u_{ν}}^{i_{ν}, ℓ_{ν}} = \sum_{x_{ν} \in X_{ν}} Q_{ν, u_{ν}}^{i_{ν}} (x_{ν}) a_{ν, u_{ν}}^{i_{ν}, ℓ_{ν}} (x_{ν}) = 0 . \end{matrix}

(22)

Finally, fix some parameters

δ_{ν} > 0

and let

D_{ν}^{i_{ν}}

be the set of codewords for which the average empirical cost of its constituent subcodewords

\frac{1}{| I_{u_{ν}} (u_{ν}) |} a_{ν, u_{ν}}^{i_{ν}, ℓ_{ν}} (x_{ν} (I_{u_{ν}} (u_{ν})))

is close to the statistical mean

ϕ_{ν, u_{ν}}^{i_{ν}, ℓ_{ν}} = 0

for all cost functions and source symbols, i.e.,

\begin{matrix} D_{ν, u_{ν}}^{i_{ν}} ≜ \{x_{ν} \in X_{ν}^{n} : |\frac{1}{| I_{u_{ν}} (u_{ν}) |} a_{ν, u_{ν}}^{i_{ν}, ℓ_{ν}} (x_{ν} (I_{u_{ν}} (u_{ν})))| \leq \frac{δ_{ν}}{| I_{u_{ν}} (u_{ν}) |}, u_{ν} \in U_{ν}, ℓ_{ν} \in L_{ν}\} . \end{matrix}

(23)

Codewords

x_{ν}

are the combination of subcodewords

x_{ν} (I_{u_{ν}} (u_{ν}))

with respective positions in

I_{u_{ν}} (u_{ν})

. For this multi-class cost-constrained ensemble, the random-coding distribution is thus given by

\begin{matrix} Q_{ν}^{cost} (x_{ν} | u_{ν}) & = \frac{1}{Ξ_{ν}} \prod_{u_{ν} \in U_{ν}} \prod_{t \in I_{u_{ν}} (u_{ν})} Q_{ν, u_{ν}}^{i_{ν}} (x_{ν, t}) 1 \{x_{ν} \in D_{ν, u_{ν}}^{i_{ν}}\} \end{matrix}

(24)

\begin{matrix} = \frac{1}{Ξ_{ν}} \prod_{t = 1}^{n} Q_{ν, u_{ν, t}}^{i_{ν}} (x_{ν, t}) 1 \{x_{ν} \in D_{ν, u_{ν}}^{i_{ν}}\}, \end{matrix}

(25)

where

Ξ_{ν}

is a normalizing constant and the class index is determined by the source message,

i_{ν} = i_{ν} (u_{ν})

.

The multi-class cost-constrained ensemble subsumes all the ensembles described in Section 3.1. First of all, the iid and icd ensembles are recovered by setting

L_{ν} = 0

and choosing the appropriate number of classes

K_{ν}

and random-coding distributions

Q_{ν}

,

Q_{ν, u_{ν}}

,

Q_{ν}^{i_{ν}}

and

Q_{ν, u_{ν}}^{i_{ν}}

. For all these cases, the set

D_{ν, u_{ν}}^{i_{ν}}

includes all generated codewords and the normalizing constant is

Ξ_{ν} = 1

.

To recover the constant-composition ensembles, for which constraints force the subcodewords to belong to some set

T_{ν}^{n} (Q_{ν})

or

T_{ν}^{| I_{u_{ν}} (u_{ν}) |} (Q_{ν, u_{ν}}^{i_{ν}})

, for each of the

K_{ν}

classes for user

ν

we set

δ_{ν} < 1

,

L_{ν} = | X_{ν} |

and bijectively map the channel input symbols to cost function indices

ℓ_{ν} (x_{ν})

so that

\begin{matrix} a_{ν, u_{ν}}^{i_{ν}, ℓ_{ν}} (x_{ν}) = 1 \{x_{ν} = ℓ_{ν}\} - Q_{ν, u_{ν}}^{i_{ν}} (ℓ_{ν}) . \end{matrix}

(26)

In case the ensemble does not depend on either

i_{ν}

or

u_{ν}

, these symbols are dropped from Equation (26). For example, for the md-cc ensemble, we have

a_{ν, ℓ_{ν}}^{i_{ν}} (ℓ_{ν}) = 1 \{x_{ν} = ℓ_{ν}\} - Q_{ν}^{i_{ν}} (x_{ν})

. In addition, the codeword set

D_{ν, u_{ν}}^{i_{ν}}

in Equation (23) is simplified as

\begin{matrix} D_{ν, u_{ν}}^{i_{ν}} = \{x_{ν} \in X_{ν}^{n} : |\frac{1}{n} \sum_{t = 1}^{n} 1 \{x_{ν, t} = x\} - Q_{ν}^{i_{ν}} (x)| \leq \frac{1}{n}, x \in X_{ν}\}, \end{matrix}

(27)

which is the same as

T^{n} (Q_{ν}^{i_{ν}})

given a version of Equation (15) where

Q_{ν}

may depend on

i_{ν}

.

Again, choosing the right number of classes

K_{ν}

and random-coding distributions

Q_{ν}

,

Q_{ν, u_{ν}}

,

Q_{ν}^{i_{ν}}

, and

Q_{ν, u_{ν}}^{i_{ν}}

recovers the various constant-composition ensembles. By construction, the set

D_{ν, u_{ν}}^{i_{ν}}

includes only the (sub)codewords with empirical distribution close to respectively

Q_{ν}

,

Q_{ν, u_{ν}}

,

Q_{ν}^{i_{ν}}

, and

Q_{ν, u_{ν}}^{i_{ν}}

, and the normalizing constant

Ξ_{ν}

is the probability of the corresponding type set (or product thereof). As an example, for the md-ccc ensemble, choosing the cost functions in Equation (26) as follows

\begin{matrix} a_{ν, u_{ν}}^{i_{ν}, ℓ_{ν}} (x_{ν} (I_{u_{ν}} (u_{ν}))) = \sum_{j \in I_{u_{ν}} (u_{ν})} 1 \{x_{ν, j} = ℓ_{ν}\} - Q_{ν, u_{ν}}^{i_{ν}} (ℓ_{ν}) \end{matrix}

(28)

yields the following cost-constraint set, which is equivalent to Equation (17),

\begin{matrix} D_{ν, u_{ν}}^{i_{ν}} = \{x_{ν} \in X_{ν}^{n} : | \frac{\sum_{j \in I_{u} (u_{ν})} 1 \{x_{ν, j} = x\}}{|I_{u} (u_{ν})|} - Q_{ν, u_{ν}}^{i_{ν}} (x) | \leq \frac{1}{|I_{u} (u_{ν})|}, u \in U_{ν}, x \in X_{ν}\} . \end{matrix}

(29)

3.3. Exponent for the Generalized Multi-Class Cost-Constrained Ensemble

Theorem 1.

For the transmission of N correlated memorlyess sources with joint distribution

P_{N}

, where

N = {1, 2, \dots, N}

, over a channel with input

x_{N}

over a memoryless channel with transition probabilitiy

W (y | x_{N})

, consider a random-coding multi-class cost-constrained ensemble where source messages for each user

ν \in N

are allocated, depending on their probabilities, into

K_{ν}

classes with thresholds

{γ_{ν, 0}, γ_{ν, 1}, \dots, γ_{ν, K_{ν}}}

, as in Equation (12), and encoded onto codewords randomly generated with a distribution

Q_{ν}^{i_{ν}} (x_{ν} | u_{ν})

that depends on the source message according to Equation (24) through symbol distributions

Q_{ν, u_{ν}}^{i_{ν}}

that possibly depend on the source-message class index

i_{ν}

and source symbol

u_{ν}

and

L_{ν}

cost functions

a_{ν, u_{ν}}^{i_{ν}, ℓ_{ν}}

,

ℓ_{ν} \in {1, 2, \dots, L_{ν}}

. This random-coding ensemble attains the following exponent

E^{\cos t}

\begin{matrix} E^{cost} & = min_{\begin{matrix} τ \in 2^{N} \ \emptyset, i_{N} \in K_{N} \end{matrix}} max_{0 \leq ρ \leq 1} max_{\begin{matrix} λ_{N}^{L, U} \geq 0, r_{N u_{N}}^{ℓ_{N}} \in R \end{matrix}} E_{τ}^{i_{N}} (ρ, λ_{N}^{L, U}, r_{N u_{N}}^{ℓ_{N}}), \end{matrix}

(30)

where the Gallager function

E_{τ}^{i_{N}} (ρ, λ_{N}^{L, U}, r_{N u_{N}}^{ℓ_{N}})

is given by

\begin{matrix} E_{τ}^{i_{N}} (ρ, λ_{N}^{L, U}, r_{N u_{N}}^{ℓ_{N}}) = \\ - log \sum_{u_{τ^{c}}, x_{τ^{c}}, y} {(\sum_{u_{τ}, x_{τ}} P_{N} {(u_{N})}^{\frac{1}{1 + ρ}} Λ_{N}^{i_{N}} (u_{N}) Q_{τ, u_{τ}}^{i_{τ}} (x_{τ}) R_{N, u_{N}}^{i_{N}} (x_{N}) {(Q_{τ^{c}, u_{τ^{c}}}^{i_{τ^{c}}} (x_{τ^{c}}) W (y | x_{N}))}^{\frac{1}{1 + ρ}})}^{1 + ρ}, \end{matrix}

(31)

and the functions

Λ_{σ}^{i_{σ}} (u_{σ})

and

R_{σ, u_{σ}}^{i_{σ}} (x_{σ})

are respectively given by

\begin{matrix} Λ_{σ}^{i_{σ}} (u_{σ}) = \prod_{ν \in σ} {(\frac{P_{ν} (u_{ν})}{γ_{ν, i_{ν}}})}^{λ_{ν}^{L}} {(\frac{γ_{ν, i_{ν} - 1}}{P_{ν} (u_{ν})})}^{λ_{ν}^{U}}, \end{matrix}

(32)

\begin{matrix} R_{σ, u_{σ}}^{i_{σ}} (x_{σ}) = \prod_{ν \in σ} \prod_{ℓ_{ν} \in L_{ν}} e^{r_{ν u_{ν}}^{ℓ_{ν}} a_{ν, u_{ν}}^{i_{ν}, ℓ_{ν}} (x_{ν})}, \end{matrix}

(33)

and implicitly depende on the set of optimization parameters

(λ_{N}^{L, U}, r_{N u_{N}}^{ℓ_{N}})

.

Proof.

This result is proved in Appendix A. □

The random-coding exponent in Equation (30) depends on the partitioning of the source-message set into classes, the channel input distributions, and the codeword cost-constraint functions. The best possible generalized cost-constraint exponent is obtained by optimizing over the multi-class partitioning, the cost constraints and the input distributions. We briefly discuss the optimization w. r. t. the thresholds of the source messages partitioning in Appendix B. In the next section, we provide some numerical examples where we compute the optimal exponents for either independent or correlated sources, and find that the optimal number of classes is two. In ref. [31] (Sec. 3.2.1.1), we provide some indications of why this optimality of only two classes is harder to establish in multi-user scenarios, compared to the single-user case. In the next section, we use Equations (31) and (30) to respectively obtain the source and channel Gallager functions of the various ensembles in Section 3.1 and rank their achievable exponents and transmissibility regions.

4. Discussion

4.1. Gallager Functions for Correlated Sources

In this section, we evaluate the generalized Gallager function

E_{τ}^{i_{N}} (ρ, λ_{N}^{L, U}, r_{N u_{N}}^{ℓ_{N}})

of the multi-class cost-constrained ensemble in Equation (31) for the various ensembles described in Section 3.1. In the cases where it is possible, we relate this Gallager function to the well-known [1] correlated-source and channel Gallager functions, respectively given by:

\begin{matrix} E_{s, σ} (ρ, P_{N}) = log \sum_{u_{σ^{c}}} {(\sum_{u_{σ}} P_{N} {(u_{N})}^{\frac{1}{1 + ρ}})}^{1 + ρ}, \end{matrix}

(34)

\begin{matrix} E_{0} (ρ, Q, W) = - log \sum_{y} {(\sum_{x} Q (x) W {(y | x)}^{\frac{1}{1 + ρ}})}^{1 + ρ}, \end{matrix}

(35)

where

σ \in 2^{N}

. Using that

[u_{σ}, u_{σ^{c}}] = u_{N}

, the standard Gallager source function is given by

E_{s} (ρ, P_{N}) = E_{s, N} (ρ, P_{N})

, with

N = {1, \dots, N}

the set of user indices.

For the simple mi-iid ensemble, with only one source class and no cost constraints,

K_{ν} = 1

and

L_{ν} = 0

for all

ν \in N

, and

Λ_{σ}^{i_{σ}} (u_{σ}) = R_{σ, u_{σ}}^{i_{σ}} (x_{σ}) = 1

for all

σ \in 2^{N}

. With no statistical dependency between messages and codewords,

Q_{ν, u_{ν}} (x_{ν}) = Q_{ν} (x_{ν})

. Setting

i_{N} = 1

and

λ_{N}^{L, U} = r_{N u_{N}}^{ℓ_{N}} = 0

in Equation (31) gives the Gallager function

E_{τ}^{mi - iid} (ρ, P_{N}, Q_{N}, W)

,

\begin{matrix} E_{τ}^{mi - iid} & (ρ, P_{N}, Q_{N}, W) \\ = - log \sum_{u_{τ^{c}}, x_{τ^{c}}, y} {(\sum_{u_{τ}, x_{τ}} P_{N} {(u_{N})}^{\frac{1}{1 + ρ}} Q_{τ} (x_{τ}) {(Q_{τ^{c}} (x_{τ^{c}}) W (y | x_{N}))}^{\frac{1}{1 + ρ}})}^{1 + ρ} . \end{matrix}

(36)

Isolating the summations over

u_{τ^{c}}

and

u_{τ}

, we can split the Gallager function as

\begin{matrix} E_{τ}^{mi - iid} (ρ, P_{N}, Q_{N}, W) & = E_{0} (ρ, Q_{τ}, Q_{τ^{c}} W) - E_{s, τ} (ρ, P_{N}), \end{matrix}

(37)

where

Q_{τ^{c}} W

is a shorthand for

Q_{τ^{c}} (x_{τ^{c}}) W (y | x_{N})

, the transition probability of a channel with input

x_{τ}

and output

(x_{τ^{c}}, y)

.

For the mi-icd ensemble, we have a similar set-up as for the mi-iid ensemble, where

Q_{ν, u_{ν}} (x_{ν})

now may depend on

u_{ν}

. In this case, the Gallager function

E_{τ}^{mi - icd} (\cdot)

is given by Equation (36) with

Q_{σ} (x_{σ})

replaced by

Q_{σ, u_{σ}} (x_{σ})

, for

σ \in {τ, τ^{c}}

:

\begin{matrix} E_{τ}^{mi - icd} & (ρ, P_{N}, Q_{N, U}, W) \\ = - log \sum_{u_{τ^{c}}, x_{τ^{c}}, y} {(\sum_{u_{τ}, x_{τ}} P_{N} {(u_{N})}^{\frac{1}{1 + ρ}} Q_{τ, u_{τ}} (x_{τ}) {(Q_{τ^{c}, u_{τ^{c}}} (x_{τ^{c}}) W (y | x_{N}))}^{\frac{1}{1 + ρ}})}^{1 + ρ} . \end{matrix}

(38)

As the summations over

u_{τ^{c}}

and

u_{τ}

are not independent from the rest, the Gallager function does not split into source and channel functions unless the sources are independent, in which case one can find an mi-iid ensemble with a tilted unconditional input distribution and identical exponent. To this end, and for a given conditional input distribution

Q_{ν, u_{ν}} (x_{ν})

, let us define a tilted distribution

Q_{ν}^{ρ} (x_{ν})

as

\begin{matrix} Q_{ν}^{ρ} (x_{ν}) = \sum_{u_{ν}} \frac{P_{ν} {(u_{ν})}^{\frac{1}{1 + ρ}}}{\sum_{{\bar{u}}_{ν}} P_{ν} {({\bar{u}}_{ν})}^{\frac{1}{1 + ρ}}} Q_{ν, u_{ν}} (x_{ν}) . \end{matrix}

(39)

From this equation, we have the following equality:

\begin{matrix} Q_{ν}^{ρ} (x_{ν}) \sum_{{\bar{u}}_{ν}} P_{ν} {({\bar{u}}_{ν})}^{\frac{1}{1 + ρ}} = \sum_{u_{ν}} P_{ν} {(u_{ν})}^{\frac{1}{1 + ρ}} Q_{ν, u_{ν}} (x_{ν}) . \end{matrix}

(40)

Substituting this identity together with

P_{N} (u_{N}) = P_{τ} (u_{τ}) P_{τ^{c}} (u_{τ^{c}})

in Equation (38) and rearranging the result, we obtain the following Gallager function for independent sources:

\begin{matrix} E_{τ}^{mi - icd} (ρ, P_{N}, Q_{N, U}, W) & = E_{0} (ρ, Q_{τ}^{ρ}, W Q_{τ^{c}}) - E_{s} (ρ, P_{τ}) \end{matrix}

(41)

\begin{matrix} = E_{τ}^{mi - iid} (ρ, P_{N}, [Q_{τ}^{ρ}, Q_{τ^{c}}], W) . \end{matrix}

(42)

For the md-iid and md-icd ensembles, there are

K_{ν}

source classes per user and no cost constraints, i.e.,

L_{ν} = 0

and

R_{σ, u_{σ}}^{i_{σ}} (x_{σ}) = 1

for

ν \in N

and

σ \in 2^{N}

. Settting

r_{N u_{N}}^{ℓ_{N}} = 0

in Equation (31) gives the Gallager function

E_{τ, i_{N}}^{md - icd} (\cdot)

for generic

i_{N}

[31] (Eq. (4.36)),

\begin{matrix} E & _{τ, i_{N}}^{md - icd} (ρ, P_{N}, Q_{N, U}^{i_{N}}, W) \\ = - log \sum_{u_{τ^{c}}, x_{τ^{c}}, y} {(\sum_{u_{τ}, x_{τ}} P_{N} {(u_{N})}^{\frac{1}{1 + ρ}} Λ_{N}^{i_{N}} (u_{_{N}}) Q_{τ, u_{τ}}^{i_{τ}} (x_{τ}) {(Q_{τ^{c}, u_{τ^{c}}}^{i_{τ c}} (x_{τ^{c}}) W (y | x_{N}))}^{\frac{1}{1 + ρ}})}^{1 + ρ} . \end{matrix}

(43)

The Gallager function

E_{τ, i_{N}}^{md - iid} (\cdot)

for the md-iid ensemble is obtained by setting

Q_{σ, u_{σ}} (x_{σ}) = Q_{σ} (x_{σ})

, independent of

u_{ν}

, for

σ \in {τ, τ^{c}}

in Equation (43). As the summations over

u_{τ^{c}}

and

u_{τ}

are now independent from the rest, the Gallager function splits as

\begin{matrix} E_{τ, i_{N}}^{md - iid} (ρ, P_{N}, Q_{N}^{i_{N}}, W) = E_{0} (ρ, Q_{τ}^{i_{τ}}, Q_{τ^{c}}^{i_{τ^{c}}} W) - E_{s, τ}^{i_{N}} (ρ, P_{N}), \end{matrix}

(44)

where we defined

E_{s, τ}^{i_{N}} (ρ, P_{N})

, a modified Gallager

E_{s}

-function, as

\begin{matrix} E_{s, τ}^{i_{N}} (ρ, P_{N}) = log \sum_{u_{τ^{c}}} {(\sum_{u_{τ}} P_{N} {(u_{N})}^{\frac{1}{1 + ρ}} Λ_{N}^{i_{N}} (u_{N}))}^{1 + ρ} . \end{matrix}

(45)

The maximization w.r.t.

λ_{N}^{L, U}

in Equation (30) only affects the second term in the r. h. s. of Equation (44), since the function

Λ_{N}^{i_{N}}

only appears in the source part of the exponent. In Appendix C, we discuss the properties of Equation (45) after the maximization w.r.t.

λ_{N}^{L, U}

as a function of

ρ

, and establish some connections to the Gallager source function (34) and to the source functions for the single-user md-iid ensemble in ref. [8].

The Gallager functions for the constant-composition ensembles differ from the ones considered so far in the presence of

L_{ν} = | X_{ν} |

cost functions

a_{ν, u_{ν}}^{i_{ν}, ℓ_{ν}} (x_{ν})

, given in Equation (26), for each input distribution

Q_{ν, u_{ν}}^{i_{ν}} (x_{ν})

. These cost functions appear in the Gallager functions through the factors

R_{σ, u_{σ}}^{i_{σ}} (x_{σ})

, for

σ \in {τ, τ^{c}}

that multiply each appearance of

Q_{σ, u_{σ}}^{i_{σ}} (x_{σ})

in the function, and through their associated optimization parameters

r_{N u_{N}}^{ℓ_{N}}

. The expressions of the Gallager functions for these constant-composition ensembles can be easily inferred from this obversation, so we focus on the factor

R_{σ, u_{σ}}^{i_{σ}} (x_{σ})

itself.

For the mi-cc and md-cc ensembles, the cost functions

a_{ν, u_{ν}}^{i_{ν}, ℓ_{ν}} (x_{ν})

, factor

R_{σ, u_{σ}}^{i_{σ}} (x_{σ})

, and associated optimization parameter

r_{ν u_{ν}}^{ℓ_{ν}}

are independent of

u_{ν}

, we thus write

a_{ν}^{i_{ν}, ℓ_{ν}} (x_{ν})

,

R_{σ}^{i_{σ}} (x_{σ})

, and

r_{ν}^{ℓ_{ν}}

. The expressions in Equations (26) and (33) for

L_{ν} = X_{ν}

give

\begin{matrix} R_{ν}^{i_{ν}} (x_{ν}) = e^{\sum_{ℓ_{ν} \in X_{ν}} r_{ν}^{ℓ_{ν}} (1 {x_{ν} = ℓ_{ν}} - Q_{ν}^{i_{ν}} (ℓ_{ν}))} . \end{matrix}

(46)

The exponent in Equation (46) can be evaluated as

\begin{matrix} \sum_{ℓ_{ν} \in X_{ν}} r_{ν}^{ℓ_{ν}} (1 \{x_{ν} = ℓ_{ν}\} - Q_{ν}^{i_{ν}} (ℓ_{ν})) & = r_{ν}^{x_{ν}} - \sum_{ℓ_{ν} \in X_{ν}} r_{ν}^{ℓ_{ν}} Q_{ν}^{i_{ν}} (ℓ_{ν}) \end{matrix}

(47)

\begin{matrix} = α_{τ, ν}^{i_{ν}} (x_{ν}), \end{matrix}

(48)

where we have defined a function

α_{τ, ν}^{i_{ν}} (x_{ν})

that depends on

τ

and

i_{ν}

through the optimization parameters

r_{ν}^{ℓ_{ν}}

. We can be easily verify that

α_{τ, ν}^{i_{ν}}

has zero mean, in other words,

\sum_{x_{ν}} α_{τ, ν}^{i_{ν}} (x_{ν}) Q_{ν}^{i_{ν}} (x_{ν}) = 0

. At this point, the parameters

r_{ν}^{ℓ_{ν}}

may be replaced by the equivalent real-valued functions

α_{τ, ν}^{i_{ν}} (x_{ν})

. We obtain the mi-cc Gallager function

E_{τ}^{mi - cc} (\cdot)

by setting

i_{N} = 1

and

λ_{N}^{L, U} = 0

in Equation (31),

\begin{array}{l} E_{τ}^{mi - cc} (ρ, α_{τ, N}, P_{N}, Q_{N}, W) \end{array}

\begin{matrix} = - log \sum_{u_{τ^{c}}, x_{τ^{c}}, y} {(\sum_{u_{τ}, x_{τ}} P_{N} {(u_{N})}^{\frac{1}{1 + ρ}} Q_{τ} (x_{τ}) e^{α_{τ, N} (x_{N})} {(Q_{τ^{c}} (x_{τ^{c}}) W (y | x_{N}))}^{\frac{1}{1 + ρ}})}^{1 + ρ} \end{matrix}

(49)

\begin{matrix} = - log \sum_{x_{τ^{c}}, y} {(\sum_{x_{τ}} Q_{τ} (x_{τ}) e^{α_{τ, N} (x_{N})} {(Q_{τ^{c}} (x_{τ^{c}}) W (y | x_{N}))}^{\frac{1}{1 + ρ}})}^{1 + ρ} - E_{s, τ} (ρ, P_{N}), \end{matrix}

(50)

where we split the Gallager function into channel and source terms in analogy to Equation (37).

In ref. [31] (Eq. (4.49)), the md-cc ensemble was studied for

N = 2

users in both the primal and dual domains. The md-cc Gallager function

E_{τ}^{md - cc} (\cdot)

for N users is obtained by combining the derivation of Equation (50) with that of Equation (44) to yield

\begin{matrix} E_{τ}^{md - cc} (ρ, α_{τ, N}^{i_{N}}, P_{N}, Q_{N}^{i_{N}}, W) = - log & \sum_{x_{τ^{c}}, y} {(\sum_{x_{τ}} Q_{τ} (x_{τ}) e^{α_{τ, N}^{i_{N}} (x_{N})} {(Q_{τ^{c}} (x_{τ^{c}}) W (y | x_{N}))}^{\frac{1}{1 + ρ}})}^{1 + ρ} \\ - log \sum_{u_{τ^{c}}} {(\sum_{u_{τ}} P_{N} {(u_{N})}^{\frac{1}{1 + ρ}} Λ_{N}^{i_{N}} (u_{_{N}}))}^{1 + ρ} . \end{matrix}

(51)

As in previous cases, the exponent is obtained after maximization over

α_{τ, N}^{i_{N}}

.

Concluding our list, the cost functions

a_{ν, u_{ν}}^{i_{ν}, ℓ_{ν}} (x_{ν})

, factors

R_{σ, u_{σ}}^{i_{σ}} (x_{σ})

, and parameters

r_{ν u_{ν}}^{ℓ_{ν}}

for the mi-ccc and md-ccc ensembles do depend on

u_{ν}

. In analogy to Equation (48), we define a zero-mean function

β_{τ, ν, u_{ν}}^{i_{ν}} (x_{ν})

as

\begin{matrix} β_{τ, ν, u_{ν}}^{i_{ν}} (x_{ν}) = r_{ν u_{ν}}^{x_{ν}} - \sum_{ℓ_{ν} \in X_{ν}} r_{ν u_{ν}}^{ℓ_{ν}} Q_{ν, u_{ν}}^{i_{ν}} (ℓ_{ν}), \end{matrix}

(52)

and similarly for

β_{τ, ν, u_{ν}} (x_{ν})

for the mi-ccc ensemble. The Gallager function for the mi-ccc ensemble

E_{τ}^{mi - ccc} (\cdot)

is obtained by combining the derivations of Equation (50) and of Equation (38),

\begin{matrix} E_{τ}^{mi - ccc} (ρ, β_{τ, N, u_{N}}, P_{N}, Q_{N, U}, W) \\ = - log \sum_{u_{τ^{c}}, x_{τ^{c}}, y} {(\sum_{u_{τ}, x_{τ}} P_{N} {(u_{N})}^{\frac{1}{1 + ρ}} Q_{τ, u_{τ}} (x_{τ}) e^{β_{τ, N, u_{N}} (x_{N})} {(Q_{τ^{c}, u_{τ^{c}}} (x_{τ^{c}}) W (y | x_{N}))}^{\frac{1}{1 + ρ}})}^{1 + ρ} . \end{matrix}

(53)

Similarly, for the md-ccc ensemble, and in agreeement with the 2-user case studied in ref. [31] (Eq. (4.45)), combining the derivations of Equations (50) and (43), yields

\begin{matrix} E_{τ, i_{N}}^{md - ccc} (ρ, β_{τ, N, u_{N}}^{i_{N}}, P_{N}, Q_{N, U}^{i_{N}}, W) \\ = - log \sum_{u_{τ^{c}}, x_{τ^{c}}, y} {(\sum_{u_{τ}, x_{τ}} P_{N} {(u_{N})}^{\frac{1}{1 + ρ}} Λ_{N}^{i_{N}} (u_{_{N}}) Q_{τ, u_{τ}}^{i_{τ}} (x_{τ}) e^{β_{τ, N, u_{N}}^{i_{N}} (x_{N})} {(Q_{τ^{c}, u_{τ^{c}}}^{i_{τ c}} (x_{τ^{c}}) W (y | x_{N}))}^{\frac{1}{1 + ρ}})}^{1 + ρ} . \end{matrix}

(54)

4.2. Transmissibility

We may obtain the transmissibility conditions from the achievable exponents derived in Section 4.1, following the random-coding method described in ref. [1] (Th. 5.6.4). The analysis extends the transmissibility condition for joint source-channel coding in ref. [1] (Prob. 5.16), to account for statistical dependency of the codeword on the source message in the multiuser set-up. As mentioned above, the source

U_{N}

is transmissible over the channel W if there exists a sequence of codes with vanishing error probability, or equivalently, with strictly positive achievable error exponent

E^{c o s t}

in Equation (30). As an example, we present the derivation for the mi-icd ensemble where the class and cost functions in Equations (32) and (33) are inactive, namely

Λ_{σ}^{i_{σ}} (u_{σ}) = R_{σ, u_{σ}}^{i_{σ}} (x_{σ}) = 1

for all

σ \in 2^{N}

, and leave the general case of

K_{ν} > 1

classes and cost-constrained codewords as an open problem.

For the mi-icd case, and similarly to Gallager’s

E_{0}

-function [1] (Th. 5.6.3), the Gallager function

E_{τ}^{mi - icd} (\cdot)

in Equation (38) is concave (∩) with respect to

ρ

and satisfies

E_{τ}^{mi - icd} (ρ = 0, \cdot) = 0

. For every

τ \subset 2^{N} \ \emptyset

, let

{\hat{ρ}}_{τ}

be the optimizer given by

{\hat{ρ}}_{τ} = \underset{0 \leq ρ \leq 1}{arg max} E_{τ}^{mi - icd} (ρ, P_{N}, Q_{N, U}, W) .

(55)

Therefore, the achievable exponent is strictly positive, namely

E_{τ}^{mi - icd} ({\hat{ρ}}_{τ}, \cdot) > 0

, as far as the slope of the

E_{τ}^{mi - icd} (ρ, \cdot)

function is strictly positive at

ρ = 0

, that is

\frac{\partial}{\partial ρ} {E_{τ}^{mi - icd} (ρ, P_{N}, Q_{N, U}, W)|}_{ρ = 0} > 0 .

(56)

Taking the derivative with respect to

ρ

at both sides of Equation (38), after some algebraic manipulations, we find that (56) is equivalent to

\begin{matrix} \sum_{u_{τ^{c}}} P_{τ^{c}} (u_{τ^{c}}) \sum_{x_{τ^{c}}, y} \sum_{u_{τ}, x_{τ}} & P_{τ | τ^{c}} (u_{τ} | u_{τ^{c}}) Q_{τ, u_{τ}} (x_{τ}) Q_{τ^{c}, u_{τ^{c}}} (x_{τ^{c}}) W (y | x_{N}) \times \\ \times log \frac{P_{τ | τ^{c}} (u_{τ} | u_{τ^{c}}) Q_{τ^{c}, u_{τ^{c}}} (x_{τ^{c}}) W (y | x_{N})}{\sum_{{\bar{u}}_{τ}, {\bar{x}}_{τ}} P_{τ | τ^{c}} ({\bar{u}}_{τ} | u_{τ^{c}}) Q_{τ, {\bar{u}}_{τ}} ({\bar{x}}_{τ}) Q_{τ^{c}, u_{τ^{c}}} (x_{τ^{c}}) W (y | [{\bar{x}}_{τ}, x_{τ^{c}}])} > 0 . \end{matrix}

(57)

We next write the expression in the left hand-side of the inequality (57) in terms of entropy and mutual information. We denote as

H (P)

the entropy of a source with distribution P [32] (Eq. (2.1)) and by

I (Q, W)

the mutual information of a channel W with input distribution Q [32] (Eq. (2.28)). For

σ \subset 2^{N}

, we define a channel input distribution

Q_{τ | σ}

, that is conditioned to the source messages

u_{σ}

, as

Q_{τ | σ} (x_{τ} | u_{σ}) = \sum_{u_{τ} \in U_{τ}} P_{τ | σ} (u_{τ} | u_{σ}) Q_{τ, u_{τ}} (x_{τ}) .

(58)

Therefore, the transmissibility condition (57) can be compactly expressed as

\begin{matrix} H (P_{τ | τ^{c}}) < I (Q_{τ | τ^{c}}, W | P_{τ^{c}} Q_{τ^{c} | τ^{c}}), τ \subset 2^{N} \ \emptyset . \end{matrix}

(59)

As it is,

Q_{τ^{c} | τ^{c}}

is “transparent”, as it cancels inside the fraction, and the channel law may also be written as

Q_{τ^{c} | τ^{c}} W

, removing the conditioning in the mutual information. With

N = {1, 2}

in Equation (59), we recover the achievable Cover-El Gamal-Salehi region [11] (Eq. (3)).

4.3. Numerical Examples

In this section, we present two simple examples showing that the exponent of the md-iid ensemble can be larger than that of the mi-iid ensemble with only two classes (and associated input distributions) for each user. First, we consider two correlated discrete memoryless sources,

N = 2

and

N = {1, 2}

, with alphabet

U_{ν} = {0, 1}

for both users

ν \in N

, and probability distribution

P_{N} (u_{1}, u_{2})

given in matrix form as

P_{N} = (\begin{matrix} 0.0005 & 0.0095 \\ 0.0005 & 0.9895 \end{matrix}) .

(60)

The sources are sent over a discrete memoryless multiple-access channel with input alphabets

X_{1} = X_{2} = {1, 2, 3, 4, 5, 6}

and output alphabet

Y = {1, 2, 3, 4}

. The channel transition probabilites are given by a 36 × 4 matrix W, such that

W (y | x_{1}, x_{2})

is the row

x_{1} + 6 (x_{2} - 1)

. The transition matrix W is given by

\begin{matrix} W = (\begin{matrix} W_{1} \\ W_{2} \\ W_{3} \\ W_{4} \\ W_{5} \\ W_{6} \end{matrix}), \end{matrix}

(61)

where the 6 × 4 submatrices

W_{ℓ}

,

ℓ = 1, \dots, 6

are given as follows. First, the submatrix

W_{1}

corresponds to the point-to-point channel discussed in ref. [8] (Sec. IV.C), given by

\begin{matrix} W_{1} = (\begin{matrix} 1 - 3 k_{1} & k_{1} & k_{1} & k_{1} \\ k_{1} & 1 - 3 k_{1} & k_{1} & k_{1} \\ k_{1} & k_{1} & 1 - 3 k_{1} & k_{1} \\ k_{1} & k_{1} & k_{1} & 1 - 3 k_{1} \\ 0.5 - k_{2} & 0.5 - k_{2} & k_{2} & k_{2} \\ k_{2} & k_{2} & 0.5 - k_{2} & 0.5 - k_{2} \end{matrix}), \end{matrix}

(62)

for

k_{1} = 0.045

and

k_{2} = 0.01

. Let the m-th row of matrix

W_{1}

is denoted by

W_{1} (m)

. The matrix

W_{2}

(resp.

W_{3}

) is a

6 \times 4

matrix whose rows are all

W_{1} (5)

(resp.

W_{1} (6)

). The matrices

W_{4}

,

W_{5}

and

W_{6}

are respectively given by

\begin{matrix} W_{4} = (\begin{matrix} W_{1} (2) \\ W_{1} (3) \\ W_{1} (4) \\ W_{1} (1) \\ W_{1} (6) \\ W_{1} (5) \end{matrix}), W_{5} = (\begin{matrix} W_{1} (3) \\ W_{1} (4) \\ W_{1} (1) \\ W_{1} (2) \\ W_{1} (5) \\ W_{1} (6) \end{matrix}), W_{6} = (\begin{matrix} W_{1} (4) \\ W_{1} (1) \\ W_{1} (2) \\ W_{1} (3) \\ W_{1} (6) \\ W_{1} (5) \end{matrix}) . \end{matrix}

(63)

The optimal achievable exponent [8] (Sec. IV.C) for the single-user channel

W_{1}

in Equation (62) is related to two different distributions

Q^{☆}

and

Q^{†}

, given in vector form by

\begin{matrix} Q^{☆} = (0, 0, 0, 0, 1 \ 2, 1 \ 2), \end{matrix}

(64)

\begin{matrix} Q^{†} = (1 \ 4, 1 \ 4, 1 \ 4, 1 \ 4, 0, 0) . \end{matrix}

(65)

We let each user employ these distributions in the md-iid ensemble with input distribution in Equation (13) according to the source message partitioning in Equation (12) with

K_{ν} = 2

classes per user and thresholds

γ_{N} = (γ_{1}, γ_{2})

. Since we consider two input distributions for each user, the channel Gallager function

{max}_{ρ \in [0, 1]} E_{0} (ρ, Q_{τ}^{i_{τ}}, W Q_{τ^{c}}^{i_{τ^{c}}})

is not concave in

ρ

[8]. To find the md-iid exponent

E^{md - iid}

, we optimize over the class thresholds following the method in Appendix B with the Gallager function in Equation (44), exploit the properties of the source function in Equation (45) in Appendix C, and also find the optimal input distribution assignment of

Q_{ν}^{i_{ν}}

for each

ν \in {1, 2}

. In our setting, we have four possible assignments, namely

\begin{matrix} Ω_{1} : Q_{1}^{1} = Q_{2}^{1} = Q^{☆}, Q_{1}^{2} = Q_{2}^{2} = Q^{†}, \end{matrix}

(66)

\begin{matrix} Ω_{2} : Q_{1}^{1} = Q_{2}^{2} = Q^{☆}, Q_{1}^{2} = Q_{2}^{1} = Q^{†}, \end{matrix}

(67)

\begin{matrix} Ω_{3} : Q_{1}^{2} = Q_{2}^{1} = Q^{☆}, Q_{1}^{1} = Q_{2}^{2} = Q^{†}, \end{matrix}

(68)

\begin{matrix} Ω_{4} : Q_{1}^{2} = Q_{2}^{2} = Q^{☆}, Q_{1}^{1} = Q_{2}^{1} = Q^{†} . \end{matrix}

(69)

We start our numerical discussion by assessing which of the possible four assignments in Equations (66)–(69) leads to a higher error exponent. For each possible pair of thresholds

(γ_{1}, γ_{2})

, we numerically calculate the optimal assignment

Ω^{☆} (γ_{N})

given by

Ω^{☆} (γ_{N}) = \underset{Ω_{j}}{arg max} min_{i_{N}} min_{τ} E_{τ}^{i_{N}} (γ_{N}),

(70)

and the corresponding achievable error exponent

E^{m d - i i d} (γ_{N})

as

E^{c o s t} (γ_{N}) = max_{Ω_{j}} min_{i_{N}} min_{τ} E_{τ}^{i_{N}} (γ_{N}),

(71)

where the exponent function

E_{τ}^{i_{N}} (γ_{N})

is given in Equation (A55). Figure 2 and Figure 3 respectively show

Ω^{☆} (γ_{N})

and

E^{c o s t} (γ_{N})

for the valid range of

γ_{N}

. For most pair of thresholds

(γ_{1}, γ_{2})

, assignments

Ω_{1}

and

Ω_{3}

lead to the highest exponent among the possible assignments, while assignments

Ω_{2}

and

Ω_{4}

are optimal only for a marginal region. Using this information, and combined with the values of the achievable exponents in Figure 3, we determine the message-dependent exponent

E^{m d - i i d} = max_{γ_{N}} E^{c o s t} (γ_{N}) .

(72)

In this example, we obtained the achievable exponent

E^{md - iid} = 0.2611

, corresponding to the input distribution assignment

Ω_{1}

in Equation (66) and optimal source message partitioning

γ_{1}^{☆} = 0.8469

and

γ_{2}^{☆} = 0.6581

. The optimal point

γ_{N}^{☆}

is shown by a white (black) bullet in Figure 2 (Figure 3).

Alternatively, we may first optimize over

γ_{N}

and then over the assignments

Ω_{j}

. To do so, we solve the system of Equation (A58) in Appendix B to numerically determine the optimal thresholds

γ_{N}^{☆}

, and compute the exponent

E^{cost} (Ω_{j})

as

E^{cost} (Ω_{j}) = min_{i_{N}} min_{τ} E_{τ}^{i_{N}} (γ_{N}^{☆}),

(73)

where the exponent function

E_{τ}^{i_{N}} (γ_{N})

is given in Equation (A55). We provide in Table 1 the values of the optimal thresholds

γ_{N}^{☆}

and exponents

E_{τ}^{i_{N}} (γ_{N}^{☆})

under the different assignment

Ω_{j}

, for the three types of error

τ

and the four possible user classes

i_{N}

. For each assignment, the minimum over

i_{N}

and

τ

as in Equation (73) is highlighted in gray, leading to the exponent

E^{\cos t} (Ω_{j})

. The message-dependent exponent is then

\begin{matrix} E^{md - iid} = max_{j} E^{\cos t} (Ω_{j}), \end{matrix}

(74)

recovering the error exponent

E^{md - iid} = 0.2611

for input distribution assignment

Ω_{1}

obtained using the previous method in Equation (71).

In the second example, we consider the transmission of two independent discrete memoryless sources with identical source alphabets

U_{ν} = {0, 1}

with distributions induced by the marginals of Equation (60), given by

P_{1} (0) = 0.01

and

P_{2} (0) = 0.001

. These sources are transmitted over the multiple-access channel with transition probability given by Equation (61), and are encoded using the md-iid ensemble with the input distribution assignments

Ω_{j}

in Equations (66)–(69). Following the footsteps of the correlated sources case, in Table 2 we calculate optimal thresholds

γ_{N}^{☆}

and exponents

E_{τ}^{i_{N}} (γ_{N}^{☆})

for the possible input distribution assignments and determine the exponent of the md-iid ensemble using Equations (73) and (74). In this case, the optimal assignment is again

Ω_{1}

, with optimal source message partitioning specified by the thresholds

γ_{1}^{☆} = 0.8779

and

γ_{2}^{☆} = 0.6933

, achieving an exponent of

E^{md - iid} = 0.2458

, slightly smaller than that of correlated sources.

For the sake of completeness and purpose of comparison, we also calculate the exponent for the mi-iid ensemble described in Equation (8). In the absence of message dependence, for a given assignment

Ω_{j}

, the mi-iid exponent is given by

\begin{matrix} E^{no - cost} (Ω_{j}) = min_{τ} E_{τ}, \end{matrix}

(75)

where the exponent function

E_{τ}

is given by

E_{τ} = {max}_{ρ} E_{τ}^{mi - iid} (ρ, P_{N}, Q_{N}, W)

and

E_{τ}^{mi - iid}

is the Gallager function in Equation (37), described in the previous subsection. For both the correlated and independent sources described above, Table 3 presents the achievable exponents

E_{τ}

for each type of error

τ

and input distribution assignment

(Q_{1}, Q_{2})

, where

Q_{1}

and

Q_{2}

are either of

Q^{☆}

and

Q^{†}

in Equations (64) and (65). In our numerical example for correlated sources, the assignment with highest exponent is

(Q_{1}, Q_{2}) = (Q^{†}, Q^{☆})

, giving an exponent of

E^{mi - iid} = 0.2503

, slightly smaller than that of the md-iid ensemble. In contrast, the mi-iid exponent for independent sources, according to the second part of Table 3 is found to be

E^{mi - iid} = 0.2367

with input distribution

(Q_{1}, Q_{2}) = (Q^{☆}, Q^{†})

. In this case, the md-iid exponent

E^{md - iid}

is around

4 %

larger that the mi-iid; this situation is in contrast with to-point communication, where the gain in exponent achieved by an ensemble with two distributions is typically smaller, for example,

1 %

in ref. [8]. Hence, message-dependent random coding with two class distributions, compared to iid random coding, may lead to a higher error exponent gain in the MAC than in point-to-point communication.

4.4. Comparison of the Random-Coding Achievable Error Exponents

From the numerical results presented in Section 4.3, as well as from refs. [8,20,28,31], the message-dependent ensembles attain in general a larger exponent than their message-independent counterparts. We now compare the random-coding exponents for the ensembles presented in Section 3.1, whose Gallager functions were obtained in Section 4.1.

For independent sources, we found in Equation (42) that for a given conditional input distribution

Q_{ν, u_{ν}} (x_{ν})

and

ρ

, there exists an iiid distribution

Q_{ν, ρ}

given by Equation (39) with identical Gallager function. Thus, the mi-iid and mi-icd ensembles attains the same exponent, after maximization over the input distributions. Similarly, we conclude that md-iid and md-icd-ensembles attain the same exponent.

In ref. [31] (Prop. 2.9), it was proved that for point-to-point communication, the exponent of the mi-ccc ensemble may be lower than that of the mi-cc ensemble. The same steps actually prove the same result for the MAC with independent sources. Thus, for the MAC with independent sources we have

\begin{matrix} E^{m i - c c c} \leq E^{m i - c c} \leq E^{m d - c c}, E^{m d - c c c} \leq E^{m d - c c}, \end{matrix}

(76)

\begin{matrix} E^{m i - i i d} \leq E^{m i - c c} \leq E^{m d - c c}, E^{m d - i i d} \leq E^{m d - c c}, \end{matrix}

(77)

and

E^{m d - c c}

is thus largest among the ensembles in Section 3.1 for an arbitrary input distribution. As discussed in ref. [29] (Th. 4), for optimal input distributions both

E^{m d - c c}

and

E^{m d - i i d}

may coincide.

Concerning the optimal partitioning into message classes, for point-to-point communication it is known that partitioning the source-message set into two classes is sufficient to attain the optimal error exponent [8,31] (Prop. 2.7). However, the proof of ref. [31] (Prop. 2.7) cannot be easily generalized to the MAC with independent sources. At the same time, we could not find an example showing that assigning more than two input distributions leads to a larger exponent. Hence, finding the sufficient number of input distributions is for the message-dependent exponent is an open problem.

The comparisons in Equations (76) and (77) for correlated sources require, in general, a more sophisticated machinery and we consider here two simple cases. For the message-dependent md-icd and md-ccc ensembles, we observe that compared to

E_{τ, i_{N}}^{m d - i c d}

in Equation (43) the

E_{τ, i_{N}}^{m d - c c c}

exponent in Equation (54) contains an additional term

β_{τ, N, u_{N}}^{i_{N}} (x_{N})

to guarantee the constant-composition distribution as in Equation (52). This allows to recover

E_{τ, i_{N}}^{m d - c c c}

by setting

β_{τ, N, u_{N}}^{i_{N}} (x_{N}) = 0

in

E_{τ, i_{N}}^{m d - i c d}

and to prove that

E^{m d - i c d} \leq E^{m d - c c c}

after maximizing w. r. t.

β_{τ, N, u_{N}}^{i_{N}} (x_{N})

. Similarly for the ensembles with statistical independence between messages and codewords, we observe that the constant-composition exponent

E_{τ, i_{N}}^{m d - c c}

in Equation (51) also contains the additional term

α_{τ, N}^{i_{N}} (x_{N})

compared to its iid counterpart

E_{τ, i_{N}}^{m d - i i d}

in Equation (44), yielding

E^{m d - i i d} \leq E^{m d - c c}

. Put together, for correlated sources it holds that

\begin{matrix} E^{m d - i c d} \leq E^{m d - c c c}, E^{m d - i i d} \leq E^{m d - c c}, \end{matrix}

(78)

suggesting that, as in the case of single-user communication, the use of constant-composition input distributions may lead to higher exponents than the symbol-wise independent distributions when transmitting correlated sources over the MAC.

Summarizing, proper choices of the cost functions recover the different coding schemes considered in Section 3.1, including message-dependent and message-independent versions of iid, independent conditionally distributed, constant-composition, and conditional constant composition ensembles. Thanks to the flexibility of the generalized cost-constraint random-coding ensemble, the achievable exponents of the various ensembles can be compared and ranked, both numerically and analytically.

Author Contributions

Conceptualization, A.M. and A.G.i.F.; methodology J.F.-S., software, A.R.; formal analysis, A.R.; witting A.R., J.F.-S. and A.M. All authors have equally contributed to this work. All authors have read and agreed to the published version of the manuscript.

Funding

This work has been funded in part by the European Research Council under grant 725411, and by the Spanish Ministry of Economy and Competitiveness under grant TEC2016-78434-C3-1-R.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proof of Theorem 1

We start by bounding the average error probability over the generalized cost-constraint ensemble,

{\bar{P}}_{e}

. Counting ties as errors, the random coding union bound [2] (Th. 16) for joint source-channel is

\begin{matrix} {\bar{P}}_{e} \leq \sum_{u_{N}, x_{N}, y} P_{N} (u_{N}) Q_{N} (x_{N} | u_{N}) W (y | x_{N}) min \{1, \sum_{\begin{matrix} {\hat{u}}_{N} \neq u_{N} \end{matrix}} Pr \{\frac{P_{N} ({\hat{u}}_{N}) W (y | {\hat{X}}_{N})}{P_{N} (u_{N}) W (y | x_{N})} \geq 1\}\}, \end{matrix}

(A1)

where

Q_{N} (x_{N} | u_{N})

is given by Equation (7), with every user using the generalized cost-constrained input distribution

Q_{ν} (x_{ν} | u_{ν})

as in Equation (24), and

{\hat{x}}_{N}

has the same distribution as

x_{N}

but conditioned on

{\hat{u}}_{N}

rather than

u_{N}

, i.e.,

Q_{N} (x_{N} | {\hat{u}}_{N})

. The summation over

{\hat{u}}_{N} \neq u_{N}

can be split into

2^{N} - 1

distinct types of error events indexed by the non-empty subsets in the power set of the user indices

2^{N} \ \emptyset

, e.g.,

τ \in {{1}, {2}, {1, 2}}

for

N = 2

, such that

{\hat{u}}_{τ^{c}} = u_{τ^{c}}

and

{\hat{u}}_{ν} \neq u_{ν}

for all

ν \in τ

.

Since

min {1, a + b} \leq min {1, a} + min {1, b}

, we bound

{\bar{P}}_{e}

as

\begin{matrix} {\bar{P}}_{e} \leq \sum_{τ \in 2^{N} \ \emptyset} {\bar{P}}_{e}^{τ}, \end{matrix}

(A2)

where

{\bar{P}}_{e}^{τ}

is in turn given by

\begin{matrix} {\bar{P}}_{e}^{τ} = \sum_{u_{N}, x_{N}, y} P_{N} (u_{N}) Q_{N} (x_{N} | u_{N}) W (y | x_{N}) min \{1, \sum_{\begin{matrix} {\hat{u}}_{N} : {\hat{u}}_{τ^{c}} = u_{τ^{c}}, \\ {\hat{u}}_{ν} \neq u_{ν}, ν \in τ \end{matrix}} Pr \{\frac{P_{N} ({\hat{u}}_{N}) W (y | [x_{τ^{c}}, {\hat{X}}_{τ}])}{P_{N} (u_{N}) W (y | x_{N})} \geq 1\}\}, \end{matrix}

(A3)

where the inner probability is computed according to the distribution

Q_{τ} (x_{τ} | u_{τ})

, including only users

u_{τ}

in the set

τ

as

{\hat{x}}_{τ^{c}} = x_{τ^{c}}

. We recall that

[x_{τ^{c}}, {\hat{X}}_{τ}]

is the sorted merger of the channel inputs for users in the sets

τ^{c}

and

τ

, in this case

x_{τ^{c}}

and

{\hat{X}}_{τ}

respectively.

Next, we split the summation over

u_{N}

in Equation (A3) into classes

i_{N} \in K_{N}

defined by Equation (12), summing then over the messages belonging to the Cartesian product of the sets

A_{N}^{i_{N}}

. We note that codewords are generated according to distributions that depend on the class index of the sources. Let

D_{N, u_{N}}^{i_{N}}

be the Cartesian product of the sets of codewords

D_{ν, u_{ν}}^{i_{ν}}

in Equation (23) for

ν = 1, 2, \dots, N

, and define

\begin{matrix} Q_{N}^{i_{N}} (x_{N} | u_{N}) = \prod_{ν \in N} Q_{ν}^{i_{ν}} (x_{ν} | u_{ν}), \end{matrix}

(A4)

where

Q_{ν}^{i_{ν}} (x_{ν} | u_{ν})

is given by either Equation (24) or Equation (25). Then, the double outer summation of Equation (A3) over

u_{N}

and

x_{N}

can be written as

\begin{matrix} \sum_{u_{N}, x_{N}} & P_{N} (u_{N}) Q_{N} (x_{N} | u_{N}) = \sum_{i_{N} \in K_{N}} \sum_{\begin{matrix} u_{N} \in A_{N}^{i_{N}} \\ x_{N} \in D_{N, u_{N}}^{i_{N}} \end{matrix}} P_{N} (u_{N}) Q_{N}^{i_{N}} (x_{N} | u_{N}) \end{matrix}

(A5)

\begin{matrix} = \sum_{i_{τ^{c}} \in K_{τ^{c}}} \sum_{\begin{matrix} u_{τ^{c}} \in A_{τ^{c}}^{i_{τ^{c}}} \\ x_{τ^{c}} \in D_{τ^{c}, u_{τ^{c}}}^{i_{τ^{c}}} \end{matrix}} P_{τ^{c}} (u_{τ^{c}}) Q_{τ^{c}}^{i_{τ^{c}}} (x_{τ^{c}} | u_{τ^{c}}) \sum_{i_{τ} \in K_{τ}} \sum_{\begin{matrix} u_{τ} \in A_{τ}^{i_{τ}} \\ x_{τ} \in D_{τ, u_{τ}}^{i_{τ}} \end{matrix}} P_{τ | τ^{c}} (u_{τ} | u_{τ^{c}}) Q_{τ}^{i_{τ}} (x_{τ} | u_{τ}), \end{matrix}

(A6)

where we split the summations over

u_{N}

and

x_{N}

into separate summations over

u_{τ^{c}}

and

u_{τ}

, similarly with

x_{τ^{c}}

and

x_{τ}

with the corresponding rearrangements in the probabilities, and written the term

Q_{τ^{c}}^{i_{τ^{c}}} (x_{τ^{c}} | u_{τ^{c}})

in a similar way to Equation (A4). The inner summation of Equation (A3) can be split in an analogous manner based on the classes to which

{\hat{u}}_{τ}

belongs, now indexed by the variable

j_{τ} \in K_{τ}

. Applying this fact together with Markov’s inequality

\begin{matrix} Pr {A \geq 1} \leq min_{s \geq 0} E [A^{s}] \end{matrix}

(A7)

to upper bound the probability with a parameter

s \geq 0

that implicitly depends on the error-event type

τ

and indices

i_{τ^{c}}

,

i_{τ}

, and

j_{τ}

. We bound the inner summation of Equation (A3) as

\begin{matrix} \sum_{\begin{matrix} {\hat{u}}_{N} : {\hat{u}}_{τ^{c}} = u_{τ^{c}}, \\ {\hat{u}}_{ν} \neq u_{ν}, ν \in τ \end{matrix}} & Pr \{\frac{P_{N} ({\hat{u}}_{N}) W (y | [x_{τ^{c}}, {\hat{X}}_{τ}])}{P_{N} (u_{N}) W (y | x_{N})} \geq 1\} \leq \\ \leq \sum_{j_{τ} \in K_{τ}} min_{s \geq 0} \{\sum_{\begin{matrix} {\hat{u}}_{τ} \in A_{τ}^{j_{τ}} \\ {\hat{x}}_{τ} \in D_{τ, {\hat{u}}_{τ}}^{j_{τ}} \end{matrix}} Q_{τ}^{j_{τ}} ({\hat{x}}_{τ} | {\hat{u}}_{τ}) {(\frac{P_{τ | τ^{c}} ({\hat{u}}_{τ} | u_{τ^{c}}) W (y | [x_{τ^{c}}, {\hat{x}}_{τ}])}{P_{τ | τ^{c}} (u_{τ} | u_{τ^{c}}) W (y | [x_{τ^{c}}, x_{τ}])})}^{s}\}, \end{matrix}

(A8)

where we also used that

P_{N} ({\hat{u}}_{N}) = P_{τ^{c}} ({\hat{u}}_{τ^{c}}) P_{τ | τ^{c}} ({\hat{u}}_{τ} | {\hat{u}}_{τ^{c}}) = P_{τ^{c}} (u_{τ^{c}}) P_{τ | τ^{c}} ({\hat{u}}_{τ} | u_{τ^{c}})

to rewrite the message probabilities in Equation (A8) and we expressed the codeword

x_{N}

as

[x_{τ^{c}}, x_{τ}]

.

Inserting Equations (A6) and (A8) into Equation (A3) and using the following inequality for

A \geq 0

,

\begin{matrix} min {1, A} \leq min_{ρ \in [0, 1]} A^{ρ}, \end{matrix}

(A9)

where

ρ \in [0, 1]

, we further bound

{\bar{P}}_{e}^{τ}

as

\begin{matrix} {\bar{P}}_{e}^{τ} \leq \sum_{i_{τ^{c}} \in K_{τ^{c}}} \sum_{i_{τ} \in K_{τ}} \sum_{j_{τ} \in K_{τ}} min_{s \geq 0} min_{ρ \in [0, 1]} {\bar{P}}_{e, i_{τ^{c}} i_{τ}}^{τ, j_{τ}}, \end{matrix}

(A10)

where after some minor rearrangements

{\bar{P}}_{e, i_{τ^{c}} i_{τ}}^{τ, j_{τ}}

is in turn given by

\begin{matrix} {\bar{P}}_{e, i_{τ^{c}} i_{τ}}^{τ, j_{τ}} = \sum_{\begin{matrix} u_{τ^{c}} \in A_{τ^{c}}^{i_{τ^{c}}} \\ x_{τ^{c}} \in D_{τ^{c}, u_{τ^{c}}}^{i_{τ^{c}}} \end{matrix}} & P_{τ^{c}} (u_{τ^{c}}) Q_{τ^{c}}^{i_{τ^{c}}} (x_{τ^{c}} | u_{τ^{c}}) \sum_{y \in Y^{n}} \sum_{\begin{matrix} u_{τ} \in A_{τ}^{i_{τ}} \\ x_{τ} \in D_{τ, u_{τ}}^{i_{τ}} \end{matrix}} P_{τ | τ^{c}} (u_{τ} | u_{τ^{c}}) Q_{τ}^{i_{τ}} (x_{τ} | u_{τ}) W (y | [x_{τ^{c}}, x_{τ}]) \\ {(\sum_{\begin{matrix} {\hat{u}}_{τ} \in A_{τ}^{j_{τ}} \\ {\hat{x}}_{τ} \in D_{τ, {\hat{u}}_{τ}}^{j_{τ}} \end{matrix}} Q_{τ}^{j_{τ}} ({\hat{x}}_{τ} | {\hat{u}}_{τ}) {(\frac{P_{τ | τ^{c}} ({\hat{u}}_{τ} | u_{τ^{c}}) W (y | [x_{τ^{c}}, {\hat{x}}_{τ}])}{P_{τ | τ^{c}} (u_{τ} | u_{τ^{c}}) W (y | [x_{τ^{c}}, x_{τ}])})}^{s})}^{ρ} . \end{matrix}

(A11)

Note that, for some conveniently chosen variables

z_{0}

and

z_{i_{τ}}

, sets

Z_{0}

and

Z_{i_{τ}}

, as well as functions

f_{0} (z_{0})

and

f_{i_{τ}}^{s} (z_{0}, z_{i_{τ}})

, with

i_{τ} \in K_{τ}

, we can express

{\bar{P}}_{e, i_{τ^{c}} i_{τ}}^{τ, j_{τ}}

as

\begin{matrix} {\bar{P}}_{e, i_{τ^{c}} i_{τ}}^{τ, j_{τ}} = \sum_{z_{0} \in Z_{0}} f_{0} (z_{0}) (\sum_{z_{i_{τ}} \in Z_{i_{τ}}} f_{i_{τ}}^{1 - s_{i_{τ}, j_{τ}} ρ_{i_{τ}, j_{τ}}} (z_{0}, z_{i_{τ}})) {(\sum_{z_{j_{τ}} \in Z_{j_{τ}}} f_{j_{τ}}^{s_{i_{τ}, j_{τ}}} (z_{0}, z_{j_{τ}}))}^{ρ_{i_{τ}, j_{τ}}} . \end{matrix}

(A12)

In Equation (A12), the variable

z_{0}

stands for the triplet

(u_{τ^{c}}, x_{τ^{c}}, y)

, the alphabet

Z_{0}

for the Cartesian product

A_{τ^{c}}^{i_{τ^{c}}} \times D_{τ^{c}, u_{τ^{c}}}^{i_{τ^{c}}} \times Y^{n}

and the function

f_{0} (z_{0})

is given by

P_{τ^{c}} (u_{τ^{c}}) Q_{τ^{c}}^{i_{τ^{c}}} (x_{τ^{c}} | u_{τ^{c}})

. The variable

z_{i_{τ}}

stands for the pair

(u_{τ}, x_{τ})

, the alphabet

Z_{i_{τ}}

for the Cartesian product

A_{τ}^{i_{τ}} \times D_{τ, u_{τ}}^{i_{τ}}

and the function

f_{i_{τ}}^{s} (z_{0}, z_{i_{τ}})

is given by

P_{τ | τ^{c}} {(u_{τ} | u_{τ^{c}})}^{s} Q_{τ}^{i_{τ}} (x_{τ} | u_{τ}) W {(y | [x_{τ^{c}}, x_{τ}])}^{s}

.

The optimization parameters s and

ρ

in Equation (A10) implictly depend on the error-event type

τ

and the indices

i_{τ^{c}}

,

i_{τ}

, and

j_{τ}

. For new parameters

{\bar{ρ}}_{ℓ_{τ}} \in [0, 1]

,

ℓ_{τ} \in K_{τ}

, setting

\begin{matrix} s_{i_{τ}, j_{τ}} = \frac{1}{1 + {\bar{ρ}}_{j_{τ}}}, \end{matrix}

(A13)

\begin{matrix} ρ_{i_{τ}, j_{τ}} = \frac{{\bar{ρ}}_{i_{τ}} (1 + {\bar{ρ}}_{j_{τ}})}{1 + {\bar{ρ}}_{i_{τ}}}, \end{matrix}

(A14)

In Equation (A12), we obtain the following partial upper bound in Equation (A10) as

\begin{matrix} min_{s \geq 0} min_{ρ \in [0, 1]} {\bar{P}}_{e, i_{τ^{c}} i_{τ}}^{τ, j_{τ}} \leq min_{{\bar{ρ}}_{ℓ_{τ}} \in {[0, 1]}^{K_{τ}}} \sum_{z_{0} \in Z_{0}} f_{0} (z_{0}) \sum_{z_{i_{τ}} \in Z_{i_{τ}}} f_{i_{τ}}^{\frac{1}{1 + {\bar{ρ}}_{i_{τ}}}} (z_{0}, z_{i_{τ}}) {(\sum_{z_{j_{τ}} \in Z_{j_{τ}}} f_{j_{τ}}^{\frac{1}{1 + {\bar{ρ}}_{j_{τ}}}} (z_{0}, z_{j_{τ}}))}^{\frac{{\bar{ρ}}_{i_{τ}} (1 + {\bar{ρ}}_{j_{τ}})}{1 + {\bar{ρ}}_{i_{τ}}}} . \end{matrix}

(A15)

Here, we have kept implicit the dependence on

τ

and

i_{τ^{c}}

of the optimization parameter

{\bar{ρ}}_{ℓ_{τ}}

. Now, applying Hölder’s inequality [33] (Th. 13) in the form

\begin{matrix} \sum_{i \in K} α_{i} a_{i} b_{i} \leq {(\sum_{i \in K} α_{i} a_{i}^{p})}^{\frac{1}{p}} {(\sum_{i \in K} α_{i} b_{i}^{\frac{p}{p - 1}})}^{\frac{p - 1}{p}}, for p \in [1, \infty), \end{matrix}

(A16)

to the expression in Equation (A15) with

p_{i_{τ}, j_{τ}} = 1 + {\bar{ρ}}_{i_{τ}} \geq 1

, we obtain

\begin{matrix} min_{s \geq 0} min_{ρ \in [0, 1]} {\bar{P}}_{e, i_{τ^{c}} i_{τ}}^{τ, j_{τ}} \leq min_{{\bar{ρ}}_{ℓ_{τ}} \in {[0, 1]}^{K_{τ}}} & {(\sum_{z_{0} \in Z_{0}} f_{0} (z_{0}) {(\sum_{z_{i_{τ}} \in Z_{i_{τ}}} f_{i_{τ}}^{\frac{1}{1 + {\bar{ρ}}_{i_{τ}}}} (z_{0}, z_{i_{τ}}))}^{1 + {\bar{ρ}}_{i_{τ}}})}^{\frac{1}{1 + {\bar{ρ}}_{i_{τ}}}} \\ {(\sum_{z_{0} \in Z_{0}} f_{0} (z_{0}) {(\sum_{z_{j_{τ}} \in Z_{j_{τ}}} f_{j_{τ}}^{\frac{1}{1 + {\bar{ρ}}_{j_{τ}}}} (z_{0}, z_{j_{τ}}))}^{1 + {\bar{ρ}}_{j_{τ}}})}^{\frac{{\bar{ρ}}_{i_{τ}}}{1 + {\bar{ρ}}_{i_{τ}}}} . \end{matrix}

(A17)

Next, putting Equation (A17) back in Equation (A10) and using the following inequality, proved in Appendix A.1, for

A_{i} \geq 0

and

0 \leq s_{i} \leq 1

\begin{matrix} \sum_{i, j \in K} A_{i}^{s_{i}} A_{j}^{1 - s_{i}} \leq 2 | K | \sum_{i \in K} A_{i} \end{matrix}

(A18)

in the double summation over

i_{τ}

and

j_{τ}

in Equation (A10), the following upper bound holds

\begin{matrix} \sum_{i_{τ} \in K_{τ}} \sum_{j_{τ} \in K_{τ}} min_{s \geq 0} min_{ρ \in [0, 1]} {\bar{P}}_{e, i_{τ^{c}} i_{τ}}^{τ, j_{τ}} \leq 2 K_{τ} \sum_{i_{τ} \in K_{τ}} min_{ρ \in [0, 1]} {\bar{P}}_{e, i_{τ^{c}} i_{τ}}^{τ}, \end{matrix}

(A19)

where we have moved the optimization over

{\bar{ρ}}_{ℓ_{τ}}

inside the summation over

i_{τ}

and renamed

{\bar{ρ}}_{ℓ_{τ}}

as

ρ

, with the dependence on the index

i_{τ}

kept implicit. Moreover, the expression for

{\bar{P}}_{e, i_{τ^{c}} i_{τ}}^{τ}

is in fact given by

{\bar{P}}_{e, i_{τ^{c}} i_{τ}}^{τ, j_{τ}}

in Equation (A11) after setting

i_{τ} = j_{τ}

,

s = \frac{1}{1 + ρ}

and rearranging terms, that is,

\begin{matrix} {\bar{P}}_{e, i_{τ^{c}} i_{τ}}^{τ} = \sum_{\begin{matrix} u_{τ^{c}} \in A_{τ^{c}}^{i_{τ^{c}}} \\ x_{τ^{c}} \in D_{τ^{c}, u_{τ^{c}}}^{i_{τ^{c}}} \end{matrix}} & P_{τ^{c}} (u_{τ^{c}}) Q_{τ^{c}}^{i_{τ^{c}}} (x_{τ^{c}} | u_{τ^{c}}) \sum_{y \in Y^{n}} {(\sum_{\begin{matrix} u_{τ} \in A_{τ}^{i_{τ}} \\ x_{τ} \in D_{τ, u_{τ}}^{i_{τ}} \end{matrix}} P_{τ | τ^{c}} {(u_{τ} | u_{τ^{c}})}^{\frac{1}{1 + ρ}} Q_{τ}^{i_{τ}} (x_{τ} | u_{τ}) W {(y | x_{N})}^{\frac{1}{1 + ρ}})}^{1 + ρ} . \end{matrix}

(A20)

It remains to factorize Equation (A20) into a product of symbol distributions in order to obtain a single-letter expression for the exponent. We start by upper bounding the summations over the input messages

u_{τ^{c}}

and

u_{τ}

. For a list of users

σ

with corresponding messages

u_{σ}

, list of class indices

i_{σ}

and some function

p_{σ}^{i_{σ}} (u_{σ})

, we have that

\begin{matrix} \sum_{\begin{matrix} u_{σ} \in A_{σ}^{i_{σ}} \end{matrix}} p_{σ}^{i_{σ}} (u_{σ}) & = \sum_{u_{σ} \in U_{σ}^{n}} p_{σ}^{i_{σ}} (u_{σ}) 1 {u_{σ} \in A_{σ}^{i_{σ}}}, \end{matrix}

(A21)

where we used the definition of the message sets

A_{σ}^{i_{σ}}

in Equation (12) and the identity

\begin{matrix} \sum_{i \in K} f_{i} = \sum_{i \in N} f_{i} 1 {i \in K} . \end{matrix}

(A22)

Using the upper bound

\begin{matrix} 1 {a < b \leq c} \leq min_{λ^{L}, λ^{U} \geq 0} {(\frac{b}{a})}^{λ^{L}} {(\frac{c}{b})}^{λ^{U}} \end{matrix}

(A23)

for

a, b, c > 0

with

λ^{L}, λ^{U} \geq 0

, together with the fact that the source-message classes are defined separately for each user to express the source message probabilities in terms of

P_{σ}^{ind} (u_{σ}) = \prod_{ν \in σ} P_{ν} (u_{ν})

similarly to Equation (2), we upper bound the r. h. s. of Equation (A21) as

\begin{matrix} \sum_{\begin{matrix} u_{σ} \in A_{σ}^{i_{σ}} \end{matrix}} p_{σ}^{i_{σ}} (u_{σ}) \leq min_{λ_{σ}^{L, U} \geq 0} \sum_{u_{σ} \in U_{σ}^{n}} p_{σ}^{i_{σ}} (u_{σ}) {(\frac{P_{σ}^{ind} (u_{σ})}{γ_{σ, i_{σ}}^{n}})}^{λ_{σ}^{L}} {(\frac{γ_{σ, i_{σ} - 1}^{n}}{P_{σ}^{ind} (u_{σ})})}^{λ_{σ}^{U}}, \end{matrix}

(A24)

where we jointly wrote

λ_{σ}^{L}

and

λ_{σ}^{U}

as

λ_{σ}^{L, U}

. Definining

\begin{matrix} Λ_{σ}^{i_{σ}} (u_{σ}) = {(\frac{P_{σ}^{ind} (u_{σ})}{γ_{σ, i_{σ}}})}^{λ_{σ}^{L}} {(\frac{γ_{σ, i_{σ} - 1}}{P_{σ}^{ind} (u_{σ})})}^{λ_{σ}^{U}} . \end{matrix}

(A25)

and taking into account that the sources are memoryless, we obtain that the summations w. r. t. the source messages

u_{τ}

and

u_{τ^{c}}

in Equation (A20) are upper bounded as

\begin{matrix} \sum_{\begin{matrix} u_{σ} \in A_{σ}^{i_{σ}} \end{matrix}} p_{σ}^{i_{σ} i_{σ}} (u_{σ}) \leq min_{λ_{σ}^{L, U}} \sum_{u_{σ} \in U_{σ}^{n}} p_{σ}^{i_{σ^{c}} i_{σ}} (u_{σ}) \prod_{t = 1}^{n} Λ_{σ}^{i_{σ}} (u_{σ, t}) . \end{matrix}

(A26)

respecively for

σ = τ

and

σ = τ^{c}

.

We proceed in a similar manner for the summations w. r. t. the codewords

x_{τ}

and

x_{τ^{c}}

in Equation (A20). For a list of users

σ

and some function

q_{σ}^{i_{σ}} (u_{σ}, x_{σ})

implicitly defined, the summation over channel codewords

x_{σ} \in D_{σ, u_{σ}}^{i_{σ}}

can be upper bounded as:

\begin{matrix} \sum_{x_{σ} \in D_{σ, u_{σ}}^{i_{σ}}} q_{σ}^{i_{σ}} (u_{σ}, x_{σ}) & = \sum_{x_{σ} \in X_{σ}^{n}} q_{σ}^{i_{σ}} (u_{σ}, x_{σ}) 1 {x_{σ} \in D_{σ, u_{σ}}^{i_{σ}}} \end{matrix}

(A27)

\begin{matrix} = \sum_{x_{σ} \in X_{σ}^{n}} q_{σ}^{i_{σ}} (u_{σ}, x_{σ}) \prod_{ν \in σ} \prod_{υ_{ν} \in U_{ν}} \prod_{ℓ_{ν} \in L_{ν}} 1 \{|a_{ν, υ_{ν}}^{i_{ν}, ℓ_{ν}} (x_{ν} (I_{υ_{ν}} (u_{ν})))| \leq δ_{ν}\} \end{matrix}

(A28)

\begin{matrix} \leq min_{r_{σ υ_{σ}}^{ℓ_{σ}}, {\bar{r}}_{σ υ_{σ}}^{ℓ_{σ}}} \sum_{x_{σ} \in X_{σ}^{n}} q_{σ}^{i_{σ}} (u_{σ}, x_{σ}) \prod_{υ_{σ} \in U_{σ}} \prod_{ℓ_{σ} \in L_{σ}} e^{r_{σ υ_{σ}}^{ℓ_{σ}} a_{σ, υ_{σ}}^{i_{σ}, ℓ_{σ}} (x_{σ} (I_{υ_{σ}} (u_{σ}))) + {\bar{r}}_{σ υ_{σ}}^{ℓ_{σ}} δ_{σ}}, \end{matrix}

(A29)

where we used Equation (A22) in Equation (A27), the fact that the codeword ensembles are defined separately for each user together with the definition of the ensemble cost constraints in Equation (23) and subcodewords

x_{ν} (I_{u_{ν}} (u_{ν}))

in Equation (A28), and a variant of Equation (A23) proved in Appendix A.2,

\begin{matrix} 1 {| a | \leq δ} \leq min_{r, \bar{r}} e^{r a + \bar{r} δ} \end{matrix}

(A30)

for

r \in R

and

\bar{r} \geq 0

, in Equation (A29) for each indicator function of Equation (A28) and combined the product of exponentials over

σ

as a single exponential using the list notation. We continue by rewriting the double product over

υ_{σ}

and

ℓ_{σ}

in Equation (A29) as follows

\begin{matrix} \prod_{υ_{σ} \in U_{σ}} \prod_{ℓ_{σ} \in L_{σ}} e^{r_{σ υ_{σ}}^{ℓ_{σ}} a_{σ, υ_{σ}}^{i_{σ}, ℓ_{σ}} (x_{σ} (I_{υ_{σ}} (u_{σ}))) + {\bar{r}}_{σ υ_{σ}}^{ℓ_{σ}} δ_{σ}} & = \prod_{υ_{σ} \in U_{σ}} \prod_{ℓ_{σ} \in L_{σ}} e^{{\bar{r}}_{σ υ_{σ}}^{ℓ_{σ}} δ_{σ}} \prod_{t \in I_{υ_{σ}} (u_{σ})} e^{r_{σ υ_{σ}}^{ℓ_{σ}} a_{σ, υ_{σ}}^{i_{σ}, ℓ_{σ}} (x_{σ, t} (I_{υ_{σ}} (u_{σ})))} \end{matrix}

(A31)

\begin{matrix} = β_{σ} \prod_{t = 1}^{n} R_{σ, u_{τ, t}}^{i_{τ}} (x_{τ, t}), \end{matrix}

(A32)

where in Equation (A31) we wrote the cost function in terms of the symbol costs and in Equation (A32) we rearranged terms and introduced a factor

β_{σ}

that depends on the list

{{\bar{r}}_{σ υ_{σ}}^{ℓ_{σ}}}

and a function

R_{σ, u_{σ}}^{i_{σ}} (x_{σ})

that depends on the list

{r_{σ υ_{σ}}^{ℓ_{σ}}}

and are respectively given by

\begin{matrix} β_{σ} = \prod_{υ_{σ} \in U_{σ}} \prod_{ℓ_{σ} \in L_{σ}} e^{{\bar{r}}_{σ υ_{σ}}^{ℓ_{σ}} δ_{σ}}, \end{matrix}

(A33)

\begin{matrix} R_{σ, u_{σ}}^{i_{σ}} (x_{σ}) = \prod_{ℓ_{σ} \in L_{σ}} e^{r_{σ u_{σ}}^{ℓ_{σ}} a_{σ, u_{σ}}^{i_{σ}, ℓ_{σ}} (x_{σ})} . \end{matrix}

(A34)

Replacing Equation (A32) back into Equation (A29), we obtain that the summations over the codewords are upper bounded as

\begin{matrix} \sum_{x_{σ} \in D_{σ, u_{σ}}^{i_{σ}}} q_{σ}^{i_{σ}} (u_{σ}, x_{σ}) & \leq min_{r_{σ u_{σ}}^{ℓ_{σ}}, {\bar{r}}_{σ u_{σ}}^{ℓ_{σ}}} \sum_{x_{σ} \in X_{σ}^{n}} q_{σ}^{i_{σ}} (u_{σ}, x_{σ}) β_{σ} \prod_{t = 1}^{n} R_{σ, u_{σ, t}}^{i_{σ}} (x_{τ, t}) . \end{matrix}

(A35)

for both user lists

σ = τ

and

σ = τ^{c}

.

We now combine Equations (A26) and (A35) for

σ = τ

to bound the summation inside the parenthesis in Equation (A20) as

\begin{matrix} \sum_{\begin{matrix} u_{τ} \in A_{τ}^{i_{τ}} \\ x_{τ} \in D_{τ, u_{τ}}^{i_{τ}} \end{matrix}} P_{τ | τ^{c}} {(u_{τ} | u_{τ^{c}})}^{\frac{1}{1 + ρ}} Q_{τ}^{i_{τ}} (x_{τ} | u_{τ}) W {(y | x_{N})}^{\frac{1}{1 + ρ}} \leq \\ \leq \sum_{\begin{matrix} u_{τ} \in U_{τ}^{n} \\ x_{τ} \in X_{τ}^{n} \end{matrix}} P_{τ | τ^{c}} {(u_{τ} | u_{τ^{c}})}^{\frac{1}{1 + ρ}} min_{\begin{matrix} λ_{τ}^{L, U} \\ r_{τ u_{τ}}^{ℓ_{τ}}, {\bar{r}}_{τ u_{τ}}^{ℓ_{τ}} \end{matrix}} \frac{β_{τ}}{Ξ_{τ}} \prod_{t = 1}^{n} Λ_{τ}^{i_{τ}} (u_{τ, t}) Q_{τ, u_{τ, t}}^{i_{τ}} (x_{τ, t}) R_{τ, u_{τ, t}}^{i_{τ}} (x_{τ, t}) W {(y | x_{N})}^{\frac{1}{1 + ρ}}, \end{matrix}

(A36)

where we expressed the distribution

Q_{τ}^{i_{τ}} (x_{τ}, u_{τ})

in terms of the symbol-wise iid distribution

Q_{τ, u_{τ}}^{i_{τ}} (x)

as in Equation (25). Since both source and channel are memoryless, we may now factorize and rearrange the expression in Equation (A36) into single-letter, symbolwise factors as

\begin{matrix} \sum_{\begin{matrix} u_{τ} \in A_{τ}^{i_{τ}} \\ x_{τ} \in D_{τ, u_{τ}}^{i_{τ}} \end{matrix}} P_{τ | τ^{c}} {(u_{τ} | u_{τ^{c}})}^{\frac{1}{1 + ρ}} Q_{τ}^{i_{τ}} (x_{τ} | u_{τ}) W {(y | x_{N})}^{\frac{1}{1 + ρ}} \leq min_{\begin{matrix} λ_{τ}^{L, U} \\ r_{τ u_{τ}}^{ℓ_{τ}}, {\bar{r}}_{τ u_{τ}}^{ℓ_{τ}} \end{matrix}} \frac{β_{τ}}{Ξ_{τ}} \prod_{t = 1}^{n} g_{τ}^{i_{τ}} (u_{τ^{c}, t}, x_{τ^{c}, t}, y_{t}), \end{matrix}

(A37)

where, for a list of users

σ

, the function

g_{σ}^{i_{σ}} (u_{σ^{c}}, x_{σ^{c}}, y)

is defined as

\begin{matrix} g_{σ}^{i_{σ}} (u_{σ^{c}}, x_{σ^{c}}, y) = \sum_{\begin{matrix} u_{σ} \in U_{σ} \\ x_{σ} \in X_{σ} \end{matrix}} P_{σ | σ^{c}} {(u_{σ} | u_{σ^{c}})}^{\frac{1}{1 + ρ}} Λ_{σ}^{i_{σ}} (u_{σ}) Q_{σ, u_{σ}}^{i_{σ}} (x_{σ}) R_{σ, u_{σ}}^{i_{σ}} (x_{σ}) W {(y | x_{N})}^{\frac{1}{1 + ρ}} . \end{matrix}

(A38)

Although not explicitly, the function

g_{τ}^{i_{τ}} (u_{τ^{c}}, x_{τ^{c}}, y)

in Equation (A37) depends on several optimization parameters, namely

ρ, λ_{τ}^{L, U}, r_{τ u_{τ}}^{ℓ_{τ}}, {\bar{r}}_{τ u_{τ}}^{ℓ_{τ}}

, which depend in turn on the error-event type

τ

and class indices

i_{τ}

and

i_{τ^{c}}

.

Again, we use Equations (A26) and (A35) for

σ = τ^{c}

and the fact that the source is memoryless to upper bound the summation outside the parenthesis in Equation (A20) as

\begin{matrix} \sum_{\begin{matrix} u_{τ^{c}} \in A_{τ^{c}}^{i_{τ^{c}}} \\ x_{τ^{c}} \in D_{τ^{c}, u_{τ^{c}}}^{i_{τ^{c}}} \end{matrix}} P_{τ^{c}} (u_{τ^{c}}) Q_{τ^{c}}^{i_{τ^{c}}} (x_{τ^{c}} | u_{τ^{c}}) \leq \\ \leq min_{\begin{matrix} λ_{τ^{c}}^{L, U} \\ r_{τ^{c} u_{τ^{c}}}^{ℓ_{τ^{c}}}, {\bar{r}}_{τ^{c} u_{τ^{c}}}^{ℓ_{τ^{c}}} \end{matrix}} \sum_{\begin{matrix} u_{τ^{c}} \in U_{τ^{c}}^{n} \\ x_{τ^{c}} \in X_{τ^{c}}^{n} \end{matrix}} \frac{β_{τ^{c}}}{Ξ_{τ^{c}}} \prod_{t = 1}^{n} P_{τ^{c}} (u_{τ^{c}, t}) Λ_{τ^{c}}^{i_{τ^{c}}} (u_{τ^{c}, t}) Q_{τ^{c}, u_{τ^{c}, t}}^{i_{τ^{c}}} (x_{τ^{c}, t}) R_{τ^{c}, u_{τ^{c}, t}}^{i_{τ^{c}}} (x_{τ^{c}, t}) . \end{matrix}

(A39)

Substituting Equations (A37) and (A39) in Equation (A20), the resulting expression back into Equation (A19) and then into Equation (A10), we get

\begin{matrix} {\bar{P}}_{e}^{τ} \leq 2 K_{τ} \sum_{\begin{matrix} i_{τ^{c}} \in K_{τ^{c}} \\ i_{τ} \in K_{τ} \end{matrix}} min_{\begin{matrix} ρ, λ_{τ^{c}}^{L, U} \\ r_{τ^{c} u_{τ^{c}}}^{ℓ_{τ^{c}}}, {\bar{r}}_{τ^{c} u_{τ^{c}}}^{ℓ_{τ^{c}}} \end{matrix}} \sum_{\begin{matrix} u_{τ^{c}} \in U_{τ^{c}}^{n} \\ x_{τ^{c}} \in X_{τ^{c}}^{n} \end{matrix}} \frac{β_{τ^{c}}}{Ξ_{τ^{c}}} (\prod_{t = 1}^{n} P_{τ^{c}} (u_{τ^{c}, t}) Λ_{τ^{c}}^{i_{τ^{c}}} (u_{τ^{c}, t}) Q_{τ^{c}, u_{τ^{c}, t}}^{i_{τ^{c}}} (x_{τ^{c}, t}) R_{τ^{c}, u_{τ^{c}, t}}^{i_{τ^{c}}} (x_{τ^{c}, t})) \\ (\sum_{y \in Y^{n}} {(\frac{β_{τ}}{Ξ_{τ}})}^{1 + ρ} min_{\begin{matrix} λ_{τ}^{L, U} \\ r_{τ u_{τ}}^{ℓ_{τ}}, {\bar{r}}_{τ u_{τ}}^{ℓ_{τ}} \end{matrix}} \prod_{t = 1}^{n} g_{τ}^{i_{τ}} {(u_{τ^{c}, t}, x_{τ^{c}, t}, y_{t})}^{1 + ρ}) . \end{matrix}

(A40)

Let us define now the function

h_{σ}^{i_{σ}, i_{σ^{c}}}

of the user set

σ

and the class indices

i_{σ}

and

i_{σ^{c}}

as

\begin{matrix} h_{σ}^{i_{σ}, i_{σ^{c}}} = \sum_{\begin{matrix} u_{σ} \in U_{σ}, x_{σ} \in X_{σ}, y \in Y \end{matrix}} P_{σ} (u_{σ}) Λ_{σ}^{i_{σ}} (u_{σ}) Q_{σ, u_{σ}}^{i_{σ}} (x_{σ}) R_{σ, u_{σ}}^{i_{σ}} (x_{σ}) g_{σ^{c}}^{i_{σ^{c}}} {(u_{σ}, x_{σ}, y)}^{1 + ρ} . \end{matrix}

(A41)

With this definition, we can rewrite Equation (A40) in a compact manner as

\begin{matrix} {\bar{P}}_{e}^{τ} & \leq 2 K_{τ} \sum_{\begin{matrix} i_{N} \in K_{N} \end{matrix}} min_{\begin{matrix} ρ, λ_{N}^{L, U}, r_{N u_{N}}^{ℓ_{N}}, {\bar{r}}_{N u_{N}}^{ℓ_{N}} \end{matrix}} \frac{β_{τ^{c}}}{Ξ_{τ^{c}}} {(\frac{β_{τ}}{Ξ_{τ}})}^{1 + ρ} \prod_{t = 1}^{n} h_{τ^{c}}^{i_{τ^{c}}, i_{τ}}, \end{matrix}

(A42)

where we have also combined the complementary sets

i_{τ^{c}}

and

i_{τ}

into

i_{N}

, and similarly for

λ_{N}^{L}

,

λ_{N}^{U}

,

r_{N u_{N}}^{ℓ_{N}}

, and

{\bar{r}}_{N u_{N}}^{ℓ_{N}}

. Finally, substituting Equation (A42) into Equation (A10) and then back into Equation (A2), taking (minus) the logarithm of the bound on

{\bar{P}}_{e}

, dividing the result by n, and the limit as

n \to \infty

, we obtain a lower bound

E_{KL}^{cost}

to the exponent of the generalized cost-constrained ensemble

E_{KL}^{cost}

, namely

\begin{matrix} E_{K}^{cost} L & = min_{τ, i_{N}} max_{\begin{matrix} ρ, λ_{N}^{L, U}, r_{N u_{N}}^{ℓ_{N}} \end{matrix}} \{- log h_{τ^{c}}^{i_{τ^{c}}, i_{τ}}\}, \end{matrix}

(A43)

where we have used that as

n \to \infty

, the quantities

2 K_{τ}

,

\frac{β_{τ^{c}}}{Ξ_{τ^{c}}}

, and

{(\frac{β_{τ}}{Ξ_{τ}})}^{1 + ρ}

are subexponential in the blocklength n and do not contribute to the exponent, accordingly removed

{\bar{r}}_{τ^{c} u_{τ^{c}}}^{ℓ_{τ^{c}}}

and

{\bar{r}}_{τ u_{τ}}^{ℓ_{τ}}

from the optimization parameter list, and finally used that the exponential decay of the error probability in Equation (A2) will be dominated by the worst error type

τ

and the worst classes assignment

i_{N}

. It will prove convenient to the express the exponent in terms of a Gallager function

E_{τ}^{i_{N}} (ρ, λ_{N}^{L, U}, r_{N u_{N}}^{ℓ_{N}})

, defined as

\begin{matrix} E_{τ}^{i_{N}} (ρ, λ_{N}^{L, U}, r_{N u_{N}}^{ℓ_{N}}) = - log h_{τ^{c}}^{i_{τ^{c}}, i_{τ}} . \end{matrix}

(A44)

Substituted the expression for

h_{τ^{c}}^{i_{τ^{c}}, i_{τ}}

in Equation (A40), where

Λ_{τ}^{i_{τ}} (u_{τ})

and

R_{ν, u_{ν}}^{i_{ν}} (x_{ν})

are respectively given by Equations (A25) and (A34), we may express

E_{τ}^{i_{N}} (ρ, λ_{N}^{L, U}, r_{N u_{N}}^{ℓ_{N}})

as

\begin{matrix} E_{τ}^{i_{N}} ( & ρ, λ_{N}^{L, U}, r_{N u_{N}}^{ℓ_{N}}) = \\ - log (\sum_{u_{τ^{c}}, x_{τ^{c}}, y} P_{τ^{c}} (u_{τ^{c}}) Λ_{τ^{c}}^{i_{τ^{c}}} (u_{τ^{c}}) Q_{τ^{c}, u_{τ^{c}}}^{i_{τ^{c}}} (x_{τ^{c}}) R_{τ^{c}, u_{τ^{c}}}^{i_{τ^{c}}} (x_{τ^{c}}) \\ {(\sum_{u_{τ}, x_{τ}} P_{τ | τ^{c}} {(u_{τ} | u_{τ^{c}})}^{\frac{1}{1 + ρ}} Λ_{τ}^{i_{τ}} (u_{τ}) Q_{τ, u_{τ}}^{i_{τ}} (x_{τ}) R_{τ, u_{τ}}^{i_{τ}} (x_{τ}) W {(y | x_{N})}^{\frac{1}{1 + ρ}})}^{1 + ρ}), \end{matrix}

(A45)

or equivalently in the alternative form

\begin{matrix} E_{τ}^{i_{N}} (ρ, λ_{N}^{L, U}, r_{N u_{N}}^{ℓ_{N}}) = \\ - log \sum_{u_{τ^{c}}, x_{τ^{c}}, y} (\sum_{u_{τ}, x_{τ}} P_{N} {(u_{N})}^{\frac{1}{1 + ρ}} Λ_{N}^{i_{N}} (u_{N}) Q_{τ, u_{τ}}^{i_{τ}} (x_{τ}) R_{N, u_{N}}^{i_{N}} (x_{N}) {(Q_{τ^{c}, u_{τ^{c}}}^{i_{τ^{c}}} (x_{τ^{c}}) W {(y | x_{N})}^{\frac{1}{1 + ρ}})}^{1 + ρ}, \end{matrix}

(A46)

where in Equation (A46) we have moved the product

P_{τ^{c}} (u_{τ^{c}}) Λ_{τ^{c}}^{i_{τ^{c}}} (u_{τ^{c}}) Q_{τ^{c}, u_{τ^{c}}}^{i_{τ^{c}}} (x_{τ^{c}}) R_{τ^{c}, u_{τ^{c}}}^{i_{τ^{c}}} (x_{τ^{c}})

inside the parenthesis and merged terms in

τ

and

τ^{c}

as done above, as well as redefined the optimization parameters

\frac{λ_{τ^{c}}^{L}}{1 + ρ}

,

\frac{λ_{τ^{c}}^{L}}{1 + ρ}

, and

\frac{r_{τ^{c} u_{τ^{c}}}^{ℓ_{τ^{c}}}}{1 + ρ}

as

λ_{τ^{c}}^{L}

,

λ_{τ^{c}}^{U}

, and

r_{τ^{c} u_{τ^{c}}}^{ℓ_{τ^{c}}}

respectively.

Appendix A.1. Proof of Equation (A18)

A sketch of the proof of the inequality in Equation (A18) proceeds as follows:

\begin{matrix} \sum_{i, j \in K} A_{i}^{s_{i}} A_{j}^{1 - s_{i}} & \leq \sum_{i, j \in K} (s_{i} A_{i} + (1 - s_{i}) A_{j}) \end{matrix}

(A47)

\begin{matrix} \leq \sum_{i, j \in K} (A_{i} + A_{j}) \end{matrix}

(A48)

\begin{matrix} = 2 | K | \sum_{i \in K} A_{i} \end{matrix}

(A49)

where Equation (A47) follows from the inequality between arithmetic and geometric means and in Equation (A48) we used that

0 \leq s \leq 1

.

Appendix A.2. Proof of Equation (A30)

We have the following

\begin{matrix} 1 {| a | \leq δ} & = 1 {- δ \leq a} 1 {a \leq δ} \end{matrix}

(A50)

\begin{matrix} = 1 {e^{- δ} \leq e^{a}} 1 {e^{a} \leq e^{+ δ}} \end{matrix}

(A51)

\begin{matrix} \leq e^{r_{-} (a + δ)} e^{r_{+} (δ - a)} \end{matrix}

(A52)

\begin{matrix} = e^{r a + \bar{r} δ} \end{matrix}

(A53)

for

r_{-}, r_{+} \geq 0

or equivalently

r = r_{-} - r_{+} \in R

and

\bar{r} = r_{-} + r_{+} \geq 0

. The bound in Equation (A53) can be optimized w. r. t. r and

\hat{r}

.

Appendix B. Computation of the Optimum Multi-Class Thresholds

In this section we find some conditions describing the optimum partitioning of the source-message set into classes for the optimization of the exponent in Equation (30). For simplicity, let each user

ν \in N

have two classes,

K_{ν} = 2

.

From the class definition in Equation (12) with

K_{ν} = 2

, we have that

γ_{ν, 2} = 0

and

γ_{ν, 0} = 1

, so we need find just one optimum

γ_{ν, 1}

for each user, which redefine as

γ_{ν}

. Optimizing the exponent in Equation (30) over

γ_{N}

gives

\begin{matrix} max_{0 \leq γ_{N} \leq 1} E^{cost} & = max_{0 \leq γ_{N} \leq 1} min_{i_{N}} min_{τ} max_{\begin{matrix} ρ, λ_{N}^{L, U}, r_{N u_{N}}^{ℓ_{N}} \end{matrix}} E_{τ}^{i_{N}} (ρ, λ_{N}^{L, U}, r_{N u_{N}}^{ℓ_{N}}), \end{matrix}

(A54)

where one of the parameters

λ_{N}^{L}

or

λ_{N}^{U}

is zero for each

i_{N}

, as the corresponding constraint is absent. For each

γ_{N}

, we have a minimization over

2^{N}

assignments

i_{N}

. Following the same steps as in refs. [31] (Sec. 4.1.2) and [31] (Lemma 4.3), we find that

E_{τ}^{i_{N}} (γ_{N})

defined, with some abuse of notation, as

\begin{matrix} E_{τ}^{i_{N}} (γ_{N}) = max_{\begin{matrix} ρ, λ_{N}^{L, U}, r_{N u_{N}}^{ℓ_{N}} \end{matrix}} E_{τ}^{i_{N}} (ρ, λ_{N}^{L, U}, r_{N u_{N}}^{ℓ_{N}}), \end{matrix}

(A55)

is a non-decreasing (resp. non-increasing) function with respect to

γ_{ν}

for

i_{N} = [i_{ν}, i_{ν^{c}}]

with

i_{ν} = 1

(resp.

i_{ν} = 2

), irrespective of the values of

i_{ν^{c}}

and of

ν

. For the sake of completeness, we present an independent proof of this fact here. Let

i_{ν} = 1

and

τ

be arbitrary. Using Equation (31), the function

E_{τ}^{i_{N}} (γ_{N}, ρ)

has the form

- log (\sum_{z} f_{1} (z) \ γ_{ν}^{λ_{ν}^{L}}))

for some function

f_{1} (z)

, as all

γ_{N}

are independent from each other, regardless the value of

i_{ν^{c}}

. Since

λ_{ν}^{L} \geq 0

, the function

E_{τ}^{i_{N}} (γ_{N}, ρ)

in Equation (A55) is non-decreasing with respect to

γ_{ν}

. When

i_{ν} = 2

, this function

E_{τ}^{i_{N}} (γ_{N}, ρ)

has the form

- log (\sum_{z} f_{2} (z) γ_{ν}^{λ_{ν}^{U}}))

for some

f_{2} (z)

, and is therefore non-increasing. This behavior will not change after taking maximization over

ρ

. As the minimum of monotonic functions is monotonic, the function

E_{τ}^{i_{N}} (γ_{N})

is non-decreasing (non-increasing) with respect to

γ_{ν}

, when

i_{ν} = 1

(

i_{ν} = 2

).

For any

ν

and fixed

γ_{ν^{c}}

, we may write the optimization problem in Equation (A54) as

\begin{matrix} max_{γ_{ν^{c}}} max_{γ_{ν}} min_{i_{ν}} min_{i_{ν^{c}}} min_{τ} E_{τ}^{[i_{ν}, i_{ν^{c}}]} ([γ_{ν}, γ_{ν^{c}}]) . \end{matrix}

(A56)

The optimization problem

{max}_{γ_{ν}} {min}_{i_{ν}} {min}_{i_{ν^{c}}} {min}_{τ} E_{τ}^{[i_{ν}, i_{ν^{c}}]} ([γ_{ν}, γ_{ν^{c}}])

satisfies the following lemma, proved in Appendix B.1, with

γ = γ_{ν}

,

i = i_{ν}

, and

k_{i} (γ) = {min}_{i_{ν^{c}}} {min}_{τ} E_{τ}^{[i_{ν}, i_{ν^{c}}]} ([γ_{ν}, γ_{ν^{c}}])

.

Lemma A1.

Let

k_{1} (γ)

and

k_{2} (γ)

be respectively continuous non-decreasing and non-increasing functions with respect to

γ \in [0, 1]

. The optimal

γ^{☆}

maximizing

{min}_{i = 1, 2} k_{i} (γ)

satisfies the following equation

\begin{matrix} k_{1} (γ^{☆}) = k_{2} (γ^{☆}) . \end{matrix}

(A57)

When Equation (A57) does not have any solution, we have

γ^{☆} = 0

if

k_{1} (0) > k_{2} (0)

, and

γ^{☆} = 1

otherwise.

Therefore, the optimal

γ_{ν}^{☆}

satisfies

\begin{matrix} min_{i_{ν^{c}}} min_{τ} E_{τ}^{[1, i_{ν^{c}}]} ([γ_{ν}^{*}, γ_{ν^{c}}]) = min_{i_{ν^{c}}} min_{τ} E_{τ}^{[2, i_{ν^{c}}]} ([γ_{ν}^{*}, γ_{ν^{c}}]), \end{matrix}

(A58)

if Equation (A58) has a solution. If not, the inequality

{min}_{i_{ν^{c}}} {min}_{τ} E_{τ}^{[1, i_{ν^{c}}]} ([0, γ_{ν^{c}}]) > {min}_{i_{ν^{c}}} {min}_{τ} E_{τ}^{[1, i_{ν^{c}}]} ([0, γ_{ν^{c}}])

is satisfied, we have

γ_{ν}^{☆} = 0

or

γ_{ν}^{☆} = 1

otherwise. Since Equation (A58) holds for any

ν

, evaluating it for each

ν

gives a system of equations for the computation of the optimal thresholds.

In ref. [31] (Sec. 3.2.1.1), we give a graphical interpretation of the solutions to Equation (A58) and outline the relevant differences with the single-user case. We observe a strong coupling between the exponent and the thresholds that prevents to find the optimal number of classes, suggesting that, unlike the single-user case, two classes might not be sufficient.

Proof of Lemma A1

The relative behaviour of a non-decreasing function with a non-increasing function can be categorized in three cases.

If $k_{1} (0) < k_{2} (0)$ and $k_{1} (1) > k_{2} (1)$ , there exists a $γ^{☆}$ such that $k_{1} (γ^{☆}) = k_{2} (γ^{☆})$ . In this case, the function ${min}_{i} k_{i} (γ)$ is non-decreasing from $[0, γ^{☆})$ , and non-increasing from $(γ^{☆}, 1]$ . Thus, the maximum over $γ$ of ${min}_{i} k_{i} (γ)$ occurs at $γ = γ^{☆}$ .
If $k_{1} (0) < k_{2} (0)$ and $k_{1} (1) < k_{2} (1)$ , $k_{1} (γ)$ and $k_{2} (γ)$ do not cross in $γ \in [0, 1]$ . Hence, we have ${min}_{i} k_{i} (γ) = k_{1} (γ)$ and obviously since it is an non-decreasing function the maximum over $γ$ occurs at $γ = γ^{☆} = 1$ .
When $k_{1} (0) \geq k_{2} (0)$ , we have ${min}_{i} k_{i} (γ) = k_{2} (γ)$ and hence $γ^{☆} = 0$ .

Appendix C. Properties of the Modified Gallager Source Function

In this appendix, we study the modified Gallager source function

E_{s, τ}^{i_{N}}

in Equation (45) involved in the achievable exponent for the md-iid ensemble. For the sake of simplicity, we consider the rather illustrative case of

N = 2

users, each having a

K_{ν} = 2

-class partition of the source messages with thresholds

i_{N}

where

N = {1, 2}

. From the definition of the sets

A_{ν}^{i_{ν}}

in Equation (12) with

γ_{ν, 0} = 1

,

γ_{ν, 1} = γ_{ν}

and

γ_{ν, 2} = 0

, the two message sets

\begin{matrix} A_{ν}^{1} (γ_{ν}) = \{u_{ν} \in U_{ν}^{n} : P_{ν} (u_{ν}) \geq γ_{ν}^{n}\}, \end{matrix}

(A59)

\begin{matrix} A_{ν}^{2} (γ_{ν}) = \{u_{ν} \in U_{ν}^{n} : P_{ν} (u_{ν}) < γ_{ν}^{n}\}, \end{matrix}

(A60)

are specified using a single threshold

γ_{ν}

for each user

ν \in {1, 2}

. With some abuse of notation, we include the optimization w. r. t.

λ_{N}^{L, U}

and make explicit the dependence on the thresholds

γ_{N}

in the expression of the source function

E_{s, τ}^{i_{N}}

in Equation (45), namely

E_{s, τ}^{i_{N}} (ρ, P_{N}, γ_{N}) = min_{λ_{N}^{L, U} \geq 0} log \sum_{u_{τ^{c}}} {(\sum_{u_{τ}} P_{N} {(u_{N})}^{\frac{1}{1 + ρ}} Λ_{N}^{i_{N}} (u_{N}))}^{1 + ρ} .

(A61)

For

i_{ν} = 1

, the set

A_{ν}^{1} (γ_{ν})

in Equation (A59) has no upper threshold, hence we find that the optimal parameter

λ_{ν}^{U}

in this case is

{\hat{λ}}_{ν}^{U} = 0

. Similarly for

i_{ν} = 2

, we obtain that

{\hat{λ}}_{ν}^{L} = 0

. As a consequence and without any loss of generality, we define

λ_{ν} = λ_{ν}^{L}

for

i_{ν} = 1

, and

λ_{ν} = λ_{ν}^{U}

for

i_{ν} = 2

, and further simplify Equation (A61) to the following optimization problem

\begin{matrix} E_{s, τ}^{i_{N}} (ρ, P_{N}, γ_{N}) = min_{λ_{N} \geq 0} log \sum_{u_{τ^{c}}} {(\sum_{u_{τ}} P_{N} {(u_{N})}^{\frac{1}{1 + ρ}} {(\frac{γ_{1}}{P_{1} (u_{1})})}^{{(- 1)}^{i_{1}} λ_{1}} {(\frac{γ_{2}}{P_{2} (u_{2})})}^{{(- 1)}^{i_{2}} λ_{2}})}^{1 + ρ}, \end{matrix}

(A62)

where we also used the definition of the functions

Λ_{σ}^{i_{σ}}

in Equation (32) with

σ = {1, 2}

. We recall that

P_{1}

and

P_{2}

are the marginal distributions for users

ν = 1

and

ν = 2

, respectively, and the indices

i_{ν} \in {1, 2}

indicate that user

ν

transmits a source message selected from the class

A_{ν}^{i_{ν}} (γ_{ν})

in Equations (A59) and (A60). It can be shown that the objective function in the r. h. s. of Equation (A62) is convex w. r. t. both

λ_{1}

and

λ_{2}

. Hence, the minimizers

{\hat{λ}}_{1}

and

{\hat{λ}}_{2}

in the source function

E_{s, τ}^{i_{N}} (ρ, P_{N}, γ_{N})

are respectively given by

{\hat{λ}}_{1} = max {λ_{1}^{☆}, 0}

and

{\hat{λ}}_{2} = max {λ_{2}^{☆}, 0}

, where

λ_{1}^{☆}

and

λ_{2}^{☆}

are the unique solution after setting the partial derivatives of the r. h. s. of Equation (A62) to zero. Two special cases can be obtained from Equation (A62).

The first case is when

γ_{ν} = 1

for

ν \in {1, 2}

, implying that no message partition happens whatsoever. In such a case, we have that

{\hat{λ}}_{1} = {\hat{λ}}_{2} = 0

and Equation (A62) reduces to the joint source-channel coding source function for correlated-sources in Equation (34), i.e.,

\begin{matrix} E_{s, τ} (ρ, P_{N}) = log \sum_{u_{τ^{c}}} {(\sum_{u_{τ}} P_{N} {(u_{N})}^{\frac{1}{1 + ρ}})}^{1 + ρ} . \end{matrix}

(A63)

The second one is the case of independent sources. Substituting

P_{N} = P_{1} P_{2}

in Equation (A62), after some algebra, we obtain that

E_{s, τ}^{i_{N}} (ρ, P_{N}, γ_{N})

can be split into two terms as

\begin{matrix} E_{s, τ}^{i_{N}} (ρ, P_{N}, γ_{N}) = E_{s}^{i_{τ}} (ρ, P_{τ}, γ_{τ}) + E_{s}^{i_{τ^{c}}} (0, P_{τ^{c}}, γ_{τ^{c}}), \end{matrix}

(A64)

where we defined the function

E_{s}^{i} (ρ, P, γ)

as

\begin{matrix} E_{s}^{i} (ρ, P, γ) = min_{λ_{} \geq 0} log {(\sum_{u} P {(u)}^{\frac{1}{1 + ρ}} {(\frac{γ}{P (u)})}^{{(- 1)}^{i} λ})}^{1 + ρ} \end{matrix}

(A65)

for arbitrary class index

i \in {1, 2}

, source distribution P and threshold

γ

. First, we find that the unique solution after setting the derivative of the r. h. s. of Equation (A65) to zero, denoted as

λ^{☆}

, is implicitly given by

\begin{matrix} \frac{\sum_{u} P {(u)}^{\frac{1}{1 + α^{☆}}} log P (u)}{\sum_{u} P {(u)}^{\frac{1}{1 + α^{☆}}}} = log (γ), \end{matrix}

(A66)

where we made the convenient change of variable

\begin{matrix} \frac{1}{1 + α^{☆}} = \frac{1}{1 + ρ} - {(- 1)}^{i} λ^{☆} . \end{matrix}

(A67)

Although not made explicit,

λ^{☆}

depends on the triplet

(i, P, ρ, γ)

. When

λ^{☆} < 0

, or equivalently when

\begin{matrix} {(- 1)}^{i} (\frac{1}{1 + ρ} - \frac{1}{1 + α^{☆}}) < 0 \end{matrix}

(A68)

we have that

\hat{λ} = max (0, λ^{☆}) = 0

, implying that Equation (A65) simplifies to

\begin{matrix} E_{s}^{i} (ρ, P, γ) = E_{s} (ρ, P), \end{matrix}

(A69)

where

E_{s} (ρ, P)

is the Gallager source function

\begin{matrix} E_{s} (ρ, P) = log {(\sum_{u} P {(u)}^{\frac{1}{1 + ρ}})}^{1 + ρ} . \end{matrix}

(A70)

Otherwise, when

\hat{λ} = λ^{☆} \geq 0

, a regime given by the following inequality

\begin{matrix} {(- 1)}^{i} (\frac{1}{1 + ρ} - \frac{1}{1 + α^{☆}}) \geq 0 \end{matrix}

(A71)

we may substitute

λ = λ^{☆}

in the objective function in Equation (A65) to obtain

\begin{matrix} E_{s}^{i} (ρ, P, γ) = (1 + ρ) log (\sum_{u} P {(u)}^{\frac{1}{1 + α^{☆}}}) + \frac{α^{☆} - ρ}{1 + α^{☆}} log (γ), \end{matrix}

(A72)

where we wrote the expression in terms of

α^{☆}

. Using Equation (A66) into Equation (A72) to replace

log (γ)

, we get

\begin{matrix} E_{s}^{i} (ρ, P, γ) = (1 + ρ) log (\sum_{u} P {(u)}^{\frac{1}{1 + α^{☆}}}) + \frac{α^{☆} - ρ}{1 + α^{☆}} \frac{\sum_{u} P {(u)}^{\frac{1}{1 + α^{☆}}} log P (u)}{\sum_{u} P {(u)}^{\frac{1}{1 + α^{☆}}}} . \end{matrix}

(A73)

After some algebra, we are able to express the former equation in terms of the derivative of the

E_{s}

-function in Equation (A70), given by

\begin{matrix} E_{s}^{'} (ρ, P) = log (\sum_{u} P {(u)}^{\frac{1}{1 + ρ}}) - \frac{1}{1 + ρ} \frac{\sum_{u} P {(u)}^{\frac{1}{1 + ρ}} log P (u)}{\sum_{u} P {(u)}^{\frac{1}{1 + ρ}}}, \end{matrix}

(A74)

and the

E_{s}

-function itself, as

\begin{matrix} E_{s}^{i} (ρ, P, γ) = E_{s} (α^{☆}, P) + (ρ - α^{☆}) E_{s}^{'} (α^{☆}, P) . \end{matrix}

(A75)

We may finally combine Equations (A69) and (A75), with the respective ranges in Equations (A68) and (A71) to write the

E_{s}^{i} (ρ, P, γ)

function in Equation (A65) piecewise as

\begin{matrix} E_{s}^{1} (ρ, P, γ) = \{\begin{matrix} E_{s} (ρ, P) & \frac{1}{1 + ρ} \geq \frac{1}{1 + α^{☆}}, \\ E_{s} (α^{☆}, P) + E_{s}^{'} (α^{☆}) (ρ - α^{☆}) & \frac{1}{1 + ρ} < \frac{1}{1 + α^{☆}}, \end{matrix} \end{matrix}

(A76)

and

\begin{matrix} E_{s}^{2} (ρ, P, γ) = \{\begin{matrix} E_{s} (ρ, P) & \frac{1}{1 + ρ} < \frac{1}{1 + α^{☆}}, \\ E_{s} (α^{☆}, P) + E_{s}^{'} (α^{☆}) (ρ - α^{☆}) & \frac{1}{1 + ρ} \geq \frac{1}{1 + α^{☆}} . \end{matrix} \end{matrix}

(A77)

where

α^{☆}

is the solution to the implicit Equation (A66), hence recovering the source error exponent functions of the md-iid ensemle described in ref. [8] (Lemma 1). The source functions

E_{s}^{1} (ρ, P, γ)

and

E_{s}^{2} (ρ, P, γ)

follow the Gallager function in Equation (A70) for a certain interval of

ρ

, and are the straight-line tangent beyond that interval. The tangent point

α^{☆}

is a function of the distribution P and of the multi-class threshold

γ

.

Once

E_{s}^{i} (ρ, P, γ)

in Equation (A65) is fully characterized, we may now discuss the correlated-sources error function

E_{s, τ}^{i_{N}} (ρ, P_{N}, γ_{N})

in Equation (A64) in terms of the error type

τ

. We start with the third error type

τ = {1, 2}

, for which since

τ^{c} = \emptyset

, we have that

\begin{matrix} E_{s, τ}^{i_{N}} (ρ, P_{N}, γ_{N}) = E_{s}^{i_{1}} (ρ, P_{1}, γ_{1}) + E_{s}^{i_{2}} (ρ, P_{2}, γ_{2}), \end{matrix}

(A78)

namely the superposition of two

E_{s}^{i}

functions as the ones in Equations (A76) and (A77), one for each user. For the remaining of this appendix, we consider the more informative error types

τ = {1}

and

τ = {2}

for the four possible pairs of class indices

i_{1}

and

i_{2}

in Equations (A59) and (A60), since in this case

E_{s, τ}^{i_{N}}

in Equation (A64) is either directly an

E_{s} (ρ, P_{τ})

function or the straight-line tangent to it, in both cases shifted by a constant term given by

E_{s}^{i_{τ^{c}}} (0, P_{τ^{c}}, γ_{τ^{c}})

.

Figure A1 shows the family of

E_{s, τ}^{i_{N}}

source functions respectively for independent and correlated sources, as a function of

ρ

where

P_{N}

given by Equation (60) and

τ = {1}

. For independent sources, we observe that the source functions

E_{s, τ}^{1, 1}

and

E_{s, τ}^{2, 1}

follow the solid blue line depicting

E_{s} (ρ, P_{τ})

as in Equation (A70) for a certain interval of

ρ

, and then take the tangent line beyond. A similar behavior is observed for the sources functions

E_{s, τ}^{1, 2}

and

E_{s, τ}^{2, 2}

, which in this case follow or are tangent to the solid black line, the solid blue Gallager’s source function shifted by the constant function

E_{s}^{i_{τ^{c}}} (0, P_{τ^{c}}, γ_{τ^{c}})

as in Equation (A64).

For correlated sources, the source functions

E_{s, τ}^{1, 1}

and

E_{s, τ}^{2, 1}

follow the generalized Gallager’s source function given by Equation (A63) for a certain interval, but unlike independent sources they are not straight lines but a curve tangent to

E_{s, τ}

beyond that interval. Some intuition about this fact can be gained from the primal form of the source function

E_{s, τ}^{i_{N}}

. Consider, for instance, the source function

E_{s, τ}^{2, 1}

in Figure A1 for correlated sources, for which

i_{1} = 2

and

i_{2} = 1

. The primal form of this source function

E_{s, τ}^{2, 1}

can be obtained as a constrained optimization problem w. r. t. some auxiliary joint distribution

{\hat{P}}_{N}

. The interval in

ρ

where

E_{s, τ}^{2, 1}

does not follow

E_{s, τ}

in the dual form (approximately for

ρ \leq 0.5

in the figure) corresponds to the case where only one of the two constraints on the auxiliary distribution

{\hat{P}}_{N}

is actually active in the primal form, where the constraint is given by

\sum_{u_{N}} {\hat{P}}_{N} (u_{N}) log P_{ν} (u_{ν}) = log (γ_{ν})

. This implies that, unlike the case of independent sources where each source has its auxiliary distribution

{\hat{P}}_{1}

and

{\hat{P}}_{2}

constrained, for correlated sources the joint auxiliary distribution

{\hat{P}}_{N}

is not fully constrained but is the union of joint distributions with one constrained marginal distribution. This partial constraint manifests as a curve in

ρ

, rather than a straight line, in the dual form. A similar behavior is observed for

E_{s, τ}^{1, 2}

and

E_{s, τ}^{2, 2}

, which instead of following the source function for joint source-channel coding in Equation (A63) for some intervals of

ρ

, they follow the curve

\begin{matrix} min_{λ_{ν} \geq 0} log \sum_{u_{τ^{c}}} {(\sum_{u_{τ}} P_{N} {(u_{N})}^{\frac{1}{1 + ρ}} {(\frac{P_{ν} (u_{ν})}{γ_{ν}})}^{- \frac{{(- 1)}^{i_{ν}} λ_{ν}}{1 + ρ}})}^{1 + ρ}, \end{matrix}

(A79)

corresponding to Equation (A62) when the constraint for one source is not active, i.e.,

{\hat{λ}}_{ν^{c}} = 0

.

Figure A1. Example of the source functions

E_{s, τ}^{i_{N}}

in Equation (A62) for independent and correlated sources and error type

τ \in \{{1}, {2}\}

.

Figure A1. Example of the source functions

E_{s, τ}^{i_{N}}

in Equation (A62) for independent and correlated sources and error type

τ \in \{{1}, {2}\}

.

References

Gallager, R. Information Theory and Reliable Communication; John Wiley & Sons: Hoboken, NJ, USA, 1968. [Google Scholar]
Polyanskiy, Y.; Poor, H.V.; Verdú, S. Channel coding rate in the finite blocklength regime. IEEE Trans. Inf. Theory 2010, 56, 2307–2359. [Google Scholar] [CrossRef]
Jelinek, F. Probabilistic Information Theory-Discrete and Memoryless Models; McGraw-Hill Book Company: New York, NY, USA, 1968. [Google Scholar]
Shannon, C.; Gallager, R.; Berlekamp, E. Lower bounds to error probability for coding on discrete memoryless channels. I. Inf. Control 1967, 10, 65–103. [Google Scholar] [CrossRef] [Green Version]
Shannon, C.; Gallager, R.; Berlekamp, E. Lower bounds to error probability for coding on discrete memoryless channels. II. Inf. Control 1967, 10, 522–552. [Google Scholar] [CrossRef] [Green Version]
Csiszár, I. Joint source-channel error exponent. Probl. Control Inf. Theory 1980, 9, 315–328. [Google Scholar]
Zhong, Y.; Alajaji, F.; Campbell, L.L. On the joint source-channel coding error exponent for discrete memoryless systems. IEEE Trans. Inf. Theory 2006, 52, 1450–1468. [Google Scholar] [CrossRef] [Green Version]
Tauste Campo, A.; Vazquez-Vilar, G.; Guillén i Fàbregas, A.; Martinez, A.; Koch, T. A derivation of the source-channel error exponent using nonidentical product distributions. IEEE Trans. Inf. Theory 2014, 60, 3209–3217. [Google Scholar] [CrossRef] [Green Version]
Ahlswede, R. Multi-way communication channels. In Proceedings of the 2nd International Symposium on Information Theory, Tsaghkadzor, Armenia, USSR, 2–8 September 1971. [Google Scholar]
Slepian, D.; Wolf, J.K. A coding theorem for multiple access channel with correlated sources. Bell Syst. Tech. J. 1973, 52, 1037–1076. [Google Scholar] [CrossRef]
Cover, T.M.; El Gamal, A.; Salehi, M. Multiple access channels with arbitrarily correlated sources. IEEE Trans. Inf. Theory 1980, 26, 648–657. [Google Scholar] [CrossRef] [Green Version]
Gallager, R. A perspective on multiple access channels. IEEE Trans. Inf. Theory 1985, 31, 124–142. [Google Scholar] [CrossRef]
Liu, Y.S.; Hughes, B. A new universal random coding bound for the multiple-access channel. IEEE Trans. Inf. Theory 1996, 42, 376–386. [Google Scholar]
Lapidoth, A. Mismatched decoding and the multiple-access channel. IEEE Trans. Inf. Theory 1996, 42, 1439–1452. [Google Scholar] [CrossRef]
Haim, E.; Kochman, Y.; Erez, U. Improving the MAC error exponent using distributed structure. In Proceedings of the 2011 IEEE International Symposium on Information Theory (ISIT), Saint Petersburg, Russia, 31 July–5 August 2011. [Google Scholar]
Cai, N. The maximum error probability criterion, random encoder, and feedback, in multiple input channels. Entropy 2014, 16, 1211–1242. [Google Scholar] [CrossRef] [Green Version]
Nazari, A.; Anastasopoulos, A.; Pradhan, S.S. Error exponent for multiple-access channels: Lower bounds. IEEE Trans. Inf. Theory 2014, 60, 5095–5115. [Google Scholar] [CrossRef] [Green Version]
Nazari, A.; Pradhan, S.S.; Anastasopoulos, A. Error exponent for multiple access channels: Upper bounds. IEEE Trans. Inf. Theory 2015, 61, 3605–3621. [Google Scholar] [CrossRef] [Green Version]
Farkas, L.; Kói, T. Random access and source-channel coding error exponents for multiple access channels. IEEE Trans. Inf. Theory 2015, 61, 3029–3040. [Google Scholar] [CrossRef] [Green Version]
Rezazadeh, A.; Font-Segura, J.; Martinez, A.; Guillén i Fàbregas, A. Multiple-Access Channel with Independent Sources: Error Exponent Analysis. In Proceedings of the 2018 IEEE Information Theory Workshop (ITW), Guangzhou, China, 25–29 November 2018. [Google Scholar]
Dueck, G. A note on the multiple access channel with correlated sources (Corresp.). IEEE Trans. Inf. Theory 1981, 27, 232–235. [Google Scholar] [CrossRef]
Padakandla, A. Communicating correlated sources over MAC and interference channels II: Joint source-channel coding. IEEE Trans. Inf. Theory 2021. [Google Scholar] [CrossRef]
Padakandla, A. Communicating correlated sources over a MAC. In Proceedings of the 2017 IEEE International Symposium on Information Theory (ISIT), Aachen, Germany, 25–30 June 2017. [Google Scholar]
Heidari, M.; Shirani, F.; Pradhan, S.S. New sufficient conditions for multiple-access channel with correlated sources. In Proceedings of the 2016 IEEE International Symposium on Information Theory (ISIT), Barcelona, Spain, 10–15 July 2016. [Google Scholar]
Rezazadeh, A.; Font-Segura, J.; Martinez, A.; Guillén i Fàbregas, A. An achievable error exponent for the multiple access channel with correlated sources. In Proceedings of the 2017 IEEE International Symposium on Information Theory (ISIT), Aachen, Germany, 25–30 June 2017. [Google Scholar]
Scarlett, J.; Martinez, A.; Guillén i Fàbregas, A. Mismatched decoding: Error exponents, second-order rates and saddlepoint approximations. IEEE Trans. Inf. Theory 2014, 60, 2647–2666. [Google Scholar] [CrossRef] [Green Version]
Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. Journal 1948, 27, 379–423. [Google Scholar] [CrossRef] [Green Version]
Rezazadeh, A.; Font-Segura, J.; Martinez, A.; Guillén i Fàbregas, A. Joint source-channel coding for the multiple-access channel with correlated sources. In Proceedings of the 2019 International Symposium on Information Theory (ISIT), Paris, France, 7–12 July 2019. [Google Scholar]
Gallager, R. Fixed Composition Arguments and Lower Bounds to Error Probability. Available online: http://web.mit.edu/gallager/www/notes/notes5.pdf (accessed on 3 May 2021).
Csiszár, I.; Körner, J. Information Theory: Coding Theorems for Discrete Memoryless Systems; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
Rezazadeh, A. Error Exponent Analysis for the Multiple-Access Channel with Correlated Sources. Ph.D. Thesis, Universitat Pompeu Fabra, Barcelona, Spain, 2019. [Google Scholar]
Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2006. [Google Scholar]
Hardy, G.H.; Littlewood, J.E.; Pólya, G. Inequalities, 2nd ed.; Cambridge University Press: Cambridge, UK, 1934. [Google Scholar]

Figure 1. Example of codewords

x_{ν}^{(1)}

,

x_{ν}^{(2)}

and

x_{ν}^{(3)}

in the mi-icd ensemble and

x_{ν}^{(4)}

,

x_{ν}^{(5)}

and

x_{ν}^{(6)}

in the mi-ccc ensemble, for a given source sequence

u_{ν}

.

Figure 1. Example of codewords

x_{ν}^{(1)}

,

x_{ν}^{(2)}

and

x_{ν}^{(3)}

in the mi-icd ensemble and

x_{ν}^{(4)}

,

x_{ν}^{(5)}

and

x_{ν}^{(6)}

in the mi-ccc ensemble, for a given source sequence

u_{ν}

.

Figure 2. Correlated-sources optimal assignment

Ω^{☆} (γ_{N})

in Equation (70) for all pairs of thresholds

(γ_{1}, γ_{2})

.

Figure 2. Correlated-sources optimal assignment

Ω^{☆} (γ_{N})

in Equation (70) for all pairs of thresholds

(γ_{1}, γ_{2})

.

Figure 3. Correlated-sources error exponent

E^{c o s t} (γ_{N})

in Equation (71) for all pairs of thresholds

(γ_{1}, γ_{2})

.

Figure 3. Correlated-sources error exponent

E^{c o s t} (γ_{N})

in Equation (71) for all pairs of thresholds

(γ_{1}, γ_{2})

.

Table 1. Correlated-sources optimal thresholds

γ_{N}^{☆}

and exponents

E_{τ}^{i_{N}} (γ_{N}^{☆})

in Equation (73) for assignments

Ω_{j}

in Equations (66)–(69). For each assignment, the minimum over

i_{N}

and

τ

is highlighted in gray.

Table 1. Correlated-sources optimal thresholds

γ_{N}^{☆}

and exponents

E_{τ}^{i_{N}} (γ_{N}^{☆})

in Equation (73) for assignments

Ω_{j}

in Equations (66)–(69). For each assignment, the minimum over

i_{N}

and

τ

is highlighted in gray.

	Assignment $Ω_{1}$				Assignment $Ω_{2}$
	$γ_{1}^{☆} = 0.8469$ $γ_{2}^{☆} = 0.6581$				$γ_{1}^{☆} = 1$ $γ_{2}^{☆} = 1$
$(i_{1}, i_{2})$	$(1, 1)$	$(1, 2)$	$(2, 1)$	$(2, 2)$	$(1, 1)$	$(1, 2)$	$(2, 1)$	$(2, 2)$
$τ = {1}$	0.3131	0.2735	0.3120	0.2611	0.0642	0.3268	0.1005	0.3604
$τ = {2}$	0.3986	0.4369	0.2611	0.4119	0.3959	0.3986	0.4323	0.3110
$τ = {1, 2}$	0.2611	0.2972	0.2630	0.2883	0.2108	0.2108	0.2360	0.2637
	Assignment $Ω_{3}$				Assignment $Ω_{4}$
	$γ_{1}^{☆} = 0.5605$ $γ_{2}^{☆} = 0.6709$				$γ_{1}^{☆} = 0.6985$ $γ_{2}^{☆} = 0.9033$
$(i_{1}, i_{2})$	$(1, 1)$	$(1, 2)$	$(2, 1)$	$(2, 2)$	$(1, 1)$	$(1, 2)$	$(2, 1)$	$(2, 2)$
$τ = {1}$	0.3120	0.2503	0.2763	0.2897	0.0879	0.3605	0.0879	0.3112
$τ = {2}$	0.2503	0.3898	0.5675	0.5731	0.3664	0.2503	0.4720	0.4684
$τ = {1, 2}$	0.2630	0.2816	0.2503	0.3012	0.2360	0.2632	0.2097	0.2097

Table 2. Independent-sources md-iid optimal thresholds

γ_{N}^{☆}

and exponents

E_{τ}^{i_{N}} (γ_{N}^{☆})

in Equation (73) for assignments

Ω_{j}

in Equations (66)–(69). For each assignment, the minimum over

i_{N}

and

τ

is highlighted in gray.

Table 2. Independent-sources md-iid optimal thresholds

γ_{N}^{☆}

and exponents

E_{τ}^{i_{N}} (γ_{N}^{☆})

in Equation (73) for assignments

Ω_{j}

in Equations (66)–(69). For each assignment, the minimum over

i_{N}

and

τ

is highlighted in gray.

	Assignment $Ω_{1}$				Assignment $Ω_{2}$
	$γ_{1}^{☆} = 0.8779$ $γ_{2}^{☆} = 0.6933$				$γ_{1}^{☆} = 0.8776$ $γ_{2}^{☆} = 1$
$(i_{1}, i_{2})$	$(1, 1)$	$(1, 2)$	$(2, 1)$	$(2, 2)$	$(1, 1)$	$(1, 2)$	$(2, 1)$	$(2, 2)$
$τ = {1}$	0.3343	0.2458	0.3089	0.2458	0.0913	0.3341	0.0913	0.3089
$τ = {2}$	0.3850	0.3987	0.2458	0.3788	0.4555	0.3850	0.4357	0.2459
$τ = {1, 2}$	0.2730	0.2870	0.2685	0.2863	0.3430	0.2728	0.2956	0.2685
	Assignment $Ω_{3}$				Assignment $Ω_{4}$
	$γ_{1}^{☆} = 0.61$ $γ_{2}^{☆} = 0.7043$				$γ_{1}^{☆} = 0.7092$ $γ_{2}^{☆} = 1$
$(i_{1}, i_{2})$	$(1, 1)$	$(1, 2)$	$(2, 1)$	$(2, 2)$	$(1, 1)$	$(1, 2)$	$(2, 1)$	$(2, 2)$
$τ = {1}$	0.3089	0.2367	0.2681	0.3078	0.0913	0.3117	0.0913	0.2648
$τ = {2}$	0.2367	0.3672	0.5425	0.5538	0.4269	0.2367	0.5393	0.4683
$τ = {1, 2}$	0.2685	0.2811	0.2367	0.3133	0.3006	0.2685	0.2740	0.2164

Table 3. Mi-iid exponents

E_{τ}

in Equation (75) for two correlated and two independent sources vs several input distribution assigments

(Q_{1}, Q_{2})

. For each assignment, the minimum over

τ

is highlighted in gray.

Table 3. Mi-iid exponents

E_{τ}

in Equation (75) for two correlated and two independent sources vs several input distribution assigments

(Q_{1}, Q_{2})

. For each assignment, the minimum over

τ

is highlighted in gray.

	Correlated Sources
$(Q_{1}, Q_{2})$	$(Q^{☆}, Q^{☆})$	$(Q^{☆}, Q^{†})$	$(Q^{†}, Q^{☆})$	$(Q^{†}, Q^{†})$
$τ = {1}$	0.2682	0.0642	0.3120	0.0879
$τ = {2}$	0.3986	0.3986	0.2503	0.3696
$τ = {1, 2}$	0.2097	0.2097	0.2630	0.2360
	Independent Sources
$(Q_{1}, Q_{2})$	$(Q^{☆}, Q^{☆})$	$(Q^{☆}, Q^{†})$	$(Q^{†}, Q^{☆})$	$(Q^{†}, Q^{†})$
$τ = {1}$	0.2648	0.3089	0.0627	0.0865
$τ = {2}$	0.3850	0.2367	0.3850	0.3559
$τ = {1, 2}$	0.2164	0.2685	0.2164	0.2421

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Rezazadeh, A.; Font-Segura, J.; Martinez, A.; Guillén i Fàbregas, A. Multi-Class Cost-Constrained Random Coding for Correlated Sources over the Multiple-Access Channel. Entropy 2021, 23, 569. https://doi.org/10.3390/e23050569

AMA Style

Rezazadeh A, Font-Segura J, Martinez A, Guillén i Fàbregas A. Multi-Class Cost-Constrained Random Coding for Correlated Sources over the Multiple-Access Channel. Entropy. 2021; 23(5):569. https://doi.org/10.3390/e23050569

Chicago/Turabian Style

Rezazadeh, Arezou, Josep Font-Segura, Alfonso Martinez, and Albert Guillén i Fàbregas. 2021. "Multi-Class Cost-Constrained Random Coding for Correlated Sources over the Multiple-Access Channel" Entropy 23, no. 5: 569. https://doi.org/10.3390/e23050569

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Multi-Class Cost-Constrained Random Coding for Correlated Sources over the Multiple-Access Channel

Abstract

1. Introduction

2. Problem Formulation

Summary of Notation Used in the Paper

3. Multi-Class Cost-Constrained Ensemble with Statistical Dependency

3.1. Review of Random-Coding Ensembles

3.2. Generalized Multi-Class Cost-Constrained Ensemble

3.3. Exponent for the Generalized Multi-Class Cost-Constrained Ensemble

4. Discussion

4.1. Gallager Functions for Correlated Sources

4.2. Transmissibility

4.3. Numerical Examples

4.4. Comparison of the Random-Coding Achievable Error Exponents

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. Proof of Theorem 1

Appendix A.1. Proof of Equation (A18)

Appendix A.2. Proof of Equation (A30)

Appendix B. Computation of the Optimum Multi-Class Thresholds

Proof of Lemma A1

Appendix C. Properties of the Modified Gallager Source Function

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI