Skip to main content
Log in

Minimax Theorems for Finite Blocklength Lossy Joint Source-Channel Coding over an Arbitrarily Varying Channel

  • INFORMATION THEORY
  • Published:
Problems of Information Transmission Aims and scope Submit manuscript

Abstract

Motivated by applications in the security of cyber-physical systems, we pose the finite blocklength communication problem in the presence of a jammer as a zero-sum game between the encoder-decoder team and the jammer, by allowing the communicating team as well as the jammer only locally randomized strategies. The communicating team's problem is nonconvex under locally randomized codes, and hence, in general, a minimax theorem need not hold for this game. However, we show that approximate minimax theorems hold in the sense that the minimax and maximin values of the game approach each other asymptotically. In particular, for rates strictly below a critical threshold, both the minimax and maximin values approach zero, and for rates strictly above it, they both approach unity. We then show a second-order minimax theorem, i.e., for rates exactly approaching the threshold along a specific scaling, the minimax and maximin values approach the same constant value, that is neither zero nor one. Critical to these results is our derivation of finite blocklength bounds on the minimax and maximin values of the game and our derivation of second-order dispersion-based bounds.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1.

Similar content being viewed by others

References

  1. Vora, A.S. and Kulkarni, A.A., A Minimax Theorem for Finite Blocklength Joint Source-Channel Coding over an AVC, in 2019 National Conf. on Communications (NCC 2019), Bangalore, India, Feb. 20–23, 2019, pp. 1–6. https://doi.org/10.1109/NCC.2019.8732205

  2. Humayed, A., Lin, J., Li, F., and Luo, B., Cyber-Physical Systems Security—A Survey, IEEE Internet of Things J., 2017, vol. 4, no. 6, pp. 1802–1831. https://doi.org/10.1109/JIOT.2017.2703172

    Article  Google Scholar 

  3. Slay, J. and Miller, M., Lessons Learned from the Maroochy Water Breach, Critical Infrastructure Protection (Proc. Int. Conf. ICCIP–2007, Hanover, NH, USA, Mar. 19–21, 2007), Goetz, E. and Shenoi, S., Eds., Boston: Springer, 2008. https://doi.org/10.1007/978-0-387-75462-8_6

  4. Langner, R., Stuxnet: Dissecting a Cyberwarfare Weapon, IEEE Secur. Priv., 2011, vol. 9, no. 3, pp. 49–51. https://doi.org/10.1109/MSP.2011.67

    Article  Google Scholar 

  5. Maschler, M., Solan, E., and Zamir, S., Game Theory, Cambridge: Cambridge Univ. Press, 2013.

    Book  Google Scholar 

  6. Kulkarni, A.A. and Coleman, T.P., An Optimizer’s Approach to Stochastic Control Problems with Nonclassical Information Structures, IEEE Trans. Autom. Control, 2015, vol. 60, no. 4, pp. 937–949. https://doi.org/10.1109/TAC.2014.2362596

    Article  MathSciNet  Google Scholar 

  7. Ahlswede, R., A Note on the Existence of the Weak Capacity for Channels with Arbitrarily Varying Channel Probability Functions and Its Relation to Shannon’s Zero Error Capacity, Ann. Math. Statist., 1970, vol. 41, no. 3, pp. 1027–1033. https://doi.org/10.1214/aoms/1177696979

    Article  MathSciNet  Google Scholar 

  8. Polyanskiy, Y., Poor, H.V., and Verdú, S., Channel Coding Rate in the Finite Blocklength Regime, IEEE Trans. Inform. Theory, 2010, vol. 56, no. 5, pp. 2307–2359. https://doi.org/10.1109/TIT.2010.2043769

    Article  MathSciNet  Google Scholar 

  9. Kostina, V. and Verdú, S., Lossy Joint Source-Channel Coding in the Finite Blocklength Regime, IEEE Trans. Inform. Theory, 2013, vol. 59, no. 5, pp. 2545–2575. https://doi.org/10.1109/TIT.2013.2238657

    Article  MathSciNet  Google Scholar 

  10. Csiszár, I. and Körner, J., Information Theory: Coding Theorems for Discrete Memoryless Systems, Cambridge, UK: Cambridge Univ. Press, 2011, 2nd ed.

    Book  Google Scholar 

  11. Kosut, O. and Kliewer, J., Finite Blocklength and Dispersion Bounds for the Arbitrarily-Varying Channel, in Proc. 2018 IEEE Int. Symp. on Information Theory (ISIT–2018), Vail, CO, USA, June 17–22, 2018, pp. 2007–2011. https://doi.org/10.1109/ISIT.2018.8437724

  12. Jose, S.T. and Kulkarni, A.A., Linear Programming-Based Converses for Finite Blocklength Lossy Joint Source-Channel Coding, IEEE Trans. Inform. Theory, 2017, vol. 63, no. 11, pp. 7066–7094. https://doi.org/10.1109/TIT.2017.2738634

    Article  MathSciNet  Google Scholar 

  13. Jose, S.T. and Kulkarni, A.A., Improved Finite Blocklength Converses for Slepian–Wolf Coding via Linear Programming, IEEE Trans. Inform. Theory, 2019, vol. 65, no. 4, pp. 2423–2441. https://doi.org/10.1109/TIT.2018.2873623

    Article  MathSciNet  Google Scholar 

  14. Borden, J.M., Mason, D.M., and McEliece, R.J., Some Information Theoretic Saddlepoints, SIAM J. Control Optim., 1985, vol. 23, no. 1, pp. 129–143. https://doi.org/10.1137/0323011

    Article  MathSciNet  Google Scholar 

  15. Hegde M.V., Stark W.E., and Teneketzis, D., On the Capacity of Channels with Unknown Interference, IEEE Trans. Inform. Theory, 1989, vol. 35, no. 4, pp. 770–783. https://doi.org/10.1109/18.32154

    Article  MathSciNet  Google Scholar 

  16. Başar, T. and Wu, Y.-W., A Complete Characterization of Minimax and Maximin Encoder-Decoder Policies for Communication Channels with Incomplete Statistical Description, IEEE Trans. Inform. Theory, 1985, vol. 31, no. 4, pp. 482–489. https://doi.org/10.1109/TIT.1985.1057076

    Article  MathSciNet  Google Scholar 

  17. Hughes, B. and Narayan, P., Gaussian Arbitrarily Varying Channels, IEEE Trans. Inform. Theory, 1987, vol. 33, no. 2, pp. 267–284. https://doi.org/10.1109/TIT.1987.1057288

    Article  MathSciNet  Google Scholar 

  18. Jose, S.T. and Kulkarni, A.A., On a Game Between a Delay-Constrained Communication System and a Finite State Jammer, in Proc. 2018 IEEE Conf. on Decision and Control (CDC’2018), Miami, FL, USA, Dec. 17–19, 2018, pp. 5063–5068. https://doi.org/10.1109/CDC.2018.8618987

  19. Jose, S.T. and Kulkarni, A.A., Shannon Meets von Neumann: A Minimax Theorem for Channel Coding in the Presence of a Jammer, IEEE Trans. Inform. Theory, 2020, vol. 66, no. 5, pp. 2842–2859. https://doi.org/10.1109/TIT.2020.2971682

    Article  MathSciNet  Google Scholar 

  20. Blackwell, D., Breiman, L., and Thomasian, A.J., The Capacities of Certain Channel Classes under Random Coding, Ann. Math. Statist., 1960, vol. 31, no. 3, pp. 558–567. https://doi.org/10.1214/aoms/1177705783

    Article  MathSciNet  Google Scholar 

  21. Ahlswede, R., Elimination of Correlation in Random Codes for Arbitrarily Varying Channels, Z. Wahrsch. Verw. Gebiete, 1978, vol. 44, no. 2, pp. 159–175. https://doi.org/10.1007/BF00533053

    Article  MathSciNet  Google Scholar 

  22. Lapidoth, A. and Narayan, P., Reliable Communication under Channel Uncertainty, IEEE Trans. Inform. Theory, 1998, vol. 44, no. 6, pp. 2148–2177. https://doi.org/10.1109/18.720535

    Article  MathSciNet  Google Scholar 

  23. Cover, T.M. and Thomas, J.A., Elements of Information Theory, Hoboken, NJ: Wiley, 2012, 2nd ed.

    MATH  Google Scholar 

  24. Kostina, V. and Verdú, S., Fixed-Length Lossy Compression in the Finite Blocklength Regime, IEEE Trans. Inform. Theory, 2012, vol. 58, no. 6, pp. 3309–3338. https://doi.org/10.1109/TIT.2012.2186786

    Article  MathSciNet  Google Scholar 

  25. Shannon, C.E., Coding Theorems for a Discrete Source with a Fidelity Criterion, IRE Nat. Conv. Rec., 1959, Part 4, pp. 142–163.

    Google Scholar 

  26. Conforti, M., Cornuéjols, G., and Zambelli, G., Integer Programming, New York: Springer, 2014.

    MATH  Google Scholar 

  27. Kosut, O. and Kliewer, J., Dispersion of the Discrete Arbitrarily-Varying Channel with Limited Shared Randomness, in Proc. 2017 IEEE Int. Symp. on Information Theory (ISIT’2017), Aachen, Germany, June 25–30, 2017, pp. 1242–1246. https://doi.org/10.1109/ISIT.2017.8006727

Download references

Acknowledgment

We thank an anonymous reviewer for his careful reading and constructive comments on an earlier version of this paper.

Author information

Authors and Affiliations

Authors

Additional information

Translated from Problemy Peredachi Informatsii, 2021, Vol. 57, No. 2, pp. 3–35 https://doi.org/10.31857/S0555292321020017.

Appendix

Appendix

We begin with the following central limit theorem due to Berry and Esseen (see [8]).

Theorem 14

(Berry–Esseen CLT). Fix \(n\in\mathbb{N}\). Let \(W_i\) be independent random variables. Then, for \(t\in\mathbb{R}\), we have

$$\Biggl|\boldsymbol{\rm{P}}\Biggl(\frac{1}{n}\sum\limits_{i=1}^n W_i>D_n+t\sqrt{\frac{V_n}{n}} \Biggr)-\mathrm{Q}(t)\Biggr|\le\frac{B_n}{\sqrt{n}},$$

where \(\mathrm{Q}\) is the complementary Gaussian function, and

$$\begin{aligned}D_n &= \frac{1}{n}\sum\limits_{i=1}^n\boldsymbol{\rm{E}}[W_i], & V_n &=\frac{1}{n}\sum\limits_{i=1}^n{\rm{Var}}[W_i],\\ A_n &=\frac{1}{n}\sum\limits_{i=1}^n\boldsymbol{\rm{E}}\bigl[|W_i-\boldsymbol{\rm{E}}[W_i]|^3\bigr],\qquad & B_n &=\frac{c_0A_n}{V_n^{3/2}},\quad c_0 >0.\end{aligned}$$

Proof of Lemma 1.

Fix a \(\theta\in\mathcal{T}^n\) and consider K independent copies of the random code as \(\left\{F_i,\Phi_i\right\}_{i=1}^K\). Using Hoeffding’s inequality, we write

$$\boldsymbol{\rm{P}} \biggl(\frac{1}{K}\sum\limits_ie_{\boldsymbol{d},\theta}(F_i,\Phi_i)\ge\varepsilon'\biggr)\le\exp\Bigl(-2K\bigl(\varepsilon'-\boldsymbol{\rm{E}}[e_{\boldsymbol{d},\theta}(F,\Phi)]^2\bigr)\Bigr).$$

Since \(e_{\boldsymbol{d}}(\psi)=\max\limits_{\theta\in\mathcal{T}^n}\boldsymbol{\rm{E}}[e_{\boldsymbol{d},\theta}(F,\Phi)]<\varepsilon\), we have that \(\boldsymbol{\rm{E}}[e_{\boldsymbol{d},\theta}(F,\Phi)] < \varepsilon\;\forall \theta \in\mathcal{T}^n\). Using this and \(\varepsilon'\) from (18), we have that for all \(\theta\in\mathcal{T}^n\),

$$\varepsilon'-\boldsymbol{\rm{E}}[e_{\boldsymbol{d},\theta}(F,\Phi)]>\varepsilon'-\varepsilon>\sqrt{\log|\mathcal{T}^n|/2K},$$

and hence

$$\begin{aligned}\boldsymbol{\rm{P}} \biggl(\frac{1}{K}\sum\limits_ie_{\boldsymbol{d},\theta}(F_i,\Phi_i)\ge\varepsilon'\biggr) & \le\exp\Bigl(-2K^2\bigl(\varepsilon'-\boldsymbol{\rm{E}}[e_{\boldsymbol{d},\theta}(F,\Phi)]^2\bigr)\Bigr)\\&< \exp \biggl(-2K \frac{\log |\mathcal{T}^n|}{2K}\biggr)=\frac{1}{|\mathcal{T}^n|}. \end{aligned}$$

Thus, we write

$$\begin{aligned}\boldsymbol{\rm{P}} \biggl(\frac{1}{K}\sum\limits_ie_{\boldsymbol{d},\theta}(F_i,\Phi_i)<\varepsilon'\; \forall\theta\in\mathcal{T}^n\biggr)&= 1-\boldsymbol{\rm{P}} \biggl(\frac{1}{K}\sum\limits_ie_{\boldsymbol{d},\theta}(F_i,\Phi_i)\ge\varepsilon'\; \mbox{for some}\,\, \theta\in\mathcal{T}^n\biggr)\\ &\ge 1-\sum\limits_{\theta\in\mathcal{T}^n} \boldsymbol{\rm{P}}\biggl(\frac{1}{K}\sum\limits_ie_{\boldsymbol{d},\theta}(F_i,\Phi_i)\ge\varepsilon'\biggr)\\ &> 1-\frac{|\mathcal{T}^n|}{|\mathcal{T}^n|}>0.\end{aligned}$$

Thus, the event \(\Bigl\{\sum\limits_ie_{\boldsymbol{d},\theta}(F_i,\Phi_i)/K<\varepsilon'\;\forall\theta\in\mathcal{T}^n\Bigr\}\) has nonzero probability, and hence there exist K deterministic codes \(\left\{f_i,\varphi_i\right\}_{i=1}^K\) such that for all \(\theta\in\mathcal{T}^n\),

$$\frac{1}{K}\sum\limits_i e_{\boldsymbol{d},\theta}(f_i,\varphi_i)<\varepsilon',$$

where \(\varepsilon'>\varepsilon+\sqrt{\log |\mathcal{T}^n|/2K}\). △

Proof of Theorem 9.

We weaken the bound in (19) by taking \(P_{X_a}(x_a)=\prod\limits_{i=1}^{d_n} P^*_{\mathbb{X}}(x_i)\), \(P_{\mathbb{X}}^*\in\Pi_{\mathbb{X}}\). Let \(\mathcal{A}\), Z, K, and \(d_n\) be as defined in (24), (26), and (27) respectively. Let \(U_i:=i_{\mathbb{X}^*;\mathbb{Y}_{T_{\theta}}}(X_{ai};Y_{ai})\), where \((X_{ai}, Y_{ai}) \sim P^*_{\mathbb{X}}\times\sum\limits_{\bar\theta\in\mathcal{T}}T_{\theta}(\bar\theta)P_{\mathbb{Y}|\mathbb{X},\Theta=\bar\theta} \; \forall i\). Since the channel is memoryless, \(i_{X^*;Y_{q^*}}(X_a;Y_a)=\sum\limits_{i=1}^{d_n}U_i\). Thus, taking \(\gamma'=\log(\sqrt{d_n}K)\) we can write the following: \(\boldsymbol{\rm{P}}\bigl(i_{X^*;Y_{q^*}}(X_a;Y_a)\le\gamma'\bigr)=\boldsymbol{\rm{P}}\biggl(\,\sum\limits_{i=1}^{d_n}U_i\le\log (\sqrt{d_n}K)\!\biggr)\). From [10, Lemma 12.10], we have \(C\le\boldsymbol{\rm{E}}\bigl[i_{\mathbb{X};\mathbb{Y}_{T_{\theta}}}(X_{ai};Y_{ai})\bigr]=\boldsymbol{\rm{E}}[U_i]\) for all \(T_{\theta}\in\mathcal{P}_{d_n}(\mathcal{T})\), and hence \(d_nC\le\sum\limits_{i=1}^{d_n}\boldsymbol{\rm{E}}[U_i]\). Also, substituting for \(\log K\) from equation (27), we get

$$\begin{aligned}\boldsymbol{\rm{P}}\Biggl(\sum\limits_{i=1}^{d_n}U_i\le\log \sqrt{d_n}+\log K\Biggr) & \le\boldsymbol{\rm{P}}\Biggl(\sum\limits_{i=1}^{d_n} U_i\le\log \frac{\sqrt{d_n}}{\exp d_n\delta}+\sum\limits_{i=1}^{d_n} \boldsymbol{\rm{E}}[U_i]\Biggr)\\ & \le\boldsymbol{\rm{P}}\Biggl(\Biggl|\sum\limits_{i=1}^{d_n} (U_i-\boldsymbol{\rm{E}}[U_i])\Biggr|\ge\log \frac{\exp d_n\delta}{\sqrt{d_n}}\Biggr),\end{aligned}$$

where the last equation follows from the triangle inequality. Using Chebyshev’s inequality and taking the supremum over \(\theta_a\), we get

$$\begin{array}{ll}\displaystyle\sup\limits_{\theta_a\in\mathcal{T}^{d_n}} \boldsymbol{\rm{P}}\Biggl(\Biggl|\sum\limits_{i=1}^{d_n} (U_i - \boldsymbol{\rm{E}}[U_i])\Biggr|\ge d_n\delta-\log \sqrt{d_n}\Biggr)\\ \quad\quad\quad\quad\quad\,\,\quad\le\displaystyle\sup\limits_{\theta_a\in\mathcal{T}^{d_n}} \frac{d_n}{(d_n \delta-\log \sqrt{d_n})^2}{\rm{Var}} \bigl(i_{\mathbb{X}^*;\mathbb{Y}_{T_{\theta_a}}}(\mathbb{X}_a;\mathbb{Y}_a)\bigr)\le\frac{V_0}{d_n\Bigl(\delta-\frac{\log \sqrt{d_n}}{d_n}\Bigr)^2},\quad\end{array}$$
(33)

where the last inequality in (33) follows from equation (25).

The bounds on the other two terms given as \(\boldsymbol{\rm{P}}(Z(X_a,\bar{X}_a,Y_a)=0,(X_a,Y_a)\in\mathcal{A} \,|\,\theta_a)\le 1/\sqrt{d_n}K\) and \(\boldsymbol{\rm{P}}(Z(X_a,\bar{x}_a,Y_a)=0, (X_a,Y_a)\in\mathcal{A} \,|\,\bar{X}_a=\bar{x}_a,\theta_a)\le (d_n+1)^{|\mathcal{X}|^2|\Theta||\mathcal{Y}|} \exp(-d_n\eta)\) follow from the proof of Theorem 4 in [11]. Using (33) and the above bounds, we get the required bound. △

Proof of Theorem 10.

To derive the bound, we construct a random joint source-channel code \((F,\Phi)\) or equivalently a distribution \(\psi\) on the set of codes \(\{(f,\varphi)\mid f\colon\,\mathcal{S}^k\to\mathcal{X}^n,\:\varphi\colon\,\mathcal{Y}^n\to\mathcal{S}^k\}\). From Theorem 6, it suffices to choose the distributions \(P_X\) and \(P_{\widehat{S}}\). Let \(P^*_{\mathbb{X}}\) be a distribution from the set \(\Pi_{\mathbb{X}}\). Define \(P_X(x):=\prod\limits_{i=1}^nP^*_{\mathbb{X}}(x_i)\) for \(x\in\mathcal{X}^n\). Further, take \(P_{\widehat{S}}(\widehat{s})=\prod\limits_{i=1}^kP_{\widehat{\mathbb{S}}}^*(\widehat{s}_i)\), \(\widehat{s}\in\mathcal{S}^k\), where \(P_{\widehat{\mathbb{S}}}^*\) achieves the optimum in (2). Clearly, with these choice of distributions, we have that

$$i_{X^*;Y_{q^*}}(X;Y)=\sum\limits_{i=1}^ni_{\mathbb{X}^*;\mathbb{Y}_{q_{\Theta}^*}}(X_i;Y_i),\qquad j_S(S,\boldsymbol{d})=\sum\limits_{j=1}^kj_{\mathrm{S}}(S_j,\boldsymbol{d}).$$

Recall the error terms corresponding to the random code from Theorem 8. Writing the maximization over \(\theta_b\in\mathcal{T}^n\) as maximization over \(q\in\mathcal{P}(\mathcal{T}^n)\), we can write the bound as

$$\max_{q\in\mathcal{P}(\mathcal{T}^n)} \biggl[\boldsymbol{\rm{E}} \biggl[\exp\biggl(-\biggl|i_{X^*;Y_{q^*}}(X_b;Y_b)-\log\frac{\overline\gamma}{P_{\widehat{S}}(\mathcal{B}_{\boldsymbol{d}}(S))}\biggr|^+\biggr)\biggr]+e^{1-\overline\gamma}\biggr],$$

where \(q(\theta)=\prod\limits_{i=1}^nq_i(\theta_i)\) with \(q_i\in\mathcal{P}(\mathcal{T})\), \(\theta\in\mathcal{T}^n\). Let \(h(X_b,Y_b,S):=\sum\limits_{j=1}^ni_{\mathbb{X}^*;\mathbb{Y}_{q_{\Theta}^*}}(X_{bj};Y_{bj})-\log(\overline\gamma/P_{\widehat{S}}(\mathcal{B}_{\boldsymbol{d}}(S)))\). Further, define the set \(\mathcal{D}\) as

$$\mathcal{D}:=\biggl\{s\in\mathcal{S}^k:\: \log\frac{1}{P_{\widehat{S}}(\mathcal{B}_{\boldsymbol{d}}(s))}\le\sum\limits_{i=1}^kj_{\mathrm{S}}(s_i,\boldsymbol{d})+\Bigl(\bar{c}-\frac{1}{2}\Bigr)\log k+c\biggr\},$$

where \(\bar{c}\) and c are constants defined in [9, Lemma 5]. Define the random variable \(W_\ell\) as

$$W_\ell=W_\ell(n,k):=\begin{cases} i_{\mathbb{X}^*;\mathbb{Y}_{q_{\Theta}^*}}(X_{b\ell};Y_{b\ell}) & \text{if}\ \ell\le n,\\ -j_{\mathrm{S}}(S_{\ell-n},\boldsymbol{d}) & \text{if}\ n<\ell\le n+k.\end{cases}$$
(34)

The expectation \(\boldsymbol{\rm{E}} [\exp(-|h(X_b,Y_b,S)|^+)]\) can be written as

$$\begin{array}{ll}\boldsymbol{\rm{E}} \bigl[\exp(-|h(X_b,Y_b,S)|^+) \mathbb{I}\{S\in\mathcal{D}\}\bigr]+\boldsymbol{\rm{E}}\bigl[\exp(-|h(X_b,Y_b,S)|^+)\mathbb{I}\{S \notin \mathcal{D}\}\bigr]\\ \quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\le\displaystyle\boldsymbol{\rm{E}}\Biggl[\exp\Biggl(-\Biggl|\sum\limits_{\ell=1}^{n+k} W_\ell-\log\bigl(k^{(\bar{c}-\frac{1}{2})}\exp(c)\overline\gamma\bigr)\Biggr|^+\Biggr)\Biggr]+\frac{K_0}{\sqrt{k}},\quad\end{array}$$
(35)

where the first term in (35) follows by using the definition of the set \(\mathcal{D}\) and the second term follows by using \(\boldsymbol{\rm{E}} \bigl[\exp(-|h(X_b,Y_b,S)|^+)\mathbb{I}\{S \notin \mathcal{D}\}\bigr]\le\boldsymbol{\rm{P}}(S \notin \mathcal{D})\) and applying Lemma 5 from [9].

We define the following moments to be used in bounding the first term in (35) using the Berry–Esseen CLT:

$$\begin{aligned}D_{n+k}(q) &= \frac{1}{n+k}\sum\limits_{\ell=1}^{n+k}\boldsymbol{\rm{E}}[W_\ell], & V_{n+k}(q) &=\frac{1}{n+k}\sum\limits_{\ell=1}^{n+k}{\rm{Var}}[W_\ell],\\ A_{n+k}(q) &=\frac{1}{n+k}\sum\limits_{\ell=1}^{n+k}\boldsymbol{\rm{E}}\bigl[|W_\ell-\boldsymbol{\rm{E}}[W_\ell]|^3\bigr],\qquad & B_{n+k}(q) &= \frac{c_0A_{n+k}(q)}{V_{n+k}^{3/2}(q)},\quad c_0 >0.\quad\end{aligned}$$
(36)

Note that the moments are computed with respect to the distribution \(P_X\times\sum\limits_{\theta} q(\theta)P_{Y|X,\boldsymbol{\Theta}=\theta}\times P_S\). Next, we define the following set:

$$\mathcal{H}=\Biggl\{(x,y,s)\in\mathcal{X}^n \times\mathcal{Y}^n\times\mathcal{S}^k:\:\frac{1}{n+k}\sum\limits_{\ell=1}^{n+k}W_\ell>D_{n+k}(q)-t_{k,n}\sqrt{\frac{V_{n+k}(q)}{n+k}}\Biggr\},$$

where \(t_{k,n}>0\) will be chosen later. For the sake of brevity, we define the term in the \(\exp\) as follows:

$$g(X_b,Y_b,S)=\sum\limits_{\ell=1}^{n+k} W_\ell-\log\bigl(k^{(\bar{c}-\frac{1}{2})}\exp(c)\gamma\bigr),$$
(37)
$$\Gamma_{n+k}(q)=(n+k)\Biggl(D_{n+k}(q)-t_{k,n}\sqrt{\frac{V_{n+k}(q)}{n+k}}\Biggr)-\log\bigl(k^{(\bar{c}-\frac{1}{2})} \exp(c)\log \gamma\bigr).$$
(38)

Thus, we can write (35) as \(\boldsymbol{\rm{E}} [\exp(-|g(X_b,Y_b,S)|^+)]\), which can be written as

$$\begin{array}{ll}\boldsymbol{\rm{E}} \bigl[\exp(-|g(X_b,Y_b,S)|^+)\mathbb{I}\{(X_b,Y_b,S)\in\mathcal{H}\}\bigr] +\boldsymbol{\rm{E}} \bigl[\exp(-|g(X_b,Y_b,S)|^+)\mathbb{I}\{(X_b,Y_b,S) \notin \mathcal{H}\}\bigr]\\ \quad\quad\quad\quad\quad\quad\quad\quad\le\boldsymbol{\rm{E}} \bigl[\exp(-|\Gamma_{n+k}(q)|^+) \mathbb{I}\{(X_b,Y_b,S)\in\mathcal{H}\} \bigr]+\boldsymbol{\rm{E}}\bigl[\mathbb{I}\{(X_b,Y_b,S) \notin \mathcal{H}\} \bigr],\quad\end{array}$$
(39)

where the first term in (39) follows by using the definition of the set \(\mathcal{H}\) and the second term follows by using \(\exp(-|\cdot|^+)\le 1\). Using the above results and (35), we get the following bound:

$$\begin{array}{lll}\displaystyle\max\limits_{q\in\mathcal{P}(\mathcal{T}^n)} \biggl[\boldsymbol{\rm{E}} \biggl[\exp\biggl(-\biggl|i_{X^*;Y_{q^*}}(X_b;Y_b)-\log\frac{\overline\gamma}{P_{\widehat{S}}(\mathcal{B}_{\boldsymbol{d}}(S))}\biggr|^+\biggr)\biggr]+e^{1-\overline\gamma}\biggr]\\ \,\,\quad\quad\quad\quad\quad\quad\quad\quad\le\max\limits_{q \in\mathcal{P}(\mathcal{T}^n)}\boldsymbol{\rm{E}}\bigl[\exp(-|\Gamma_{n+k}(q)|^+) \mathbb{I}\{(X_b,Y_b,S)\in\mathcal{H}\}\bigr]\\ \,\,\,\,\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad \strut+\displaystyle\max\limits_{q\in\mathcal{P}(\mathcal{T}^n)} \boldsymbol{\rm{P}} ((X_b,Y_b,S) \notin\mathcal{H})+e^{1-\overline\gamma}+\frac{K_0}{\sqrt{k}}.\quad\end{array}$$
(40)

To upper bound the first term in (40), we compute the maximum of \(\exp(-|\Gamma_{n+k}(q)|^+)\) over all distributions. Since \(\exp(-|\cdot|^+)\) is a decreasing function with respect to \(\Gamma_{n+k}(q)\), we get

$$\max_{q\in\mathcal{P}(\mathcal{T}^n)}\exp(-|\Gamma_{n+k}(q)|^+ )\le\exp\Bigl(-\Bigl|\min_{q\in\mathcal{P}(\mathcal{T}^n)}\Gamma_{n+k}(q) \Bigr|^+\Bigr).$$

To compute the minimum, we consider the following:

$$\begin{aligned}D_{n+k}(q) &= \frac{1}{n+k}\sum\limits_{i=1}^n\boldsymbol{\rm{E}}\bigl[i_{\mathbb{X}^*;\mathbb{Y}_{q_{\Theta}^*}}(\mathbb{X}_{bi};\mathbb{Y}_{bi})\bigr] -\frac{k}{n+k} R(\boldsymbol{d}), \nonumber\\V_{n+k}(q) &= \frac{1}{n+k}\sum\limits_{i=1}^n{\rm{Var}}\bigl(i_{\mathbb{X}^*;\mathbb{Y}_{q_{\Theta}^*}}(\mathbb{X}_{bi};\mathbb{Y}_{bi})\bigr) + \frac{k}{n+k}V_{\mathrm{S}}(\boldsymbol{d}), \nonumber\end{aligned}$$

where the moments are with respect to \(P_{\mathbb{X}}^*\times\sum\limits_{\theta\in\mathcal{T}}q_i(\theta)P_{\mathbb{Y}|\mathbb{X},\Theta=\theta}\). Thus, the minimum is given as

$$\begin{array}{ll} \displaystyle\min\limits_{q\in\mathcal{P}(\mathcal{T}^n)}(n+k)\Biggl(D_{n+k}(q)-t_{k,n}\sqrt{\frac{V_{n+k}(q)}{n+k}}\Biggr)\\[-5pt] \quad\quad\,\,\,\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad=\displaystyle nC-kR(\boldsymbol{d})-(n+k)\max\limits_{q_i\in\Pi_{\Theta}}t_{k,n}\sqrt{\frac{V_{n+k}(q)}{n+k}}+O(1),\quad\end{array}$$
(41)

where the maximum in the second term in (41) is restricted to \(\Pi_{\Theta}\) by using [8, Lemmas 63 and 64]. Since \(P_{\mathbb{X}}^*\in\Pi_{\mathbb{X}}\) and \(q_{\Theta}^*\in\Pi_{\Theta},|\Pi_{\Theta}|=1\), we have \(\max\limits_{q_i\in\Pi_{\Theta}}V_{n+k}(q)=nV_{\mathrm{C}}\). Thus, we have that

$$\min_{q\in\mathcal{P}(\mathcal{T}^n)}(n+k) \Biggl(D_{n+k}(q)-t_{k,n}\sqrt{\frac{V_{n+k}(q)}{n+k}}\Biggr)=nC-kR(\boldsymbol{d})-t_{k,n} \sqrt{nV_{\mathrm{C}}+k V_{\mathrm{S}}(\boldsymbol{d})}+O(1).$$

We choose \(t_{k,n}\) as \(t_{k,n}=(nC-kR(\boldsymbol{d})-\bar{c}\log k-\log \overline\gamma-c)/\sqrt{n V_{\mathrm{C}}+k V_{\mathrm{S}}(\boldsymbol{d})}\). Thus, we get \(\min\limits_{q\in\mathcal{P}(\mathcal{T}^n)}\Gamma_{n+k}(q)\ge \frac{1}{2} \log k+O(1)\). Substituting in (40), we get the following upper bound:

$$\begin{array}{ll}\displaystyle\max\limits_{q\in\mathcal{P}(\mathcal{T}^n)}\frac{1}{\sqrt{k}}\boldsymbol{\rm{P}}((X_b,Y_b,S)\in\mathcal{H})+\max\limits_{q\in\mathcal{P}(\mathcal{T}^n)} \boldsymbol{\rm{P}} ((X_b,Y_b,S)\notin \mathcal{H} )+e^{1-\overline\gamma}+\frac{K_0}{\sqrt{k}}\\ \quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\le\displaystyle\mathrm{Q}(t_{k,n})+\max\limits_{q\in\mathcal{P}(\mathcal{T}^n)}\frac{B_{n+k}(q)}{\sqrt{n+k}}+\frac{K_0+2}{\sqrt{k}},\quad\end{array}$$
(42)

where (42) follows by taking \(\overline\gamma=(\log_{e}k)/2+1\) and bounding \(\boldsymbol{\rm{P}} ((X_b,Y_b,S) \notin \mathcal{H})\) using the Berry–Esseen CLT.

Recall that \(B_{n+k}(q)\) is given as \(B_{n+k}(q)=c_0A_{n+k}(q)/V_{n+k}^{3/2}(q)\). Assuming \(\min\limits_{\ell\in\{1,\ldots,n+k\}}{\rm{Var}}[W_\ell]\ne 0\), we can bound \(B_{n+k}(q)\) as from the definition of \(A_{n+k}(q)\) and \(V_{n+k}(q)\) as

$$B_{n+k}(q)\le\frac{c_0 \max\limits_\ell\boldsymbol{\rm{E}}\bigl[|W_\ell-\boldsymbol{\rm{E}}[W_\ell]|^3\bigr]}{\Bigl(\min\limits_\ell{\rm{Var}}[W_\ell]\Bigr)^{3/2}}.$$
(43)

From (22) and (23), we have that the right-hand side is finite for all q and independent of k and n. Thus, we have that \(\max\limits_{q\in\mathcal{P}(\mathcal{T}^n)} B_{n+k}(q)\) is a finite constant. Substituting \(t_{k,n}\) from above, taking \(\max\limits_{q\in\mathcal{P}(\mathcal{T}^n)} B_{n+k}(q)\le B\) with \(B>0\) and using (42), we get the required bound. △

Proof of Theorem 12.

From (11), it suffices to construct q, \(P_{\bar{Y}_q}\), the random variable U, and \(\gamma\) to get a lower bound on \(\underline{\nu}(k,n)\). Take \(q(\theta)=q^*(\theta)=\prod\limits_{i=1}^n q^*_{\Theta}(\theta_i)\) where \(q^*_{\Theta}\in\Pi_{\Theta}\). Let \(\mathrm{U}\) be the number of types in \(\mathcal{P}_n(\mathcal{X})\) and \(\mathcal{U}=\{1,\ldots,\mathrm{U}\}\) be indices corresponding to each of the types. Thus, for a given sequence \(x\in\mathcal{X}^n\), U maps it to its type, which is denoted by some index \(u\in\mathcal{U}\). Further, let \(P_{\overline{Y}_q|U}\) be defined as

$$P_{\overline{Y}_q|U}(y\,|\,u)=(P_Xq^*P_{Y|X,\boldsymbol{\Theta}})(y)=\sum\limits_{x,\theta} P_X(x)q^*(\theta)P_{Y|X,\boldsymbol{\Theta}}(y\,|\,x,\theta),$$

where \(P_X(x)=\prod\limits_{i=1}^nT_x(x_i)\), \(x\in\mathcal{X}^n\), with \(T_x\in\mathcal{P}_n(\mathcal{X})\) being the type corresponding to the index u. Thus, we have that \(i_{X;\overline{Y}_q|U}(x;Y\,|\,u)=\sum\limits_{i=1}^ni_{\mathbb{X};\mathbb{Y}_{q^*}}(x_i;Y_i)\), where

$$i_{\mathbb{X};\mathbb{Y}_{q^*}}(x';y):=\log \frac{(q_{\Theta}^*P_{\mathbb{Y}|\mathbb{X},\Theta})(y\,|\,x')}{(T_xq_{\Theta}^* P_{\mathbb{Y}|\mathbb{X},\Theta})(y)},\quad x'\in\mathcal{X},\quad y\in\mathcal{Y},$$

and \(j_S(s,\boldsymbol{d})=\sum\limits_{j=1}^kj_{\mathrm{S}}(s_j,\boldsymbol{d})\).

Since q is taken as an i.i.d. distribution, effectively, we have a channel where

$$Y \sim \prod\limits_{i=1}^n\sum\limits_{\theta_i\in\mathcal{T}}q_{\Theta}^*(\theta_i) P_{\mathbb{Y}|\mathbb{X}=x_i,\Theta=\theta_i}$$

when the input is \(x=(x_1,\ldots,x_n)\). Thus, the left-hand side of (31) is a converse of the standard DMC without a jammer with the channel given as the above averaged channel. Following the line of arguments given in [9, Appendix C], we get the following inequality:

$$\begin{array}{ll}\displaystyle\max\limits_{q,P_{\overline{Y}_q}, \mathrm{U}} \,\sup\limits_{\gamma>0} \Biggl[\sum\limits_sP_S(s)\min_x \Biggl[\boldsymbol{\rm{P}}\bigl(j_S(s,\boldsymbol{d})-i_{X;\overline{Y}_q|U}(x;Y\,|\,U)\le\gamma\bigr)+\exp \bigl(j_S(s,\boldsymbol{d})-\gamma\bigr)\\ \quad\,\,\quad\strut\times\displaystyle\sum\limits_{u=1}^{\mathrm{U}} \sum\limits_yP_{U|X}(u\,|\,x)P_{\overline{Y}_q|U}(y\,|\,u)\mathbb{I}\bigl\{j_S(s,\boldsymbol{d})-i_{X;\overline{Y}_q|U}(x;y\,|\,u)>\gamma\bigr\}\Biggr] - \frac{\mathrm{U}}{\exp(\gamma)}\Biggr]\\ \quad\ge\displaystyle\boldsymbol{\rm{P}}\Biggl(\sum\limits_{i=1}^ni_{\mathbb{X};\mathbb{Y}_{q_{\Theta}^*}}(x_i^*;Y_i)-\sum\limits_{j=1}^k j_{\mathrm{S}}(S_j,\boldsymbol{d})\le -\gamma\Biggr)-\frac{K_1}{k}-\frac{K_2}{\sqrt{n}}-(n+1)^{|\mathcal{X}|-1}\exp(-\gamma),\quad\end{array}$$
(44)

where the second term in the left-hand side is bounded below by zero, \(K_1\) and \(K_2\) are some constants and \(x^*=(x_1^*,\ldots,x_n^*)\) is a sequence such that its type \(T_{x^*}\) minimizes

$$\min_{T_x\in\mathcal{P}_n(\mathcal{X})} |T_x-P_{\mathbb{X}}^*|$$
(45)

where \(P_{\mathbb{X}}^*\in\Pi_{\mathbb{X}}\). Let

$$W_\ell=W_\ell(n,k):=\begin{cases}i_{\mathbb{X};\mathbb{Y}_{q_{\Theta}^*}}(x_\ell^*;Y_\ell) & \text{if}\ \ell\le n,\\j_{\mathrm{S}}(S_{n-\ell},\boldsymbol{d}) & \text{if}\ n<\ell\le n+k.\end{cases}$$

Define the following moments of the random variable \(W_\ell\):

$$\begin{aligned}D_{n+k} &= \frac{1}{n+k}\sum\limits_{\ell=1}^{n+k}\boldsymbol{\rm{E}}[W_\ell], & V_{n+k} &=\frac{1}{n+k}\sum\limits_{\ell=1}^{n+k}{\rm{Var}}[W_\ell],\\ A_{n+k} &=\frac{1}{n+k}\sum\limits_{\ell=1}^{n+k}\boldsymbol{\rm{E}}\bigl[|W_\ell-\boldsymbol{\rm{E}}[W_\ell]|^3\bigr],\qquad &B_{n+k}' &= \frac{c_0A_{n+k}}{V_{n+k}^{3/2}},\quad c_0 >0.\end{aligned}$$

From the Berry–Esseen CLT, we have

$$\begin{aligned}\boldsymbol{\rm{P}}\Biggl(\sum\limits_{\ell=1}^{n+k}W_\ell\le -\gamma\Biggr)\ge\mathrm{Q}\left(\frac{D_{n+k}+\frac{\gamma}{n+k}}{\sqrt{\frac{V_{n+k}}{n+k}}}\right)-\frac{B_{n+k}'}{\sqrt{n+k}}.\label{eq:CLT-bnd-conv}\end{aligned}$$
(46)

We also have the following inequalities from [9, Appendix C]:

$$D_{n+k}\le\frac{n}{n+k}C-\frac{k}{n+k}R(\boldsymbol{d}),$$
(47)
$$V_{n+k} \ge\frac{n}{n+k}V_{\mathrm{C}}+\frac{k}{n+k}V_{\mathrm{S}}(\boldsymbol{d})-\frac{K_3}{n+k},$$
(48)

where \(K_3>0\) is some constant. Further, from (22) and (23), we can show that \(A_{n+k}\) is bounded and hence \(B'_{n+k}\) is bounded by a constant \(B'>0\). Using (47) and (48), we get

$$\mathrm{Q}\left(\frac{D_{n+k}+\frac{\gamma}{n+k}}{\sqrt{\frac{V_{n+k}}{n+k}}}\right)\ge \mathrm{Q}\biggl(\frac{nC-kR(\boldsymbol{d})+\gamma}{\sqrt{nV_{\mathrm{C}}+kV_{\mathrm{S}}(\boldsymbol{d})-K_3}}\biggr).$$

Substituting the above in (46), taking \(\gamma=(|\mathcal{X}|-1/2)\log (n+1)\) and using (44), we get the required bound. △

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vora, A., Kulkarni, A. Minimax Theorems for Finite Blocklength Lossy Joint Source-Channel Coding over an Arbitrarily Varying Channel. Probl Inf Transm 57, 99–128 (2021). https://doi.org/10.1134/S0032946021020010

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S0032946021020010

keywords

Navigation