Abstract
Motivated by applications in the security of cyber-physical systems, we pose the finite blocklength communication problem in the presence of a jammer as a zero-sum game between the encoder-decoder team and the jammer, by allowing the communicating team as well as the jammer only locally randomized strategies. The communicating team's problem is nonconvex under locally randomized codes, and hence, in general, a minimax theorem need not hold for this game. However, we show that approximate minimax theorems hold in the sense that the minimax and maximin values of the game approach each other asymptotically. In particular, for rates strictly below a critical threshold, both the minimax and maximin values approach zero, and for rates strictly above it, they both approach unity. We then show a second-order minimax theorem, i.e., for rates exactly approaching the threshold along a specific scaling, the minimax and maximin values approach the same constant value, that is neither zero nor one. Critical to these results is our derivation of finite blocklength bounds on the minimax and maximin values of the game and our derivation of second-order dispersion-based bounds.
Similar content being viewed by others
References
Vora, A.S. and Kulkarni, A.A., A Minimax Theorem for Finite Blocklength Joint Source-Channel Coding over an AVC, in 2019 National Conf. on Communications (NCC 2019), Bangalore, India, Feb. 20–23, 2019, pp. 1–6. https://doi.org/10.1109/NCC.2019.8732205
Humayed, A., Lin, J., Li, F., and Luo, B., Cyber-Physical Systems Security—A Survey, IEEE Internet of Things J., 2017, vol. 4, no. 6, pp. 1802–1831. https://doi.org/10.1109/JIOT.2017.2703172
Slay, J. and Miller, M., Lessons Learned from the Maroochy Water Breach, Critical Infrastructure Protection (Proc. Int. Conf. ICCIP–2007, Hanover, NH, USA, Mar. 19–21, 2007), Goetz, E. and Shenoi, S., Eds., Boston: Springer, 2008. https://doi.org/10.1007/978-0-387-75462-8_6
Langner, R., Stuxnet: Dissecting a Cyberwarfare Weapon, IEEE Secur. Priv., 2011, vol. 9, no. 3, pp. 49–51. https://doi.org/10.1109/MSP.2011.67
Maschler, M., Solan, E., and Zamir, S., Game Theory, Cambridge: Cambridge Univ. Press, 2013.
Kulkarni, A.A. and Coleman, T.P., An Optimizer’s Approach to Stochastic Control Problems with Nonclassical Information Structures, IEEE Trans. Autom. Control, 2015, vol. 60, no. 4, pp. 937–949. https://doi.org/10.1109/TAC.2014.2362596
Ahlswede, R., A Note on the Existence of the Weak Capacity for Channels with Arbitrarily Varying Channel Probability Functions and Its Relation to Shannon’s Zero Error Capacity, Ann. Math. Statist., 1970, vol. 41, no. 3, pp. 1027–1033. https://doi.org/10.1214/aoms/1177696979
Polyanskiy, Y., Poor, H.V., and Verdú, S., Channel Coding Rate in the Finite Blocklength Regime, IEEE Trans. Inform. Theory, 2010, vol. 56, no. 5, pp. 2307–2359. https://doi.org/10.1109/TIT.2010.2043769
Kostina, V. and Verdú, S., Lossy Joint Source-Channel Coding in the Finite Blocklength Regime, IEEE Trans. Inform. Theory, 2013, vol. 59, no. 5, pp. 2545–2575. https://doi.org/10.1109/TIT.2013.2238657
Csiszár, I. and Körner, J., Information Theory: Coding Theorems for Discrete Memoryless Systems, Cambridge, UK: Cambridge Univ. Press, 2011, 2nd ed.
Kosut, O. and Kliewer, J., Finite Blocklength and Dispersion Bounds for the Arbitrarily-Varying Channel, in Proc. 2018 IEEE Int. Symp. on Information Theory (ISIT–2018), Vail, CO, USA, June 17–22, 2018, pp. 2007–2011. https://doi.org/10.1109/ISIT.2018.8437724
Jose, S.T. and Kulkarni, A.A., Linear Programming-Based Converses for Finite Blocklength Lossy Joint Source-Channel Coding, IEEE Trans. Inform. Theory, 2017, vol. 63, no. 11, pp. 7066–7094. https://doi.org/10.1109/TIT.2017.2738634
Jose, S.T. and Kulkarni, A.A., Improved Finite Blocklength Converses for Slepian–Wolf Coding via Linear Programming, IEEE Trans. Inform. Theory, 2019, vol. 65, no. 4, pp. 2423–2441. https://doi.org/10.1109/TIT.2018.2873623
Borden, J.M., Mason, D.M., and McEliece, R.J., Some Information Theoretic Saddlepoints, SIAM J. Control Optim., 1985, vol. 23, no. 1, pp. 129–143. https://doi.org/10.1137/0323011
Hegde M.V., Stark W.E., and Teneketzis, D., On the Capacity of Channels with Unknown Interference, IEEE Trans. Inform. Theory, 1989, vol. 35, no. 4, pp. 770–783. https://doi.org/10.1109/18.32154
Başar, T. and Wu, Y.-W., A Complete Characterization of Minimax and Maximin Encoder-Decoder Policies for Communication Channels with Incomplete Statistical Description, IEEE Trans. Inform. Theory, 1985, vol. 31, no. 4, pp. 482–489. https://doi.org/10.1109/TIT.1985.1057076
Hughes, B. and Narayan, P., Gaussian Arbitrarily Varying Channels, IEEE Trans. Inform. Theory, 1987, vol. 33, no. 2, pp. 267–284. https://doi.org/10.1109/TIT.1987.1057288
Jose, S.T. and Kulkarni, A.A., On a Game Between a Delay-Constrained Communication System and a Finite State Jammer, in Proc. 2018 IEEE Conf. on Decision and Control (CDC’2018), Miami, FL, USA, Dec. 17–19, 2018, pp. 5063–5068. https://doi.org/10.1109/CDC.2018.8618987
Jose, S.T. and Kulkarni, A.A., Shannon Meets von Neumann: A Minimax Theorem for Channel Coding in the Presence of a Jammer, IEEE Trans. Inform. Theory, 2020, vol. 66, no. 5, pp. 2842–2859. https://doi.org/10.1109/TIT.2020.2971682
Blackwell, D., Breiman, L., and Thomasian, A.J., The Capacities of Certain Channel Classes under Random Coding, Ann. Math. Statist., 1960, vol. 31, no. 3, pp. 558–567. https://doi.org/10.1214/aoms/1177705783
Ahlswede, R., Elimination of Correlation in Random Codes for Arbitrarily Varying Channels, Z. Wahrsch. Verw. Gebiete, 1978, vol. 44, no. 2, pp. 159–175. https://doi.org/10.1007/BF00533053
Lapidoth, A. and Narayan, P., Reliable Communication under Channel Uncertainty, IEEE Trans. Inform. Theory, 1998, vol. 44, no. 6, pp. 2148–2177. https://doi.org/10.1109/18.720535
Cover, T.M. and Thomas, J.A., Elements of Information Theory, Hoboken, NJ: Wiley, 2012, 2nd ed.
Kostina, V. and Verdú, S., Fixed-Length Lossy Compression in the Finite Blocklength Regime, IEEE Trans. Inform. Theory, 2012, vol. 58, no. 6, pp. 3309–3338. https://doi.org/10.1109/TIT.2012.2186786
Shannon, C.E., Coding Theorems for a Discrete Source with a Fidelity Criterion, IRE Nat. Conv. Rec., 1959, Part 4, pp. 142–163.
Conforti, M., Cornuéjols, G., and Zambelli, G., Integer Programming, New York: Springer, 2014.
Kosut, O. and Kliewer, J., Dispersion of the Discrete Arbitrarily-Varying Channel with Limited Shared Randomness, in Proc. 2017 IEEE Int. Symp. on Information Theory (ISIT’2017), Aachen, Germany, June 25–30, 2017, pp. 1242–1246. https://doi.org/10.1109/ISIT.2017.8006727
Acknowledgment
We thank an anonymous reviewer for his careful reading and constructive comments on an earlier version of this paper.
Author information
Authors and Affiliations
Additional information
Translated from Problemy Peredachi Informatsii, 2021, Vol. 57, No. 2, pp. 3–35 https://doi.org/10.31857/S0555292321020017.
Appendix
Appendix
We begin with the following central limit theorem due to Berry and Esseen (see [8]).
Theorem 14
(Berry–Esseen CLT). Fix \(n\in\mathbb{N}\). Let \(W_i\) be independent random variables. Then, for \(t\in\mathbb{R}\), we have
where \(\mathrm{Q}\) is the complementary Gaussian function, and
Proof of Lemma 1.
Fix a \(\theta\in\mathcal{T}^n\) and consider K independent copies of the random code as \(\left\{F_i,\Phi_i\right\}_{i=1}^K\). Using Hoeffding’s inequality, we write
Since \(e_{\boldsymbol{d}}(\psi)=\max\limits_{\theta\in\mathcal{T}^n}\boldsymbol{\rm{E}}[e_{\boldsymbol{d},\theta}(F,\Phi)]<\varepsilon\), we have that \(\boldsymbol{\rm{E}}[e_{\boldsymbol{d},\theta}(F,\Phi)] < \varepsilon\;\forall \theta \in\mathcal{T}^n\). Using this and \(\varepsilon'\) from (18), we have that for all \(\theta\in\mathcal{T}^n\),
and hence
Thus, we write
Thus, the event \(\Bigl\{\sum\limits_ie_{\boldsymbol{d},\theta}(F_i,\Phi_i)/K<\varepsilon'\;\forall\theta\in\mathcal{T}^n\Bigr\}\) has nonzero probability, and hence there exist K deterministic codes \(\left\{f_i,\varphi_i\right\}_{i=1}^K\) such that for all \(\theta\in\mathcal{T}^n\),
where \(\varepsilon'>\varepsilon+\sqrt{\log |\mathcal{T}^n|/2K}\). △
Proof of Theorem 9.
We weaken the bound in (19) by taking \(P_{X_a}(x_a)=\prod\limits_{i=1}^{d_n} P^*_{\mathbb{X}}(x_i)\), \(P_{\mathbb{X}}^*\in\Pi_{\mathbb{X}}\). Let \(\mathcal{A}\), Z, K, and \(d_n\) be as defined in (24), (26), and (27) respectively. Let \(U_i:=i_{\mathbb{X}^*;\mathbb{Y}_{T_{\theta}}}(X_{ai};Y_{ai})\), where \((X_{ai}, Y_{ai}) \sim P^*_{\mathbb{X}}\times\sum\limits_{\bar\theta\in\mathcal{T}}T_{\theta}(\bar\theta)P_{\mathbb{Y}|\mathbb{X},\Theta=\bar\theta} \; \forall i\). Since the channel is memoryless, \(i_{X^*;Y_{q^*}}(X_a;Y_a)=\sum\limits_{i=1}^{d_n}U_i\). Thus, taking \(\gamma'=\log(\sqrt{d_n}K)\) we can write the following: \(\boldsymbol{\rm{P}}\bigl(i_{X^*;Y_{q^*}}(X_a;Y_a)\le\gamma'\bigr)=\boldsymbol{\rm{P}}\biggl(\,\sum\limits_{i=1}^{d_n}U_i\le\log (\sqrt{d_n}K)\!\biggr)\). From [10, Lemma 12.10], we have \(C\le\boldsymbol{\rm{E}}\bigl[i_{\mathbb{X};\mathbb{Y}_{T_{\theta}}}(X_{ai};Y_{ai})\bigr]=\boldsymbol{\rm{E}}[U_i]\) for all \(T_{\theta}\in\mathcal{P}_{d_n}(\mathcal{T})\), and hence \(d_nC\le\sum\limits_{i=1}^{d_n}\boldsymbol{\rm{E}}[U_i]\). Also, substituting for \(\log K\) from equation (27), we get
where the last equation follows from the triangle inequality. Using Chebyshev’s inequality and taking the supremum over \(\theta_a\), we get
where the last inequality in (33) follows from equation (25).
The bounds on the other two terms given as \(\boldsymbol{\rm{P}}(Z(X_a,\bar{X}_a,Y_a)=0,(X_a,Y_a)\in\mathcal{A} \,|\,\theta_a)\le 1/\sqrt{d_n}K\) and \(\boldsymbol{\rm{P}}(Z(X_a,\bar{x}_a,Y_a)=0, (X_a,Y_a)\in\mathcal{A} \,|\,\bar{X}_a=\bar{x}_a,\theta_a)\le (d_n+1)^{|\mathcal{X}|^2|\Theta||\mathcal{Y}|} \exp(-d_n\eta)\) follow from the proof of Theorem 4 in [11]. Using (33) and the above bounds, we get the required bound. △
Proof of Theorem 10.
To derive the bound, we construct a random joint source-channel code \((F,\Phi)\) or equivalently a distribution \(\psi\) on the set of codes \(\{(f,\varphi)\mid f\colon\,\mathcal{S}^k\to\mathcal{X}^n,\:\varphi\colon\,\mathcal{Y}^n\to\mathcal{S}^k\}\). From Theorem 6, it suffices to choose the distributions \(P_X\) and \(P_{\widehat{S}}\). Let \(P^*_{\mathbb{X}}\) be a distribution from the set \(\Pi_{\mathbb{X}}\). Define \(P_X(x):=\prod\limits_{i=1}^nP^*_{\mathbb{X}}(x_i)\) for \(x\in\mathcal{X}^n\). Further, take \(P_{\widehat{S}}(\widehat{s})=\prod\limits_{i=1}^kP_{\widehat{\mathbb{S}}}^*(\widehat{s}_i)\), \(\widehat{s}\in\mathcal{S}^k\), where \(P_{\widehat{\mathbb{S}}}^*\) achieves the optimum in (2). Clearly, with these choice of distributions, we have that
Recall the error terms corresponding to the random code from Theorem 8. Writing the maximization over \(\theta_b\in\mathcal{T}^n\) as maximization over \(q\in\mathcal{P}(\mathcal{T}^n)\), we can write the bound as
where \(q(\theta)=\prod\limits_{i=1}^nq_i(\theta_i)\) with \(q_i\in\mathcal{P}(\mathcal{T})\), \(\theta\in\mathcal{T}^n\). Let \(h(X_b,Y_b,S):=\sum\limits_{j=1}^ni_{\mathbb{X}^*;\mathbb{Y}_{q_{\Theta}^*}}(X_{bj};Y_{bj})-\log(\overline\gamma/P_{\widehat{S}}(\mathcal{B}_{\boldsymbol{d}}(S)))\). Further, define the set \(\mathcal{D}\) as
where \(\bar{c}\) and c are constants defined in [9, Lemma 5]. Define the random variable \(W_\ell\) as
The expectation \(\boldsymbol{\rm{E}} [\exp(-|h(X_b,Y_b,S)|^+)]\) can be written as
where the first term in (35) follows by using the definition of the set \(\mathcal{D}\) and the second term follows by using \(\boldsymbol{\rm{E}} \bigl[\exp(-|h(X_b,Y_b,S)|^+)\mathbb{I}\{S \notin \mathcal{D}\}\bigr]\le\boldsymbol{\rm{P}}(S \notin \mathcal{D})\) and applying Lemma 5 from [9].
We define the following moments to be used in bounding the first term in (35) using the Berry–Esseen CLT:
Note that the moments are computed with respect to the distribution \(P_X\times\sum\limits_{\theta} q(\theta)P_{Y|X,\boldsymbol{\Theta}=\theta}\times P_S\). Next, we define the following set:
where \(t_{k,n}>0\) will be chosen later. For the sake of brevity, we define the term in the \(\exp\) as follows:
Thus, we can write (35) as \(\boldsymbol{\rm{E}} [\exp(-|g(X_b,Y_b,S)|^+)]\), which can be written as
where the first term in (39) follows by using the definition of the set \(\mathcal{H}\) and the second term follows by using \(\exp(-|\cdot|^+)\le 1\). Using the above results and (35), we get the following bound:
To upper bound the first term in (40), we compute the maximum of \(\exp(-|\Gamma_{n+k}(q)|^+)\) over all distributions. Since \(\exp(-|\cdot|^+)\) is a decreasing function with respect to \(\Gamma_{n+k}(q)\), we get
To compute the minimum, we consider the following:
where the moments are with respect to \(P_{\mathbb{X}}^*\times\sum\limits_{\theta\in\mathcal{T}}q_i(\theta)P_{\mathbb{Y}|\mathbb{X},\Theta=\theta}\). Thus, the minimum is given as
where the maximum in the second term in (41) is restricted to \(\Pi_{\Theta}\) by using [8, Lemmas 63 and 64]. Since \(P_{\mathbb{X}}^*\in\Pi_{\mathbb{X}}\) and \(q_{\Theta}^*\in\Pi_{\Theta},|\Pi_{\Theta}|=1\), we have \(\max\limits_{q_i\in\Pi_{\Theta}}V_{n+k}(q)=nV_{\mathrm{C}}\). Thus, we have that
We choose \(t_{k,n}\) as \(t_{k,n}=(nC-kR(\boldsymbol{d})-\bar{c}\log k-\log \overline\gamma-c)/\sqrt{n V_{\mathrm{C}}+k V_{\mathrm{S}}(\boldsymbol{d})}\). Thus, we get \(\min\limits_{q\in\mathcal{P}(\mathcal{T}^n)}\Gamma_{n+k}(q)\ge \frac{1}{2} \log k+O(1)\). Substituting in (40), we get the following upper bound:
where (42) follows by taking \(\overline\gamma=(\log_{e}k)/2+1\) and bounding \(\boldsymbol{\rm{P}} ((X_b,Y_b,S) \notin \mathcal{H})\) using the Berry–Esseen CLT.
Recall that \(B_{n+k}(q)\) is given as \(B_{n+k}(q)=c_0A_{n+k}(q)/V_{n+k}^{3/2}(q)\). Assuming \(\min\limits_{\ell\in\{1,\ldots,n+k\}}{\rm{Var}}[W_\ell]\ne 0\), we can bound \(B_{n+k}(q)\) as from the definition of \(A_{n+k}(q)\) and \(V_{n+k}(q)\) as
From (22) and (23), we have that the right-hand side is finite for all q and independent of k and n. Thus, we have that \(\max\limits_{q\in\mathcal{P}(\mathcal{T}^n)} B_{n+k}(q)\) is a finite constant. Substituting \(t_{k,n}\) from above, taking \(\max\limits_{q\in\mathcal{P}(\mathcal{T}^n)} B_{n+k}(q)\le B\) with \(B>0\) and using (42), we get the required bound. △
Proof of Theorem 12.
From (11), it suffices to construct q, \(P_{\bar{Y}_q}\), the random variable U, and \(\gamma\) to get a lower bound on \(\underline{\nu}(k,n)\). Take \(q(\theta)=q^*(\theta)=\prod\limits_{i=1}^n q^*_{\Theta}(\theta_i)\) where \(q^*_{\Theta}\in\Pi_{\Theta}\). Let \(\mathrm{U}\) be the number of types in \(\mathcal{P}_n(\mathcal{X})\) and \(\mathcal{U}=\{1,\ldots,\mathrm{U}\}\) be indices corresponding to each of the types. Thus, for a given sequence \(x\in\mathcal{X}^n\), U maps it to its type, which is denoted by some index \(u\in\mathcal{U}\). Further, let \(P_{\overline{Y}_q|U}\) be defined as
where \(P_X(x)=\prod\limits_{i=1}^nT_x(x_i)\), \(x\in\mathcal{X}^n\), with \(T_x\in\mathcal{P}_n(\mathcal{X})\) being the type corresponding to the index u. Thus, we have that \(i_{X;\overline{Y}_q|U}(x;Y\,|\,u)=\sum\limits_{i=1}^ni_{\mathbb{X};\mathbb{Y}_{q^*}}(x_i;Y_i)\), where
and \(j_S(s,\boldsymbol{d})=\sum\limits_{j=1}^kj_{\mathrm{S}}(s_j,\boldsymbol{d})\).
Since q is taken as an i.i.d. distribution, effectively, we have a channel where
when the input is \(x=(x_1,\ldots,x_n)\). Thus, the left-hand side of (31) is a converse of the standard DMC without a jammer with the channel given as the above averaged channel. Following the line of arguments given in [9, Appendix C], we get the following inequality:
where the second term in the left-hand side is bounded below by zero, \(K_1\) and \(K_2\) are some constants and \(x^*=(x_1^*,\ldots,x_n^*)\) is a sequence such that its type \(T_{x^*}\) minimizes
where \(P_{\mathbb{X}}^*\in\Pi_{\mathbb{X}}\). Let
Define the following moments of the random variable \(W_\ell\):
From the Berry–Esseen CLT, we have
We also have the following inequalities from [9, Appendix C]:
where \(K_3>0\) is some constant. Further, from (22) and (23), we can show that \(A_{n+k}\) is bounded and hence \(B'_{n+k}\) is bounded by a constant \(B'>0\). Using (47) and (48), we get
Substituting the above in (46), taking \(\gamma=(|\mathcal{X}|-1/2)\log (n+1)\) and using (44), we get the required bound. △
Rights and permissions
About this article
Cite this article
Vora, A., Kulkarni, A. Minimax Theorems for Finite Blocklength Lossy Joint Source-Channel Coding over an Arbitrarily Varying Channel. Probl Inf Transm 57, 99–128 (2021). https://doi.org/10.1134/S0032946021020010
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S0032946021020010