Universal Gaps for XOR Games from Estimates on Tensor Norm Ratios

Aubrun, Guillaume; Lami, Ludovico; Palazuelos, Carlos; Szarek, Stanisław J.; Winter, Andreas

doi:10.1007/s00220-020-03688-2

Universal Gaps for XOR Games from Estimates on Tensor Norm Ratios

Open access
Published: 07 March 2020

Volume 375, pages 679–724, (2020)
Cite this article

Download PDF

You have full access to this open access article

Communications in Mathematical Physics Aims and scope Submit manuscript

Universal Gaps for XOR Games from Estimates on Tensor Norm Ratios

Download PDF

1835 Accesses
15 Citations
2 Altmetric
Explore all metrics

Abstract

We define and study XOR games in the framework of general probabilistic theories, which encompasses all physical models whose predictive power obeys minimal requirements. The bias of an XOR game under local or global strategies is shown to be given by a certain injective or projective tensor norm, respectively. The intrinsic (i.e. model-independent) advantage of global over local strategies is thus connected to a universal function r(n, m) called ‘projective–injective ratio’. This is defined as the minimal constant $\rho $ such that $\Vert \cdot \Vert _{X\otimes _\pi Y}\leqslant \rho \,\Vert \cdot \Vert _{X\otimes _\varepsilon Y}$ holds for all Banach spaces of dimensions $\dim X=n$ and $\dim Y=m$, where $X\otimes _\pi Y$ and $X \otimes _\varepsilon Y$ are the projective and injective tensor products. By requiring that $X=Y$, one obtains a symmetrised version of the above ratio, denoted by $r_s(n)$. We prove that $r(n,m)\geqslant 19/18$ for all $n,m\geqslant 2$, implying that injective and projective tensor products are never isometric. We then study the asymptotic behaviour of r(n, m) and $r_s(n)$, showing that, up to log factors: $r_s(n)$ is of the order $\sqrt{n}$ (which is sharp); r(n, n) is at least of the order $n^{1/6}$; and r(n, m) grows at least as $\min \{n,m\}^{1/8}$. These results constitute our main contribution to the theory of tensor norms. In our proof, a crucial role is played by an ‘$\ell _1/\ell _2/\ell _{\infty }$ trichotomy theorem’ based on ideas by Pisier, Rudelson, Szarek, and Tomczak-Jaegermann. The main operational consequence we draw is that there is a universal gap between local and global strategies in general XOR games, and that this grows as a power of the minimal local dimension. In the quantum case, we are able to determine this gap up to universal constants. As a corollary, we obtain an improved bound on the scaling of the maximal quantum data hiding efficiency against local measurements.

Perfect Strategies for Non-Local Games

Article 26 February 2020

M. Lupini, L. Mančinska, … A. Winter

K-Correspondences, USCOs, and fixed point problems arising in discounted stochastic games

Article Open access 22 September 2020

Frank H. Page & Jing Fu

Information Geometry and Game Theory

1 Introduction

One of the most prominent conceptual contributions of the celebrated 1964 paper by Bell [1] is to point out that the implications of the quantum mechanical predictions extend far beyond the very same formalism that is used to deduce them, and shed light on some of the deepest secrets of Nature. The scenario considered by Bell features two distant parties who share a quantum entangled state and make local quantum measurements on it. The main contribution of [1] is to show that the resulting correlations cannot be explained by any local ‘hidden variable’ theory. Here we want to stress that the argument does not depend on the correctness of quantum mechanics as the ultimate theory of Nature, but rather only on the accuracy of its predictions concerning the above experimental setting. In other words, any alternative theory that leads to the same predictions in the same setting will also be subjected to Bell’s theorem. In this spirit, we deem it important to understand what features of information processing in composite systems are truly intrinsic, meaning that they are common to all conceivable physical theories.

A suitable way to formalise the concept of a physical theory in this context is provided by the mathematical machinery of general probabilistic theories (GPTs) [2,3,4]. It is sometimes convenient to think of GPTs as generalisations of finite-dimensional quantum mechanics, where the set of unnormalised states is not assumed to be the cone of positive semidefinite matrices, but it is rather taken to be an arbitrary convex cone in a finite-dimensional real vector space. As is well known, a GPT makes the host vector space a Banach space in a canonical way, by equipping it with a so-called base norm.

The starting point of our investigation is the study of XOR games in the rich landscape of GPTs. We remind the reader that any XOR game can be equivalently cast in terms of state discrimination queries subjected to locality constraints, so that our analysis applies equally well to these problems. XOR games are arguably the simplest examples of two-prover one-round games and feature two cooperating players, Alice and Bob, and a third party known as the referee. The referee asks the players some ‘questions’ by sending them states of some physical system modelled by a GPT. The correctness of the one-bit answers the players provide upon measuring the state depends only on their parity. According to whether Alice and Bob are allowed to carry out product or global measurements, one talks about local or global strategies (see the two paragraphs right after Definition 3 for the precise definition of these strategies). In general, the winning probability can be significantly larger in the latter than in the former case.

While the quantitative details of this phenomenon will in general depend on the particular physical system modelling the questions, our work is instead motivated by the wish to understand which behaviours are universal, and thus pertain to the intrinsic nature of XOR games. This line of investigation brings us to develop an extensive connection with the theory of tensor norms on finite-dimensional Banach spaces, which has already proved to be instrumental in the study of classical and quantum XOR games [5]. While in these more standard settings one deals with specific examples of tensor norms, the analysis of games played over arbitrary GPT models requires a systematic understanding of general tensor norms.

The main problem we investigate here asks for the maximal gap that can be guaranteed to exist between the winning probabilities associated with global and local strategies in XOR games played over GPTs of fixed local dimensions n and m. In analogy with the classical case, we show that such winning probabilities are given by simple expressions involving respectively the projective and injective tensor norms induced by the local GPTs through their native Banach space structures. Comparing them in a model-independent fashion prompts us to investigate the least constant of domination of the injective over the projective tensor norm over all pairs of normed spaces of fixed dimensions. We call this function ‘projective/injective ratio’, or ‘$\pi /\varepsilon $ ratio’ for short. When seen from the point of view of pure mathematics, this universal function of (n, m) encodes some information regarding Grothendieck’s theory of tensor products of Banach spaces. At the same time, the operational interpretation we construct here guarantees that the same object captures some intrinsic feature of general XOR games.

Our main result is that the $\pi /\varepsilon $ ratio associated with two n-dimensional Banach spaces scales at least as $n^{1/6}$ (up to logarithmic factors), implying that global strategies for XOR games are intrinsically much more effective than local ones in a precise asymptotic sense. We ask the question whether this scaling can be improved up to $n^{1/2}$, and bolster this hypothesis by showing that it holds true (again, up to log factors) when two copies of the same space are considered. Interestingly, this question is intimately connected to the problem of estimating the radius of the weak Banach–Mazur compactum, which has also been conjectured to be of order $n^{1/2}$ [6, 7]. We also consider the problem of establishing dimension-independent lower bounds for the $\pi /\varepsilon $ ratio. We prove that for all pairs of Banach spaces X, Y of dimension at least 2 (and possibly infinite), there is a nonzero tensor in $X\otimes Y$ whose projective norm is at least 19/18 times its injective norm. In particular, these norms are always different, which seems to be a new observation. This should be compared with the famous construction by Pisier [8] of an infinite-dimensional Banach space X such that the injective and projective norms on $X \otimes X$ are equivalent. Finally, we solve the problem of computing the $\pi /\varepsilon $ ratio for some specific examples of physically relevant Banach spaces. Most notably, we establish that it is of the order $\min \{n,m\}^{3/2}$ for $S_1^{n,\mathrm {sa}}\otimes S_1^{m,\mathrm {sa}}$, where $S_1^{k,\mathrm {sa}}$ stands for the space of $k\times k$ Hermitian matrices endowed with the trace norm. The importance of this special case stems from the fact that $S_1^{k,\mathrm {sa}}$ is the natural Banach space associated with a k-level quantum system.

The rest of the paper is structured as follows. Throughout this section, we provide very brief introductions to the GPT formalism (Sect. 1.1), to the theory of tensor norms (Sect. 1.2), and to XOR games (Sect. 1.3). In Sect. 2, we state our main results and broadly discuss some of the proof techniques we developed. Section 3 presents some general properties of the $\pi /\varepsilon $ ratio, connecting it with other concepts in functional analysis. There, we find the universal lower bound of 19/18, and solve the quantum mechanical case up to multiplicative constants. Section 4 deals with the problem of determining the asymptotic scaling of the $\pi /\varepsilon $ ratio, either for two copies of the same space, or in the fully general setting of two normed spaces of different dimensions. In order to improve the accessibility of the paper we added several Appendices, where some extra information can be found. “Appendix A” investigates how our operational interpretation of the injective tensor norm is affected by the introduction of a bounded amount of two-way communication, while “Appendix B” provides a proof of the useful fact that any normed space is 2-isomorphic to a base norm space. Finally, “Appendix C” gathers the functional-analytic background that is used throughout the paper and which may be unfamiliar to a non-specialist reader. In particular, we sketch there proofs of various statements which can be deduced from known results, but some elucidation is needed for non-experts.

1.1 General probabilistic theories

The origins of the formalism of general probabilistic theories (GPTs) lie in the attempt to axiomatise quantum mechanics, rebuilding it upon operationally motivated postulates rather than upon more evasive concepts such as ‘wave function’ and ‘microscopic system’. Although these ideas can be found already in some antecedent works [9, 10], the first major contributions were made by the ‘Marburg school’ led by Ludwig [11, 12]. This resulted in an intense debate around the nascent GPT formalism, which took place in a series of papers published in Communications in Mathematical Physics [13,14,15,16,17,18]. For an account of the early development of the field, we refer the interested reader to [2]. A more modern point of view can be found in [4, Chapter 1]. This foundationally motivated interest has seen a revival in the last two decades, with much effort being focused on attempts to ‘reconstruct’ quantum mechanics starting from first principles [19,20,21,22,23]. At the same time, GPTs have become central to quantum information science, as they provide indispensable tools to analyse information processing beyond classical theories, see for instance [24,25,26,27,28,29,30,31,32]. An introduction to the GPT framework that will suffice for our purposes can be found in [4, Chapter 2] (see also [32, Section 2]). Throughout this subsection, we limit ourselves to recalling the basics and to fixing the notation.

Definition 1

A general probabilistic theory is a triple (V, C, u), where: (i) V is a finite-dimensional real vector space; (ii) $C\subset V$ is a closed, convex, salient and generating cone; and (iii) u, called the order unit or the unit effect, is a functional in the interior of the dual cone $C^*:=\{x^* \in V^*:\, x^*(x)\geqslant 0\ \forall \, x\in C \}$. GPTs will be denoted by capital letters such as A, B etc., which – with a slight abuse of notation – identify also the underlying physical systems. We call $\dim V$ the dimension of the GPT.

On the mathematical level, we can think of (V, C) as an ordered vector space, the ordering being given by $x\leqslant y\Leftrightarrow y-x\in C$. Also the dual vector space can be thought of as ordered by the dual cone $C^*$. In this language, the functional u is said to be strictly positive, since $x\geqslant 0$ and $u(x)=0$ implies $x=0$. The states of the physical system modelled by (V, C, u) are represented by vectors in $C\cap u^{-1}(1)=:\Omega $. The compact convex set $\Omega $ is called the state space of the GPT, and accordingly we will sometimes refer to C as the cone of unnormalised states. Convexity here plays an operationally relevant role, as the process of preparing a system in a state $\omega _0$ with probability p and $\omega _1$ with probability $1-p$, and later forgetting the value of the binary random variable associated with its preparation, leaves the system in the state $p\omega _0+(1-p)\omega _1$.

The GPT formalism allows us to make probabilistic predictions of the outcomes of measurements performed on a certain state. In this context, a measurement is a finite collection $(e_i)_{i\in I}$ of functionals in the order interval [0, u] (generically called effects) that add up to the order unit, i.e. such that $\sum _i e_i=u$ (normalisation). The probability of obtaining the outcome i upon measuring the state $\omega \in \Omega $ is evaluated as $e_i(\omega )$. Throughout this paper, we will always make the so-called no-restriction hypothesis, which guarantees that all normalised collections of effects identify a physically legitimate measurement [33]. We denote by $\mathbf{M }$ the set of all measurements associated with a certain GPT, adding a subscript to identify it if needed.

Equipping an ordered vector space (V, C) with a GPT structure entails selecting a special positive functional on it, i.e. the unit effect u. In turn, this special functional can be used to define a norm on the dual space $V^*$. By definition, the unit ball of this norm is the order interval$[-u,u]$, and for $x^*\in V^*$ one has

$$\begin{aligned} \Vert x^*\Vert :=\min \left\{ t\geqslant 0:\ x^*\in t[-u,u] \right\} . \end{aligned}$$

(1)

This choice makes $V^*$ a so-called order unit space [34, 35]. The corresponding Banach space structure induced on V is that of a base norm space [36]. The norm on V is given by any of the two expressions

$$\begin{aligned} \Vert x\Vert&= \max _{x^*\in [-u,u]} |x^*(x)| \end{aligned}$$

(2)

$$\begin{aligned}&= \min \left\{ u(x_+ + x_-):\ x=x_+-x_-,\ x_\pm \geqslant 0 \right\} . \end{aligned}$$

(3)

The equivalence is an easy consequence of the strong duality of conic optimisation programs [37]; alternatively, it can be established by checking that the convex body $K :={{\,\mathrm{\mathrm {conv}}\,}}(\Omega \cup -\Omega ) \subset V$ is the unit ball for the base norm, while $\Vert x^*\Vert = \sup \{ |x^*(x)| \, : \, x \in K \}$ for any $x^* \in V^*$.

This more or less exhausts the description of single systems within the GPT framework. Note that dynamics is not part of this very basic picture, which is limited to so-called ‘prepare-and-measure procedures’. Time evolution can be accounted for within this formalism, but this goes beyond the scope of the present paper. What we will need here is instead the extension of the formalism to the case of composite systems. We will be mainly concerned with the simplest case of a bipartite system formed by two subsystems A and B described by local GPTs $(V_A, C_A, u_A)$ and $(V_B, C_B, u_B)$. The theory modelling the composite AB will be denoted by $AB=(V_{AB}, C_{AB}, u_{AB})$. Under very reasonable assumptions [38, 39], the main one being that bipartite states are always uniquely determined by the statistics produced by local measurements (a principle that goes under the name of local tomography), one can identify $V_{AB}$ with the tensor product of the local vector spaces, i.e. $V_{AB}\simeq V_A\otimes V_B$. When this identification is made, one obtains also $u_{AB}=u_A \otimes u_B$. To fully specify the joint system, one still needs to identify the cone of unnormalised states $C_{AB}$. It turns out that such a choice cannot be made a priori on the ground of some indisputable axiom, but has to be based on some information regarding the actual physics of the system. However, the operational interpretation of the theory puts some nontrivial constraints on $C_{AB}$, in the form of a lower and upper bound with respect to the inclusion relation. Namely, we have

$$\begin{aligned} C_A \underset{\text {min}}{\otimes }C_B \subseteq C_{AB}\subseteq C_A \underset{\text {max} }{\otimes } C_B\, , \end{aligned}$$

(4)

where

$$\begin{aligned} C_A \underset{\text {min}}{\otimes }C_B&:={{\,\mathrm{\mathrm {conv}}\,}}\left( C_A \otimes C_B \right) , \end{aligned}$$

(5)

$$\begin{aligned} C_A \underset{\text {max} }{\otimes } C_B&:=\left( C_A^*\underset{\text {min}}{\otimes }C_B^*\right) ^* . \end{aligned}$$

(6)

The two constructions (5) and (6) are called minimal and maximal tensor product, respectively. In (4), the lower bound comes from the fact that any tensor product of local states must represent a valid state, while – dually – the fact that any tensor product of local effects must be an effect of the joint system leads to the upper bound. In (5) we used the notation $C_A\otimes C_B :=\left\{ x\otimes y:\, x\in C_A,\, y\in C_B \right\} $. In what follows, we will call admissible any composite AB whose associated cone $C_{AB}$ satisfies (4). Also, we will denote by $A\underset{\text {min}}{\otimes }B$ and $A \underset{\text {max} }{\otimes } B$ those corresponding to the choices (5) and (6) for $C_{AB}$.

We conclude this brief presentation of the GPT formalism by discussing the two physically most relevant examples, i.e. classical probability theory and quantum mechanics. Classical probability theory can be viewed as the GPT

$$\begin{aligned} \mathrm {Cl}_d :=\left( {\mathbf {R}}^d, {\mathbf {R}}^d_+, u \right) , \end{aligned}$$

(7)

where ${\mathbf {R}}_+^d$ is the cone of entrywise non-negative vectors, and $u(x):=\sum _{i=1}^d x_i$ for all $x\in {\mathbf {R}}^d$. The induced base norm coincides with the $\ell _1$-norm $\Vert x\Vert _{\ell _1}:=\sum _{i=1}^d |x_i|$. Composing classical systems is easy, for when either $C_A$ or $C_B$ is simplicial (i.e. a linear image of ${\mathbf {R}}_+^d$) minimal and maximal tensor product coincide.

An n-level quantum mechanical system is modelled by the GPT

$$\begin{aligned} \mathrm {QM}_{n} :=\left( {\mathsf {M}}_{n}^{\mathrm {sa}}, \mathrm {PSD}_{n}, {{\,\mathrm{tr}\,}}\right) , \end{aligned}$$

(8)

where ${\mathsf {M}}_{n}^{\mathrm {sa}}$ is the space of $n\times n$ Hermitian matrices, $\text {PSD}_n$ the cone of positive semidefinite matrices, and ${{\,\mathrm{tr}\,}}$ represents the trace functional. The quantum mechanical base norm is the appropriate non-commutative generalisation of the $\ell _1$-norm, i.e. the trace norm $\Vert X\Vert _1:={{\,\mathrm{tr}\,}}\sqrt{X^\dag X}$. The base norm space of $n\times n$ Hermitian matrices endowed with the trace norm will be denoted by $S_1^{n,\mathrm {sa}}$. In contrast with the classical case, for quantum mechanics composition rules become an issue. In fact, the standard quantum mechanical composition rule dictates that if $A=\text {QM}_{n}$ and $B=\text {QM}_m$ then $AB=\text {QM}_{nm}$. The corresponding cone $C_{AB}$ is well known to make both inclusions in (4) strict: provided n, $m \geqslant 2$, we have

$$\begin{aligned} \mathrm {PSD}_n \underset{\text {min}}{\otimes }\mathrm {PSD}_m \subsetneq \mathrm {PSD}_{nm} \subsetneq \mathrm {PSD}_n \underset{\text {max} }{\otimes } \mathrm {PSD}_m. \end{aligned}$$

(9)

Indeed, the left member of (9) is the cone of separable operatorss, and thus the strictness of the left inclusion reflects the existence of entanglement. Since the positive semidefinite cone is selfdual, it follows by duality that the right inclusion is also strict. Note that the right member of (9) is the cone of the so-called block-positive operators, which can serve as entanglement witnesses.

1.2 Tensor norms

We start by recalling the basic theory of tensor products of normed spaces. In what follows $B_X:=\left\{ x\in X:\ \Vert x\Vert \leqslant 1\right\} $ will denote the unit ball of a Banach space X. There are at least two canonical ways in which one can construct a norm on a generic tensor product $X\otimes Y$ of finite-dimensional real Banach spaces [40,41,42]. The injective norm of a tensor $z\in X\otimes Y$ is defined by the expression

$$\begin{aligned} \Vert z\Vert _{X \otimes _\varepsilon Y} :=\max \left\{ (x^*\otimes y^*)(z):\ x^*\in B_{X^*},\ y^*\in B_{Y^*} \right\} , \end{aligned}$$

(10)

while its projective norm is given by

$$\begin{aligned} \Vert z\Vert _{X \otimes _\pi Y} :=\min \left\{ \sum \nolimits _i \Vert x_i\Vert \Vert y_i\Vert :\ z=\sum \nolimits _i x_i\otimes y_i \right\} . \end{aligned}$$

(11)

In Sect. 1.1, we learnt that the vector space associated with a GPT carries a natural norm, i.e. the base norm given by (3). Since a joint system AB lives on the tensor product $V_A\otimes V_B$ of the local vector spaces, it is natural to ask whether either of the above tensor norms admits an operational interpretation in this context. Indeed, it turns out that [32, Proposition 22]

$$\begin{aligned} \Vert \cdot \Vert _{AB}\leqslant \Vert \cdot \Vert _{A\scriptstyle \underset{\text {min} }{\otimes }B} = \Vert \cdot \Vert _{V_A \otimes _\pi V_B}\, . \end{aligned}$$

(12)

for all admissible composites AB. The last equality tells us that the projective norm corresponds to the base norm associated with the minimal tensor product of the two theories.

One may thus be led to conjecture that an analogous identity exists between $\Vert \cdot \Vert _{V_A \otimes _\varepsilon V_B}$ and $\Vert \cdot \Vert _{A\scriptstyle \underset{\text {max} }{\otimes }B}$, but the example of two classical probability theories reveals that this is not the case. We will find an adequate operational interpretation for the injective tensor norm in the forthcoming Sect. 1.3.

The most elementary properties of injective and projective norms is perhaps the inequality

$$\begin{aligned} \Vert \cdot \Vert _{X\otimes _\varepsilon Y} \leqslant \Vert \cdot \Vert _{X \otimes _\pi Y}\, , \end{aligned}$$

(13)

valid for all X, Y. Moreover, since the space $X\otimes Y$ is of finite dimension, these two norms will always be equivalent, i.e. there will exist a constant $1\leqslant C<\infty $ such that

$$\begin{aligned} \Vert \cdot \Vert _{X\otimes _\varepsilon Y} \leqslant \Vert \cdot \Vert _{X \otimes _\pi Y} \leqslant C \Vert \cdot \Vert _{X\otimes _\varepsilon Y}\, . \end{aligned}$$

(14)

Denote by $\rho (X,Y)$ the smallest constant C satisfying this inequality. It is straightforward to verify that it is formally given by the following optimisation:

$$\begin{aligned} \rho (X,Y) :=\sup _{0\ne z\in X\otimes Y} \frac{\Vert z\Vert _{X\otimes _\pi Y}}{\Vert z\Vert _{X\otimes _\varepsilon Y}}\, . \end{aligned}$$

(15)

For reasons that will soon become clear, in this paper we are interested in studying the range of values of the function $\rho (X,Y)$ across all pairs of spaces of fixed dimensions.

Definition 2

The projective/injective ratio, or $\pi /\varepsilon $ ratio for short, is the following universal function over pairs of integers $n,m \geqslant 2$:

$$\begin{aligned} r(n,m) :=\inf _{\begin{array}{c} \dim X =n \\ \dim Y =m \end{array}} \rho (X,Y)\, , \end{aligned}$$

(16)

where the optimisation is understood to be over all pairs of finite-dimensional Banach spaces X, Y of fixed dimensions n, m. A slight modification of the above function (16) yields the symmetric projective/injective ratio:

$$\begin{aligned} r_s(n) :=\inf _{\dim X=n} \rho (X,X)\, , \end{aligned}$$

(17)

where $n\geqslant 2$ and the infimum is taken over all Banach spaces of dimension n.

One could equally well investigate analogous quantities where the infimum in the above optimisations is replaced by a supremum, however it turns out that these can be evaluated exactly. In fact, it has been shown that [32, Proposition 21]

$$\begin{aligned} R(n,m) :=\sup _{\begin{array}{c} \dim X=n \\ \dim Y=m \end{array}} \rho (X,Y) = \min \{n,m\}\, . \end{aligned}$$

(18)

In light of this, in the rest of the paper we shall be concerned with the $\pi /\varepsilon $ ratios as constructed in Definition 2. By considering the examples $X=\ell _1^n$ and $Y=\ell _2^m$ in (16) (and assuming without loss of generality that $n\leqslant m$), one can see that

$$\begin{aligned} 1\leqslant r(n,m) \leqslant \sqrt{\min \{n,m\}} \qquad \forall \ n,m\, . \end{aligned}$$

(19)

For an explicit proof, see the discussion preceding (59). To upper bound the symmetrised ratio one can consider two copies of $\ell _1^n$, which yields the slightly worse estimate [32, Example 29] (we compute a sharper upper bound on $\rho (\ell _1^n,\ell _1^n)$, which is equivalent to $\sqrt{\pi /2}\sqrt{n}$ as n tends to infinity, in Proposition 14)

$$\begin{aligned} 1\leqslant r_s(n) \leqslant \rho (\ell _1^n,\ell _1^n) \leqslant \sqrt{2n} \qquad \forall \ n\, . \end{aligned}$$

(20)

Note that although $r(n,n)\leqslant r_s(n)$ for all n, it may conceivably happen that $r(n,n) < r_s(n)$. In other words, it is possible that the infimum in (16) is not achieved on two copies of the same space even when $n = m$. However, we do not know this for a fact, even when $n=2$ [cf. (77), (78)].

The above inequalities exhaust the elementary properties of the $\pi /\varepsilon $ ratios, and leave open many interesting questions, whose thorough investigation constitutes our main contribution. For a summary of the results we obtain on these quantities, we refer the reader to Sect. 2.

1.3 XOR games

A simple but extremely useful setting where different physical models can be studied and compared from the point of view of information processing is that defined by XOR games. In these games, a referee interacts with two players Alice and Bob, who can cooperate with each other in order to maximise their winning probability. In the classical setting, the referee chooses a pair of questions according to a publicly known distribution and sends one question to each player. Then, the players are requested to provide a one-bit answer each, and the winning condition of the game, for a given pair of questions, only depends on the parity of the answers. In the basic local setting, the players can agree in advance on a strategy for their answers but they are not allowed to communicate with each other once the game has started.

These games are arguably central in theoretical computer science, mainly because of their simplicity and broad applicability to different topics such as interactive proof systems, hardness of approximation, and the PCP theorem. In addition, XOR games have played a major role in quantum information theory since they were first considered in [43]. In fact, these games had already been implicitly considered in the context of the study of quantum nonlocality [1, 44], by means of their equivalent formulation in terms of correlation Bell inequalities. Their systematic study was initiated by Tsirelson [45]. Far from being purely theoretical objects, in the last years these games have been crucial in the development of device-independent quantum cryptography and random numbers generators.

Motivated by their relevance for theory and applications, and drawing from previous works that put forth suitable quantum generalisations [46, 47], in this paper we introduce XOR games in the context of GPTs. In this more general setting, the two players’ system will be described by some bipartite GPT $AB=(V_A\otimes V_B, C_{AB}, u_A\otimes u_B)$. The referee samples the questions from a finite alphabet I, the probability of drawing i being denoted by $p_i$. The answers are represented by a collection of bits $(c_{i})_{i\in I}\in \{0,1\}^{|I|}$, while the questions are described by states $\omega _i\in C_{AB}\subset V_A\otimes V_B$. Upon being asked the question $\omega _i$, the players output answers $a\in \{0,1\}$ and $b\in \{0,1\}$, respectively, and the winning condition takes the form $a\oplus b=c_i$. The players’ behaviour can be modelled by a suitable measurement $M=(g_{ab})_{ab \in \{00,01,10,11\} }\in \mathbf{M }_{AB}$ over AB, with $g_{ab}(\omega _i)$ representing the probability that the answers a, b are given when the question $\omega _i$ has been asked. We can then formalise the notion of XOR game as follows.

Definition 3

An XOR gameG is a quadruple $(AB, \omega , p, c)$, where: (i) AB is a bipartite GPT; (ii) $\omega = (\omega _i)_{i\in I}$ is a discrete collection of states over AB; (iii) p is a probability distribution over the set I; and (iv) $c=(c_i)_{i\in I}$ is a set of bits. A strategy for the players is a measurement $M=(g_{ab})_{ab \in \{00,01,10,11\}}$ over AB.

Before we delve into the study of XOR games over GPTs, let us point out some caveats in the terminology. According to the standard conventions, a classical XOR game is more than an XOR game played over the composite $AB=\mathrm {Cl}_{nm}$ formed by two classical GPTs $A=\mathrm {Cl}_{n}$ and $B=\mathrm {Cl}_{m}$ defined by (7). In fact, it is usually understood that in this case the questions are taken from the standard basis of $V_{AB}={\mathbf {R}}^{nm}$, i.e. $\omega _{xy}:=v_x\otimes v_y$, where $1\leqslant x\leqslant n$ and $1\leqslant y\leqslant m$ are integers, and $v_k$ is the k-th vector in the standard basis of ${\mathbf {R}}^k$. In view of the perfect local distinguishability of the questions, one usually refers directly to the labels x and y as the questions. In compliance with the established conventions, from now on we will stick to the above definition of a classical XOR game.

The prototypical (and simplest) example of a strategy, called a local strategy, consists in the players performing a product measurement $(e_a \otimes f_b)_{a,b\in \{0,1\}}\in \mathbf{M }_A\otimes \mathbf{M }_B$. The opposite case is naturally that of a global strategy, corresponding to the case of Alice and Bob having access to global measurements $(g_{ab})_{a,b\in \{0,1\}}\in {\mathbf {M}}_{AB}$, but one can equally well consider some intermediate scenarios where the players are allowed a bounded amount of communication before they are required to output the answers. Some of these variations on the theme are examined in “Appendix A”. From the above picture it follows that XOR games can be equivalently formulated as instances of state discrimination problems, possibly subjected to some special constraints dictated by locality.

One is usually interested in maximising the success probability of the players given a certain set of measurements they have access to. Since an XOR game can always be won with probability 1/2 by just answering randomly, it is standard to quantify the effectiveness of the players’ strategy by introducing the bias$\beta (G)$ of the game G:

$$\begin{aligned} \beta (G) :=P_{\text {winning}}(G) - P_{\text {losing}}(G) = 2 P_{\text {winning}}(G) - 1 \, . \end{aligned}$$

(21)

It is understood that the bias depends also on the strategy adopted by the players, which we specify with a subscript. The following result yields explicit expressions for the bias corresponding to local and global strategies.

Theorem 1

Consider an XOR game $G=(AB,\omega , p,c)$, where $AB=(V_A\otimes V_B, C_{AB}, u_A\otimes u_B)$. Define $z_G:=\sum _{i\in I} p_i (-1)^{c_i}\omega _i\in V_A\otimes V_B$. Then the biases corresponding to local and global strategies evaluate to

$$\begin{aligned} \beta _{\mathrm {LO}}(G)&= \left\| z_G\right\| _{V_A\otimes _\varepsilon V_B}\, , \end{aligned}$$

(22)

$$\begin{aligned} \beta _{\mathrm {ALL}}(G)&= \left\| z_G \right\| _{AB}\, , \end{aligned}$$

(23)

respectively. Here, $\Vert \cdot \Vert _{AB}$ is the base norm associated with the GPT AB. In particular, we obtain $\beta _{\text {ALL }}(G) \leqslant \left\| z_G \right\| _{V_A \otimes _\pi V_B}$ for all composites AB, and the bound is achieved when $AB=A\underset{\text {min } }{\otimes }B$.

Proof

It is not difficult to realise that when the players adopt global strategies, their task is equivalent to a state discrimination problem with all measurements on the composite system being available. Hence, (23) follows from [32, Lemma 7]. As for (22), note that a pair of local measurements $(e_a \otimes f_b)_{a,b\in \{0,1\}}$ yields

$$\begin{aligned}&P_{\text {winning}}(G) - P_{\text {losing}}(G)\\&\quad = \sum _i p_i \left( \sum _{a,b:\, a\oplus b=c_i} (e_a\otimes f_b)(\omega _i) - \sum _{a,b:\, a\oplus b\ne c_i} (e_a\otimes f_b)(\omega _i) \right) \\&\quad = \sum _i p_i \sum _{a,b} (-1)^{a+b+c_i} (e_a\otimes f_b)(\omega _i) \\&\quad = \left( \left( \sum \nolimits _a (-1)^a e_a \right) \otimes \left( \sum \nolimits _b (-1)^b f_b\right) \right) (z_G)\, . \end{aligned}$$

That the optimisation over all local measurements yields (22) is a consequence of the elementary fact that

$$\begin{aligned} \left\{ e_0-e_1:\ e_0\in [0,u_A],\ e_0+e_1=u_A \right\} = \left[ -u_A, u_A\right] \, , \end{aligned}$$

(24)

and analogously for system B. The last claim follows from (12). $\square $

Remark

The value of the bias under local strategies depends only on the local structure of the GPTs A and B, and is thus independent of the particular rule we chose to compose them.

Theorem 1 generalises the mathematical description of the non-entangled bias of classical and quantum XOR games [5, 47], and yields the operational interpretation of the injective norm we were seeking. In “Appendix A” we show that this interpretation is ‘robust’, in the sense that even allowing the players a bounded amount of communication before they output the answers does not increase the bias by more than a factor equal to the product of the dimensions of the GPTs used to carry the messages. The same type of argument also shows that not much can be gained if the players have access to a pre-shared physical system of bounded dimension.

For classical XOR games more is true: namely, Tsirelson’s theorem [45] states that even assistance by entangled quantum states of arbitrarily large dimension does not allow for a significant improvement over product strategies. However, this has to be regarded as a peculiar feature of quantum systems, deeply linked with the underlying Hilbert space structure through Grothendieck’s inequality. It is therefore not surprising that it does not carry over to the general GPT setting. In fact, it turns out that a classical XOR game can always be won with probability 1 if one allows for assistance from a well-chosen set of non-signalling correlations^{Footnote 1} (we point that non-signalling correlations can be viewed as GPTs [4, Section 2.3.4]).

We now illustrate the main motivation behind our investigation.

Let $n,m\geqslant 2$ be two integers. We can imagine a game that starts with the referee asking Alice and Bob to name two local GPTs A, B of fixed dimensions n, m, which they will have to manipulate later. Since the referee has no control over the experimental capabilities of the other two players, these are free to pick any A and B, subjected to the constraints $\dim V_ A = n$ and $\dim V_B=m$. Once this choice has been made, it is communicated to the referee, who physically constructsA and B, combines them in a bipartite system AB that is a legitimate GPT but is elsewhere of their choice, selects an appropriate XOR game G over AB, and plays it with Alice and Bob. The goal of the referee is to make the global/local bias ratio of G as large as possible – ideally, one would aim for a global bias close to 1 and a local bias close to 0 – by suitably choosing the composite AB and the game G. Conversely, Alice and Bob wish to keep the bias ratio as close as possible to 1, by making the appropriate choice of local systems A and B. In other words, they want to be able to win the game with a reasonable probability without having to resort to global strategies. Given $n,m\geqslant 2$, how large is the global/local bias ratio the referee can hope to achieve?

Having specified the setting, we now proceed to perform the quantitative analysis that will ultimately lead us to identify our main object of study. Let us start by assuming that Alice and Bob have already named GPTs A, B, that the referee has also selected a composite AB, and that the only choice that remains to be done is that of the game G. Start by observing that (3) implies that every element $z\in V_A\otimes V_B$ such that $\Vert z\Vert _{AB}\leqslant 1$ for some legitimate composite AB is such that $z=z_G$ for some game G on AB (with just two questions). Hence, we can equivalently parametrise XOR games with vectors z in the unit ball of $\Vert \cdot \Vert _{AB}$. The maximal global/local bias ratio the referee can hope to achieve then reads

$$\begin{aligned} \sup _{G} \frac{\beta _{\text {ALL}}(G)}{\beta _{\text {LO}}(G)} = \sup _{0\ne z\in V_A\otimes V_B} \frac{\Vert z\Vert _{AB}}{\Vert z\Vert _{V_A\otimes _\varepsilon V_B}}\, , \end{aligned}$$

(25)

where the supremum on the l.h.s. is over all games on a fixed composite AB. Since the referee is also free to choose the optimal composite AB for given local systems A and B, we can also optimise over all composition rules. This is easily done by means of (12), and yields

$$\begin{aligned} \sup _{AB, G} \frac{\beta _{\text {ALL}}(G)}{\beta _{\text {LO}}(G)} = \sup _{0\ne z\in V_A\otimes V_B} \frac{\Vert z\Vert _{V_A \otimes _\pi V_B}}{\Vert z\Vert _{V_A\otimes _\varepsilon V_B}} = \rho \left( V_A, V_B\right) \, , \end{aligned}$$

(26)

where the last step is an application of the definition (15). The above equation (26) is important because it connects the theory of XOR games over GPTs to that of tensor norms, and it can be used as a starting point to investigate some intrinsic aspects of the behaviour of information processing in bipartite systems. For instance, in [32] the optimal performance of data hiding against ‘locally constrained sets of measurements’ is connected to the quantity $\sup _{A,B} \rho (V_A, V_B)$, the supremum running over all GPTs A and B of fixed dimensions.^{Footnote 2} In the setting we study here, instead, Alice and Bob’s goal is to minimise the global/local bias ratio in (26) by making a clever choice of A and B in the first place. Hence, the relevant quantity is

$$\begin{aligned} r_{\mathrm {bn}}(n, m) :=\inf _{\begin{array}{c} \dim A = n \\ \dim B = m \end{array}} \rho \left( V_A, V_B\right) , \end{aligned}$$

(27)

the infimum running over all GPTs A, B (equivalently, over all base norm spaces $V_A, V_B$) of fixed dimensions. If we find that $r_{\mathrm {bn}}(n,m)>1$ for all n, m, then there is a point in claiming that global strategies for XOR games perform better than local ones, independently of the underlying physical theories. If we manage to determine the asymptotic scaling of the quantity (27), we will even be able to make this statement quantitative. Comparing (27) with (16), it is elementary to observe that

$$\begin{aligned} r_{\mathrm {bn}}(n,m)\geqslant r(n,m) \end{aligned}$$

(28)

for all $n,m \geqslant 2$, as the infimum that defines the r.h.s. is over all pairs of Banach spaces, while that on the l.h.s. includes only base norm spaces. In spite of this, thanks to the result of “Appendix B” we know that the two sides of (28) are in fact comparable. Namely, Lemma B.2 tells us that for all $n,m \geqslant 2$ one has

$$\begin{aligned} r_{\mathrm {bn}}(n,m)&\leqslant 4\, r(n,m)\, , \end{aligned}$$

(29)

$$\begin{aligned} r_{\mathrm {bn}}(n,m)&\leqslant 2 + r(n-1, m-1)\, . \end{aligned}$$

(30)

In light of the above equivalences, in the rest of the paper we shall study the function r instead of $r_{\mathrm {bn}}$. This simplifies our investigation considerably.

Before we present our results, let us comment on the optimisation over the composition rules performed in (26)–(27). While we argued above that ours may be most natural choice, it is conceivable to consider a modified scenario where Alice and Bob fix not only the systems A, B, but also the composite AB. Instead of (26), one should rather compute $\inf _{AB} \sup _G \beta _{\mathrm {ALL}}(G)/\beta _{\mathrm {LO}}(G)$, and then take also the infimum over A and B of fixed dimensions, as in (27). We will not consider further this alternative scenario. Remarkably, the choice of the composite is irrelevant when either A or B are classical theories. In this case, we are also able to give sharp estimates of the r.h.s. of (26) (see Remarks after Proposition 15 and Lemma 19).

2 Main Results

Throughout this section, we present our main results on the universal functions r(n, m) and $r_s(n)$ introduced in (16) and (17), respectively. The discussion of the elementary properties of these objects, as conducted in Sect. 1.2, left many natural questions open. For instance, is it the case that $r(n,m)\geqslant c>1$ for all $n,m\geqslant 2$and for some universal constantc? In other words, is it true that for any pair of Banach spaces X, Y of dimension at least 2 there is a tensor $z\in X\otimes Y$ for which $\Vert z\Vert _{X \otimes _\pi Y} \geqslant c\Vert z\Vert _{X \otimes _\varepsilon Y}$? Looking only at large enough dimensions, how do r(n, m) and $r_s(n)$ behave asymptotically in n, m? Some insight into this latter question was provided by Pisier [48], who showed that $r(n,m)\rightarrow \infty $ when $\min \{n,m\}\rightarrow \infty $, but with no asymptotic growth explicitly stated.

Our first result answers the first of the above questions in the affirmative.

Theorem 2

For any pair of Banach spaces X, Y with $\dim X \geqslant 2$ and $\dim Y \geqslant 2$, we have $\rho (X,Y) \geqslant 19/18$. Equivalently, there exists a nonzero tensor $z \in X \otimes Y$ such that

$$\begin{aligned} \Vert z\Vert _{X \otimes _\pi Y} \geqslant \frac{19}{18} \Vert z\Vert _{X \otimes _\varepsilon Y}. \end{aligned}$$

Consequently, the function r(n, m) defined in (16) satisfies

$$\begin{aligned} r(n,m) \geqslant \frac{19}{18}\qquad \forall \ n,m\geqslant 2\, . \end{aligned}$$

(31)

Theorem 2 also applies to infinite-dimensional spaces and shows that the injective and projective tensor products cannot be isometric. Our proof of Theorem 2 requires a variation on Auerbach’s lemma that is susceptible to an intuitive geometrical interpretation (Lemma 16).

The following open problem asks whether the injective and projective tensor norms are always $\sqrt{2}$ apart. The value of $\sqrt{2}$ would be optimal, since $\rho (\ell _1^2,\ell _2^2) = \sqrt{2}$ (for a proof of this fact, see (59)).

Problem 3

Is it true that for any pair of Banach spaces X, Y with $\dim X, \dim Y \geqslant 2$, we have $\rho (X,Y) \geqslant \sqrt{2}$? In other words, does there always exist a nonzero tensor $z \in X \otimes Y$ such that

$$\begin{aligned} \Vert z\Vert _{X \otimes _\pi Y} \geqslant \sqrt{2}\, \Vert z\Vert _{X \otimes _\varepsilon Y}\ ? \end{aligned}$$

(32)

We now move on to the analysis of the asymptotic behaviour of the function $r_s(n)$. Developing functional analysis techniques from [49], we arrive at a lower estimate of $\rho (X,X)$ involving the Banach–Mazur distance of X from the Hilbert space of the same dimension (Theorem 17). Since by John’s theorem (Theorem C.7) this cannot exceed the square root of the dimension, we deduce the following estimate, which is sharp up to logarithmic factors (compare with (20)).

Theorem 4

The function $r_s$ defined in (17) satisfies

$$\begin{aligned} \sqrt{2n} \geqslant r_s(n) \geqslant c\, \frac{\sqrt{n}}{(\log n)^3} \end{aligned}$$

(33)

for some universal constant $c>0$.

Note

Throughout the paper we will denote universal (always strictly positive) constants by c, C, $C'$ etc. Unless explicitly indicated, these symbols do not necessarily refer to the same numerical values when they appear in different formulae.

The investigation of the function r(n, m) poses more substantial technical hurdles. For simplicity, we start by looking at r(n, n). Our main result is the following.

Theorem 5

The function r defined in (16) satisfies

$$\begin{aligned} \sqrt{n} \geqslant r(n,n) \geqslant c\, \frac{n^{1/6}}{(\log n)^{4/3}} \end{aligned}$$

(34)

for some universal constant $c>0$.

The above estimate shows in particular that r(n, n) grows at least as a power of n. In turn, this implies that global strategies for XOR games are intrinsically asymptotically (much) more efficient than local ones, which was one of our main claims. Our proof of Theorem 5 rests upon two main ingredients: (1) a lower estimate for $\rho \left( \ell _1^n, X\right) $, $\rho \left( \ell _\infty ^n, X\right) $ and $\rho \left( \ell _2^n, X\right) $ when $\dim X\geqslant n$ (Lemma 19), which can be handled using known facts about p-summing norms; and (2) a ‘trichotomy theorem’ inspired by previous results in [7, 50], which states that every Banach space hosts sufficiently well-behaved subspaces on which the norm is similar enough to either an $\ell _1$-norm, or an $\ell _\infty $-norm, or a Euclidean norm (Theorem 20).

Due to the technical complexity of managing many different estimates simultaneously, for the case $n\ne m$ we could not obtain an exponent as good as 1/6. However, we were nevertheless able to ensure that there is power law scaling in $\min \{n,m\}$.

Theorem 6

For all $n,m\geqslant 2$, the function r(n, m) satisfies

$$\begin{aligned} \sqrt{\min \{n,m\}} \geqslant r(n,m) \geqslant c\, \frac{\, \min \{n,m\}^{1/8}\, }{\log \min \{n,m\}} \end{aligned}$$

(35)

for some universal constant $c>0$.

The exponents we obtained in Theorems 5 and 6 are unlikely to optimal. We present the following problem concerning the scaling of the projective/injective ratio.

Problem 7

Does there exist a universal constant $c>0$ such that

$$\begin{aligned} r(n,m)\geqslant c \min \{n,m\}^{1/2} \end{aligned}$$

(36)

for all positive integers n, m? In other words, for all pairs of finite-dimensional Banach spaces X, Y, does there exist a nonzero tensor $z\in X\otimes Y$ such that

$$\begin{aligned} \Vert z\Vert _{X\otimes _\pi Y} \geqslant c \min \{\dim X, \dim Y\}^{1/2}\, \Vert z\Vert _{X\otimes _\varepsilon Y}\ ? \end{aligned}$$

(37)

While the value of r(n, m) grows with the dimension, new phenomena appear when considering infinite-dimensional spaces. Indeed, a famous construction by Pisier [8], solving negatively a conjecture by Grothendieck, entails that there exists an infinite-dimensional Banach space X such that $\rho (X,X)<\infty $. Using the information from Lemma B.1, we conclude that the same behaviour occurs in the realm of GPTs: there exist infinite-dimensional GPTs A, B such that local and global strategies are equivalent up to a universal constant in any possible composite AB.

The study of asymptotic behaviours in the general setting should not distract us from the fact that certain GPT models are of prime importance because of their compliance with known physics. Therefore, one of our results is the determination of the quantity $\rho (X,Y)$ when X, Y are the base norm spaces corresponding to quantum mechanical systems, i.e. the Banach spaces $S_1^{k,\mathrm {sa}}$. The following constitutes a notable improvement over [51, Lemma 20], [52, Theorem 15] and [32, Eq. (72)], as detailed below.

Theorem 8

Denoting by $S_1^{k,\mathrm {sa}}$ the space of $k\times k$ Hermitian matrices endowed with the trace norm, the best constant of domination of $\Vert \cdot \Vert _{X \otimes _\varepsilon Y}$ over $\Vert \cdot \Vert _{X \otimes _\pi Y}$ on $S_1^{n,\mathrm {sa}}\otimes S_1^{m,\mathrm {sa}}$ satisfies

$$\begin{aligned} c \min \{n,m\}^{3/2} \leqslant \rho \left( S_1^{n,\mathrm {sa}}, S_1^{m,\mathrm {sa}}\right) \leqslant C \min \{n,m\}^{3/2} \end{aligned}$$

(38)

for some constants C, $c>0$. More precisely, we have the following estimate for the upper bound in the above relation:

$$\begin{aligned} \rho \left( S_1^{n,\mathrm {sa}}, S_1^{m,\mathrm {sa}}\right) \leqslant 4 \min \{n,m\}^{3/2} - 2\sqrt{2} (\sqrt{2}-1)\sqrt{\min \{n,m\}}. \end{aligned}$$

(39)

Note

When comparing (38) with the other estimates on r(n, m) that we presented throughout this section, one should remember that the dimension of the space $S_1^{n,\mathrm {sa}}$ is $n^2$ rather than n. Hence, curiously, for $X=S_1^{n,\mathrm {sa}}$ the quantity $\rho (X,X)\approx n^{3/2}$ is of the same order as the geometric mean between the theoretical minimum $r_s(n^2)\approx n$ and the absolute maximum $R(n^2,n^2)=n^2$.

Although we will not consider complex spaces in this work, it is worth mentioning that the same estimate holds (with a slight modification in the constants) if the real space $S_1^{k,\mathrm {sa}}$ is replaced by the complex space $S_1^k$ of $k\times k$ matrices endowed with the trace norm.

Among other things, the above Theorem 8 enables us to give a new upper bound on the maximal efficiency of quantum mechanical data hiding against local measurements, quantitatively encoded in the data hiding ratio function $R_{\mathrm {LO}}(n,m)$ defined in [32].

Corollary 9

For all Hermitian matrices Z acting on a bipartite system ${\mathbf {C}}^n\otimes {\mathbf {C}}^m$, we have

$$\begin{aligned} \Vert Z\Vert _{S_1^{n,\mathrm {sa}}\otimes _\varepsilon S_1^{m,\mathrm {sa}}} \leqslant \Vert Z\Vert _{\mathrm {LO}} \leqslant \Vert Z\Vert _1 \leqslant \Vert Z\Vert _{S_1^{n,\mathrm {sa}}\otimes _\pi S_1^{m,\mathrm {sa}}} \leqslant 4\min \{n,m\}^{3/2} \Vert Z\Vert _{S_1^{n,\mathrm {sa}}\otimes _\varepsilon S_1^{m,\mathrm {sa}}}\, , \end{aligned}$$

(40)

where $\Vert \cdot \Vert _{\mathrm {LO}}$ is the distinguishability norm under local measurements [52]. In other words, the data hiding ratio against local measurements can be upper bounded as

$$\begin{aligned} R_{\mathrm {LO}}(n,m)\leqslant 4 \min \{n,m\}^{3/2}\, . \end{aligned}$$

(41)

The above result improves upon several previously known estimates. In [51, Lemma 20], an inequality analogous to (41) was shown, that featured an exponent 2 on the r.h.s.; moreover, the relation proven in [52, Theorem 15] implies that $R_{\mathrm {LO}}(n,m)\leqslant \sqrt{153\, n m}$, which can be worse than (41) e.g. when $n^2 \leqslant m$ or $m^2\leqslant n$.

3 First Bounds on the $\pi /\varepsilon $ Ratio

3.1 Some notions of functional analysis

We start by reminding the reader of some facts in elementary linear algebra. Given a pair of finite-dimensional vector spaces X, Y, there is a canonical isomorphism between the tensor product space $X\otimes Y$ and the space of linear maps $X^*\rightarrow Y$. We will write this correspondence as

$$\begin{aligned} X\otimes Y \ni z\longmapsto \widetilde{z} \in \mathcal {L}(X^*,Y)\, . \end{aligned}$$

(42)

Note that one has $\widetilde{F(z)}= \widetilde{z}\,^*$, where on the l.h.s. we have the flip operator $F:X\otimes Y\rightarrow Y\otimes X$ defined by $F(x\otimes y):=y\otimes x$, while on the r.h.s. $(\cdot )^*$ stands for the adjoint (transposition) operation $\mathcal {L}(X^*,Y)\rightarrow \mathcal {L}(Y^*, X)$. It is also easy to see that one has

$$\begin{aligned} w(z) = {{\,\mathrm{tr}\,}}\left[ \widetilde{z}\,^* \widetilde{w}\right] = {{\,\mathrm{tr}\,}}\left[ \widetilde{z}\, \widetilde{w}\,^*\right] \qquad \forall \ z\in X\otimes Y,\quad \forall \ w\in (X\otimes Y)^*=X^*\otimes Y^*\, , \end{aligned}$$

(43)

where ${{\,\mathrm{tr}\,}}$ denotes the trace.

In this paper we are interested in tensor products of finite-dimensional real Banach spaces, so from now on X and Y will denote a pair of such objects. We already encountered the concepts of injective and projective tensor products [see (10) and (11)]. Below we discuss some elementary properties of these constructions. For a start, injective and projective norm are dual to each other, in the sense that

$$\begin{aligned} \Vert \cdot \Vert _{(X\otimes _\varepsilon Y)^*} = \Vert \cdot \Vert _{X^*\otimes _\pi Y^*}\, . \end{aligned}$$

(44)

By means of the correspondence (42), it is possible to translate tensor norms into the language of operators. One has

$$\begin{aligned} \Vert z\Vert _{X\otimes _\varepsilon Y}&= \left\| \widetilde{z}:X^*\rightarrow Y\right\| , \end{aligned}$$

(45)

$$\begin{aligned} \Vert z\Vert _{X\otimes _\pi Y}&= \left\| \widetilde{z}:X^*\rightarrow Y\right\| _N , \end{aligned}$$

(46)

where $\Vert \cdot \Vert _N$ is the so-called nuclear norm [53, Section 8], which may be more familiar to some readers. This is particularly transparent for the tensor product of two Euclidean spaces: if $X = \ell _2^m$, $Y =\ell _2^n$, then $\Vert z\Vert _{X\otimes _\varepsilon Y} = \left\| \widetilde{z}\right\| _{\infty }$ and $\Vert z\Vert _{X\otimes _\pi Y} = \left\| \widetilde{z}\right\| _{1}$, where $\Vert M\Vert _{\infty }$ and $\Vert M\Vert _{1}$ denote the operator norm and the trace norm of a matrix M. The following well-known lemma covers special cases which are simple: the projective tensor product with $\ell _1^n$ and the injective tensor product with $\ell _{\infty }^n$.

Lemma 10

If $(e_i)_{1 \leqslant i \leqslant n}$ denotes the canonical basis of ${\mathbf {R}}^n$, then for every vectors $x_1,\cdots ,x_n$ in a Banach space X,

$$\begin{aligned}&\left\| \sum _{i=1}^n e_i \otimes x_i \right\| _{\ell _1^n \otimes _\pi X} = \sum _{i=1}^n \Vert x_i\Vert , \end{aligned}$$

(47)

$$\begin{aligned}&\left\| \sum _{i=1}^n e_i \otimes x_i \right\| _{\ell _{\infty }^n \otimes _\varepsilon X} = \max _{1 \leqslant i \leqslant n} \Vert x_i\Vert . \end{aligned}$$

(48)

Proof

The upper bound in (47) is immediate from the definition. Conversely, given a decomposition of $\sum e_i \otimes x_i$ as $\sum \alpha ^k \otimes \xi ^k$, we have

$$\begin{aligned} \sum _k \Vert \alpha ^k\Vert _1 \Vert \xi ^k\Vert = \sum _k \sum _{i=1}^n |\alpha ^k_i| \Vert \xi ^k\Vert \geqslant \sum _{i=1}^n \Big \Vert \sum _k \alpha ^k_i \xi ^k \Big \Vert = \sum _{i=1}^n \Vert x_i\Vert \end{aligned}$$

and equality in (47) follows once we take the infimum over decompositions. The proof of (48) is immediate once we realize that the supremum in the definition of the injective norm can be restricted to extreme points of the unit ball. $\square $

The Banach–Mazur distance between two normed spaces X, Y with the same finite dimension is defined as [53]

$$\begin{aligned} {{\,\mathrm{d}\,}}(X,Y) :=\inf _u \left\{ \Vert u : X \rightarrow Y \Vert \cdot \Vert u^{-1} : Y \rightarrow X \Vert \right\} , \end{aligned}$$

(49)

where the infimum is taken over invertible linear maps u from X to Y. It satisfies the multiplicative version of the triangle inequality, i.e. ${{\,\mathrm{d}\,}}(X,Z) \leqslant {{\,\mathrm{d}\,}}(X,Y){{\,\mathrm{d}\,}}(Y,Z)$. Another elementary property is the fact that ${{\,\mathrm{d}\,}}(X,Y)={{\,\mathrm{d}\,}}(X^*,Y^*)$. When ${{\,\mathrm{d}\,}}(X,Y) \leqslant \lambda $, we say that X is $\lambda $-isomorphic to Y. Similarly, X is $\lambda $-Euclidean if it is $\lambda $-isomorphic to $\ell _2^{\dim X}$.

The Banach–Mazur distance only makes sense for a pair of spaces of equal dimension. When X, Y are normed spaces such that $\dim X \leqslant \dim Y$, one may define as a substitute the factorisation constant of X through Y as

$$\begin{aligned} {{\,\mathrm{f}\,}}(X,Y) :=\inf _{u,v} \left\{ \Vert u : X \rightarrow Y \Vert \cdot \Vert v : Y \rightarrow X \Vert \, : \, vu = \mathrm {Id}_X \right\} . \end{aligned}$$

(50)

We point that ${{\,\mathrm{d}\,}}(X,Y) = {{\,\mathrm{f}\,}}(X,Y)$ if $\dim X=\dim Y$. Moreover, ${{\,\mathrm{f}\,}}(X,Y)$ is finite if and only if $\dim X\leqslant \dim Y$ (otherwise the above infimum is over an empty set). In order to circumvent this restriction, it is sometimes relevant to consider a relaxed version of the above quantity, where we allow the factorisation to be realised only after averaging. This leads to the definition of the weak factorisation constant as

$$\begin{aligned} {{\,\mathrm{wf}\,}}(X,Y) :=\inf _{u,v} \left\{ {{\,\mathrm{{\mathbf {E}}}\,}}\left[ \Vert u : X \rightarrow Y \Vert \cdot \Vert v : Y \rightarrow X \Vert \right] \, : \, {{\,\mathrm{{\mathbf {E}}}\,}}\left[ vu \right] = \mathrm {Id}_X \right\} , \end{aligned}$$

(51)

where now the infimum is taken over pairs of operator-valued random variables (u, v). We get the inequality ${{\,\mathrm{wf}\,}}(X,Y) \leqslant {{\,\mathrm{f}\,}}(X,Y)$ by restricting the infimum to constant random variables. Note also that the quantity ${{\,\mathrm{wf}\,}}(X,Y)$ is well-defined without any restriction on the dimensions of X and Y. It is easy to verify that the factorisation constants dualise, in the sense that

$$\begin{aligned} {{\,\mathrm{f}\,}}(X,Y)&= {{\,\mathrm{f}\,}}(X^*, Y^*)\, , \end{aligned}$$

(52)

$$\begin{aligned} {{\,\mathrm{wf}\,}}(X,Y)&= {{\,\mathrm{wf}\,}}(X^*, Y^*)\, . \end{aligned}$$

(53)

We may also consider a symmetric variant of the weak factorisation constant, called the weak Banach–Mazur distance and defined as

$$\begin{aligned} {{\,\mathrm{wd}\,}}(X,Y) = \max \{ {{\,\mathrm{wf}\,}}(X,Y), {{\,\mathrm{wf}\,}}(Y,X)\}\, . \end{aligned}$$

The family of all equivalence classes of n-dimensional normed spaces up to isometries can be turned into a compact metric space by the introduction of the distance $\log {{\,\mathrm{d}\,}}$. We now review some classical facts about the geometry of such space, called the Banach–Mazur compactum of dimension n. For a more complete introduction, we refer the reader to the excellent monograph [53]. A general upper bound valid for any n-dimensional normed space is the estimate following from John’s theorem (Theorem C.7)

$$\begin{aligned} {{\,\mathrm{d}\,}}(X,\ell _2^n) \leqslant \sqrt{n}. \end{aligned}$$

(54)

As a consequence of this and the multiplicative triangle inequality, for any pair of n-dimensional spaces X, Y, we have

$$\begin{aligned} {{\,\mathrm{d}\,}}(X,Y) \leqslant n . \end{aligned}$$

(55)

This bound is essentially sharp: Gluskin proved, via a random construction, the existence of spaces X, Y such that ${{\,\mathrm{d}\,}}(X,Y) \geqslant cn$ for some $c>0$ [54]. However, the estimate (55) can be improved in many specific cases. In particular, a question that is of relevance to us is that of the distance between a space and its dual: it was proved by Bourgain and Milman [49] that whenever $\dim (X)=n$, we have

$$\begin{aligned} {{\,\mathrm{d}\,}}(X,X^*) \leqslant C n^{5/6} \log ^C n. \end{aligned}$$

(56)

In a similar vein, (55) can be improved if we switch to the weak Banach–Mazur distance: a result by Rudelson [7] asserts that

$$\begin{aligned} {{\,\mathrm{wd}\,}}(X,Y) \leqslant C n^{13/14} \log ^C n \end{aligned}$$

(57)

whenever $\dim (X)=\dim (Y)=n$. We also point out that the exact diameter of the Banach–Mazur compactum is known in dimension 2: namely,

$$\begin{aligned} \max _{\begin{array}{c} \dim X = 2 \\ \dim Y = 2 \end{array}} {{\,\mathrm{d}\,}}(X,Y) = \frac{3}{2}\, , \end{aligned}$$

(58)

and the equality is achieved iff the unit balls of X and Y are the images of a square and a regular hexagon through a linear (invertible) map [55].

3.2 Basic properties

We start the investigation of the elementary properties of the parameter $\rho (X,Y)$ by presenting a simple calculation in the case $X=\ell _1^n$, $Y=\ell _2^m$, where $n\leqslant m$ (we point that the calculation can be rephrased in terms of 1-summing norms, see Proposition 13). For a matrix $z\in \ell _1^n\otimes \ell _2^m$, we write

$$\begin{aligned} \Vert z\Vert _{\ell _1^n\otimes _\varepsilon \ell _2^m}&= \left( \sup _{\sigma \in \{\pm 1\}^n} \sum \nolimits _{j=1}^m \left( \sum \nolimits _{i=1}^n z_{ij} \sigma _i \right) ^2\right) ^{1/2} \\&\geqslant \left( {{\,\mathrm{{\mathbf {E}}}\,}}_\sigma \sum \nolimits _{j=1}^m \left( \sum \nolimits _{i=1}^n z_{ij} \sigma _i \right) ^2\right) ^{1/2} \\&= \left( \sum \nolimits _{j=1}^m \sum \nolimits _{i=1}^n z_{ij}^2\right) ^{1/2} \\&\geqslant \frac{1}{\sqrt{n}}\sum \nolimits _{i=1}^n \left( \sum \nolimits _{j=1}^m z_{ij}^2\right) ^{1/2} \\&= \frac{1}{\sqrt{n}} \Vert z\Vert _{\ell _1^n\otimes _\pi \ell _2^m}\, , \end{aligned}$$

where to obtain the first inequality we randomised over $\sigma $, assuming that $\sigma _1, \ldots , \sigma _n$ are i.i.d. $\pm 1$ Bernoulli random variables, and the last equality uses Lemma 10. The above computation shows that $\Vert z\Vert _{\ell _1^n\otimes _\pi \ell _2^m}\leqslant \sqrt{n} \Vert z\Vert _{\ell _1^n\otimes _\varepsilon \ell _2^m}$, implying that $\rho (\ell _1^n, \ell _2^m)\leqslant \sqrt{n}$. That this upper bound is in fact tight can be seen by considering the matrix z with entries $z_{ij} = \delta _{i,j}$ (i.e. the identity if $m=n$), for which $\Vert z\Vert _{\ell _1^n \otimes _\pi \ell _2^m} = n$ (from Lemma 10) and $\Vert z\Vert _{\ell _1^n \otimes _\varepsilon \ell _2^m} = \sqrt{n}$ (as seen from the above computation). We conclude that

$$\begin{aligned} \rho (\ell _1^n, \ell _2^m) = \sqrt{n} \qquad \forall \ n\leqslant m\, , \end{aligned}$$

(59)

entailing the upper bound (19) on r(n, m).

We now move on to investigating some more general properties of the function $\rho (X,Y)$. We start by stating a very useful reformulation in terms of operators rather than tensors.

Lemma 11

For any pair of finite-dimensional normed spaces X, Y, we have

$$\begin{aligned} \rho (X,Y) = \sup \left\{ {{\,\mathrm{tr}\,}}(vu) \, : \ \Vert u : X \rightarrow Y^* \Vert \leqslant 1, \ \Vert v : Y^* \rightarrow X \Vert \leqslant 1 \right\} . \end{aligned}$$

(60)

Proof

The statement follows from the properties of injective and projective norms under the correspondence (42). We have

$$\begin{aligned} \rho (X,Y)&= \sup _{0\ne z\in X\otimes Y} \frac{\Vert z\Vert _{X\otimes _\pi Y}}{\Vert z\Vert _{X\otimes _\varepsilon Y}} \\&{\mathop {=}\limits ^{{{1}}}} \sup _{0\ne z\in X\otimes Y}\sup _{0\ne w\in X^*\otimes Y^*}\frac{w(z)}{\Vert w\Vert _{X^*\otimes _\varepsilon Y^*} \Vert z\Vert _{X\otimes _\varepsilon Y}} \\&= \sup \left\{ w(z)\,:\ \Vert w\Vert _{X^*\otimes _\varepsilon Y^*}\leqslant 1,\ \Vert z\Vert _{X\otimes _\varepsilon Y}\leqslant 1 \right\} \\&{\mathop {=}\limits ^{{{2}}}} \sup \left\{ {{\,\mathrm{tr}\,}}\left[ \left( \widetilde{z}\right) ^* \widetilde{w} \right] \, : \ \Vert \widetilde{w}:X\rightarrow Y^*\Vert \leqslant 1,\ \Vert \widetilde{z}: X^*\rightarrow Y\Vert \leqslant 1 \right\} \\&{\mathop {=}\limits ^{{{3}}}} \sup \left\{ {{\,\mathrm{tr}\,}}\left[ vu \right] \, : \ \Vert u:X\rightarrow Y^*\Vert \leqslant 1, \ \Vert v:Y^*\rightarrow X\Vert \leqslant 1 \right\} . \end{aligned}$$

The above passages can be justified as follows: 1: we used the duality relation (44); 2: we applied (43), (45) and (46); 3: we just renamed $v:=\left( \widetilde{z}\right) ^*$ and $u:=\widetilde{w}$. $\square $

The following proposition gathers several estimates of the function $\rho (X,Y)$.

Proposition 12

Let X, $X'$ and Y be finite-dimensional normed spaces. Then

$$\begin{aligned} \rho (X,Y)&= \rho (Y,X) = \rho (X^*,Y^*) = \rho (Y^*,X^*)\, , \end{aligned}$$

(61)

$$\begin{aligned} \rho (X,X^*)&= \dim (X)\, , \end{aligned}$$

(62)

$$\begin{aligned} \rho (X',Y)&\leqslant {{\,\mathrm{wf}\,}}(X',X) \rho (X,Y) \end{aligned}$$

(63)

$$\begin{aligned}&\leqslant {{\,\mathrm{f}\,}}(X',X) \rho (X,Y) \, , \end{aligned}$$

(64)

$$\begin{aligned} \rho (X',Y)&\leqslant {{\,\mathrm{d}\,}}(X',X) \rho (X,Y) \ \ \ (\text {assuming } \dim (X) = \dim (X') ), \end{aligned}$$

(65)

$$\begin{aligned} \rho (X,Y)&\geqslant \frac{\dim X}{{{\,\mathrm{wf}\,}}(X,Y^*)}\, , \end{aligned}$$

(66)

$$\begin{aligned} \rho (X,Y)&\leqslant \min \left\{ \dim (X), \dim (Y)\right\} . \end{aligned}$$

(67)

In particular, when $\dim (X) =\dim (Y)=n$ we have

$$\begin{aligned} \rho (X,Y) \geqslant \frac{n}{{{\,\mathrm{wd}\,}}(X,Y^*)} \geqslant \frac{n}{{{\,\mathrm{d}\,}}(X,Y^*)}\, . \end{aligned}$$

(68)

Proof

The identities (61) can be obtained for instance from (60) by exchanging the role of u and v and/or taking their duals $u^*, v^*$. Remember that one has ${{\,\mathrm{tr}\,}}(uv) = {{\,\mathrm{tr}\,}}(vu) = {{\,\mathrm{tr}\,}}(u^* v^*) = {{\,\mathrm{tr}\,}}(v^*u^*)$. Also (62) is elementary, and follows by taking $u=v=\mathrm {Id}_X$ in (60).

To show (63), pick some operators $u:X'\rightarrow Y^*$ and $v:Y^*\rightarrow X'$ of norm no larger than 1, and consider random variables $u':X'\rightarrow X$ and $v':X\rightarrow X'$ such that ${{\,\mathrm{{\mathbf {E}}}\,}}[v'u']=\mathrm {Id}_{X'}$. Since for all realisations of $u'$ and $v'$ the operator $\frac{uv'}{\Vert v'\Vert }:X\rightarrow Y^*$ has norm no larger than 1, and an analogous reasoning holds for $\frac{u'v}{\Vert u'\Vert }:Y^*\rightarrow X$, using (60) we deduce that

$$\begin{aligned} \frac{{{\,\mathrm{tr}\,}}(u'vuv')}{\Vert u'\Vert \Vert v'\Vert } \leqslant \rho (X,Y)\, . \end{aligned}$$

Then we can write

$$\begin{aligned} {{\,\mathrm{tr}\,}}(vu)&= {{\,\mathrm{tr}\,}}\left( vu {{\,\mathrm{{\mathbf {E}}}\,}}[v'u']\right) \\&= {{\,\mathrm{{\mathbf {E}}}\,}}\left[ {{\,\mathrm{tr}\,}}\left( vuv'u'\right) \right] \\&= {{\,\mathrm{{\mathbf {E}}}\,}}\left[ {{\,\mathrm{tr}\,}}\left( u'vuv'\right) \right] \\&= {{\,\mathrm{{\mathbf {E}}}\,}}\left[ \Vert u'\Vert \Vert v'\Vert \frac{{{\,\mathrm{tr}\,}}(u'vuv')}{\Vert u'\Vert \Vert v'\Vert }\right] \\&\leqslant \rho (X,Y) {{\,\mathrm{{\mathbf {E}}}\,}}\left[ \Vert u'\Vert \Vert v'\Vert \right] . \end{aligned}$$

Taking the supremum over u, v and the infimum over $u',v'$ subjected to the above constraints, and using (51) and (60), we finally obtain (63). Since ${{\,\mathrm{wf}\,}}(X',X) \leqslant {{\,\mathrm{f}\,}}(X',X)$, (64) follows as well. To prove (65), note that ${{\,\mathrm{d}\,}}(X,X')={{\,\mathrm{f}\,}}(X,X')$ whenever $\dim (X)=\dim (X')$. Note that (66) also follows immediately:

$$\begin{aligned} \dim X&{\mathop {=}\limits ^{{{1}}}} \rho (X,X^*) \end{aligned}$$

(69)

$$\begin{aligned}&{\mathop {\leqslant }\limits ^{{{2}}}} {{\,\mathrm{wf}\,}}(X,Y^*) \rho (Y^*,X^*) \end{aligned}$$

(70)

$$\begin{aligned}&{\mathop {=}\limits ^{{{3}}}} {{\,\mathrm{wf}\,}}(X,Y^*) \rho (X,Y)\, \end{aligned}$$

(71)

where: 1: follows from (62) and (61); 2: is an application of (63); and 3: is again a consequence of (61).

The estimate (67) was proved in [32, Proposition 21] (compare with (18)) as a consequence of Auerbach’s lemma [56, Vol. I, Sec. 1.c.3]. An alternative proof can be given using the following fact from linear algebra (left as an exercise for the reader): every linear map w on ${\mathbf {R}}^n$ that is a contraction for some norm satisfies ${{\,\mathrm{tr}\,}}(w)\leqslant \mathrm {rank}(w)$. To recover (67), apply this observation to the composite map $w=vu$, with u, v as in (60). $\square $

Remark

It follows in particular from (64) that any upper bound on $\rho (X,Y)$ is also valid for 1-complemented subspaces of X and Y. Recall that a subspace $X' \subseteq X$ is $\varvec{\lambda }$-complemented if there is a surjective projection $P:X \rightarrow X'$ with $\Vert P\Vert \leqslant \lambda $. The complementation hypothesis cannot be omitted. To give a concrete example, consider a cn-dimensional subspace $X \subseteq \ell _1^{n}$ with ${{\,\mathrm{d}\,}}(X,\ell _2^{cn}) \leqslant C$ (existence of such a subspace is well known and follows for example from Dvoretzky–Milman theorem (Theorem C.6) since $k_*(\ell _1^n) \geqslant cn$): we have $\rho (X,X) = \Theta (n)$ while $\rho (\ell _1^{n},\ell _1^{n}) = \Theta (\sqrt{n}).$

Remark

From (65) we see in particular that the function $\rho (\cdot , \cdot )$ defined on the product of the Banach–Mazur compacta of dimensions n and m is continuous (with respect to the product metric). In particular, this implies that the infima in (16) and (17) are always achieved. We will make use of this fact without further mention in what follows.

We point out that weaker versions of Theorems 4 and 5 follow easily by combining Proposition 12 and ‘off-the-shelf’ results. More precisely, the lower bound $r_s(n) \geqslant cn^{1/6}/(\log n)^C$ is an immediate consequence of (68) and (56), and the lower bound $r(n,n) \geqslant cn^{1/14}/(\log n)^C$ follows from (66) and (57). Interestingly, the special case $n=m$ of Problem 7 would follow from (68) if one could prove Rudelson’s conjecture that ${{\,\mathrm{wf}\,}}(X,Y)\leqslant C \sqrt{n}$ for all n-dimensional Banach spaces X, Y.

In the case where one of the spaces is $\ell _1^n$, the quantity $\rho (X,Y)$ can be rephrased in terms of 1-summing norms (the quantities $\pi _1^{(n)}(u)$ and $\pi _1(u)$ are defined in “Appendix 5”).

Proposition 13

For every finite-dimensional normed space X, we have

$$\begin{aligned} \rho (\ell _1^n,X) = \pi _1^{(n)}(\mathrm {Id}_X) \leqslant \pi _1(\mathrm {Id}_X) . \end{aligned}$$

Proof

Using Lemma 11, we have

$$\begin{aligned} \rho (\ell _1^n,X) = \rho (X,\ell _1^n) = \sup \{ {{\,\mathrm{tr}\,}}(vu) \ : \ \Vert u : X \rightarrow \ell _{\infty }^n \Vert \leqslant 1, \ \Vert v : \ell _{\infty }^n \rightarrow X \Vert \leqslant 1 \}. \end{aligned}$$

(72)

We rewrite the norms which appear in (72) in a more tangible way. If $v : \ell _{\infty }^n \rightarrow X$ and $v_i=v(e_i)$, then

$$\begin{aligned} \Vert v\Vert&= \sup \left\{ \left\| \sum _{i=1}^n \varepsilon _i v_i \right\| \ : \ \varepsilon _i=\pm 1 \right\} . \end{aligned}$$

Next, if $u : X \rightarrow \ell _\infty ^n$ is given by $x \rightarrow (\langle f_i , x\rangle )_{i=1}^n$ for some $f_i \in X^*$, then $\Vert u\Vert = \max _i \Vert f_i\Vert $. Finally, if u and v are as above, then ${{\,\mathrm{tr}\,}}(vu) ={{\,\mathrm{tr}\,}}(uv) = \sum _{i=1}^n \langle f_i , v_i\rangle $. Combining these we are led to

$$\begin{aligned} \rho (\ell _1^n,X) = \sup \left\{ \sum _{i=1}^n \Vert v_i\Vert \, : \ v_i \in X,\ \max _{\varepsilon _i=\pm 1} \left\| \sum _{i=1}^n \varepsilon _i v_i\right\| \leqslant 1 \right\} . \end{aligned}$$

(73)

Comparing with (C4) and using the relation

$$\begin{aligned} \max _{\varepsilon _i=\pm 1} \left\| \sum _{i=1}^n \varepsilon _i v_i\right\| = \sup _{\phi \in B_{X^*}} \sum _{i=1}^n |\phi (v_i)| \end{aligned}$$

shows that $\rho (\ell _1^n,X) = \pi _1^{(n)}(\mathrm {Id}_X)$ (the general inequality $\pi _1^{(n)}(\cdot ) \leqslant \pi _1(\cdot )$ is immediate from the definitions). $\square $

With this connection at hand, we are able to give an improved upper bound on the parameter $r_s(n)$.

Proposition 14

For every $n \geqslant 1$, we have

$$\begin{aligned} r_s(n) \leqslant \rho (\ell _1^n,\ell _1^n) \leqslant \pi _1(\mathrm {Id}_{\ell _1^n}) = \frac{n}{ {{\,\mathrm{{\mathbf {E}}}\,}}\left| \sum _{i=1}^n \varepsilon _i \right| } {\mathop {\sim }\limits ^{n\rightarrow \infty }} \sqrt{\frac{\pi n}{2}}, \end{aligned}$$

where $\varepsilon _1,\ldots , \varepsilon _n\in \{\pm 1\}$ are independent random variables with ${\mathbf {P}}(\varepsilon _i=1)={\mathbf {P}}(\varepsilon _i=-1)=1/2$.

Proof

This is a simple consequence of Proposition 13 and (C5). $\square $

3.3 Universal lower bounds

In this section we prove that the injective and projective tensor products of any two Banach spaces cannot be isometric, unless one of them is 1-dimensional. Going further, Problem 3 asks whether the injective and projective norms are always at least $\sqrt{2}$ apart. As a partial answer, we present a proof in the special case when one of the spaces is $\ell _{\infty }^n$ (Proposition 15). Then, we solve a weaker version of Problem 3, with $\sqrt{2}$ replaced by the value 19/18 (Theorem 2). Finally, we discuss the special case when both dimensions are equal to 2.

Proposition 15

If Y is a Banach space with $\dim (Y) \geqslant 2$, then for any $n \geqslant 2$,

$$\begin{aligned} \rho (\ell _{\infty }^n,Y) \geqslant \sqrt{2}. \end{aligned}$$

Proof

Since $\ell _{\infty }^2$ is 1-complemented in $\ell _{\infty }^n$, in view of (64), it suffices to consider $n=2$. In that case, there are explicit formulas for both the projective and injective norms: for any $z=e_1 \otimes y_1 + e_2 \otimes y_2 \in \ell _\infty ^2 \otimes Y$, we have

$$\begin{aligned} \Vert z\Vert _{\ell _{\infty }^2 \otimes _\pi Y}= \frac{1}{2} \left( \Vert y_1+y_2\Vert + \Vert y_1-y_2\Vert \right) \ \ \text {and} \ \ \Vert z\Vert _{\ell _{\infty }^2 \otimes _\varepsilon Y}= \max \left\{ \Vert y_1\Vert ,\Vert y_2\Vert \right\} . \end{aligned}$$

Both formulas can be derived from Lemma 10, the first using the fact that the map $(a,b) \mapsto (a+b,a-b)$ is an isometry between $\ell _{1}^2$ and $\ell _{\infty }^2$. It remains to justify that any Banach space Y contains two vectors $y_1$ and $y_2$ such that

$$\begin{aligned} \Vert y_1+y_2\Vert + \Vert y_1-y_2\Vert \geqslant 2\sqrt{2} \max \left\{ \Vert y_1\Vert ,\Vert y_2\Vert \right\} . \end{aligned}$$

(74)

This follows from properties of the so-called modulus of uniform convexity of Y, a real function defined for $\varepsilon \in [0,2]$ by

$$\begin{aligned} \delta _Y(\varepsilon ) = \inf \left\{ 1 - \frac{\Vert y_1+y_2\Vert }{2} \ : \ \Vert y_1\Vert =\Vert y_2\Vert =1, \ \Vert y_1-y_2\Vert \geqslant \varepsilon \right\} . \end{aligned}$$

It is known [57] that for any Banach space Y and any $\varepsilon \in [0,2]$, we have $\delta _Y(\varepsilon ) \leqslant 1-\sqrt{1-{\varepsilon ^2}/{4}}$ (the value obtained for a Euclidean space). Applying this inequality with $\varepsilon =\sqrt{2}$ shows the existence of unit vectors $y_1$, $y_2$ such that $\Vert y_1-y_2\Vert = \sqrt{2}$ and $\Vert y_1+y_2\Vert \geqslant \sqrt{2}$, and therefore (74) is satisfied. $\square $

Remark

An immediate consequence of Proposition 15 is that there is always a gap at least as large as $\sqrt{2}$ between local and global bias for XOR games played over a system AB in which either A or B is a classical theory (7).

We now proceed to establish the universal lower bound $r(n,m)\geqslant 19/18$, formalised earlier as Theorem 2. Our main technical tool is the following variant of Auerbach’s lemma, illustrated in Fig. 1.

Lemma 16

Let X be a Banach space of dimension at least 2. Then there exist vectors $e_1$, $e_2 \in X$, $e_1^*$, $e_2^* \in X^*$ such that for any $i,j \in \{1,2\}$ we have $\Vert e_i\Vert _X = \Vert e_j^*\Vert _{X^*} = 1$, $e_j^*(e_i) = \delta _{i,j}$, and moreover

$$\begin{aligned} \Vert e_1 + e_2 \Vert _X \leqslant 3/2. \end{aligned}$$

Proof

It is enough to prove the lemma when $\dim X =2$, since the general case follows by considering any 2-dimensional subspace $Y \subseteq X$ and extending the linear forms.

Suppose now $\dim X=2$. Without loss of generality we may assume that $X = ({\mathbf {R}}^2,\Vert \cdot \Vert _X)$, and identify as well $X^*$ with ${\mathbf {R}}^2$. By applying a suitable linear transformation, we may assume that the variational problem $\max \{ |\det (f,g)| : \, f,g \in B_{X^*} \}$ is achieved when $(f,g)=(e_1,e_2)$, the canonical basis. It is clear that $\Vert e_1\Vert _{X^*} = \Vert e_2\Vert _{X^*} =1 $, and one checks that $\Vert e_1\Vert _X=\Vert e_2\Vert _X=1$. Let us show this explicitly for $e_1$. On the one hand, $1 = \langle e_1 , e_1\rangle \leqslant \Vert e_1\Vert _{X^*} \Vert e_1\Vert _X = \Vert e_1\Vert _X$. On the other hand, if $\Vert e_1\Vert _X>1$ one could find a vector $f'\in B_{X^*}$ such that $\langle f' , e_1\rangle >1$, which would imply that $|\det (f',e_2)| >1 = |\det (e_1,e_2)|$, in contradiction with the assumption that the pair $(e_1,e_2)$ achieves the maximum in the above variational expression.

Define $\alpha = \Vert e_1+e_2\Vert _X$ and $\beta = \Vert e_1-e_2\Vert _X$, and let $\phi $, $\psi \in B_{X^*}$ such that $\phi (e_1+e_2)=\alpha $ and $\psi (e_1-e_2)=\beta $. Write $\phi =(\phi _1,\phi _2)$ and $\psi =(\psi _1,\psi _2)$, so that $\alpha = \phi _1+\phi _2$ and $\beta = \psi _1 - \psi _2$. We compute

$$\begin{aligned} 1 \geqslant \det (\psi ,\phi ) = \phi _2 \psi _1 - \phi _1 \psi _2 \geqslant \alpha + \beta -2. \end{aligned}$$

To derive the last inequality, note that

$$\begin{aligned} \phi _2 \psi _1 - \phi _1 \psi _2 - ( \alpha + \beta - 2) = (1 - \phi _2)(1-\psi _1) + (1-\phi _1)(1+\psi _2) \end{aligned}$$

is nonnegative since $|\phi _i| \leqslant 1$ and $|\psi _j| \leqslant 1$. We proved that $\alpha + \beta \leqslant 3$, and therefore either $\alpha \leqslant 3/2$ or $\beta \leqslant 3/2$. In the first case the conclusion is immediate; in the second case it suffices to replace $e_2$ by $-e_2$. $\square $

Proof of Theorem 2

Apply Lemma 16 to both X and Y, and consider the tensor

$$\begin{aligned} z = 5 e_1 \otimes e_1 + 5 e_1 \otimes e_2 + 5 e_2 \otimes e_1 - 4 e_2 \otimes e_2 \in X \otimes Y . \end{aligned}$$

Consider also

$$\begin{aligned} w^* = e_1^* \otimes e_1^* + e_1^* \otimes e_2^* + e_2^* \otimes e_1^* - e_2^* \otimes e_2^* \in X^* \otimes Y^* . \end{aligned}$$

Since the linear forms $e_i^*$ are bounded in absolute value by 1 on the unit ball, an argument following closely the proof of the CHSH inequality [44] shows that $\Vert w^*\Vert _{X^* \otimes _\varepsilon Y^*} \leqslant 2$. Together with the fact that $w^*(z)=19$, this implies that

$$\begin{aligned} \Vert z\Vert _{X \otimes _{\pi } Y} \geqslant \frac{19}{2}. \end{aligned}$$

(75)

It remains to upper bound $\Vert z\Vert _{X \otimes _{\varepsilon } Y}$. Given $\phi \in B_{X^*}$ and $\psi \in B_{Y^*}$, consider the numbers

$$\begin{aligned} a = \phi (e_1), \ b = \phi (e_2), \ c = \psi (e_1), \ d = \psi (e_2) . \end{aligned}$$

Both pairs (a, b) and (c, d) belong to the hexagon

$$\begin{aligned} H = \left\{ (x,y) \in {\mathbf {R}}^2 : \ |x| \leqslant 1,\, |y| \leqslant 1,\, |x+y| \leqslant 3/2 \right\} . \end{aligned}$$

Under these constraints it can be proved that

$$\begin{aligned} 5ac+5ad+5bc-4bd \leqslant 9\, . \end{aligned}$$

Indeed, it suffices to verify this inequality when (a, b) and (c, d) are extreme points of H; this yields a total of 36 different combinations to check. Finally, we have

$$\begin{aligned} \Vert z\Vert _{X \otimes _{\varepsilon } Y} = \sup _{\phi \in B_{X^*} ,\, \psi \in B_{Y^*}} (\phi \otimes \psi )(z) \leqslant 9\, . \end{aligned}$$

(76)

Combining (75) with (76) gives $\Vert z\Vert _{X \otimes _\pi Y} \geqslant \frac{19}{18} \Vert z\Vert _{X \otimes _\varepsilon Y}$, as needed. $\square $

Remark

The proof of Lemma 16 gives more information, namely that

$$\begin{aligned} \Vert e_1+e_2\Vert _X \leqslant \alpha , \ \Vert e_1 - e_2 \Vert _X \leqslant \beta \end{aligned}$$

for some real numbers $\alpha $, $\beta \in [1,2]$ such that $\alpha + \beta \leqslant 3$. This extra information can presumably be used to improve the lower bound in Theorem 2 to $\Vert z\Vert _{X \otimes _\pi Y} \geqslant \frac{8}{7} \Vert z\Vert _{X \otimes _\varepsilon Y}$ for an appropriate choice of z depending on $\alpha $, $\beta $. However, since our arguments for that would rely heavily on computer assistance (and since the bound 8/7 is unlikely to be optimal), we do not present them.

Before we move on, let us discuss the special case of 2-dimensional spaces. Although we are not yet able to evaluate the two quantities r(2, 2) and $r_s(2)$ exactly, we can show that

$$\begin{aligned} \frac{4}{3}&< r(2,2)\leqslant \sqrt{2}\, , \end{aligned}$$

(77)

$$\begin{aligned} \frac{4}{3}&< r_s(2) \leqslant \sqrt{3}\, . \end{aligned}$$

(78)

To these inequalities we have to add the obvious fact that $r(2,2)\leqslant r_s(2)$. To justify the lower bound in (77) (and hence also that in (78)), we observe that combining (58) and (68) yields $\rho (X,Y)\geqslant 4/3$ for all 2-dimensional spaces X, Y. Equality is possible iff ${{\,\mathrm{d}\,}}(X,Y)=3/2$, which happens iff the unit balls of X and Y are simultaneous linear images of a square and a regular hexagon. Without loss of generality, this is the same as saying that X is isomorphic to $\ell _1^2$. By Proposition 15, this ensures that $\rho (X,Y)=\rho (\ell _1^2,Y)\geqslant \sqrt{2}>4/3$. Hence, it must be the case that $r(2,2)>4/3$ strictly.

As we have already seen, the upper bound in (77) can by found by evaluating $\rho (\ell _1^2,\ell _2^2)=\sqrt{2}$, which is a special case of (59). The upper bound in (78), instead, is obtained by considering a space whose unit ball is a cleverly chosen octagon [4, Appendix D].

3.4 An important special case: quantum theory

In this section we will study the case where both parties are described by a quantum model. Before we start, let us expound some notation that we already partially introduced. We denote by ${\mathsf {M}}_{k}^{\mathrm {sa}}$ the real vector space of $k\times k$ Hermitian matrices. By equipping it with a Schatten norm $\Vert \cdot \Vert _p$, defined by $\Vert z\Vert _p:=\left( {{\,\mathrm{tr}\,}}|z|^p\right) ^{1/p}$, we can make such space a Banach space, which we denote by $S_p^{k,\mathrm {sa}}$. In what follows, we will be interested in the two particular cases $p=1$ and $p=\infty $, whose corresponding norms are the trace norm and the operator norm, respectively. For simplicity, we will make the canonical identification $(S_1^{k,\mathrm {sa}})^* = S_\infty ^{k,\mathrm {sa}}$. Accordingly, the action of $(S_1^{k,\mathrm {sa}})^*$ on $S_1^{k,\mathrm {sa}}$ is given simply by the Hilbert–Schmidt inner product, i.e. $y(x) = {{\,\mathrm{tr}\,}}[xy]$ for $x\in S_1^{k,\mathrm {sa}}$ and $y\in S_\infty ^{k,\mathrm {sa}}$. As for the tensor product, remember that ${\mathsf {M}}_{n}^{\mathrm {sa}}\otimes {\mathsf {M}}_{m}^{\mathrm {sa}}={\mathsf {M}}_{nm}^{\mathrm {sa}}$ canonically. We now proceed to prove Theorem 8, hence determining the scaling of the function $\rho (S_1^{n,\mathrm {sa}}, S_1^{m,\mathrm {sa}})$ with respect to n and m. The proof of the Corollary 9 appears at the end of the present section.

Note

From now on, in some of the proofs we will find it convenient to adopt Dirac’s notation for vectors and functionals in (or acting on) ${\mathbf {R}}^n$ and ${\mathbf {C}}^n$. This will be done without further comments.

Proof of Theorem 8

In order to establish (38), we have to show the existence of two constants $c,C>0$ such that

$$\begin{aligned} c \min \{n, m\}^{3/2} \leqslant \sup _{0\ne z\in {\mathsf {M}}_{n}^{\mathrm {sa}} \otimes {\mathsf {M}}_{m}^{\mathrm {sa}}} \frac{\Vert z\Vert _{S_1^{n,\mathrm {sa}}\otimes _\pi S_1^{m,\mathrm {sa}}}}{\Vert z\Vert _{S_1^{n,\mathrm {sa}}\otimes _\varepsilon S_1^{m,\mathrm {sa}}}} \leqslant C \min \{n, m\}^{3/2} . \end{aligned}$$

(79)

We break down the argument to prove (79) into two parts.

Step 1: lower bound. We assume without loss of generality that $n \leqslant m$, and consider two Hilbert–Schmidt orthonormal bases $(x_i)_{1 \leqslant i \leqslant n^2}$ and $(y_j)_{1 \leqslant j \leqslant m^2}$ of ${\mathsf {M}}_{n}^{\mathrm {sa}}$ and ${\mathsf {M}}_{m}^{\mathrm {sa}}$, respectively. We form the random tensor

$$\begin{aligned} z = \sum _{i=1}^{n^2} \sum _{j=1}^{m^2} g_{ij} x_i \otimes y_j, \end{aligned}$$

(80)

where $(g_{ij})$ are independent N(0, 1) Gaussian random variables. Let us observe that the distribution of z does not depend on the choice of the local orthonormal bases. We use the results from Corollary C.4:

$$\begin{aligned}&\mathbb {{\,\mathrm{{\mathbf {E}}}\,}}\Vert z\Vert _{S_1^{n,\mathrm {sa}}\otimes _{\varepsilon } S_1^{m,\mathrm {sa}}}\leqslant C \sqrt{n}\,m^{3/2}, \end{aligned}$$

(81)

$$\begin{aligned}&{{\,\mathrm{{\mathbf {E}}}\,}}\Vert z\Vert _{S_{\infty }^{n,\mathrm {sa} } \otimes _{\varepsilon } S_{\infty }^{m, \mathrm {sa}}} \leqslant C' \sqrt{m}. \end{aligned}$$

(82)

By duality, (82) implies a lower bound on the projective norm of z in $S_1^{n,\mathrm {sa}}\otimes _\pi S_1^{m,\mathrm {sa}}$. More precisely, using the duality between $S_1^{n,\mathrm {sa}}\otimes _{\pi } S_1^{m,\mathrm {sa}}$ and $S_{\infty }^{n,\mathrm {sa}} \otimes _{\varepsilon } S_{\infty }^{m,\mathrm {sa}}$ together with the Cauchy–Schwartz inequality, we obtain

$$\begin{aligned} {{\,\mathrm{{\mathbf {E}}}\,}}\left( \sum _{i,j} g_{ij}^2 \right) ^{1/2} = {{\,\mathrm{{\mathbf {E}}}\,}}\sqrt{{{\,\mathrm{tr}\,}}[z^2]} \leqslant \sqrt{{{\,\mathrm{{\mathbf {E}}}\,}}\Vert z\Vert _{S_1^{n,\mathrm {sa}}\otimes _{\pi } S_1^{m,\mathrm {sa}}}} \sqrt{ {{\,\mathrm{{\mathbf {E}}}\,}}\Vert z\Vert _{S_{\infty }^{n,\mathrm {sa}} \otimes _{\varepsilon } S_{\infty }^{m,\mathrm {sa}}}}. \end{aligned}$$

(83)

Since the l.h.s. of (83) is of order nm, combining (83) and (82) yields the lower bound

$$\begin{aligned} \mathbb {{\,\mathrm{{\mathbf {E}}}\,}}\Vert z\Vert _{S_1^{n,\mathrm {sa}}\otimes _{\pi } S_1^{m,\mathrm {sa}}} \geqslant c n^2 m^{3/2}. \end{aligned}$$

(84)

Using the above relation together with (81), we see that the random variable

$$\begin{aligned} U :=C \Vert z\Vert _{S_1^{n,\mathrm {sa}}\otimes _{\pi } S_1^{m,\mathrm {sa}}} - cn^{3/2} \Vert z\Vert _{S_1^{n,\mathrm {sa}}\otimes _{\varepsilon } S_1^{m,\mathrm {sa}}} \end{aligned}$$

has a nonnegative expectation. In particular, the event $\{U \geqslant 0 \}$ is nonempty, from which it follows that $\rho (S_1^{n,\mathrm {sa}},S_1^{m,\mathrm {sa}}) \geqslant cC^{-1} n^{3/2}$.

Step 2: upper bound. As before, we assume, without loss of generality, that $n\leqslant m$. Let us consider an element $z\in {\mathsf {M}}_{n}^{\mathrm {sa}} \otimes {\mathsf {M}}_{m}^{\mathrm {sa}}$ such that $\Vert z\Vert _{S_1^{n,\mathrm {sa}}\otimes _{\varepsilon } S_1^{m,\mathrm {sa}}}=1$. By Corollary C.12, there is a state $\varphi $ such that

$$\begin{aligned} \Vert \widetilde{z}(x) \Vert _1 \leqslant 2 \sqrt{2} \left( \varphi (x^2) \right) ^{1/2} \end{aligned}$$

(85)

for every $x \in {\mathsf {M}}_{n}^{\mathrm {sa}}$ (the notation $\widetilde{z}$ was introduced in (42)). In Dirac’s notation, the spectral decomposition of $\varphi $ reads

$$\begin{aligned} \varphi =\sum _{j=1}^n\lambda _j \vert {u_j}\rangle \!\langle {u_j}\vert \, , \end{aligned}$$

where $(\vert {u_j}\rangle )_j$ is an orthonormal basis of ${\mathbf {C}}^n$, and $(\lambda _j)_j$ is a probability distribution. Then, it is clear that $E_{jk}:=\vert {u_j}\rangle \!\langle {u_k}\vert $, with $j,k=1,\cdots , n$, defines a Hilbert–Schmidt orthonormal basis of the space of $n\times n$ complex matrices. Using that $(E_{jk})_{jk}$, $(E_{kj})_{jk}$ is a biorthogonal system, we can write

$$\begin{aligned} z=\sum _{j,k=1}^n E_{kj}\otimes \widetilde{z}(E_{jk})\in S_1^n\otimes S_1^m\, . \end{aligned}$$

If we define the Hermitian matrices $F_{jk}:=E_{jk}+E_{kj}$ and $H_{jk}:=i(E_{jk}-E_{kj})$, one can easily check that $F_{jk} \otimes \widetilde{z}(F_{jk}) + H_{jk} \otimes \widetilde{z}(H_{jk}) = 2 [E_{jk} \otimes \widetilde{z}(E_{kj}) + E_{kj} \otimes \widetilde{z}(E_{jk})]$ and therefore

$$\begin{aligned} z=\sum _j E_{jj}\otimes \widetilde{z}(E_{jj})+\frac{1}{2}\sum _{j < k}\left( F_{jk}\otimes \widetilde{z}(F_{jk})+H_{jk}\otimes \widetilde{z}(H_{jk})\right) , \end{aligned}$$

(86)

where all indices range from 1 to n. We then obtain the following:

$$\begin{aligned} \Vert z\Vert _{S_1^{n,\mathrm {sa}}\otimes _\pi S_1^{m,\mathrm {sa}}}&{\mathop {\leqslant }\limits ^{{{1}}}} \sum _j \Vert E_{jj}\Vert _{1}\Vert \widetilde{z}(E_{jj})\Vert _{1} +\frac{1}{2}\sum _{j<k} \left( \Vert F_{jk}\Vert _{1}\Vert \widetilde{z}(F_{jk})\Vert _{1}+\Vert H_{jk}\Vert _{1}\Vert \widetilde{z}(H_{jk})\Vert _{1}\right) \\&{\mathop {\leqslant }\limits ^{{{2}}}} \sum _j \Vert \widetilde{z}(E_{jj})\Vert _{1} + \sum _{j<k} \left( \Vert \widetilde{z}(F_{jk})\Vert _{1}+\Vert \widetilde{z}(H_{jk})\Vert _{1}\right) \\&{\mathop {\leqslant }\limits ^{{{3}}}} 2\sqrt{2} \sum _j \sqrt{\varphi (E_{jj})^2} + 2\sqrt{2} \sum _{j<k} \left( \sqrt{\varphi (F_{jk}^2)}+\sqrt{\varphi (H_{jk}^2)}\right) \\&{\mathop {=}\limits ^{{{4}}}} 2\sqrt{2} \sum _j \sqrt{\lambda _j} + 4\sqrt{2} \sum _{j<k} \sqrt{\lambda _j+\lambda _k} \\&{\mathop {\leqslant }\limits ^{{{5}}}} 2\sqrt{2} \left( \sqrt{n}\left( \sum \nolimits _j \lambda _j\right) ^{1/2} + 2\sqrt{\frac{n(n-1)}{2}} \left( \sum \nolimits _{j < k} (\lambda _j +\lambda _k) \right) ^{1/2} \right) \\&{\mathop {=}\limits ^{{{6}}}} 2\sqrt{2} \left( \sqrt{n} + 2\sqrt{\frac{n(n-1)}{2}} \sqrt{n-1}\right) \\&= 4 n^{3/2} - 2\sqrt{2} (\sqrt{2} -1) \sqrt{n}\, . \end{aligned}$$

The justification of the above steps is as follows: 1: we used the decomposition (86) as an ansatz into the minimisation that defines the projective norm (11); 2: we observed that $\Vert E_{jj}\Vert _1=1$ and $\Vert F_{jk}\Vert _1=2=\Vert H_{jk}\Vert _1$ for all $j<k$; 3: follows from (85); 4: we evaluated $\varphi (E_{jj}^2) = \varphi (E_{jj})=\lambda _j$ and

$$\begin{aligned} \varphi (F_{jk}^2) = \varphi (H_{jk}^2) = \varphi (|u_j\rangle \!\langle u_j| + |u_k\rangle \!\langle u_k|)= \lambda _j +\lambda _k\, ; \end{aligned}$$

5: is an application of the Cauchy–Schwartz inequality; 6: we computed

$$\begin{aligned} \sum _{j<k} (\lambda _j +\lambda _k) = \frac{1}{2} \sum _{j\ne k} (\lambda _j+\lambda _k) = \frac{1}{2} \left( \sum _j (n-1) \lambda _j + \sum _k (n-1) \lambda _k\right) = n-1\, , \end{aligned}$$

and remembered that $\sum _j \lambda _j =1$. This completes the proof of (39), which in turn implies (38) with $C=4$. $\square $

Proof of Corollary 9

The claim (40) derives from (79). Indeed: (i) the norm $\Vert \cdot \Vert _{\mathrm {LO}}$ as defined by [52] satisfies $\Vert \cdot \Vert _{\mathrm {LO}}\leqslant \Vert \cdot \Vert _1$ e.g. by Helstrom’s theorem; (ii) the inequality $\Vert \cdot \Vert _{1}\leqslant \Vert \cdot \Vert _{S_1^{n,\mathrm {sa}}\otimes _\pi S_1^{m,\mathrm {sa}}}$ follows from (12) combined with the fact that the standard quantum mechanical composition rule yields a legitimate composite in the GPT sense; and (iii) $\Vert \cdot \Vert _{\mathrm {LO}}\geqslant \Vert \cdot \Vert _{S_1^{n,\mathrm {sa}}\otimes _\varepsilon S_1^{m,\mathrm {sa}}}$ by [32, Proposition 22]. $\square $

4 Asymptotic Lower Bounds on the $\pi /\varepsilon $ Ratio

4.1 A lower bound for two copies of the same theory

In this section we prove that $r_s(n)\geqslant c\sqrt{n}/\log ^3 n$ for a certain universal constant $c > 0$, which is the technically challenging part of the statement of Theorem 4. In fact, remember that the example of $\ell _1^n$ shows that $r_s(n)\leqslant \sqrt{2n}$ (as reported in (20)), hence the aforementioned result is optimal up to logarithmic factors. For an n-dimensional Banach space we denote $d_X={{\,\mathrm{d}\,}}(X,\ell _2^n)$, where ${{\,\mathrm{d}\,}}$ is the Banach–Mazur distance.

Theorem 17

There exists a universal constant $c>0$ such that for every Banach space X of dimension n, we have

$$\begin{aligned} \rho (X,X)\geqslant \frac{cn}{d_X\log ^3 n}\, . \end{aligned}$$

In particular, $r_s(n)\geqslant c\sqrt{n}/\log ^3 n$.

Proof

The lower bound on $r_s(n)$ follows immediately by combining (17) and the well-known estimate $d_X \leqslant \sqrt{n}$ in (54). We now set out to prove (17). We may assume that X is equal to $({\mathbf {R}}^n, \Vert \cdot \Vert _X)$, with

$$\begin{aligned} \Vert \mathrm {Id}:\ell _2^n\rightarrow X\Vert \cdot \Vert \mathrm {Id}:X\rightarrow \ell _2^n\Vert = d_X\, . \end{aligned}$$

(87)

Our main tool is the following lemma, whose proof we postpone. It is based on ideas from [49] (see also [53, Lemma 46.2] and comments below it). We point that the assumption on the norm is not a restriction: since $d_X \leqslant \sqrt{n}$, X is isometric to $({\mathbf {R}}^n,\Vert \cdot \Vert )$ for a norm $\Vert \cdot \Vert $ on ${\mathbf {R}}^n$ satisfying $\frac{1}{\sqrt{n}} |\cdot | \leqslant \Vert \cdot \Vert _X \leqslant |\cdot |$.

Lemma 18

Consider a Banach space $X = ({\mathbf {R}}^n, \Vert \cdot \Vert _X)$, and assume that $\frac{1}{\sqrt{n}} |\cdot | \leqslant \Vert \cdot \Vert _X \leqslant |\cdot |$, where $|\cdot |$ is the standard Euclidean norm. Then there exist orthonormal vectors $\vert {f_i}\rangle \in {\mathbf {R}}^n$, $i=1,\cdots , k$, with $k\geqslant cn/\log (n)$, such that

$$\begin{aligned} {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i=1}^k g_i \vert {f_i}\rangle \right\| _X {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i=1}^kg_i \langle {f_i}\vert \right\| _{X^*}\leqslant Cn \log n\, . \end{aligned}$$

(88)

Here, c and C are universal constants and $(g_i)_{i=1}^n$ is a sequence of independent N(0, 1) Gaussian random variables.

Let us consider the vectors $(\vert {f_i}\rangle )_{1 \leqslant i \leqslant k} \in {\mathbf {R}}^n$ from Lemma 18, and form the random tensors $\vert {z}\rangle =\sum _{i,j=1}^kg_{ij} \vert {f_i}\rangle \otimes \vert {f_j}\rangle \in X\otimes X$ and $\langle {z}\vert =\sum _{i,j=1}^kg_{ij} \langle {f_i}\vert \otimes \langle {f_j}\vert \in X^*\otimes X^*$, where $(g_{ij})$ are independent N(0, 1) Gaussian random variables. It is clear that

$$\begin{aligned} {{\,\mathrm{{\mathbf {E}}}\,}}\langle z|z\rangle = {{\,\mathrm{{\mathbf {E}}}\,}}\sum _{i,j=1}^k g_{ij}^2=k^2\geqslant \frac{c^2 n^2}{\log ^2 n}\, . \end{aligned}$$

(89)

On the other hand, according to Chevet’s inequality (Theorem 3), we have

$$\begin{aligned} {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \vert {z}\rangle \right\| _{X\otimes _\varepsilon X}&= {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i,j=1}^kg_{ij}\vert {f_i}\rangle \otimes \vert {f_j}\rangle \right\| _{X\otimes _\varepsilon X}{\leqslant } 2{{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i=1}^kg_i\vert {f_i}\rangle \right\| _X\Vert \mathrm {Id}_{K}:\ell _2^k\rightarrow X\Vert , \end{aligned}$$

(90)

$$\begin{aligned} {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \langle {z}\vert \right\| _{X^*\otimes _\varepsilon X^*}&= {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i,j=1}^kg_{ij}\langle f_i|\otimes \langle f_j|\right\| _{X^*\otimes _\varepsilon X^*}\nonumber \\&\leqslant 2{{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i=1}^kg_i\langle f_i|\right\| _{X^*}\Vert \mathrm {Id}_{K}:\ell _2^k\rightarrow X^*\Vert , \end{aligned}$$

(91)

where $\mathrm {Id}_K$ denotes the identity map restricted to $K:=\text {span}\{\vert {f_i}\rangle : \, 1 \leqslant i \leqslant k\}$. We can then write

$$\begin{aligned} {{\,\mathrm{{\mathbf {E}}}\,}}\langle z|z\rangle&{\mathop {\leqslant }\limits ^{{{1}}}} {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i,j=1}^k g_{ij} \vert {f_i}\rangle \otimes \vert {f_j}\rangle \right\| _{X\otimes _\pi X}\left\| \sum _{i,j=1}^k g_{ij} \langle {f_i}\vert \otimes \langle {f_j}\vert \right\| _{X^*\otimes _\varepsilon X^*} \\&{\mathop {\leqslant }\limits ^{{{2}}}} \rho (X,X) {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i,j=1}^k g_{ij} \vert {f_i}\rangle \otimes \vert {f_j}\rangle \right\| _{X\otimes _\varepsilon X}\left\| \sum _{i,j=1}^k g_{ij} \langle {f_i}\vert \otimes \langle {f_j}\vert \right\| _{X^*\otimes _\varepsilon X^*} \\&{\mathop {\leqslant }\limits ^{{{3}}}} \rho (X,X)\left( {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i,j=1}^kg_{ij}\vert {f_i}\rangle \otimes \vert {f_j}\rangle \right\| ^2_{X\otimes _\varepsilon X}\right) ^{\frac{1}{2}}\\&\qquad \quad \left( {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i,j=1}^kg_{ij}\langle { f_i}\vert \otimes \langle { f_j}\vert \right\| ^2_{X^*\otimes _\varepsilon X^*}\right) ^{\frac{1}{2}}\\&{\mathop {\leqslant }\limits ^{{{4}}}} C_2^2 \rho (X,X)\ {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i,j=1}^kg_{ij}\vert {f_i}\rangle \otimes \vert {f_j}\rangle \right\| _{X\otimes _\varepsilon X}{{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i,j=1}^kg_{ij} \langle {f_i}\vert \otimes \langle {f_j}\vert \right\| _{X^*\otimes _\varepsilon X^*} \\&{\mathop {\leqslant }\limits ^{{{5}}}} 4C_2^2 \rho (X,X)\ {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i=1}^k g_i \vert {f_i}\rangle \right\| _X {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i=1}^k g_i \langle {f_i}\vert \right\| _{X^*}\\&\quad \quad \quad \Vert \mathrm {Id}_{K}:\ell _2^k\rightarrow X\Vert \cdot \Vert \mathrm {Id}_{K}:\ell _2^k\rightarrow X^*\Vert \\&{\mathop {\leqslant }\limits ^{{{6}}}} 4 C_2^2 \rho (X,X) d_X\ {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i=1}^k g_i \vert {f_i}\rangle \right\| _X {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i=1}^k g_i \langle {f_i}\vert \right\| _{X^*} \\&{\mathop {\leqslant }\limits ^{{{7}}}} C' \rho (X,X) d_X\ n \log n\, . \end{aligned}$$

The justification of the above steps is as follows: 1: we used the duality of injective and projective norm (44); 2: follows by definition of $\rho (X,X)$; 3: is an application of the Cauchy–Schwarz inequality; 4: is the $p=2$ case of the Khintchine–Kahane inequality (Theorem 2); 5: derives from (90) and (91); 6: can be derived from (87), using the fact that orthogonal projections onto subspaces of Hilbert spaces have norm 1; 7: is the statement of Lemma 18. Combining the above estimate with the lower bound in (89), we deduce that

$$\begin{aligned} \rho (X,X)\geqslant c\, \frac{n}{d_X \log ^3 n}\, , \end{aligned}$$

which concludes the proof. $\square $

Proof of Lemma 18

According to the $MM^*$-estimate (Theorem C.1), there exists an isomorphism $T:\ell _2^n\rightarrow X$ such that

$$\begin{aligned} \ell _X(T)\ell _{X^*}((T^{-1})^*)\leqslant Cn \log n. \end{aligned}$$

(92)

Moreover, since $\ell _X(T)=\ell _X(T\circ U)$ for every unitary U, it can be assumed that T is positive definite. By the spectral theorem, T can be written as

$$\begin{aligned} T=\sum _{i=1}^n\lambda _i \vert {f_i}\rangle \!\langle {f_i}\vert \, , \end{aligned}$$

for some positive numbers $\lambda _i$ and $(|f_i\rangle )_{i=1}^n$ an orthonormal basis of ${\mathbf {R}}^n$. Then, inequality (92) implies that

$$\begin{aligned} {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i=1}^n\lambda _ig_i |f_i\rangle \right\| _X {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i=1}^n\lambda _i^{-1}g_i \langle f_i|\right\| _{X^*} \leqslant Cn \log n\, . \end{aligned}$$

Using the inequalities $\Vert | f_i \rangle \Vert _X \geqslant 1/\sqrt{n}$ and $\Vert \langle f_j | \Vert _{X^*} \geqslant 1$ together with Jensen inequality (or Lemma C.5), we see that $\lambda _i\lambda _j^{-1} \leqslant C n^{3/2} \log n$ holds for any indices i, j. Let us denote $m=\min \{\lambda _k \, : \, 1 \leqslant k \leqslant n\}$ and $M=\max \{\lambda _k , \, \, 1 \leqslant k \leqslant n\}$. It follows that

$$\begin{aligned} \frac{M}{m}\leqslant Cn^{3/2} \log n\, . \end{aligned}$$

(93)

Now, (93) implies that the sets

$$\begin{aligned} A_s = \left\{ 1\leqslant j\leqslant n \, : \, 2^{s-1}\leqslant \frac{\lambda _j}{m}\leqslant 2^s \right\} \end{aligned}$$

with $s=1,\cdots , r$, define a partition of $\{1,\cdots , n\}$ for a certain $r\leqslant C'\log n$. By the pigeonhole principle, one can immediately deduce the existence of a set $A_{s_0}$ such that $|A_{s_0}|\geqslant \frac{n}{C'\log n}$. Now, consider the set of orthonormal vectors $\{|f_j\rangle \text {: }j\in A_{s_0}\}$. Applying Lemma C.5, we see that

$$\begin{aligned}&{{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i\in A_{s_0}}g_i | f_i\rangle \right\| _X {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i\in A_{s_0}}g_i \langle f_i|\right\| _{X^*}\\&\leqslant {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i\in A_{s_0}}\frac{\lambda _i}{2^{s_0-1}m }g_i |f_i\rangle \right\| _X {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i\in A_{s_0}}2^{s_0} m \lambda _i^{-1}g_i \langle f_i|\right\| _{X^*}\\&\leqslant 2 {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i\in A_{s_0}}\lambda _ig_i |f_i\rangle \right\| _X{{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i\in A_{s_0}}\lambda _i^{-1}g_i \langle f_i|\right\| _{X^*} \\&\leqslant Cn \log n\, , \end{aligned}$$

completing the argument. $\quad \square $

4.2 A lower bound for any pair of theories

The aim of this section is to prove Theorems 5 and 6, which provide general lower bounds on $\rho (X,Y)$ as functions of the dimensions n, m only. As discussed in Sect. 2, our strategy requires two preliminary results that allow to reduce the problem to the more manageable special case where either X or Y is one of the ‘classical’ spaces $\ell _1^n$, $\ell _2^n$ or $\ell _{\infty }^n$. We start by presenting the solution to these special cases.

Lemma 19

For every finite-dimensional normed space X with $\dim X \geqslant n$, we have

(a)
$\rho (\ell _1^n,X) \geqslant \sqrt{n/2},$
(b)
$\rho (\ell _2^n,X) \geqslant \sqrt{n},$
(c)
$\rho (\ell _{\infty }^n,X) \geqslant \sqrt{n/2}.$

Proof

We already know from Proposition 13 that $\rho (\ell _1^n,X) = \pi _1^{(n)}(\mathrm {Id}_X)$. Moreover (this is especially clear from (73)), we have $\rho (\ell _1^n,Y) \leqslant \rho (\ell _1^n,X)$ whenever Y is a subspace of X. Consequently, it suffices to prove (a) in the case when $\dim X=n$. In that case, we argue that

$$\begin{aligned} \pi _1^{(n)}(\mathrm {Id}_X) \geqslant \pi _2^{(n)}(\mathrm {Id}_X) \geqslant \frac{1}{\sqrt{2}} \pi _2(\mathrm {Id}_X) \geqslant \sqrt{n/2}, \end{aligned}$$

where we used points 1., 2. and 3. from Proposition C.10.

Part (b) is a direct consequence of Proposition C.8 together with the formulation of $\rho (\ell _2^n,X)$ from Lemma 11.

Finally, (c) follows from (a) since $\rho (\ell _{\infty }^n,X) = \rho (\ell _1^n,X^*)$, cf (61). $\square $

Remark

In light of the discussion at the end of Sect. 1.3 (see also the Remark after Proposition 15), we see that Lemma 19 entails the following: the gap between local and global bias for XOR games played over any system AB in which e.g. $A=\mathrm {Cl}_n$ is a classical theory (defined in (7)) is at least $\sqrt{n/2}$ whenever $\dim B \geqslant n$.

The following result is a variant of the ‘$\ell _1$/$\ell _2$/$\ell _{\infty }$ trichotomy’ which is based on ideas from Pisier [48], Rudelson [7], and Szarek–Tomczak-Jaegerman [50].

Theorem 20

Let X be a normed space of dimension n. Then for every $1 \leqslant A \leqslant \sqrt{n}$ at least one of the following holds

1.
X contains a subspace of dimension $d:=c \sqrt{n}$ which is $C A \sqrt{\log n}$-isomorphic to $\ell _{\infty }^d$.
2.
$X^*$ contains a subspace of dimension d which is $C A \sqrt{\log n}$-isomorphic to $\ell _{\infty }^d$.
3.
X contains a $C\log n$-complemented 4-Euclidean subspace of dimension $c A^2 / \log n$.

Here, C and c are universal constants.

Proof

By the $MM^*$-estimate (Theorem C.1), we may assume that $X = ({\mathbf {R}}^n,\Vert \cdot \Vert _X)$ with

$$\begin{aligned} \ell _X(\mathrm {Id}) \leqslant C \sqrt{n \log n} \ \ \text {and} \ \ \ell _{X^*}(\mathrm {Id}) \leqslant C \sqrt{n \log n} . \end{aligned}$$

Let ${\mathcal {E}}$ be the John ellipsoid of X as defined in Theorem C.7, $(|e_i\rangle )_{1 \leqslant i \leqslant n}$ be the semiaxes of ${\mathcal {E}}$ and $(\lambda _i)$ their lengths, i.e. ${\mathcal {E}} = T(B_2^n)$ where $T = \sum \lambda _i \vert {e_i}\rangle \!\langle {e_i}\vert $. Assume also that $\lambda _1 \leqslant \lambda _2 \leqslant \cdots \leqslant \lambda _n$. Note that we can assume that T is of this form because $T\circ u$ defines the same ellipsoid for every orthogonal transformation u. We consider the following dichotomy.

Case (i)$\lambda _{n/3} \leqslant A/\sqrt{n}$. Let $E = {{\,\mathrm{\mathrm {span}}\,}}\{ |e_i \rangle \, : \, 1 \leqslant i \leqslant n/3 \}$ and $P_E$ be the orthogonal projection onto E. We note that $P_E$ is orthogonal for both the standard Euclidean structure in ${\mathbf {R}}^n$ and the Euclidean structure induced by ${\mathcal {E}}$ (i.e. using $(\lambda _i |e_i\rangle )$ as an orthonormal basis). We apply Theorem C.9 to $P_E$ in order to produce an m-dimensional subspace of X which is R-isomorphic to $\ell _{\infty }^m$, for $m=c\sqrt{n}$ and $R = C \ell '_X(P_E)$, where we denote by $\ell '_X$ the $\ell _X$-norm computed using the Euclidean structure induced by ${\mathcal {E}}$. We use Lemma C.5 to obtain the bound

$$\begin{aligned} \ell '_X(P_E) = {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i=1}^{n/3} g_i \lambda _i | e_i \rangle \right\| _X \leqslant \lambda _{n/3} \ell _X(P_E) \leqslant \frac{A}{\sqrt{n}} \ell _X(\mathrm {Id}) \leqslant C A \sqrt{\log n} , \end{aligned}$$

and we conclude that X contains a subspace which is $C A \sqrt{\log n}$-isomorphic to $\ell _{\infty }^{c \sqrt{n}}$.

Case (ii)$\lambda _{n/3} > A/\sqrt{n}$. Let $F = {{\,\mathrm{\mathrm {span}}\,}}\{ e_i \ : \ i > n/3 \}$ and denote by $\mathrm {Id}_F : F \rightarrow {\mathbf {R}}^n$ the identity map restricted to F. We have

$$\begin{aligned} \Vert \mathrm {Id}_F : \ell _2^n \rightarrow X \Vert \leqslant \frac{\sqrt{n}}{A}. \end{aligned}$$

To see the previous bound, just write $\mathrm {Id}_F=(T\circ T^{-1}|_F)$ and use that $\Vert T^{-1}|_F:\ell _2^n\rightarrow \ell _2^n\Vert \leqslant \sqrt{n}/A$ and $\Vert T:\ell _2^n\rightarrow X\Vert \leqslant 1$.

We apply the same dichotomy to $X^*$. If case (i) occurs for either X or $X^*$, we are done. It remains to consider the situation when case (ii) occurs for both. This means that there exist subspaces $F_1$ and $F_2$ of dimension 2n/3 such that

$$\begin{aligned}&\Vert \mathrm {Id}_{F_1} : \ell _2^n \rightarrow X \Vert \leqslant \sqrt{n}/A, \\&\Vert \mathrm {Id}_{F_2} : \ell _2^n \rightarrow X^* \Vert \leqslant \sqrt{n}/A. \end{aligned}$$

Consider the subspace $F= F_1 \cap F_2$ (note that $\dim F \geqslant n/3)$. We are going to apply the Dvoretzky–Milman theorem (Theorem C.6) to both $X \cap F$ and $X^* \cap F$ (that is, to the space F with the norms inherited from X and from $X^*$ respectively). The corresponding Dvoretzky dimensions are

$$\begin{aligned} k_*(X \cap F)= & {} \left( \frac{\ell _{X \cap F}(\mathrm {Id}_F)}{\Vert \mathrm {Id}_F : \ell _2^n \rightarrow X \Vert } \right) ^2 \geqslant \frac{A^2}{n} \ell _{X \cap F} (\mathrm {Id}_F)^2 = A^2\ell _X(P_F)^2/n , \\ k_*(X^* \cap F)= & {} \left( \frac{\ell _{X^* \cap F}(\mathrm {Id}_F)}{\Vert \mathrm {Id}_F : \ell _2^n \rightarrow X^* \Vert } \right) ^2 \geqslant \frac{A^2}{n} \ell _{X^* \cap F}(\mathrm {Id}_F) = A^2 \ell _{X^*}(P_F)^2/n. \end{aligned}$$

On the other hand, if we consider the random vector $g = \sum _{i=1}^{\dim F} g_i f_i$, where $(g_i)$ are independent N(0, 1) Gaussian random variables and $(f_i)$ is an orthonormal basis in F, we compute

$$\begin{aligned}&\frac{n}{3} \leqslant \dim (F) = {{\,\mathrm{{\mathbf {E}}}\,}}| g |^2 {\mathop {\leqslant }\limits ^{{{1}}}} \left( {{\,\mathrm{{\mathbf {E}}}\,}}\left\| g \right\| ^2_X \right) ^{1/2} \left( {{\,\mathrm{{\mathbf {E}}}\,}}\left\| g \right\| ^2_{X^*} \right) ^{1/2} \\&\qquad \qquad \qquad \qquad \qquad \qquad {\mathop {\leqslant }\limits ^{{{2}}}} C {{\,\mathrm{{\mathbf {E}}}\,}}\Vert g\Vert _X {{\,\mathrm{{\mathbf {E}}}\,}}\Vert g\Vert _{X^*} = C \ell _X(P_F) \ell _{X^*}(P_F), \end{aligned}$$

where 1 follows from Cauchy–Schwarz inequality together with the inequality $|g|^2 \leqslant \Vert g\Vert _X\Vert g\Vert _{X^*}$, and 2 from Khintchine–Kahane inequalities (Theorem 2). Since $\ell _{X^*}(P_F) \leqslant \ell _{X^*}(\mathrm {Id}) \leqslant C \sqrt{n \log n}$, we have $\ell _X(P_F) \geqslant c \sqrt{n/\log n}$, and similarly $\ell _{X^*}(P_F) \geqslant c \sqrt{n/\log n}$. It follows that

$$\begin{aligned} k_*(X \cap F) \geqslant c A^2 / \log n \ \ \text {and} \ \ k_*(X^* \cap F) \geqslant c A^2 / \log n. \end{aligned}$$

By Dvoretzky–Milman theorem (Theorem C.6) and the remark following it, there is a subspace $E \subseteq F$ of dimension $c A^2 / \log n$ such that both $X \cap E$ and $X^* \cap E$ are 4-Euclidean. Moreover, using the extra information given by Theorem C.6, we have

$$\begin{aligned} \Vert \mathrm {Id}_E : \ell _2^n \rightarrow X \Vert\leqslant & {} \frac{2 \ell _X(P_F)}{\sqrt{n/3}} , \\ \Vert \mathrm {Id}_E : \ell _2^n \rightarrow X^* \Vert\leqslant & {} \frac{2 \ell _{X^*}(P_F)}{\sqrt{n/3}} . \end{aligned}$$

Since $\Vert P_E : X \rightarrow \ell _2^n \Vert = \Vert P_E : \ell _2^n \rightarrow X^* \Vert = \Vert \mathrm {Id}_E : \ell _2^n \rightarrow X^* \Vert $, we have

$$\begin{aligned} \Vert P_E : X \rightarrow X \Vert&\leqslant \Vert P_E : X \rightarrow \ell _2^n \Vert \cdot \Vert P_E : \ell _2^n \rightarrow X \Vert \\&\leqslant \frac{C \ell _X(P_F) \ell _{X^*}(P_F)}{n} \\&\leqslant \frac{C \ell _X(\mathrm {Id}) \ell _{X^*}(\mathrm {Id})}{n} \\&\leqslant C' \log n \end{aligned}$$

and therefore E is a $C'\log n$-complemented 4-Euclidean subspace of X. $\square $

Proof of Theorem 5

As usual, the difficult part is to establish the lower bound on r(n, n), while the upper bound is reported in (19) (set $n=m$). Let X and Y be n-dimensional normed spaces, and $A>1$ be a number whose value will be optimised later. Theorem 20 implies in particular that at least one of the following occurs (here we use the classical fact that a subspace isometric to $\ell _{\infty }^m$ is automatically 1-complemented):

(i)
${{\,\mathrm{f}\,}}\left( \ell _{\infty }^{c\sqrt{n}},X\right) \leqslant CA \sqrt{\log n}$;
(ii)
${{\,\mathrm{f}\,}}\left( \ell _{1}^{c\sqrt{n}},X\right) \leqslant CA \sqrt{\log n}$; or
(iii)
${{\,\mathrm{f}\,}}\left( \ell _{2}^{cA^2/\log n},X\right) \leqslant C \log n$.

If (i) holds, then by Lemma 19 and (64) we obtain

$$\begin{aligned} \rho (X,Y) \geqslant \rho \left( \ell _{\infty }^{c\sqrt{n}},Y\right) \Big / {{\,\mathrm{f}\,}}\left( \ell _{\infty }^{c\sqrt{n}},X\right) \geqslant \frac{cn^{1/4}}{A \sqrt{\log n}}\, . \end{aligned}$$

A similar estimate applies when X satisfies (ii), or when Y satisfies (i) or (ii). The only uncovered case is when X and Y both satisfy (iii), and we have then

$$\begin{aligned} \rho (X,Y) \geqslant \rho \left( \ell _{2}^{cA^2/\log n},\ell _{2}^{cA^2/\log n}\right) \Big / (C \log n)^2 \geqslant \frac{c A^2}{\log ^3 n}\, . \end{aligned}$$

The optimal choice is $A = n^{1/12} (\log n)^{5/6}$, which gives the announced lower bound. $\square $

Proof of Theorem 6

The upper bound on r(n, m) follows again from (19), so we focus on the lower bound. Let X and Y be normed spaces of respective dimensions n and m (with $n \leqslant m$), and $A>1$ be a number whose value will be optimised later. As is the previous proof, we combine Theorem 20 (applied only to X) and Lemma 19. In case (i), we have

$$\begin{aligned} \rho (X,Y) \geqslant \rho \left( \ell _{\infty }^{c \sqrt{n}},Y\right) \Big /{{\,\mathrm{f}\,}}\left( \ell _{\infty }^{c \sqrt{n}},X\right) \geqslant \frac{cn^{1/4}}{A \sqrt{\log n}}\, . \end{aligned}$$

Case (ii) is similar by duality. In case (iii), we have

$$\begin{aligned} \rho (X,Y) \geqslant \rho \left( \ell _{2}^{cA^2/\log n},Y\right) \Big /{{\,\mathrm{f}\,}}\left( \ell _{2}^{cA^2/\log n},X\right) \geqslant \frac{cA}{(\log n)^{3/2}}. \end{aligned}$$

The optimal choice $A = n^{1/8} \sqrt{\log n}$ always gives the lower bound $\rho (X,Y) \geqslant cn^{1/8}/\log n$, concluding the proof. $\square $

5 Conclusions

In this paper, we have defined and investigated XOR games from the foundational standpoint provided by general probabilistic theories. This has led us to identify a deep connection between the minimum relative increase in the bias when global strategies displace local ones on the one hand, and the so-called projective/injective ratio on the other. The existence of such a connection is made possible by the fact that all norms on a given vector space can be well approximated by suitable base norms induced by GPTs.

The projective/injective ratio r(n, m) is a universal function over pairs of integers that encodes some structural information about the theory of Banach spaces. For instance, we have shown that n/r(n, n) provides a lower bound on the diameter of the Banach–Mazur compactum in dimension n as measured by the weak distance [6, 7]. We have also proved that r(n, m) is always lower bounded by a universal constant strictly larger than 1. This shows the remarkable fact that injective and projective tensor product can never be isometric, even though Pisier’s celebrated construction [8] demonstrates that they can be isomorphic in the case where the spaces have infinite dimension. Along the way, we developed an Auerbach–type lemma that may be of independent interest.

The main results we have presented concern the asymptotic behaviour of the ratio r(n, m) and of its symmetrised version $r_s(n)$. In this context, we were able to show that, up to logarithmic factors, $r_s(n)$ is of the order $\sqrt{n}$. We showed that r(n, m) grows at least as $\min \{n,m\}^{1/8}$, and that one can improve the exponent to 1/6 if $n=m$. The proofs of these latter results follow by putting together an understanding of the projective/injective ratio in tensor products of the form $\ell _p^d\otimes X$, with $p=1,2,\infty $, and a ‘trichotomy theorem’ that identifies in any normed space a sufficiently large subspace that is close in the Banach–Mazur distance to either (a) $\ell _1^d$, or (b) $\ell _2^d$, or (c) $\ell _\infty ^d$. The main technical hurdle consists in establishing the additional requirement that in case (b) the chosen subspace is also well-complemented. As we have discussed, our findings draw on previous ideas by Pisier, Rudelson, Szarek, and Tomczak-Jaegermann.

Finally, although our primary subject of study is the intrinsic theory of XOR games played over general physical systems, it would be futile to deny that quantum systems hold great importance, due to their omnipresence in Nature as we currently understand it. In this spirit, we determined the exact scaling of the maximal global/local bias ratio in quantum XOR games, finding that it is of the order $\min \{n,m\}^{3/2}$, with n, m being the local Hilbert space dimensions. Interestingly, this implies a new bound on the maximal strength of quantum mechanical data hiding against local operations.

Our work leaves a number of open problems that we believe are worth investigating. Let us recall briefly some of them. First, it would be interesting to compute exactly the absolute minimum of r(n, m) across all pairs of integers, which one may conjecture to be equal to $\sqrt{2}$. A perhaps more profound question is to determine the best exponent $\gamma _{\text {opt}}$ such that $r(n,m)\geqslant c\min \{n,m\}^{\gamma _{\text {opt}}}$ for all n, m. We ask whether $\gamma _{\text {opt}}=1/2$. As we have seen, the simplified statement with $n=m$ would follow from Rudelson’s conjecture [7] that the Banach–Mazur compactum in dimension n has a diameter of the order $\sqrt{n}$ with respect to the weak distance.

Notes

In fact, one can pick the assisting distribution to reproduce directly the answers a, b the players have to give. For a classical XOR game defined by questions x, y and correct answers $c_{xy}\in \{0,1\}$, it suffices to define the assisting probability distribution by $p(ab|xy)=1/2$ if $a\oplus b=c_{xy}$ and 0 otherwise.
For convenience, here we are extending the definition of ‘locally constrained sets of measurements’ with respect to that given in [32], including also the scenario corresponding to an XOR game.

References

Bell, J.S.: On the Einstein–Podolsky–Rosen paradox. Physics 1(3), 195–200 (1964)
MathSciNet Google Scholar
Hartkämper, A., Neumann, H.: Foundations of Quantum Mechanics and Ordered Linear Spaces: Advanced Study Institute held in Marburg 1973. Springer, Berlin (1974)
MATH Google Scholar
Barrett, J.: Information processing in generalized probabilistic theories. Phys. Rev. A 75(3), 032304 (2007)
ADS Google Scholar
Lami, L.: Non-classical correlations in quantum mechanics and beyond. Ph.D. thesis, Universitat Autònoma de Barcelona (2017). Preprint arXiv:1803.02902
Palazuelos, C., Vidick, T.: Survey on nonlocal games and operator space theory. J. Math. Phys. 57(1), 015220 (2016)
ADS MathSciNet MATH Google Scholar
Tomczak-Jaegermann, N.: The weak distance between finite-dimensional Banach spaces. Math. Nachr. 119, 291–307 (1984)
MathSciNet MATH Google Scholar
Rudelson, M.: Estimates of the weak distance between finite-dimensional Banach spaces. Isr. J. Math. 89(1–3), 189–204 (1995)
MathSciNet MATH Google Scholar
Pisier, G.: Counterexamples to a conjecture of Grothendieck. Acta Math. 151(1), 181–208 (1983)
MathSciNet MATH Google Scholar
Segal, I.E.: Postulates for general quantum mechanics. Ann. Math. 48(4), 930–948 (1947)
MathSciNet MATH Google Scholar
Mackey, G.: Mathematical Foundations of Quantum Mechanics. Benjamin, New York (1963)
MATH Google Scholar
Ludwig, G.: Versuch einer axiomatischen Grundlegung der Quantenmechanik und allgemeinerer physikalischer Theorien. Z. Phys. 181(3), 233–260 (1964)
ADS MathSciNet MATH Google Scholar
Ludwig, G.: An Axiomatic Basis for Quantum Mechanics: Derivation of Hilbert Space Structure, vol. 1. Springer, Berlin (1985)
MATH Google Scholar
Ludwig, G.: Attempt of an axiomatic foundation of quantum mechanics and more general theories II. Commun. Math. Phys. 4(5), 331–348 (1967)
ADS MathSciNet MATH Google Scholar
Ludwig, G.: Attempt of an axiomatic foundation of quantum mechanics and more general theories III. Commun. Math. Phys. 9(1), 1–12 (1968)
ADS MathSciNet MATH Google Scholar
Dähn, G.: Attempt of an axiomatic foundation of quantum mechanics and more general theories IV. Commun. Math. Phys. 9(3), 192–211 (1968)
ADS MathSciNet MATH Google Scholar
Stolz, P.: Attempt of an axiomatic foundation of quantum mechanics and more general theories V. Commun. Math. Phys. 11(4), 303–313 (1969)
ADS MathSciNet MATH Google Scholar
Davies, E.B., Lewis, J.T.: An operational approach to quantum probability. Commun. Math. Phys. 17(3), 239–260 (1970)
ADS MathSciNet MATH Google Scholar
Edwards, C.M.: The operational approach to algebraic quantum theory I. Commun. Math. Phys. 16(3), 207–230 (1970)
ADS MathSciNet MATH Google Scholar
Hardy, L.: Quantum theory from five reasonable axioms (2001). Preprint arXiv:quant-ph/0101012
D’Ariano, G.M.: On the missing axiom of quantum mechanics. AIP Conf. Proc. 810(1), 114–130 (2006)
ADS MathSciNet MATH Google Scholar
Wilce, A.: Four and a half axioms for finite-dimensional quantum probability. In: Ben-Menahem, Y., Hemmo, M. (eds.) Probability in Physics. Springer, pp. 281–298 (2012)
Masanes, L., Müller, M.P.: A derivation of quantum theory from physical requirements. New J. Phys. 13(6), 063001 (2011)
ADS Google Scholar
Barnum, H., Müller, M.P., Ududec, C.: Higher-order interference and single-system postulates characterizing quantum theory. New J. Phys. 16(12), 123029 (2014)
ADS Google Scholar
Popescu, S., Rohrlich, D.: Quantum nonlocality as an axiom. Found. Phys. 24(3), 379–385 (1994)
ADS MathSciNet Google Scholar
Barrett, J., Linden, N., Massar, S., Pironio, S., Popescu, S., Roberts, D.: Nonlocal correlations as an information-theoretic resource. Phys. Rev. A 71, 022101 (2005)
ADS Google Scholar
Brassard, G., Buhrman, H., Linden, N., Méthot, A.A., Tapp, A., Unger, F.: Limit on nonlocality in any world in which communication complexity is not trivial. Phys. Rev. Lett. 96, 250401 (2006)
ADS MathSciNet MATH Google Scholar
Linden, N., Popescu, S., Short, A.J., Winter, A.: Quantum nonlocality and beyond: limits from nonlocal computation. Phys. Rev. Lett. 99, 180502 (2007)
ADS MathSciNet MATH Google Scholar
Barnum, H., Barrett, J., Leifer, M., Wilce, A.: Generalized no-broadcasting theorem. Phys. Rev. Lett. 99(24), 240501 (2007)
ADS Google Scholar
Barnum, H., Gaebler, C.P., Wilce, A.: Ensemble steering, weak self-duality, and the structure of probabilistic theories. Found. Phys. 43(12), 1411–1427 (2009)
ADS MathSciNet MATH Google Scholar
Barnum, H., Barrett, J., Leifer, M., Wilce, A.: Teleportation in general probabilistic theories. Proc. Sympos. Appl. Math. 71, 25–48 (2012)
MathSciNet MATH Google Scholar
Jenčová. Incompatible measurements in a class of general probabilistic theories. Phys. Rev. A 98, 012133 (2018)
Lami, L., Palazuelos, C., Winter, A.: Ultimate data hiding in quantum mechanics and beyond. Commun. Math. Phys. 361(2), 661–708 (2018)
ADS MathSciNet MATH Google Scholar
Janotta, P., Lal, R.: Generalized probabilistic theories without the no-restriction hypothesis. Phys. Rev. A 87, 052131 (2013)
ADS Google Scholar
Ellis, A.J.: The duality of partially ordered normed linear spaces. J. Lond. Math. Soc. 1(1), 730–744 (1964)
MathSciNet MATH Google Scholar
Ellis, A.J.: Linear operators in partially ordered normed vector spaces. J. Lond. Math. Soc. 1(1), 323–332 (1966)
MathSciNet MATH Google Scholar
Edwards, D.A.: On the homeomorphic affine embedding of a locally compact cone into a Banach dual space endowed with the vague topology. Proc. Lond. Math. Soc. 3(3), 399–414 (1964)
MathSciNet MATH Google Scholar
Boyd, S.P., Vandenberghe, L.: Convex Optimization. Berichte über verteilte Messysteme. Cambridge University Press, Cambridge (2004)
MATH Google Scholar
Kläy, M., Randall, C., Foulis, D.: Tensor products and probability weights. Int. J. Theor. Phys. 26(3), 199–219 (1987)
MathSciNet MATH Google Scholar
Wilce, A.: Tensor products in generalized measure theory. Int. J. Theor. Phys. 31(11), 1915–1928 (1992)
MathSciNet MATH Google Scholar
Grothendieck, A.: Résumé de la théorie métrique des produits tensoriels topologiques. Bol. Soc. Mat. São Paulo 8, 1–79 (1953)
MathSciNet MATH Google Scholar
Defant, A., Floret, K.: Tensor Norms and Operator Ideals, vol. 176. Elsevier, Hoboken (1992)
MATH Google Scholar
Ryan, R.A.: Introduction to Tensor Products of Banach Spaces. Springer, Berlin (2013)
Google Scholar
Cleve, R., Hoyer, P., Toner, B., Watrous, J.: Consequences and limits of nonlocal strategies. In: 19th IEEE Annual Conference on Computational Complexity, 2004. Proceedings, pp. 236–249. IEEE (2004)
Clauser, J.F., Horne, M.A., Shimony, A., Holt, R.A.: Proposed experiment to test local hidden-variable theories. Phys. Rev. Lett. 23, 880–884 (1969)
ADS MATH Google Scholar
Tsirel’son, B.S.: Quantum analogues of the Bell inequalities. The case of two spatially separated domains. J. Sov. Math. 36(4), 557–570 (1987)
MATH Google Scholar
Buscemi, F.: All entangled quantum states are nonlocal. Phys. Rev. Lett. 108, 200401 (2012)
ADS Google Scholar
Regev, O., Vidick, T.: Quantum XOR games. ACM Trans. Comput. Theory 7(4), 15:1–15:43 (2015)
MathSciNet MATH Google Scholar
Pisier, G.: Un théorème sur les opérateurs linéaires entre espaces de Banach qui se factorisent par un espace de Hilbert. Ann. Sci. École Norm. Sup. (4) 13(1), 23–43 (1980)
MathSciNet MATH Google Scholar
Bourgain, J., Milman, V.D.: Distances between normed spaces, their subspaces and quotient spaces. Integral Equ. Oper. Theory 9(1), 31–46 (1986)
MathSciNet MATH Google Scholar
Szarek, S.J., Tomczak-Jaegermann, N.: On the nontrivial projection problem. Adv. Math. 221(2), 331–342 (2009)
MathSciNet MATH Google Scholar
Brandão, F.G.S.L., Horodecki, M.: Exponential decay of correlations implies area law. Commun. Math. Phys. 333(2), 761–798 (2015)
ADS MathSciNet MATH Google Scholar
Matthews, W., Wehner, S., Winter, A.: Distinguishability of quantum states under restricted families of measurements with an application to quantum data hiding. Commun. Math. Phys. 291(3), 813–843 (2009)
ADS MathSciNet MATH Google Scholar
Tomczak-Jaegermann, N.: Banach–Mazur Distances and Finite-Dimensional Operator Ideals. Pitman Monographs and Surveys in Pure and Applied Mathematics, vol. 38. Longman Scientific & Technical, Harlow (1989). copublished in the United States with John Wiley & Sons, Inc., New York
MATH Google Scholar
Gluskin, E.D.: Diameter of the Minkowski compactum is approximately equal to $n$. Funct. Anal. Appl. 15(1), 57–58 (1981)
MATH Google Scholar
Stromquist, W.: The maximum distance between two-dimensional Banach spaces. Math. Scand. 48(2), 205–225 (1981)
MathSciNet MATH Google Scholar
Lindenstrauss, J., Tzafriri, L.: Classical Banach Spaces I and II, vol. 97. Springer, Berlin (1977)
MATH Google Scholar
Nordlander, G.: The modulus of convexity in normed linear spaces. Ark. Mat. 4(15–17), 1960 (1960)
MathSciNet MATH Google Scholar
Junge, M., Palazuelos, C., Villanueva, I.: Classical versus quantum communication in XOR games. Quantum Inf. Process. 17(5), 36 (2018). Art. 117
MathSciNet MATH Google Scholar
Pisier, G.: The Volume of Convex Bodies and Banach Space Geometry. Cambridge Tracts in Mathematics, vol. 94. Cambridge University Press, Cambridge (1989)
MATH Google Scholar
Aubrun, G., Szarek, S.J.: Alice and Bob Meet Banach. Mathematical Surveys and Monographs, vol. 223. American Mathematical Society, Providence (2017). The interface of asymptotic geometric analysis and quantum information theory
MATH Google Scholar
Latała, R., Oleszkiewicz, K.: Gaussian measures of dilatations of convex symmetric sets. Ann. Probab. 27(4), 1922–1938 (1999)
MathSciNet MATH Google Scholar
Ball, K.: An elementary introduction to modern convex geometry. In: Flavors of Geometry. Mathematical Sciences Research Institute Publications, vol. 31, pp. 1–58. Cambridge Univ. Press, Cambridge (1997)
Vershynin, R.: John’s decompositions: selecting a large part. Isr. J. Math. 122, 253–277 (2001)
MathSciNet MATH Google Scholar
Gordon, Y.: On $p$-absolutely summing constants of Banach spaces. Isr. J. Math. 7, 151–163 (1969)
MathSciNet MATH Google Scholar
König, H., Tomczak-Jaegermann, N.: Bounds for projection constants and 1-summing norms. Trans. Am. Math. Soc. 320(2), 799–823 (1990)
MathSciNet MATH Google Scholar
Grünbaum, B.: Projection constants. Trans. Am. Math. Soc. 95(3), 451–465 (1960)
MathSciNet MATH Google Scholar
Haagerup, U.: The Grothendieck inequality for bilinear forms on $C^\ast $-algebras. Adv. Math. 56(2), 93–116 (1985)
MathSciNet MATH Google Scholar
Pisier, G.: Grothendieck’s theorem for noncommutative $C^{\ast } $-algebras, with an appendix on Grothendieck’s constants. J. Funct. Anal. 29(3), 397–415 (1978)
MathSciNet MATH Google Scholar

Download references

Acknowledgements

Open Access funding provided by Projekt DEAL. We are grateful to Gilles Pisier for sharing with us an unpublished proof of the estimate $r_s(n)\geqslant n^{1/10 - o(1)}$, and to Marius Junge for helpful comments on some of our results. We thank the Institut Henri Poincaré for support and hospitality during the programme ‘Analysis in Quantum Information Theory’ when part of the work on this paper was performed. GA was supported in part by ANR (France) under the Grant StoQ (2014-CE25-0003). LL acknowledges financial support from the European Research Council (ERC) under the Starting Grant GQCOP (Grant No. 637352). CP is partially supported by the Spanish ‘Ramón y Cajal Programme’ (RYC-2012-10449), the Spanish ‘Severo Ochoa Programme’ for Centres of Excellence (SEV-2015-0554) and the Grant MTM2014-54240-P, funded by Spanish MINECO. The research of SJS was supported in part by a Grant DMS-1600124 from the National Science Foundation (U.S.A.). AW acknowledges support from the Spanish MINECO, Project FIS2016-86681-P, with the support of FEDER funds, and the Generalitat de Catalunya, CIRIT Project 2014-SGR-966.

Author information

Authors and Affiliations

Institut Camille Jordan, Université Claude Bernard Lyon 1, 43 boulevard du 11 novembre 1918, 69622, Villeurbanne Cedex, France
Guillaume Aubrun
School of Mathematical Sciences, Centre for the Mathematics and Theoretical Physics of Quantum Non-Equilibrium Systems, University of Nottingham, University Park, Nottingham, NG7 2RD, UK
Ludovico Lami
Institut für Theoretische Physik und IQST, Universität Ulm, Albert-Einstein-Allee 11, 89069, Ulm, Germany
Ludovico Lami
Departamento de Análisis Matemático y Matemática Aplicada, Universidad Complutense de Madrid, Plaza de Ciencias s/n, 28040, Madrid, Spain
Carlos Palazuelos
Instituto de Ciencias Matemáticas, C/ Nicolás Cabrera, 13-15, 28049, Madrid, Spain
Carlos Palazuelos
Department of Mathematics, Applied Mathematics and Statistics, Case Western Reserve University, 10900 Euclid Avenue, Cleveland, OH, 44106, USA
Stanisław J. Szarek
Institut de Mathématiques de Jussieu-PRG, Sorbonne Université, 4 place Jussieu, 75005, Paris, France
Stanisław J. Szarek
Física Teòrica: Informació i Fenòmens Quàntics, Departament de Física, Universitat Autònoma de Barcelona, 08193, Bellaterra, Barcelona, Spain
Andreas Winter
ICREA – Institució Catalana de Recerca i Estudis Avançats, Pg. Lluis Companys 23, 08010, Barcelona, Spain
Andreas Winter

Authors

Guillaume Aubrun
View author publications
You can also search for this author in PubMed Google Scholar
Ludovico Lami
View author publications
You can also search for this author in PubMed Google Scholar
Carlos Palazuelos
View author publications
You can also search for this author in PubMed Google Scholar
Stanisław J. Szarek
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Winter
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ludovico Lami.

Additional information

Communicated by M. M. Wolf.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A. More on XOR Games in GPTs

Throughout this appendix we will demonstrate that the connection drawn by Theorem 1 between injective norms and bias of XOR games under local strategies is in a certain sense robust. Namely, we will show that allowing the players to use a bounded amount of back-and-forth communication does not make the bias larger that a constant times the same injective norm as in (22). In other words, the bias does not grow by more than a constant factor with respect to the purely local case.

Our argument does not require the communication to be classical. In fact, in principle the players are allowed to exchange any physical system described by a GPT. For instance, Alice could initiate the protocol by manipulating the subsystem A corresponding to her share of the question so as to prepare a bipartite state of a new system $A_1M_1$; the subsystem $M_1$ is sent to Bob, while Alice keeps $A_1$ for later use; then, Bob employs $M_1$ together with his share of the question B to prepare a message $M'_1$ to be sent to Alice and a record $B_1$ for later use. After N such rounds, Alice will have sent the systems $M_1,\ldots , M_N$, and Bob will have sent the systems $M'_1,\ldots , M'_N$. The total dimension of the systems exchanged is thus

$$\begin{aligned} L_\leftrightarrow :=(\dim (M_1)\ldots \dim (M_N)) (\dim (M'_1)\ldots \dim (M'_N))\, . \end{aligned}$$

(A1)

In what follows, we will refer to such a setting as a ‘local strategy assisted by two-way communication of total dimension $L_\leftrightarrow $’. We now deal with the problem of bounding the corresponding bias.

Note

We will often consider complicated compositions of maps acting on different systems. The convention we adopt is to omit all occurrences of the identity map acting on the untouched systems. In this way, if $T:A\rightarrow BC$ and $S:B\rightarrow DE$ are linear maps, we write ST instead of $(S_B\otimes \mathrm {Id}_C)\, T_A$.

Proposition A.1

Let $G=(AB, \omega , p,c)$ be an XOR game over a bipartite GPT AB, and set $z_G = \sum _{i} p_i (-1)^{c_i}\omega _i$ as in Theorem 1. The bias $\beta _{\leftrightarrow }(G)$ of G under local strategies assisted by two-way communication of total dimension $L_\leftrightarrow $ can be upper bounded as

$$\begin{aligned} \beta _{\leftrightarrow }(G)\leqslant & {} \sup _{\dim W\leqslant L_\leftrightarrow } \left\| \widetilde{z}_G\otimes \mathrm {Id}_W : V_A^* \otimes _\varepsilon W \longrightarrow V_B\otimes _\pi W \right\| \nonumber \\\leqslant & {} L_\leftrightarrow \, \Vert z_G\Vert _{V_A\otimes _\varepsilon V_B} = L_\leftrightarrow \, \beta _{\mathrm {LO}}(G)\, , \end{aligned}$$

(A2)

where the optimisation is over all normed spaces W of dimension up to $L_\leftrightarrow $, and $\widetilde{z}_G:V_A^*\rightarrow V_B$ is the linear map associated with the tensor $z_G\in V_A\otimes V_B$ according to (42).

Proof

The n rounds of communication can be represented by linear maps $T_\alpha : V_{A_{\alpha -1}} \otimes V_{M_{\alpha -1}'}\rightarrow V_{A_\alpha }\otimes V_{M_\alpha }$ and $S_\alpha : V_{B_{\alpha -1}}\otimes V_{M_\alpha } \rightarrow V_{B_\alpha }\otimes V_{M'_\alpha }$, for $\alpha =1,\ldots , N$, where for convenience we identified $A_0:=A$, $B_0:=B$, and $V_{M'_0}:={\mathbf {R}}$. After the communication stage has taken place, Alice is left with the systems $A_N M'_N$, while Bob will have only $B_N$. They then perform local measurements to output the answers. These can be conveniently represented as $\left\{ \frac{u+\varphi }{2}, \frac{u-\varphi }{2}\right\} $ (on Alice’s side) and $\left\{ \frac{u+\psi }{2}, \frac{u-\psi }{2}\right\} $ (on Bob’s side), where $\varphi \in V_{A_N}^*\otimes V_{M'_N}^*$ and $\psi \in V_{B_N}^*$. A reasoning analogous to that in the proof of Theorem 1 shows that the bias for this strategy will be given by

$$\begin{aligned} \beta = (\varphi \otimes \psi )\left( (S_N T_N)\ldots (S_1 T_1)(z_G)\right) =:w(z_G) = {{\,\mathrm{tr}\,}}[\widetilde{z}_G\, \widetilde{w}\,^* ]\, , \end{aligned}$$

(A3)

where we observed that the validity of the above equation for all $z_G$ defines a functional $w\in V_{A}^*\otimes V_B^*$ (which depends on $\varphi $, $\psi $, and all the maps $T_\alpha , S_\alpha $, for $\alpha =1,\ldots , N$), and for the last step we used (43).

Now, we claim that the rank of the operator $\widetilde{w}:V_A\rightarrow V_B^*$ satisfies

$$\begin{aligned} {{\,\mathrm{rk}\,}}\left( \widetilde{w}\right) \leqslant L_\leftrightarrow \, . \end{aligned}$$

(A4)

This can be verified straightforwardly by considering for all $\alpha $ families of vectors $\{x_{\alpha , j_\alpha }\in V_{M_\alpha }\}_{j_\alpha =1,\ldots , \dim (M_\alpha )}$, $\{y_{\alpha , k_\alpha }\in V_{M'_\alpha }\}_{k_\alpha =1,\ldots , \dim (M_\alpha ')}$ and families of maps $\big \{{\overline{T}}_{\alpha , j_\alpha }:V_{A_{\alpha -1}} \otimes V_{M_{\alpha -1}'}\rightarrow V_{A_\alpha }\big \}_{j_\alpha =1,\ldots , \dim (M_\alpha )}$, $\big \{{\overline{S}}_{\alpha , k_\alpha }: V_{B_{\alpha -1}}\otimes V_{M_\alpha } \rightarrow V_{B_\alpha } \big \}_{k_\alpha =1,\ldots , \dim ( M_\alpha ')}$ such that one can expand

$$\begin{aligned} T_\alpha = \sum _{j_\alpha =1}^{\dim (M_\alpha )} {\overline{T}}_{\alpha , j_\alpha } \otimes x_{\alpha , j_\alpha }\, ,\qquad S_\alpha = \sum _{k_\alpha =1}^{\dim (M_\alpha ')} {\overline{S}}_{\alpha , k_\alpha } \otimes y_{\alpha , k_\alpha }\, . \end{aligned}$$

Defining the ‘reduced’ maps ${\overline{T}}_{\alpha , j_\alpha } y_{\alpha -1,k_{\alpha -1}}: V_{A_{\alpha -1}} \rightarrow V_{A_\alpha }$ (for $\alpha =2,\ldots , N$) and ${\overline{S}}_{\alpha , k_\alpha }x_{\alpha ,j_{\alpha }}: V_{B_{\alpha -1}} \rightarrow V_{B_\alpha }$ (for $\alpha =1,\ldots , N$), we see that

$$\begin{aligned} w= & {} \sum _{\{j_\alpha ,\, k_\alpha \}_\alpha } \!\!\varphi \left( y_{N,k_N}\otimes ({\overline{T}}_{N,j_N}y_{N-1,k_{N-1}})\!\ldots \!({\overline{T}}_{2,j_2}y_{1,k_1}) {\overline{T}}_{1,j_1} \right) \\&\,\otimes \, \psi \left( ({\overline{S}}_{N,k_N} x_{N,j_N})\!\ldots \! ({\overline{S}}_{1,k_1} x_{1,j_1}) \right) , \end{aligned}$$

where the first tensor factors are functional in $V_A^*$, and the second belong to $V_{B}^*$. Since the above sum contains exactly $L_\leftrightarrow $ terms, we see that (A4) follows.

As it turns out, w satisfies also the inequality

$$\begin{aligned} \Vert \widetilde{w}:V_A\rightarrow V_B^*\Vert \leqslant 1\, . \end{aligned}$$

(A5)

To see why this is the case, observe that the bias $\beta = w(z_G)$ cannot be larger than the maximal bias achievable by global strategies, as given by Theorem 1. This implies that $w(z_G)\leqslant \Vert z_G\Vert _{V_A\otimes _\pi V_B}$. Since this has to hold for all $z_G\in V_A\otimes V_B$, and injective and projective tensor product are dual to each other by (44), we deduce that $1\geqslant \Vert w\Vert _{(V_A\otimes _\pi V_B)^*}=\Vert w\Vert _{V_A^*\otimes _\varepsilon V_B^*}=\Vert \widetilde{w}:V_A\rightarrow V_B^*\Vert $, where the last equality is an application of (45).

Putting together (A3), (A4), and (A5), we see that

$$\begin{aligned} \beta _{\leftrightarrow }(G) \leqslant \sup _{\begin{array}{c} \\ w\in V_A^*\otimes V_B^* \\ {{\,\mathrm{rk}\,}}(\widetilde{w})\leqslant L_\leftrightarrow \\ \Vert \widetilde{w}:V_A\rightarrow V_B^*\Vert \leqslant 1 \end{array}} |w(z)| = \sup _{\begin{array}{c} \\ \Vert F:V_A\rightarrow V_B^*\Vert \leqslant 1 \\ {{\,\mathrm{rk}\,}}(F)\leqslant L_\leftrightarrow \end{array}} {{\,\mathrm{tr}\,}}\left[ \widetilde{z}_G\, F^*\right] . \end{aligned}$$

As follows from elementary linear algebra, an operator $F:V_A\rightarrow V_B^*$ satisfies ${{\,\mathrm{rk}\,}}(F)\leqslant L_\leftrightarrow $ and $\Vert F\Vert \leqslant 1$ if and only if it can be factorised as $F = f_2 f_1$, where $f_1:V_A\rightarrow W$, $f_2:W\rightarrow V_B^*$ are linear maps, W is a suitable Banach space of dimension $\dim W\leqslant L_\leftrightarrow $, and $\Vert f_1\Vert , \Vert f_2\Vert \leqslant 1$. Using this observation, we can rewrite the upper bound in the above inequality as

$$\begin{aligned} \beta _{\leftrightarrow }(G) \leqslant \sup _{\begin{array}{c} \\ \dim W\leqslant L_\leftrightarrow \\ \Vert f_1:V_A\rightarrow W\Vert ,\, \Vert f_2:W\rightarrow V_B^*\Vert \leqslant 1 \end{array}} {{\,\mathrm{tr}\,}}\left[ \widetilde{z}_G\, f_1^* f_2^*\right] . \end{aligned}$$

(A6)

Defining the tensors $u\in V_A^*\otimes W$ and $v\in V_B^*\otimes W^*$ such that $\widetilde{u}=f_1$ and $\widetilde{v}=f_2^*$, we can rewrite

$$\begin{aligned} {{\,\mathrm{tr}\,}}\left[ \widetilde{z}_G\, f_1^* f_2^*\right] = v\left( (\widetilde{z}_G\otimes \mathrm {Id}_W)(u) \right) . \end{aligned}$$

At the same time, the constraints $\Vert f_1:V_A\rightarrow W\Vert , \Vert f_2:W\rightarrow V_B^*\Vert \leqslant 1$ become simply $\Vert u\Vert _{V_A^* \otimes _\varepsilon W},\, \Vert v\Vert _{V_B^*\otimes _\varepsilon W^*}\leqslant 1$. Using once again (44), the bound in (A6) translates to

$$\begin{aligned} \beta _\leftrightarrow (G)&\leqslant \sup _{\begin{array}{c} \\ \dim W\leqslant L_\leftrightarrow \\ \Vert u\Vert _{V_A^*\otimes _\varepsilon W},\, \Vert v\Vert _{V_B^*\otimes _\varepsilon W^*}\leqslant 1 \end{array}} v\left( (\widetilde{z}_G\otimes \mathrm {Id}_W)(u) \right) \\&= \sup _{\begin{array}{c} \\ \dim W\leqslant L_\leftrightarrow \\ \Vert u\Vert _{V_A^*\otimes _\varepsilon W} \leqslant 1 \end{array}} \left\| (\widetilde{z}_G\otimes \mathrm {Id}_W)(u) \right\| _{V_B\otimes _\pi W} \\&= \sup _{\dim W\leqslant L_\leftrightarrow } \left\| \widetilde{z}_G\otimes \mathrm {Id}_W: V_A^*\otimes _\varepsilon W \rightarrow V_B\otimes _\pi W\right\| , \end{aligned}$$

which proves the first upper bound in (A2). To obtain the other inequalities, we write

$$\begin{aligned} \left\| \widetilde{z}_G\otimes \mathrm {Id}_W: V_A^*\otimes _\varepsilon W \rightarrow V_B\otimes _\pi W\right\|&{\mathop {\leqslant }\limits ^{{{1}}}} L_\leftrightarrow \left\| \widetilde{z}_G\otimes \mathrm {Id}_W: V_A^*\otimes _\pi W \rightarrow V_B\otimes _\pi W\right\| \\&{\mathop {=}\limits ^{{{2}}}} L_\leftrightarrow \left\| \widetilde{z}_G : V_A^* \rightarrow V_B \right\| \\&{\mathop {=}\limits ^{{{3}}}} L_\leftrightarrow \left\| z_G\right\| _{V_A \otimes _\varepsilon V_B}\, . \end{aligned}$$

The above steps are easy to justify: 1: we employed the inequality

$$\begin{aligned} \Vert \cdot \Vert _{V_A^*\otimes _\varepsilon W} \geqslant \frac{1}{\dim W}\, \Vert \cdot \Vert _{V_A^*\otimes _\pi W} \geqslant \frac{1}{L_\leftrightarrow }\, \Vert \cdot \Vert _{V_A^*\otimes _\pi W}\, , \end{aligned}$$

which derives from (18) (in turn proven in [32, Proposition 21]); 2: follows because the extreme points of the unit ball of $V_A^*\otimes _\pi W$ are product vectors; 3: is an application of (45). $\square $

Remark

By the same kind of arguments, one can also show that sharing a physical system of bounded dimension does not help to increase the bias by more than a constant factor. We omit the details.

Finally, let us emphasise that here we have shown an upper bound for the bias of an XOR game with back-and-forth communication. In the work [58], the authors studied classical XOR games with both one-way classical communication and one-way quantum communication. It turns out that in that case, the bias of the the games can be exactly expressed in terms of certain norms of the corresponding operator $\widetilde{z}_G:\ell _\infty \rightarrow \ell _1$.

Appendix B. Every Normed Space is 2-Isomorphic to a Base Norm Space

In this appendix we justify our choice of characterising the intrinsic difference between global and local strategies in XOR games by means of the projective/injective ratio as defined by (16) instead of (27), as discussed in Sect. 1.3. This corresponds to letting the optimisation run over all pairs of Banach spaces of fixed dimensions instead of restricting it to the base norm spaces alone, and does not lead to a significant loss of information because of the inequalities (29) and (30), whose proof we present here. Let us start with a preliminary result.

Lemma B.1

Every Banach space (possibly infinite-dimensional) is 2-isomorphic to a base norm space.

Proof

Let X be a Banach space. Pick a unit vector $x\in X$ such that $\Vert x\Vert =1$, and consider the associated norming functional $x^*\in X^*$, which satisfies $\Vert x^*\Vert =1$ and $x^*(x)=1$. Calling $B_X$ the unit ball of X, construct the set $F:=(x^*)^{-1}(1/2)\cap B_X$, and then set $B:={{\,\mathrm{cl}\,}}{{\,\mathrm{\mathrm {conv}}\,}}\left( F \cup (-F)\right) $, where the closure is possibly needed only in the infinite-dimensional case. It is not difficult to verify that B is the unit ball of the base norm space induced on X by the positive cone ${\mathbf {R}}_+\!\cdot \! F=\{x\in X:\, x^*(x)\geqslant \Vert x\Vert /2\}$ and the unit functional $u:=2x^*$ [2, p. 26]. Since $B\subseteq B_X$, it suffices to check that $B_X\subseteq 2B$ to establish the claim. To this end, we pick $y\in B_X$ and we check that $\frac{y}{2}\in B$. We can assume without loss of generality that $x^*(y)\geqslant 0$, while $|x^*(y)|\leqslant 1$ holds by construction. We now distinguish two cases.

If $x^*(y)\geqslant 1/2$, we can write
$$\begin{aligned} \frac{y}{2} = x^*(y) \frac{y}{2 x^*(y)} \in x^*(y) B \subseteq B\, , \end{aligned}$$
where we used the fact that $2x^*(y)\geqslant 1$.
The case where $0\leqslant x^*(y)<1/2$ is significantly less transparent. Figure 2 conveys the geometric intuition behind the proof. An analytical argument is as follows. Call $k:=x^*(y)$, and define the two vectors
$$\begin{aligned} z_\pm :=\frac{1}{2(1\mp k)}\left( \pm (1\mp 2k) x +y\right) . \end{aligned}$$
Observe that $z_+$ lies at the intersection of the segment joining y and x with the plane $x^*=1/2$. Analogously, $z_-$ lies at the intersection of the segment joining y and $-x$ with the plane $x^*=-1/2$. In particular, $z_\pm \in B$. We now try to obtain a multiple of y by taking a convex combination of $z_+$ and $z_-$. Setting
$$\begin{aligned} p(k) :=\frac{(1+2k)(1-k)}{2(1-2k^2)}\, , \end{aligned}$$
which satisfies $1/2 \leqslant p(k) < 1$ for all $0\leqslant k<1/2$, we can write
$$\begin{aligned} \frac{y}{2(1-2k^2)} = p(k) z_+ + (1-p(k)) z_- \in B\, . \end{aligned}$$
By rescaling the vector on the l.h.s. we see that $y/2\in B$.

This concludes the proof. $\square $

We are now ready to prove the inequalities (29) and (30) discussed in the main text.

Lemma B.2

The functions r(n, m) and $r_{\mathrm {bn}}(n, m)$ defined by (16) and (27) satisfy (29) and (30):

$$\begin{aligned} r_{\mathrm {bn}}(n, m)&\leqslant 4\, r(n, m)\, ,\\ r_{\mathrm {bn}}(n, m)&\leqslant 2 + r(n-1, m-1)\, , \end{aligned}$$

for all integers $n,m\geqslant 2$.

Proof

Lemma B.1 proves that for all Banach spaces X there is a base norm Banach space $X'$ such that ${{\,\mathrm{d}\,}}(X,X')\leqslant 2$, where ${{\,\mathrm{d}\,}}$ is the Banach–Mazur distance (49). We apply this to a pair of finite-dimensional Banach spaces X, Y, with $\dim X=n$ and $\dim Y=m$, obtaining two base norm Banach spaces $X'$ and $Y'$ of the same dimension that are 2-isomorphic to X and Y, respectively. We find that

$$\begin{aligned} r_{\mathrm {bn}}(n,m)&\leqslant \rho (X',Y') \\&\leqslant {{\,\mathrm{d}\,}}(X',X) \rho (X,Y') \\&\leqslant {{\,\mathrm{d}\,}}(X',X) {{\,\mathrm{d}\,}}(Y',Y) \rho (X,Y) \\&\leqslant 4\rho (X,Y)\, , \end{aligned}$$

where we used (65) twice, once for each of the two arguments of $\rho $ (this is possible as $\rho $ is symmetric, see (61)). Taking the infimum over all pairs X, Y yields (29).

We now move on to proving the second inequality (30). The main idea of the argument is to construct, given a pair of Banach spaces X, Y of dimensions $n-1,m-1$, another pair of base norm spaces $X',Y'$ of dimensions n, m such that $\rho (X',Y')\approx \rho (X,Y)$. This can be done by setting $X':=X\oplus _\infty {\mathbf {R}}$, where $\Vert (x,a)\Vert _{X'}= \max \{\Vert x\Vert _X,|a|\}$ for all $x\in X$ and $a\in {\mathbf {R}}$, and analogously for $Y'$. It is not difficult to check that $\Vert \cdot \Vert _{X'}$ is in fact the base norm induced by the cone $C:=\{(x,a): a\geqslant \Vert x\Vert _X\}$ and the order unit $u_{X'}\in (X')^*$ defined by $u_{X'}(x,a)=a$ for all $x\in X$ and $a\in {\mathbf {R}}$. Thus, $X'$ and $Y'$ are base norm spaces. Incidentally, this is a systematic way of associating ‘centrally symmetric’ GPTs to Banach spaces, see [32, Section 6.1]. We now proceed to show that

$$\begin{aligned} \rho (X',Y')\leqslant 2 + \rho (X,Y)\, , \end{aligned}$$

(B1)

using a similar technique to that employed in the proof of [32, Proposition 26]. Take

$$\begin{aligned} z' = \begin{pmatrix} z &{} s \\ t &{} a \end{pmatrix}\in X'\otimes Y'\, , \end{aligned}$$

where $z\in X\otimes Y$, $s\in X$, $t\in Y$, and $a\in {\mathbf {R}}$. Using the fact that a unit functional $\varphi \in B_{(X')^*}$ acts as $\varphi (x,a) = p x^*(x) \pm (1-p)a$, for some $x^* \in B_{X^*}$ and $p\in [0,1]$, it is not difficult to show that

$$\begin{aligned} \Vert z'\Vert _{X'\otimes _\varepsilon Y'} = \max \left\{ \Vert z\Vert _{X\otimes _\varepsilon Y}, \Vert s\Vert , \Vert t\Vert , |a| \right\} . \end{aligned}$$

(B2)

We now give an upper estimate of the corresponding projective norm. Taking vectors $x_i\in X$ and $y_i\in Y$ such that $z=\sum _i x_i\otimes y_i$ and $\Vert z\Vert _{X\otimes _\pi Y} = \sum _i \Vert x_i\Vert \Vert y_i\Vert $, and for an arbitrary $p\in [0,1]$, we consider the decomposition

$$\begin{aligned}&z' = \begin{pmatrix} 0 &{} s \\ 0 &{} pa \end{pmatrix} + \begin{pmatrix} 0 &{} 0 \\ t &{} (1-p)a \end{pmatrix} + \begin{pmatrix} z &{} 0 \\ 0 &{} 0 \end{pmatrix}\\&\quad = (s,pa)\otimes (0,1) + (0,1)\otimes (t,(1-p)a) + \sum _i (x_i,0)\otimes (y_i,0)\, , \end{aligned}$$

which yields the estimate

$$\begin{aligned} \begin{aligned} \Vert z'\Vert _{X'\otimes _\pi Y'}&\leqslant \min _{p\in [0,1]} \left\{ \max \{\Vert s\Vert ,p|a|\} + \max \{\Vert t\Vert ,(1-p)|a|\} +\sum _i \Vert x_i\Vert \Vert y_i\Vert \right\} \\&= \max \left\{ \Vert s\Vert +\Vert t\Vert ,|a|\right\} + \sum _i \Vert x_i\Vert \Vert y_i\Vert \\&= \max \left\{ \Vert s\Vert +\Vert t\Vert ,|a|\right\} + \Vert z\Vert _{X\otimes _\pi Y} \\&\leqslant \max \left\{ \Vert s\Vert +\Vert t\Vert ,|a|\right\} + \rho (X,Y) \Vert z\Vert _{X\otimes _\varepsilon Y} \\&\leqslant \left( 2 + \rho (X,Y)\right) \Vert z'\Vert _{X'\otimes _\varepsilon Y'}\, . \end{aligned} \end{aligned}$$

(B3)

Optimising over all $z'\in X'\otimes Y'$ and using (15) gives the estimate in (B1). We can now write

$$\begin{aligned} r_{\mathrm {bn}}(n,m)&\leqslant \inf _{\begin{array}{c} \dim X = n-1 \\ \dim Y = m-1 \end{array}} \rho (X',Y') \\&\leqslant \inf _{\begin{array}{c} \dim X = n-1 \\ \dim Y = m-1 \end{array}} \left\{ 2 + \rho (X,Y) \right\} \\&= 2 + r(n-1,m-1)\, . \end{aligned}$$

This concludes the proof. $\square $

Appendix C. Functional Analytic Tools

1.1 1. The $\ell $-norm and the $MM^*$-estimate

Let X be a real Banach space. Given a linear map $T:\ell _2^n\rightarrow X$, the $\ell $-norm of T is defined as

$$\begin{aligned} \ell _X(T)= {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i=1}^ng_i T|e_i\rangle \right\| _X, \end{aligned}$$

(C1)

where $(|e_i\rangle )_{i=1}^n$ is an orthormal basis of ${\mathbb {R}}^n$ and $(g_i)_{i=1}^n$ is a sequence of independent N(0, 1) Gaussian random variables. We point out that several authors prefer to define $\ell $-norms via the second moment, i.e. $\ell _X(T)= \left( {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i=1}^ng_i T|e_i\rangle \right\| _X^2 \right) ^{1/2}$. However both definitions give equivalent norms in view of Theorem 2. Also, note that the invariance of the Gaussian measure under unitary transformations implies that $\ell _X(T)=\ell _X(T\circ U)$ for every unitary $U:\ell _2^n\rightarrow \ell _2^n$ and, in particular, (C1) does not depend on the choice of orthonormal basis.

The following theorem will be crucial for us.

Theorem C.1

($MM^*$-estimate). Let X be an n-dimensional Banach space. Then there exists an isomorphism $T:\ell _2^n\rightarrow X$ such that

$$\begin{aligned} \ell _X(T)\ell _{X^*}((T^{-1})^*)\leqslant Cn \log n. \end{aligned}$$

That statement is a direct consequence of Lewis’ theorem ([59, Theorem 3.1]) and a well known estimate on the so-called K-convexity constant of a Banach space. The reader can find a detailed proof of Theorem C.1 in [59, Theorem 3.11] or [60, Theorem 7.10].

1.2 2. Some Gaussian inequalities

We will make use of Khintchine–Kahane inequalities (see for instance [59, Corollary 4.9] or, for optimal constants, [61, Corollary 3]).

Theorem 2

(Khintchine–Kahane inequalities). For every $1< p<\infty $ there exists a universal constant $C_p>0$ such that for every Banach space X and every sequence of elements $(x_i)_{i=1}^n\subset X$ we have

$$\begin{aligned} {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i=1}^n g_ix_i\right\| _X\leqslant \left( {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i=1}^n g_i x_i\right\| _X^p\right) ^{\frac{1}{p}}\leqslant C_p {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i=1}^n g_i x_i\right\| _X. \end{aligned}$$

We will also make use of Chevet’s inequality ( [53, Theorem 43.1]).

Theorem 3

(Chevet’s inequality). Let X and Y be real Banach spaces. Define the Gaussian random tensor $z=\sum _{i=1}^m \sum _{j=1}^n g_{ij}x_i\otimes y_j\in X\otimes Y$, where $(g_{ij})$ are independent N(0, 1) Gaussian random variables, and $(x_i)_{i=1}^m \subset X$, $(y_j)_{j=1}^n\subset Y$ are sequences of elements. Then,

$$\begin{aligned}&{{\,\mathrm{{\mathbf {E}}}\,}}\left\| z\right\| _{X \otimes _{\varepsilon } Y}\leqslant \sup _{x^*\in B_{X^*}}\left( \sum _{i=1}^m |x^*(x_i)|^2\right) ^{\frac{1}{2}} {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{j=1}^n g_j y_j \right\| _{Y}\\&\quad \qquad \qquad \qquad \qquad + \sup _{y^*\in B_{Y^*}}\left( \sum _{j=1}^n |y^*(y_j)|^2\right) ^{\frac{1}{2}} {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i=1}^m g_i x_i \right\| _{X}, \end{aligned}$$

where $(g_{i})_i$ is a sequence of independent N(0, 1) Gaussian random variables.

Note that, given a Banach space Z and $(z_i)_{i=1}^n\subset Z$, we have

$$\begin{aligned} \sup _{z^*\in B_{Z^*}}\left( \sum _{i=1}^n |z^*(z_i)|^2\right) ^{\frac{1}{2}}=\Vert T:\ell _2^n\rightarrow Z\Vert . \end{aligned}$$

Here T is the linear map defined by $T|e_i\rangle =z_i$ for every $i=1,\cdots , n$, where $(|e_i\rangle )_i$ is an orthonormal basis of $\ell _2^n$.

Here is a typical application of Chevet’s inequality. Fix integers m, n, and consider $(x_i)_{1 \leqslant i \leqslant m^2}$ and $(y_j)_{1 \leqslant j \leqslant n^2}$ orthonormal bases of ${\mathsf {M}}_{m}^{\mathrm {sa}}$ and ${\mathsf {M}}_{n}^{\mathrm {sa}}$ respectively, with respect to the Hilbert–Schmidt inner product. We form the random tensor

$$\begin{aligned} z = \sum _{i=1}^{m^2} \sum _{j=1}^{n^2} g_{ij} x_i \otimes y_j, \end{aligned}$$

(C2)

where $(g_{ij})$ are independent N(0, 1) Gaussian random variables.

Corollary C.4

Let z be defined as in (C2). Remember that we denote by $S_{p}^{n,\mathrm {sa}}$ the space of $n\times n$ Hermitian matrices equipped with the Schatten norm $\Vert \cdot \Vert _p$. Then

$$\begin{aligned}&{{\,\mathrm{{\mathbf {E}}}\,}}\Vert z\Vert _{S_1^{m,\mathrm {sa}}\otimes _{\varepsilon } S_1^{n,\mathrm {sa}}} \leqslant C \sqrt{mn} \max \{m,n\} , \\&{{\,\mathrm{{\mathbf {E}}}\,}}\Vert z\Vert _{S_{\infty }^{m,\mathrm {sa}} \otimes _{\varepsilon } S_{\infty }^{n,\mathrm {sa}}} \leqslant C \max \{\sqrt{m},\sqrt{n}\}. \end{aligned}$$

Proof

In both cases we apply Theorem 3 and need to estimate all quantities appearing on the r.h.s. The random matrix $G_m = \sum _{j=1}^m g_j x_j$ is distributed according to the Gaussian Unitary Ensemble. It is well known (see for example [60, Proposition 6.24]) that as m tends to infinity,

$$\begin{aligned} {{\,\mathrm{{\mathbf {E}}}\,}}\Vert G_m\Vert _{\infty } \sim 2 \sqrt{m}, \ \ \ {{\,\mathrm{{\mathbf {E}}}\,}}\Vert G_m\Vert _{1} \leqslant m^{3/2} . \end{aligned}$$

On the other hand, it is also well known (and easy to check) that

$$\begin{aligned} ||\mathrm {Id}:S_2^{m,\mathrm {sa}}\rightarrow S_\infty ^{m,\mathrm {sa}}||\leqslant 1, \ \ \ ||\mathrm {Id}:S_2^{m,\mathrm {sa}}\rightarrow S_1^{m,\mathrm {sa}}||\leqslant \sqrt{m}. \end{aligned}$$

Since $S_2^{m,\mathrm {sa}}$ is isometric to $\ell _2^{m^2}$, we can apply Theorem 3 to conclude that

$$\begin{aligned}&{{\,\mathrm{{\mathbf {E}}}\,}}\Vert z\Vert _{S_1^{m,\mathrm {sa}}\otimes _{\varepsilon } S_1^{n,\mathrm {sa}}} \leqslant n^{\frac{3}{2}}\sqrt{m}+ m^{\frac{3}{2}}\sqrt{n} \leqslant 2\sqrt{mn} \max \{m,n\}, \\&{{\,\mathrm{{\mathbf {E}}}\,}}\Vert z\Vert _{S_{\infty }^{m,\mathrm {sa}} \otimes _{\varepsilon } S_{\infty }^{n,\mathrm {sa}}} \leqslant 2 \sqrt{m}+ 2\sqrt{n} \leqslant 4 \max \{m,n\}; \end{aligned}$$

hence the result follows. $\square $

Finally, we will use the following lemma, whose proof is elementary.

Lemma C.5

(Contraction principle). Let $(\alpha _i)_{i=1}^n$ and $(\beta _i)_{i=1}^n$ be two sequences of numbers with $0\leqslant \alpha _i\leqslant \beta _i$ for every i. Let $(g_i)_{i=1}^n$ be a sequence of independent N(0, 1) Gaussian random variables. Then, for every Banach space X and every $x_1,\cdots , x_n\in X$, we have

$$\begin{aligned} {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i=1}^n\alpha _i g_ix_i\right\| _X\leqslant {{\,\mathrm{{\mathbf {E}}}\,}}\left\| \sum _{i=1}^n \beta _i g_ix_i\right\| _X. \end{aligned}$$

1.3 3. Dvoretkzy–Milman theorem

We also need Milman’s version of Dvoretzky theorem (see e.g. [60, Theorem 7.19]). Let $\Vert \cdot \Vert _X$ be a norm on ${\mathbf {R}}^n$ and consider the space $X = ({\mathbf {R}}^n,\Vert \cdot \Vert _X)$. The Dvoretzky dimension of X is defined as

$$\begin{aligned} k_*(X) = \left( \frac{\ell _X(\mathrm {Id})}{\Vert \mathrm {Id}: \ell _2^n \rightarrow X \Vert }\right) ^2. \end{aligned}$$

Theorem C.6

(Dvoretzky–Milman theorem). Consider a normed space $X=({\mathbf {R}}^n,\Vert \cdot \Vert _X)$ and let $E \subseteq {\mathbf {R}}^n$ be a random subspace of dimension $k \leqslant c\, k_*(X)$. Then, with large probability,

$$\begin{aligned} \frac{\ell _X(\mathrm {Id})}{2\sqrt{n}} |x| \leqslant \Vert x\Vert _X \leqslant \frac{2\ell _X(\mathrm {Id})}{\sqrt{n}} |x| \end{aligned}$$

for every $x \in E$, where $|\cdot |$ is the standard Euclidean norm on ${\mathbf {R}}^n$. In particular, the space $X \cap E$ is 4-Euclidean.

Remark

In Theorem C.6 it is understood that E is distrbuted according to the Haar measure on the Grassmann manifold (see e.g. [60]). The expression ‘with large probability’ means that the probability of failure tends to zero exponentially fast as n tends to infinity; we need only to know that the intersection of two such events of large probability is nonempty.

1.4 4. John ellipsoid

The following theorem is a classical result about convex bodies (see [62] for a modern proof).

Theorem C.7

(John’s theorem). For every n-dimensional normed space X with unit ball $B_X$, there is a unique ellipsoid ${\mathcal {E}}$ of maximal volume under the constraint ${\mathcal {E}} \subseteq B_X$. The ellipsoid ${\mathcal {E}}$ is called the John ellipsoid of X and satisfies $B_X \subseteq \sqrt{n} {\mathcal {E}}$. Consequently, we have ${{\,\mathrm{d}\,}}(X,\ell _2^n) \leqslant \sqrt{n}$.

We also use a variant of John’s theorem. It can be for example deduced from [59, Corollary 3.9].

Proposition C.8

Let X be a finite-dimensional normed space with $\dim (X) \geqslant n$. Then there exist maps $u : \ell _2^n \rightarrow X$ and $v:X \rightarrow \ell _2^n$ such that $vu = \mathrm {Id}_{\ell _2^n}$, $\Vert u\Vert =1$ and $\Vert v\Vert \leqslant \sqrt{n}$.

We will rely on a technical result which guarantees that certain normed spaces contains large-dimensional cubes; in that formulation it is due to Vershynin [63] (improving on Rudelson [7]).

Theorem C.9

(Theorem 6.2 in [63]). Let $X=({\mathbf {R}}^n,\Vert \cdot \Vert _X)$ be a n-dimensional normed space whose John ellipsoid is $B_2^n$. Let P be an orthogonal projection and $k = \mathrm {rank} (P)$. Then there are $m \geqslant c k/\sqrt{n}$ contact points $(x_j)_{1 \leqslant j \leqslant m}$ such that

$$\begin{aligned} \max _{1 \leqslant j \leqslant m} | \langle x , x_j\rangle | \leqslant \Vert x\Vert _X \leqslant C \sqrt{\frac{n}{k}}\ \ell _X(P) \max _{1 \leqslant j \leqslant m} | \langle x , x_j\rangle | \end{aligned}$$

(C3)

for every $x \in {{\,\mathrm{\mathrm {span}}\,}}\{ Px_j \ : \ 1 \leqslant j \leqslant m \}$. In particular, the space X contains a subspace which is R-isomorphic to $\ell _{\infty }^m$ for $R=C \ell _X(P) \sqrt{n/k}$.

1.5 5. p-summing norms

Let $u : X \rightarrow Y$ be a linear map between finite-dimensional normed spaces. Fix $p \in [1,\infty )$; we only need $p=1$ and $p=2$ in the present paper. For an integer N, we define a quantity $\pi _p^{(N)}(u)$ to be the smallest constant K such that, for every N vectors $x_1,\cdots ,x_N \in X$, we have

$$\begin{aligned} \left( \sum _{k=1}^N \Vert u(x_k)\Vert ^p \right) ^{1/p} \leqslant K \sup _{\phi \in B_{Y^*}} \left( \sum _{k=1}^N |\phi (x_k)|^p \right) ^{1/p}. \end{aligned}$$

(C4)

The quantity $\pi _p(u) = \sup \{ \pi _p^{(N)}(u) \ : \ N \geqslant 1\}$ is called the $\varvec{p}$-summing norm of the operator u.

Proposition C.10

Consider finite-dimensional normed spaces X, Y, and a linear operator $u : X \rightarrow Y$. We have

1.
$\pi _1^{(N)}(u) \geqslant \pi _2^{(N)}(u)$,
2.
$\pi _2^{(\dim X)}(u) \geqslant \frac{1}{\sqrt{2}} \pi _2(u)$,
3.
$\pi _2(\mathrm {Id}_X) = \sqrt{\dim X}$.

A general reference about p-summing norms (with detailed bibliography) is [53]; parts 1–3 of Proposition C.10 appear there respectively as Proposition 9.6, Theorem 18.4 and Proposition 9.11.

We also need a specific result about the 1-summing norm of the identity map on $\ell _1^n$, which appears as [64, Theorem 2(4)]: we have

$$\begin{aligned} \pi _1(\mathrm {Id}_{\ell _1^n}) = \frac{n}{ {{\,\mathrm{{\mathbf {E}}}\,}}\left| \sum _{i=1}^n \varepsilon _i \right| }, \end{aligned}$$

(C5)

where $(\varepsilon _i)$ is a sequence of independent random variables with ${\mathbf {P}}(\varepsilon _i=1)={\mathbf {P}}(\varepsilon _1=-1) = \frac{1}{2}$. For a more transparent derivation, one may use the fact that $\ell _1^n$ has enough symmetries in the sense of [53, §16], which implies that $\pi _1(\mathrm {Id}_{\ell _1^n}) = n/{{\,\mathrm{f}\,}}(\ell _1^n,\ell _\infty )$ (see e.g. [65]). In turn, the quantity ${{\,\mathrm{f}\,}}(X,\ell _\infty )$ (defined in (50) and also referred to as the projection constant of a normed space X, see [53, §32]) can be calculated directly when $X=\ell _1^n$ and equals ${{\,\mathrm{{\mathbf {E}}}\,}}|\varepsilon _1 + \cdots + \varepsilon _n|$; an early reference for the last result is [66, Theorem 3].

1.6 6. Non-commutative Grothendieck inequality

Let us recall Grothendieck’s inequality for bilinear forms on $C^*$-algebras. Here, we state [67, Theorem 1.1], which improved the original proof in [68].

Theorem C.11

(Grothendieck’s inequality for $C^*$-algebras). Let $V:{\mathcal {A}}\times {\mathcal {B}}\rightarrow {\mathbf {C}}$ be a bilinear form on a pair of $C^*$-algebras ${\mathcal {A}}$ and ${\mathcal {B}}$. Then, there exist two states $\varphi _1$ and $\varphi _2$ on ${\mathcal {A}}$ and two states $\psi _1$ and $\psi _2$ on ${\mathcal {B}}$ such that

$$\begin{aligned} |V(x,y)|\leqslant \Vert V\Vert \left( \varphi _1(x^*x)+\varphi _2(xx^*)\right) ^{\frac{1}{2}}\left( \psi _1(y^*y)+\psi _2(yy^*)\right) ^{\frac{1}{2}} \qquad \forall \ x\in {\mathcal {A}}, \, y\in {\mathcal {B}}, \end{aligned}$$

(C6)

where

$$\begin{aligned} \Vert V\Vert :=\sup _{\begin{array}{c} \\ x\in {\mathcal {A}},\, y\in {\mathcal {B}}\\ \Vert x\Vert , \Vert y\Vert \leqslant 1 \end{array}} |V(x,y)| \end{aligned}$$

(C7)

is the norm of V.

By applying the previous theorem to the particular case ${\mathcal {A}}=S_{\infty }^n$ (the $C^*$-algebra of $n \times n$ complex matrices endowed with the operator norm) and ${\mathcal {B}}=S_{\infty }^m$, we deduce the following corollary.

Corollary C.12

Let $z\in S_1^{n,\mathrm {sa}}\otimes _\varepsilon S_1^{m,\mathrm {sa}}$ be a tensor, and let $\widetilde{z}:S_{\infty }^{n, \mathrm {sa}}\rightarrow S_1^{m,\mathrm {sa}}$ be the linear map associated with it according to (42). Then, there exists a state $\varphi $ on $S_{\infty }^{n}$ such that

$$\begin{aligned} \Vert \widetilde{z}(x)\Vert _{S_1^{m,\mathrm {sa}}}\leqslant 2\sqrt{2}\, \Vert z\Vert _{S_1^{n,\mathrm {sa}}\otimes _\varepsilon S_1^{m,\mathrm {sa}}}\big (\varphi (x^2)\big )^{\frac{1}{2}}\qquad \forall \ x\in S_\infty ^{n,\mathrm {sa}}\, . \end{aligned}$$

(C8)

Proof

We start by remarking that since $S_1^{k,\mathrm {sa}}$ can be thought of as a (real) subspace of the (complex) Banach space of all $k\times k$ complex matrices endowed with the trace norm, denoted by $S_1^k$, we can consider z also as a tensor in $S_1^n\otimes _\varepsilon S_1^m$. According to [47, Claim 4.7], we have that

$$\begin{aligned} \Vert z\Vert _{S_1^n\otimes _{\varepsilon } S_1^m}\leqslant \sqrt{2}\, \Vert z\Vert _{S_1^{n,\mathrm {sa}}\otimes _\varepsilon S_1^{m,\mathrm {sa}}}. \end{aligned}$$

Indeed, to see this just notice that [47, Definition 4.3] and [47, Definition 4.6] correspond to $\Vert z\Vert _{S_1^{n,\mathrm {sa}}\otimes _\varepsilon S_1^{m,\mathrm {sa}}}$ and $\Vert z\Vert _{S_1^n\otimes _\varepsilon S_1^m}$, respectively. Now we consider the bilinear form $V_z: S_\infty ^n\times S_\infty ^m\rightarrow {\mathbf {C}}$ defined by $V_z(x,y):={{\,\mathrm{tr}\,}}\left[ (x\otimes y)z\right] $, whose norm can be verified to coincide with the injective norm of the tensor z, i.e.

$$\begin{aligned} \Vert V_z\Vert = \Vert z\Vert _{S_1^n\otimes _{\varepsilon } S_1^m} . \end{aligned}$$

Applying Theorem C.11 to $V_z$ then yields

$$\begin{aligned} \left| {{\,\mathrm{tr}\,}}[\widetilde{z}(x) y] \right|&= \left| {{\,\mathrm{tr}\,}}[(x\otimes y) z]\right| \\&\leqslant \Vert z\Vert _{S_1^n\otimes _{\varepsilon } S_1^m} \left( \varphi _1(x^*x)+\varphi _2(xx^*)\right) ^{\frac{1}{2}} \left( \psi _1(y^*y)+\psi _2(yy^*)\right) ^{\frac{1}{2}} \\&\leqslant \sqrt{2}\, \Vert z\Vert _{S_1^{n,\mathrm {sa}}\otimes _{\varepsilon } S_1^{m,\mathrm {sa}}} \left( \varphi _1(x^*x)+\varphi _2(xx^*)\right) ^{\frac{1}{2}} \left( \psi _1(y^*y)+\psi _2(yy^*)\right) ^{\frac{1}{2}} . \end{aligned}$$

Taking the supremum over all $y\in S_{\infty }^{m,\mathrm {sa}}$ such that $\Vert y\Vert _{\infty }\leqslant 1$, using the fact that $\psi (y^*y)\leqslant 1$ and $\psi (yy^*)\leqslant 1$ for all such y and for all states $\psi $ on $S_{\infty }^m$, and finally defining $\varphi :=(\varphi _1+\varphi _2)/2$, we obtain precisely (C8). $\square $

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Aubrun, G., Lami, L., Palazuelos, C. et al. Universal Gaps for XOR Games from Estimates on Tensor Norm Ratios. Commun. Math. Phys. 375, 679–724 (2020). https://doi.org/10.1007/s00220-020-03688-2

Download citation

Received: 26 November 2018
Accepted: 08 December 2019
Published: 07 March 2020
Issue Date: April 2020
DOI: https://doi.org/10.1007/s00220-020-03688-2

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Universal Gaps for XOR Games from Estimates on Tensor Norm Ratios

Abstract

Similar content being viewed by others

Perfect Strategies for Non-Local Games

K-Correspondences, USCOs, and fixed point problems arising in discounted stochastic games

Information Geometry and Game Theory

1 Introduction

1.1 General probabilistic theories

Definition 1

1.2 Tensor norms

Definition 2

1.3 XOR games

Definition 3

Theorem 1

Proof

Remark

2 Main Results

Theorem 2

Problem 3

Theorem 4

Note

Theorem 5

Theorem 6

Problem 7

Theorem 8

Note

Corollary 9

3 First Bounds on the \(\pi /\varepsilon \) Ratio

3.1 Some notions of functional analysis

Lemma 10

Proof

3.2 Basic properties

Lemma 11

Proof

Proposition 12

Proof

Remark

Remark

Proposition 13

Proof

Proposition 14

Proof

3.3 Universal lower bounds

Proposition 15

Proof

Remark

Lemma 16

Proof

Proof of Theorem 2

Remark

3.4 An important special case: quantum theory

Note

Proof of Theorem 8

Proof of Corollary 9

4 Asymptotic Lower Bounds on the \(\pi /\varepsilon \) Ratio

4.1 A lower bound for two copies of the same theory

Theorem 17

Proof

Lemma 18

Proof of Lemma 18

4.2 A lower bound for any pair of theories

Lemma 19

Proof

Remark

Theorem 20

Proof

Proof of Theorem 5

Proof of Theorem 6

5 Conclusions

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A. More on XOR Games in GPTs

Note