1 Introduction

The bilinear Hilbert transform (BHT) of two complex-valued Schwartz functions \(f_0,f_1 \in \mathscr {S}(\mathbb {R};\mathbb {C})\) is given by

$$\begin{aligned} {{\,\mathrm{BHT}\,}}(f_0,f_1)(x) := {\text {p.v.}} \int _\mathbb {R}f_0(x-t)f_1(x+t) \, \frac{\mathrm {d}t}{t}. \end{aligned}$$

The \(L^p\) bounds

$$\begin{aligned} \Vert {{\,\mathrm{BHT}\,}}(f_{0},f_{1}) \Vert _{L^{p}(\mathbb {R})} \lesssim _{p_0,p_1} \Vert f_{0} \Vert _{L^{p_{0}}(\mathbb {R})} \Vert f_{1} \Vert _{L^{p_{1}}(\mathbb {R})} \qquad \forall f_0,f_1 \in \mathscr {S}(\mathbb {R};\mathbb {C}),\nonumber \\ \end{aligned}$$
(1.1)

with \(p_0,p_1 \in (1,\infty ]\) and \(p \in (2/3,\infty )\) such that \(p^{-1}=p_{0}^{-1}+p_{1}^{-1}\), were first proven by Lacey and Thiele [28, 29]. Their proof extended techniques developed by Carleson and Fefferman in their proofs of Carleson’s theorem on the almost-everywhere convergence of Fourier series [8, 17]. These techniques are now referred to as ‘time-frequency’ or ‘wave packet’ analysis. In order to streamline and modularise these techniques, Do and Thiele developed a theory of ‘outer-\(L^p\)’ spaces, yielding proofs of \(L^p\) bounds for the BHT in which the key difficulties are cleanly compartmentalised [16].

The outer-\(L^p\) technique is not applied directly to the BHT, but rather to its associated trilinear form \({{\,\mathrm{BHF}\,}}\), given by dualising with a third function \(f_2\in \mathscr {S}(\mathbb {R};\mathbb {C}) \):

$$\begin{aligned} {{\,\mathrm{BHF}\,}}(f_0,f_1,f_2) := \int _\mathbb {R}{{\,\mathrm{p.v.}\,}}\int _\mathbb {R}f_0(x-t)f_1(x+t)f_2(x) \, \frac{\mathrm {d}t}{t} \, \mathrm {d}x. \end{aligned}$$
(1.2)

For \(p_{0},p_{1},p\in [1,\infty ]\), the estimate (1.1) is equivalent to the bound

$$\begin{aligned} |{{\,\mathrm{BHF}\,}}(f_0,f_1,f_2)| \lesssim _{p_0,p_1,p} \prod _{u=0}^{2} \Vert f_u \Vert _{L^{p_u}(\mathbb {R})} \qquad \forall f_0, f_1, f_2 \in \mathscr {S}(\mathbb {R};\mathbb {C}). \end{aligned}$$
(1.3)

The trilinear form \({{\,\mathrm{BHF}\,}}\) is a nontrivial linear combination of the Hölder form (which satisfies the desired \(L^p\) bounds by Hölder’s inequality) and another trilinear form:

$$\begin{aligned} \begin{aligned}&\int _\mathbb {R}f_0(x)f_1(x)f_2(x) \, \mathrm {d}x - \frac{i}{\pi } {{\,\mathrm{BHF}\,}}(f_0,f_1,f_2) \\&\quad = \int _{\mathbb {R}^{3}_{+}} \mathcal {E}[f_{0}](x,\eta -t^{-1},t)\,\mathcal {E}[f_{1}](x,\eta +t^{-1},t)\,\mathcal {E}[f_{2}](x,-2\eta ,t) \, \mathrm {d}x \, \mathrm {d}\eta \, \mathrm {d}t \\&\quad =: \int _{\mathbb {R}^{3}_{+}} \mathcal {E}_{0}[f_{0}](x,\eta ,t)\,\mathcal {E}_{1}[f_{1}](x,\eta ,t)\,\mathcal {E}_{2}[f_{2}](x,\eta ,t) \, \mathrm {d}x \, \mathrm {d}\eta \, \mathrm {d}t. \end{aligned} \end{aligned}$$
(1.4)

Here \(\mathbb {R}^{3}_{+} = \mathbb {R}\times \mathbb {R}\times (0,\infty )\) is the extended phase plane, which parametrises the underlying translation, modulation, and dilation symmetries of \({{\,\mathrm{BHF}\,}}\). The functions \(\mathcal {E}[f_u]\) are representations of the functons \(f_u \in \mathscr {S}(\mathbb {R};\mathbb {C})\) as functions \(\mathbb {R}^{3}_{+} \rightarrow \mathbb {C}\). We refer to \(\mathcal {E}\) as an embedding map; the modified embedding maps \(\mathcal {E}_u\) differ from \(\mathcal {E}\) by a change of variables.Footnote 1

The outer-\(L^p\) technique factorises \(L^p\) bounds for the trilinear form in (1.4) into a chain of inequalities:

The first inequality is a Radon–Nikodym-style domination; the classical integral over \(\mathbb {R}^{3}_{+}\) is controlled by an iterated outer-\(L^1\) quasinorm . This quasinorm is defined with respect to certain outer measures \(\mu \) and \(\nu \) on \(\mathbb {R}^{3}_{+}\), along with a size \(S^1\) on functions on \(\mathbb {R}^{3}_{+}\), which measures functions on distinguished subsets of \(\mathbb {R}^{3}_{+}\). The second inequality is a Hölder inequality for the iterated outer-\(L^p\) quasinorms. This involves further sizes \(\mathbb {S}_{u}\), \(u \in \{0,1,2\}\), which are connected to the size \(S^1\) by a ‘size-Hölder’ inequality. The first two inequalities follow from general properties of outer-\(L^p\) spaces. The third inequality follows from the bounds

(1.5)

which carry most of the difficulty of the problem. These are modulation invariant Carleson embedding bounds, so named because the operators \(\mathcal {E}_u\) are modulation invariant in the sense that

$$\begin{aligned} \mathcal {E}_u[e^{2\pi i z \cdot \xi } f(z)](x,\eta ,s) = \mathcal {E}_u[f](x,\eta +\xi ,s), \end{aligned}$$

and the outer-\(L^p\) quasinorms are invariant with respect to translation in the second variable.Footnote 2 These bounds do not follow from general properties of outer-\(L^p\) spaces, making them an interesting object of study in their own right. The abstract outer-\(L^p\) theory offers one useful reduction in this direction: to prove the bounds (1.5), it suffices to prove weak endpoint bounds and argue by an outer-\(L^p\) version of the Marcinkiewicz interpolation theorem.

In this paper we consider functions \(f :\mathbb {R}\rightarrow X\) valued in a complex Banach space X. Banach-valued analysis has a rich history which we do not attempt to summarise here; we simply point the reader to the recent volumes [19, 20]. Embedding bounds from the Bochner space \(L^p(\mathbb {R};X)\) into outer-\(L^p\) spaces on the upper half-space \(\mathbb {R}\times (0,\infty )\) have been proven by Di Plinio and Ou [15], with applications to Banach-valued multilinear singular integrals; the upper half-space parametrises translation and dilation symmetries, but not modulation symmetries. We would like to prove such embedding bounds into outer-\(L^p\) spaces on \(\mathbb {R}\times \mathbb {R}\times (0,\infty )\), in order to incorporate modulation invariance.Footnote 3 As a first step we prove these for a discrete model of the real line—the 3-Walsh model—in which many technical difficulties in time-frequency analysis are removed, while the core features of the analysis remain.

In the 3-Walsh model, the real line is replaced by the 3-Walsh group

$$\begin{aligned} \mathbb {W}= \mathbb {W}_3 := \bigg \{ x \in \prod _{n \in \mathbb {Z}} \mathbb {Z}/ 3\mathbb {Z}: [x]_n = 0 \text { for } n \text { sufficiently large}\bigg \}, \end{aligned}$$
(1.6)

where \([x]_n\) denotes the n-th component of x, and with group operation inherited from the infinite product. Up to measure zero, the Haar measure on \(\mathbb {W}\) can be identified with the Lebesgue measure on \([0,\infty )\), and the group operation on \(\mathbb {W}\) corresponds to ternary digitwise addition modulo 3 (i.e. ternary digitwise addition without carry) on \([0,\infty )\). The dual group of \(\mathbb {W}\) can be identified with \(\mathbb {W}\) itself, and the Walsh–Fourier transform of the characteristic function of a triadic interval is again such a characteristic function (up to a suitable Walsh modulation). Thus one can construct ‘Walsh wave packets’, supported in a given triadic interval of \([0,\infty )\), with frequency support in another given triadic interval. In this way, \(\mathbb {W}\) supports an idealised time-frequency analysis that is not possible on \(\mathbb {R}\), as a compactly supported function on \(\mathbb {R}\) cannot have compactly supported Fourier transform.

We work with the 3-Walsh group, although the 2-Walsh group (defined by replacing 3 by 2 in the definition) is more commonly used in the literature. We have made this choice because the 3-Walsh group leads to a more natural discrete model of \({{\,\mathrm{BHF}\,}}\) than the 2-Walsh group. Our arguments work equally well for any choice of integer parameter greater than or equal to 2.

In our application of the 3-Walsh model, the role of the extended phase plane \(\mathbb {R}^{3}_{+}\) is taken by the set \(3\mathbb {P}\) of all tritiles: i.e. the set of all rectangles \(\mathbf {P} = I_\mathbf {P} \times \omega _\mathbf {P}\) of area 3 in \([0,\infty ) \times [0,\infty )\) (identified with \(\mathbb {W}\times \mathbb {W}\)) whose sides are triadic intervals. The tritile \(\mathbf {P}\) roughly corresponds to the point \((x_{\mathbf {P}},\xi _{\mathbf {P}},|I_{\mathbf {P}}|)\), where \(x_{\mathbf {P}}\) and \(\xi _{\mathbf {P}}\) are the centres of \(I_{\mathbf {P}}\) and \(\omega _{\mathbf {P}}\) respectively. Each tritile is split into three tiles \(\mathbf {P}_u = I_{\mathbf {P}} \times \omega _{\mathbf {P}_u}\), \(u \in \{0,1,2\}\)—i.e. rectangles of area 1 with triadic sides—all with the same time interval \(I_{\mathbf {P}}\), and to each of these tiles is associated a Walsh wave packet \(w_{\mathbf {P}_u} :\mathbb {W} \rightarrow \mathbb {C}\), supported in \(I_{\mathbf {P}}\) with frequency support in \(\omega _{\mathbf {P}_v}\). The embedding \(\mathcal {E}[f] :3\mathbb {P}\rightarrow X^{3}\) of a function \(f :\mathbb {W}\rightarrow X\) is given by integrating f against the three wave packets corresponding to a given tritile \(\mathbf {P}\), and collecting the results in a triple

$$\begin{aligned} \mathcal {E}[f](\mathbf {P}) := \big ( \langle f; w_{\mathbf {P}_u} \rangle \big )_{u\in \{0,1,2\}}. \end{aligned}$$
(1.7)

There are two equivalent ways of looking at this embedding: either as an \(X^3\)-valued function on tritiles, or as an X-valued function on tiles, where we write

$$\begin{aligned} \mathcal {E}[f](P) = \langle f; w_P \rangle = \mathcal {E}[f](\mathbf {P})_u, \end{aligned}$$

where \(\mathbf {P}\) is the unique tritile containing the tile P, and u is the index such that \(P = \mathbf {P}_u\). Both viewpoints are handy, and we switch between them freely.

The main results of this paper are the following embedding bounds. The Banach space assumptions (UMD, r-Hilbertian) are explained in Sect. 3, and the relevant outer structures on \(3\mathbb {P}\) in Sect. 4. The sizes \(\mathbb {S}\) are also defined in Sect. 4; they depend on the Banach space X appearing in the statement of the theorem, although this is not apparent from the notation.

Theorem 1.1

Let X be a Banach space which is UMD and r-Hilbertian for some \(r \in [2,\infty )\). Then for all convex sets \(\mathbb {A}\subset 3\mathbb {P}\) of tritiles, the following embedding bounds hold.

  • For all \(p \in (r,\infty )\),

    $$\begin{aligned} \bigl \Vert \mathbb {1}_{\mathbb {A}}\,\mathcal {E}[f] \bigr \Vert _{L_\mu ^p \mathbb {S}} \lesssim \Vert f \Vert _{L^p(\mathbb {W};X)} \qquad \forall f \in \mathscr {S}(\mathbb {W};X). \end{aligned}$$
    (1.8)
  • For all \(p \in (1,\infty )\) and all \(q \in (\min (p,r)^\prime (r-1),\infty )\),

    (1.9)

The implicit constants in the above bounds do not depend on \(\mathbb {A}\).

The set of exponents for which the embedding bounds (1.9) hold is sketched in Fig. 1; in the dotted region, the iterated embedding bounds basically correspond to the non-iterated bounds. For \(p \le r\) we only have embeddings into iterated outer-\(L^p\) spaces; such behaviour ‘outside local \(L^r\)’ necessitated the introduction of iterated outer-\(L^p\) spaces by the second author [38].

Fig. 1
figure 1

Exponents (pq) for which (1.9) holds

Let us now discuss Banach-valued versions of the trilinear form \({{\,\mathrm{BHF}\,}}\). Consider a triple of Banach spaces \((X_0,X_1,X_2)\) and a bounded trilinear form

$$\begin{aligned} \Pi :X_0 \times X_1 \times X_2 \rightarrow \mathbb {C}. \end{aligned}$$
(1.10)

With respect to this data we define

$$\begin{aligned} {{\,\mathrm{BHF}\,}}_\Pi (f_0,f_1,f_2) :={{\,\mathrm{p.v.}\,}}\int _\mathbb {R}\int _\mathbb {R}\Pi (f_0(x-t),f_1(x+t),f_2(x)) \, \frac{\mathrm {d}t}{t} \, \mathrm {d}x \end{aligned}$$

for \(f_u \in \mathscr {S}(\mathbb {R};X_u)\), \(u \in \{0,1,2\}\). The first \(L^p\)-bounds for \({{\,\mathrm{BHF}\,}}_\Pi \) were proven by Silva, in the case \(X_0 = \ell ^R\), \(X_1 = \ell ^\infty \), \(X_2 = \ell ^{R^\prime }\), for \(R \in (4/3,4)\), with \(\Pi \) the natural product-sum map [36, Theorem 1.7]. The set of allowed Banach spaces was extended by Benea and Muscalu using a new ‘helicoidal method’ [2, 3], and by Lorist and Nieraeth by Rubio de Francia-type extrapolation methods [32, 33] relying on weighted estimates for \({{\,\mathrm{BHF}\,}}\) as proven for example in [10, 11]. One limitation of these results is that they only hold when the spaces \(X_0,X_1,X_2\) are Banach lattices, excluding interesting examples such as the Schatten classes \(\mathcal {C}^p\) and more general non-commutative \(L^p\) spaces. It remains an open question as to whether there are any \(L^p\)-bounds for \({{\,\mathrm{BHF}\,}}_\Pi \) without this limitation.

As a corollary of Theorem 1.1, we prove \(L^p\)-bounds for the 3-Walsh model of \({{\,\mathrm{BHF}\,}}_\Pi \) without assuming any lattice structure. This model is the tritile form \(\Lambda _\Pi \), defined by

$$\begin{aligned} \begin{aligned} \Lambda _\Pi (f_0,f_1,f_2) :=&\sum _{\mathbf {P} \in 3\mathbb {P}} \Pi \Big (\langle f_0; w_{\mathbf {P}_0} \rangle , \langle f_1; w_{\mathbf {P}_1} \rangle , \langle f_2; w_{\mathbf {P}_2} \rangle \Big ) |I_{\mathbf {P}}| \\ =&\sum _{\mathbf {P} \in 3\mathbb {P}} \Pi \Big (\mathcal {E}[f_0](\mathbf {P}_0), \mathcal {E}[f_1](\mathbf {P}_1), \mathcal {E}[f_2](\mathbf {P}_2) \Big ) |I_{\mathbf {P}}| \end{aligned} \end{aligned}$$
(1.11)

for \(f_u \in \mathscr {S}(\mathbb {W};X_u)\), \(u \in \{0,1,2\}\).Footnote 4

Theorem 1.2

Let \((X_u)_{u\in \{0,1,2\}}\) be UMD Banach spaces, such that each \(X_u\) is \(r_u\)-Hilbertian for some \(r_u \in [2,\infty )\), and let \(\Pi :X_0 \times X_1 \times X_2 \rightarrow \mathbb {C}\) be a bounded trilinear form. Given any Hölder triple of exponents \((p_u)_{u \in \{0,1,2\}}\in (1,\infty )^{3}\) satisfying

$$\begin{aligned} \sum _{u=0}^{2} \frac{1}{\min (p_u,r_u)^\prime (r_u-1)} > 1, \end{aligned}$$
(1.12)

we have the bound

$$\begin{aligned} |\Lambda _\Pi (f_0,f_1,f_2)| \lesssim \prod _{u=0}^{2} \Vert f_u\Vert _{L^{p_u}(\mathbb {W};X_u)} \qquad \forall f_u \in \mathscr {S}(\mathbb {W};X_u). \end{aligned}$$

The region of exponents \((p_u)_{u=0}^2\) for which this theorem holds (more precisely, the region of their reciprocals) is characterised as the interior of a polygon in Sect. 6.1. This region is only nonempty when the Hilbertian exponents \((r_u)_{u=0}^2\) are jointly sufficiently close to 2, in the sense that

$$\begin{aligned} \sum _{u=0}^2 \frac{1}{r_u} > 1. \end{aligned}$$

\(L^p\) bounds for the Banach-valued quartile form (the 2-Walsh analogue of \(\Lambda _\Pi \)) were first established by Hytönen, Lacey, and Parissis [23]. Their assumptions on the Banach spaces \(X_u\) are very similar to ours—possibly equivalent, although this is not known—and the resulting range of exponents in their \(L^p\) bounds are the same as ours when restricted to the reflexive range (see Sect. 6.1).Footnote 5 Banach-valued time-frequency analysis was initiated by Hytönen and Lacey in their work on the Carleson operator, and continued with their work with Parissis on the Walsh model of the variational Carleson operator [21, 22, 24]. We have taken substantial inspiration from these papers.

The iterated embeddings of Theorem 1.1 imply not only \(L^p\) bounds for the tritile form, but also sparse domination. The connection between sparse domination and Carleson embeddings into iterated outer-\(L^p\) spaces was first shown by Di Plinio, Do, and the second author [12]. A collection of intervals \(\mathcal {G}\) in \(\mathbb {W}\) is sparse if

$$\begin{aligned} \Vert \mathcal {G} \Vert _{sp}:= \sup _{I\subset \mathbb {W}} \frac{1}{|I|}\sum _{\begin{array}{c} J \in \mathcal {G} \\ J \subset I \end{array}} |J| < \infty \end{aligned}$$

where the supremum is taken over all intervals \(I \subset \mathbb {W}\) (see [31, §6] or [25] for a proof that this is equivalent to the more familiar definition of a sparse collection).

Theorem 1.3

Let \((X_u)_{u \in \{0,1,2\}}\), \((r_u)_{u \in \{0,1,2\}}\), and \(\Pi \) be as in Theorem 1.2. Let \((p_{u})_{u\in {0,1,2}}\) be any triple of exponents satisfying (1.12). Then

where the supremum is taken over all sparse collections of intervals \(\mathcal {G}\).

The term appearing on the right of the bound of Theorem 1.3 is referred to as a sparse form. It is straightforward to show that sparse forms satisfy the bounds

for any Hölder triple of exponents \(({\overline{p}}_{u})_{u\in \{0,1,2\}}\) with \({\overline{p}}_{u}>p_{u}\). Furthermore, sparse forms satisfy various weighted bounds, which we do not pursue here; for more information see for example [30, 31, 33].

Let us return to the assumptions of Theorem 1.2: we consider three UMD Banach spaces \((X_u)_{u = 0,1,2}\), each of which is \(r_u\)-Hilbertian, linked with a bounded trilinear form \(\Pi :X_0 \times X_1 \times X_2 \rightarrow \mathbb {C}\). There are a few natural examples that one should keep in mind:

  • Let X be a UMD Banach space which is r-Hilbertian. Then the dual space \(X^*\) is also UMD and r-Hilbertian, and we can consider the ‘duality trilinear form’

    $$\begin{aligned} \Pi :X \times X^* \times \mathbb {C} \rightarrow \mathbb {C}, \qquad \Pi (x,x^*,\lambda ) := \lambda x^*(x). \end{aligned}$$

    Since \(\mathbb {C}\) is UMD and 2-Hilbertian (i.e. Hilbert), the corresponding region of exponents in Theorem 1.2 is nonempty provided

    $$\begin{aligned} \frac{2}{r} + \frac{1}{2} > 1, \end{aligned}$$

    i.e. when \(r < 4\).

  • Consider a Hölder triple of exponents \(r_0, r_1, r_2 \in (1,\infty )\), so that the Lebesgue spaces \(L^{r_u}(\mathbb {R})\) are UMD and \(\max (r_u,r_u^\prime )\)-Hilbertian, and there is no exponent \(r < \max (r_u,r_u^\prime )\) such that \(L^{r_u}(\mathbb {R})\) is r-Hilbertian. Consider the ‘integration trilinear form’

    $$\begin{aligned} \Pi :L^{r_0}(\mathbb {R}) \times L^{r_1}(\mathbb {R}) \times L^{r_2}(\mathbb {R}) \rightarrow \mathbb {C}, \qquad \Pi (f,g,h) := \int _\Sigma f(x)g(x)h(x) \, \mathrm {d}x. \end{aligned}$$

    Then Theorem 1.2 would yield a nontrivial region of exponents provided that

    $$\begin{aligned} \sum _u \max (r_u,r_u^\prime )^{-1} > 1. \end{aligned}$$

    But since \(\sum _u r_u = 1\), this occurs only if \(\max (r_u,r_u^\prime ) < r_u\) for some u, which is impossible. Thus this trilinear form never fits into our framework. This is in stark contrast with the results of Benea and Muscalu, who obtain bounds for \({{\,\mathrm{BHF}\,}}_\Pi \) for this trilinear form for any Hölder triple \(r_0,r_1,r_2\) with \(r_0,r_1 \in (1,\infty ]\) and \(r_2 \in [1,\infty )\) [2]. The reason for this discrepancy is our reliance on UMD methods.

  • On the other hand, replacing \(\mathbb {R}\) with \(\mathbb {N}\) in the preceding example, one can define the integration trilinear form on \(\ell ^{r_0} \times \ell ^{r_1} \times \ell ^{r_2}\) provided that \(\sum r_u^{-1} \ge 1\). Thus this trilinear form fits into our framework provided that \(\sum r_u^{-1} > 1\) and \(r_u \ge 2\) for each u. The same holds when each \(\ell ^{r_u}\) is replaced by the Schatten class \(\mathcal {C}^{r_u}\) and \(\Pi \) is replaced by the ‘composition trilinear form’.

Here is a brief overview of the paper. In Sect. 2 we introduce the basics of the Walsh group \(\mathbb {W}\) and the associated time-frequency analysis. In Sect. 3 we discuss various Banach space properties and their analytic consequences. In Sect. 4 we set up the framework of outer structures and outer-\(L^p\) spaces. Of particular importance are the size-Hölder inequality (Proposition 4.12) for the ‘randomised’ sizes, and the size domination theorem (Theorem 4.15), which lets us control the randomised sizes by a simpler ‘deterministic’ size. Section 5 is devoted to proving Theorem 1.1. Crucial to these arguments is a basic tile selection algorithm given in Proposition 5.1. This is a simpler version of a more familiar ‘tree selection algorithm’ often used in time-frequency analysis; the simplification is thanks to the aforementioned size domination theorem. Finally, in Sect. 6, we deduce \(L^p\) bounds and sparse domination for the tritile form. Section 7.1 is an appendix, in which we sketch an alternative method using R-bounds and the RMF property; this requires additional Banach space assumptions, but the proof is a bit more direct.

1.1 Notation

The letter \(\mathbb {W}\) will always stand for the 3-Walsh group \(\mathbb {W}_3\); we always write \(\mathbb {W}_{\mathfrak {p}}\) when we want to use a different parameter \(\mathfrak {p}\) (see Sect. 2). For a Banach space X and \(p \in [1,\infty ]\), \(L^p(\mathbb {W};X)\) denotes the Bochner space of strongly measurable functions \(\mathbb {W}\rightarrow X\) such that the function \(x \mapsto \Vert f(x)\Vert _X\) is in the usual Lebesgue space \(L^p(\mathbb {W})\). For technical details on Bochner spaces see [19, Chapter 1]. When \(I \subset \mathbb {W}\) is an interval and \(f \in L_{{\text {loc}}}^p(\mathbb {W};X)\), we let

(1.13)

denote the \(L^p\)-average, and

(1.14)

denote the triadic p-maximal function; the supremum is taken over all intervals \(I \in \mathbb {W}\) containing x. For \(f \in L_{{\text {loc}}}^1(\mathbb {W};X)\) we let

$$\begin{aligned} \langle f \rangle _I := \frac{1}{|I|} \int _I f(x) \, \mathrm {d}x \in X \end{aligned}$$

denote the average, of f on I. For \(f\in \mathscr {S}(\mathbb {W};X)\) and \(g\in \mathscr {S}(\mathbb {W};\mathbb {C})\) let

$$\begin{aligned} \langle f; g \rangle := \int _\mathbb {W}f(x)g(x) \, \mathrm {d}x \in X. \end{aligned}$$

We say that a triple of exponents \((p_u)_{u \in \{0,1,2\}}\) with \(p_u \in [1,\infty ]\) is a Hölder triple if \(\sum _{u=0}^2 p_u^{-1} = 1\).

Throughout the paper, we use \((\varepsilon _n)_{n \in A}\) to denote a sequence of independent Rademacher variables (i.e. random variables that take the values \(\pm 1\) with equal probability), indexed over some countable indexing set A. It never matters precisely which probability space these Rademacher variables live on. We denote the expectation over this probability space by \(\mathbb {E}\).

2 Walsh Time–Frequency Analysis

In this section we introduce the Walsh group \(\mathbb {W}\) and the extended Walsh phase plane. In particular we introduce tiles, wave packets, tritiles, trees, and strips; none of this material is new; we include it here for the convenience of the reader, and to fix notation. In Subsect. 2.3 we introduce the defect operator, which is an important technical tool in our analysis.

2.1 The Walsh Group

Fix an integer \(\mathfrak {p}\ge 2\). The Walsh group \(\mathbb {W}_{\mathfrak {p}}\) is

$$\begin{aligned} \begin{aligned} \mathbb {W}_{\mathfrak {p}} := \Bigl \{x\in \prod _{n\in \mathbb {Z}} \mathbb {Z}/\mathfrak {p}\mathbb {Z}:|x|<\infty \Bigr \}, \end{aligned} \end{aligned}$$
(2.1)

where \(|x|=\max \bigl \{\mathfrak {p}^{n} :[x]_{n}\ne 0\bigr \}\) and \([x]_{n}\) is the nth component (‘digit’) of x. The group operation \(+\) is the digit-wise addition in \(\mathbb {Z}/\mathfrak {p}\mathbb {Z}\), and the map \((x,y)\mapsto |x-y|\) is a translation invariant metric on \(\mathbb {W}_{\mathfrak {p}}\), giving \(\mathbb {W}_{\mathfrak {p}}\) the structure of a locally compact abelian group, and thus guaranteeing the existence of a Haar measure on \(\mathbb {W}_{\mathfrak {p}}\). We normalise this measure so that \(|B_{1}(0)|=1\), and it follows that

$$\begin{aligned} |B_{\mathfrak {p}^{n}}(x)|=\mathfrak {p}^{n} \qquad \forall x \in \mathbb {W}_{\mathfrak {p}}. \end{aligned}$$
(2.2)

As explained in the introduction, there is a correspondence between the Walsh group and the non-negative reals \([0,\infty )\) given by the surjective map

$$\begin{aligned} \mathbb {W}_{\mathfrak {p}}\ni x \mapsto \sum _{n \in \mathbb {Z}} [x]_n \mathfrak {p}^n \in [0,\infty ), \end{aligned}$$

which is injective up to a set of measure zero. The pullback of the Lebesgue measure by this map is the Haar measure on \(\mathbb {W}_{\mathfrak {p}}\), and intervals in \([0,\infty )\) correspond to balls in \(\mathbb {W}_{\mathfrak {p}}\). Thus we often refer to Walsh balls as intervals.

Let X be a Banach space. We say that a function \(f :\mathbb {W}_{\mathfrak {p}} \rightarrow X\) is Schwartz, denoted \(f \in \mathscr {S}(\mathbb {W}_{\mathfrak {p}};X)\), if there exists \(N>0\) such that f is supported on \(B_{\mathfrak {p}^{N}}(0)\) and constant on any interval I with \(|I|<\mathfrak {p}^{-N}\). For all \(p \in [1,\infty )\), the Schwartz functions are dense in \(L^p(\mathbb {W}_{\mathfrak {p}};X)\).

The dual group of \(\mathbb {W}_{\mathfrak {p}}\) can be identified with \(\mathbb {W}_{\mathfrak {p}}\) itself, and the characters of \(\mathbb {W}_{\mathfrak {p}}\) are the Walsh exponentials

$$\begin{aligned} \exp _\xi (x) := (e^{2\pi i / \mathfrak {p}})^{\sum _{j+k = -1}[\xi ]_j [x]_k}\qquad \forall x,\xi \in \mathbb {W}_{\mathfrak {p}}. \end{aligned}$$
(2.3)

The Walsh–Fourier transform of a function \(f \in L^1(\mathbb {W}_{\mathfrak {p}};\mathbb {C})\) is thus

$$\begin{aligned} \widehat{f}(\xi ) := \int _{\mathbb {W}_{\mathfrak {p}}} f(x) \overline{\exp _{\xi }(x)} \, \mathrm {d}x \qquad \forall \xi \in \mathbb {W}_{\mathfrak {p}}, \end{aligned}$$
(2.4)

and we have the Plancherel identity

$$\begin{aligned} \int _{\mathbb {W}_{\mathfrak {p}}} \widehat{f}(\xi ) \overline{\widehat{g}(\xi )} d \xi = \int _{\mathbb {W}_{\mathfrak {p}}} f(x) \overline{g(x)} \, \mathrm {d}x \qquad \forall f,g \in \mathscr {S}(\mathbb {W}_{\mathfrak {p}};\mathbb {C}). \end{aligned}$$

Consider the modulation, translation, and dilation operators on functions \(f :\mathbb {W}_{\mathfrak {p}} \rightarrow \mathbb {C}\), given by

$$\begin{aligned} \begin{aligned}&{{\,\mathrm{Mod}\,}}_{\eta }f(x) := \exp _\eta (x) f(x)&\qquad&\forall \eta \in \mathbb {W}_{\mathfrak {p}},\\&{{\,\mathrm{Tr}\,}}_{y}f(x) := f(x-y)&\qquad&\forall y \in \mathbb {W}_{\mathfrak {p}},\\&{{\,\mathrm{Dil}\,}}_{\mathfrak {p}^{n}} f (x):= \mathfrak {p}^{-n} f(\mathfrak {p}^{-n} x)&\qquad&\forall n\in \mathbb {Z}, \end{aligned} \end{aligned}$$
(2.5)

where \([\mathfrak {p}^{-n}x]_{j}:=[x]_{j+n}\) for all \(n,j \in \mathbb {Z}\). It follows from the definition of the Walsh–Fourier transform that

$$\begin{aligned} \begin{aligned}&\widehat{{{\,\mathrm{Mod}\,}}_{\eta }f} = {{\,\mathrm{Tr}\,}}_{y}\widehat{f} \\&\widehat{{{\,\mathrm{Tr}\,}}_{y}f} = {{\,\mathrm{Mod}\,}}_{y}\widehat{f} \\&\widehat{{{\,\mathrm{Dil}\,}}_{\mathfrak {p}^{n}}f} = \mathfrak {p}^{-n} {{\,\mathrm{Dil}\,}}_{\mathfrak {p}^{-n}} \widehat{f} \end{aligned} \qquad \forall f \in \mathscr {S}(\mathbb {W}_{\mathfrak {p}};\mathbb {C}). \end{aligned}$$
(2.6)

Given two intervals \(I, I'\subset \mathbb {W}\), it holds that

$$\begin{aligned} I\cap I'\ne \emptyset \quad \implies \quad I\subseteq I' \text{ or } I'\subseteq I. \end{aligned}$$

This is a familiar property of \(\mathfrak {p}\)-adic intervals in \([0,\infty )\). Each interval has \(\mathfrak {p}\) child intervals \(\{{{\,\mathrm{ch}\,}}_{0}(I),{{\,\mathrm{ch}\,}}_{1}(I),\ldots ,{{\,\mathrm{ch}\,}}_{\mathfrak {p}-1}(I)\}={{\,\mathrm{ch}\,}}(I)\) given by

$$\begin{aligned} {{\,\mathrm{ch}\,}}_{j}(I)=\{x\in I :[x]_{(\log _{\mathfrak {p}}|I|) - 1}=j \}. \end{aligned}$$
(2.7)

In the remainder of the paper we will work with the case \(\mathfrak {p}= 3\), and we will write \(\mathbb {W}:= \mathbb {W}_3\).

2.2 The Extended Walsh Phase Plane

Strictly speaking, the extended Walsh phase plane is \(\{(x,\xi ,3^{n}) \in \mathbb {W}\times \mathbb {W}\times \mathbb {R}^{+} :n\in \mathbb {Z}\}\) where each point \((x,\xi ,3^{n}) \in \mathbb {W}\times \mathbb {W}\times \mathbb {R}^{+}\) represents the time x, the frequency \(\xi \), and the scale \(3^n\). We can identify each point \((x,\xi ,3^{n})\) with the rectangle \(B_{3^{n}}(x)\times B_{3^{-n}}(\xi )\subset \mathbb {W}\times \mathbb {W}\); this provides for a more graphically intuitive way of thinking of time-frequency localisation. This identification is not injective, but it turns out that that this failure of injectivity correctly encodes the ‘uncertainty principle’ i.e. the impossibility of determining both position (in time) and frequency to an arbitrary scale.

We thus introduce the notion of a tile.

Definition 2.1

(Tiles) A tile is a rectangle \(P = I_P \times \omega _P\) in \(\mathbb {W}\times \mathbb {W}\) of area 1, such that the sides \(I_P\) and \(\omega _P\) are intervals. We call \(I_P\) the time interval and \(\omega _P\) the frequency interval of P. For each tile P there exist unique \(x_{P},\xi _{P} \in \mathbb {W}\) and \(n\in \mathbb {Z}\) such that

$$\begin{aligned} \begin{aligned}&I_{P}=B_{|I_P|}(x_{P}),&[x_{P}]_{j}=0 \text { for } j< n, \\&\omega _{P}=B_{|I_P|^{-1}}(\xi _{P}),&[\xi _{P}]_{j}=0 \text { for } j < -n . \end{aligned} \end{aligned}$$
(2.8)

We call \(x_P\) the centre of the tile, and \(\xi _{P}\) the frequency of the tile. We denote the set of all tiles by \(\mathbb {P}\).

To each tile P we associate a wave packet \(w_P\), which is a \(\mathbb {C}\)-valued function supported in \(I_P\) with frequency support \(\omega _P\). In the time-frequency sense, the wave packet \(w_P\) is localised to P.

Definition 2.2

(Wave packets) Given a tile \(P\in \mathbb {P}\), the wave packet associated with P is the function

$$\begin{aligned} w_{P}(x)= {{\,\mathrm{Mod}\,}}_{\xi _{P}}{{\,\mathrm{Tr}\,}}_{x_{P}}{{\,\mathrm{Dil}\,}}_{|I_{P}|}\mathbb {1}_{B_{1}(0)}(x)= |I_{P}|^{-1}\exp _{\xi _P}(x)\mathbb {1}_{I_{P}}(x). \end{aligned}$$
(2.9)

This is the unique function, up to multiplication by a unimodular constant, such that

$$\begin{aligned} {{\,\mathrm{spt}\,}}w_{P} =I_{P}, \qquad \widehat{w_{P}} = \omega _{P}, \quad \text {and} \quad \Vert w_{P} \Vert _{L^1(\mathbb {W})}=1. \end{aligned}$$
(2.10)

Remark 2.3

It is convenient to identify wave packets with tiles, and thus to consider the translation, dilation, and modulation operators (2.5) as acting directly on tiles, so that for example

$$\begin{aligned} {{\,\mathrm{Mod}\,}}_{\xi } P = P' \iff {{\,\mathrm{Mod}\,}}_{\xi } w_{P} = c \, w_{P'} \text { for some } |c|=1. \end{aligned}$$

We could equivalently define our wave packets with an arbitrary choice of unimodular constant out the front; all the statements we make about wave packets will be invariant under this transformation. In essence, what is most important is not the wave packet itself, but the subspace of \(L^2(\mathbb {W};\mathbb {C})\) that it spans.

Simple support (and Walsh–Fourier support) considerations show that two tiles are disjoint if and only if their associated wave packets are orthogonal. More refined statements can be made about the connection between tiles and wave packets. For example, a union of disjoint tiles \(\bigcup _i P_i\) corresponds to the subspace of \(L^2(\mathbb {W};\mathbb {C})\) spanned by the pairwise orthogonal wave packets \((w_{P_i})_i\), and this subspace does not depend on the specific representation of \(\bigcup _i P_i\) as a disjoint union of tiles. In particular, if a tile P is contained in such a union, then the wave packet \(w_P\) can be written as a linear combination of the wave packets \(w_{P_i}\). This is made precise in the following lemma.

Lemma 2.4

(Basis expansion of wave packets) Let \((P_{i})_{i\in \{1,\dots ,N\}}\) be a finite collection of pairwise disjoint tiles. Then for any \(P\subset \cup _{i=1}^{N} P_{i} \) it holds that

$$\begin{aligned} w_{P} = \sum _{i=1}^{N} \langle w_{P}; w_{P_{i}} \rangle w_{P_{i}} |I_{P_{i}}|. \end{aligned}$$
(2.11)

Proof

We may assume that \(P\cap P_{i}\ne \emptyset \) for all \(i\in \{1,\dots ,N\}\), for otherwise we would have \(\langle w_P; w_{P_i} \rangle = 0\) and \(P_i\) would not contribute to the right hand side of (2.11).

If \(I_{P_{i}}\subset I_{P} \subset I_{P_{j}}\) for \(i\ne j\) then \(P_{i}\cap P_{j}\ne \emptyset \), contradicting the assumption, so either \(I_{P}\supset I_{P_{i}}\) for all \(i\in \{1,\ldots ,N\}\) or \(I_{P}\subset I_{P_{i}}\) for all \(i\in \{1,\ldots ,N\}\). We consider only the first case, as the proof of the second is similar. Write

$$\begin{aligned} \begin{aligned}&\sum _{i=1}^{N} \langle w_{P}; w_{P_{i}} \rangle w_{P_{i}} |I_{P_{i}}| = {{\,\mathrm{Mod}\,}}_{\xi _{P}} \sum _{i=1}^{N} \langle {{\,\mathrm{Mod}\,}}_{-\xi _{P}} w_{P};{{\,\mathrm{Mod}\,}}_{-\xi _{P}} w_{P_{i}} \rangle {{\,\mathrm{Mod}\,}}_{-\xi _{P}}w_{P_{i}} |I_{P_{i}}| \\&\quad ={{\,\mathrm{Mod}\,}}_{\xi _{P}} |I_{P}|^{-1} \sum _{i=1}^{N} \langle \mathbb {1}_{I_{P}};{{\,\mathrm{Mod}\,}}_{\xi _{P_{i}}-\xi _{P}} |I_{P_{i}}|^{-1}\mathbb {1}_{I_{P_{i}}} \rangle {{\,\mathrm{Mod}\,}}_{\xi _{P_{i}}-\xi _{P}}\mathbb {1}_{I_{P_{i}}} \\&\quad ={{\,\mathrm{Mod}\,}}_{\xi _{P}} |I_{P}|^{-1} \sum _{i=1}^{N} \langle \mathbb {1}_{I_{P}};\exp _{\xi _{P_{i}}-\xi _{P}}(x-x_{P_{i}}) |I_{P_{i}}|^{-1}\mathbb {1}_{I_{P_{i}}} \rangle \exp _{\xi _{P_{i}}-\xi _{P}}(x-x_{P_{i}})\mathbb {1}_{I_{P_{i}}} \\&\quad ={{\,\mathrm{Mod}\,}}_{\xi _{P}} |I_{P}|^{-1} \sum _{i=1}^{N} \langle \mathbb {1}_{I_{P}}; |I_{P_{i}}|^{-1}\mathbb {1}_{I_{P_{i}}} \rangle \mathbb {1}_{I_{P_{i}}}. \end{aligned} \end{aligned}$$

The third identity comes from the fact that \(\omega _{P}\subset \omega _{P_{i}}\) and thus \(|\xi _{P_{i}}-\xi _{P}|<|I_{P}|^{-1}\), so that by (2.3) it holds that

$$\begin{aligned} \exp _{\xi _{P_{i}}-\xi _{P}}(x-x_{P_{i}}) = 1. \end{aligned}$$

Since the intervals \((I_{P_{i}})_{i\in \{1,\ldots ,N\}}\) partition \(I_{P}\), we have

$$\begin{aligned} \sum _{i=1}^{N} \langle \mathbb {1}_{I_{P}}; |I_{P_{i}}|^{-1}\mathbb {1}_{I_{P_{i}}} \rangle \mathbb {1}_{I_{P_{i}}}=\mathbb {1}_{I_{P}}, \end{aligned}$$

completing the proof when \(I_{P_i} \subset I_P\) for all i. \(\square \)

The expression (1.11) of the tritile form involves multiplication of ‘nearby’ wave packet coefficients of three separate functions. This ‘nearness’ of tiles is encoded by grouping triples of frequency-adjacent tiles into tritiles.

Definition 2.5

(Tritiles) A tritile is a rectangle \(\mathbf {P} = I_{\mathbf {P}} \times \omega _{\mathbf {P}}\) of area 3, such that the sides \(I_{\mathbf {P}}\) and \(\omega _{\mathbf {P}}\) are intervals. As with tiles, for every tritile \(\mathbf {P}\) there are unique \(x_{\mathbf {P}},\xi _{\mathbf {P}} \in \mathbb {W}\) and \(n\in \mathbb {Z}\) such that

$$\begin{aligned} \begin{aligned}&I_{\mathbf {P}}=B_{3^n}(x_{\mathbf {P}})&[x_{\mathbf {P}}]_{j}=0 \text { for } j< n \\&\omega _{\mathbf {P}}=B_{3^{-n+1}}(\xi _{\mathbf {P}})&[\xi _{\mathbf {P}}]_{j}=0 \text { for } j< -n+1. \end{aligned} \end{aligned}$$
(2.12)

We denote the set of all tritiles by \(3\mathbb {P}\). Every tritile \(\mathbf {P}\) can be written in a unique way as a disjoint union of 3 tiles with time interval \(I_{\mathbf {P}}\); these tiles are given by

$$\begin{aligned} \mathbf {P}_{v} := I_{\mathbf {P}}\times {{\,\mathrm{ch}\,}}_{v}(\omega _{\mathbf {P}}), \qquad \forall v \in \{0,1,2\}. \end{aligned}$$
(2.13)

Conversely, for every tile P, there is a unique tritile \(\mathbf {P}\) such that that \(P = \mathbf {P}_v\) for some \(v \in \{0,1,2\}\). This splitting of \(\mathbf {P}\) into tiles is the horizontal splitting; there is also a vertical splitting

$$\begin{aligned} {\mathbf {P}}^{\uparrow } := \{J \times \omega _{\mathbf {P}} : J \in {{\,\mathrm{ch}\,}}(I)\} \end{aligned}$$
(2.14)

that we will use less often.

The horizontal and vertical splittings are sketched in Fig. 2.

Fig. 2
figure 2

A tritile, the horizontal splitting, and the vertical splitting

Remark 2.6

It is occasionally useful to identify the tritile \(\mathbf {P}\) with the set of corresponding tiles \(\{\mathbf {P}_0,\mathbf {P}_1,\mathbf {P}_2\}\), and to consider these tiles as ‘subtiles’ of \(\mathbf {P}\). Furthermore, given a Banach space X and a triple-valued function on tritiles \(F :3\mathbb {P} \rightarrow X^3\), we can identify F with an X-valued function on tiles \({\widetilde{F}} :\mathbb {P} \rightarrow X\) defined by

$$\begin{aligned} {\widetilde{F}}(P) = F(\mathbf {P})_u, \end{aligned}$$

where \(\mathbf {P} \in 3\mathbb {P}\) and \(u \in \{0,1,2\}\) are uniquely determined such that \(\mathbf {P}_u = P\). We will abuse notation and write \(F = {\widetilde{F}}\).

We consider \(3\mathbb {P}\) as being the ‘correct’ representation of the extended Walsh phase plane, and for us it plays the role that \(\mathbb {R}^3_+\) plays for time-frequency analysis on the real line, as explained in the introduction.

One of Fefferman’s (many) innovations in his proof of Carleson’s theorem was the introduction of a partial order on tiles. Using this order one can define trees, which represent sets of tiles that are frequency-localised at a certain ‘top frequency’, with time restricted to a given interval. On these subsets, time-frequency analysis is essentially reduced to Calderón–Zygmund theory.Footnote 6

Definition 2.7

(Order and trees) Given two tritiles \(\mathbf {P}\) and \(\mathbf {P}^\prime \), we say that

$$\begin{aligned} \mathbf {P}^\prime \le \mathbf {P} \qquad \text {if}\qquad I_{\mathbf {P}^\prime }\subseteq I_{\mathbf {P}} \text { and } \omega _{\mathbf {P}^\prime }\supseteq \omega _{\mathbf {P}}. \end{aligned}$$
(2.15)

The tree with top \(\mathbf {P}\) is the collection of tritiles

$$\begin{aligned} T(\mathbf {P}):=\{\mathbf {Q} \in 3\mathbb {P}:\mathbf {Q} \le \mathbf {P}\}. \end{aligned}$$
(2.16)

Given a tree T we denote by \(\mathbf {P}_{T}\) the unique tritile such that \(T=T(\mathbf {P}_{T})\). We write \(I_{T}:=I_{\mathbf {P}_{T}}\), \(\omega _{T}:=\omega _{\mathbf {P}_{T}}\), \(x_{T}=x_{\mathbf {P}_{T}}\), and \(\xi _{T}=\xi _{\mathbf {P}_{T}}\). The collection of all trees is denoted by \(\mathbb {T}\). For each \(u \in \{0,1,2\}\) the u-component of T is given by

$$\begin{aligned} T^u := \{\mathbf {Q} \in T :\omega _T \cap \omega _{\mathbf {Q}_u}\ne \emptyset \}, \end{aligned}$$

so that \(\mathbf {P}_{T}\subset T^{u}\) for all \(u\in \{0,1,2\}\), and the sets \(T^u \setminus \{\mathbf {P}_T\}\) partition \(T \setminus \{\mathbf {P}_T\}\).

Remark 2.8

Given a tile P and a tree T, it will be useful to write \(P \in T\) to mean that \(\mathbf {P} \in T\), where \(\mathbf {P}\) is the unique tritile containing P as a subtile (in the horizontal decomposition).

Another important class of subsets are the strips, which consist of tiles with time restricted to a given interval, with no restriction on frequency. These play an important role in the construction of iterated outer-\(L^p\) quasinorms.

Definition 2.9

(Strips) Given an interval \(I \subset \mathbb {W}\), the strip \(D=D(I)\) with top I is the collection of tritiles

$$\begin{aligned} D(I) := \{\mathbf {P} \in 3\mathbb {P}: I_P \subset I\}. \end{aligned}$$

Given a strip D we denote by \(I_{D}\) the unique interval such that \(D=D(I)\). The collection of all strips is denoted by \(\mathbb {D}\).

Finally, we define the notion of convexity for sets of tritiles.

Definition 2.10

(Convex sets) A set of tritiles \(\mathbb {A}\subset 3\mathbb {P}\) is convex if \(\mathbf {P},\mathbf {P}'\in \mathbb {A}\), \(\mathbf {Q}\in 3\mathbb {P}\), and \(\mathbf {P}\le \mathbf {Q}\le \mathbf {P}'\) imply \(\mathbf {Q}\in \mathbb {A}\).

Note that trees, strips, and their complements are convex, and that the intersection of two convex sets is convex.

2.3 The Embedding and the Defect

Consider a Banach space X and a function \(f :\mathbb {W} \rightarrow X\). Recall from the introduction the embedding \(\mathcal {E}[f] :\mathbb {P} \rightarrow X\), defined by

$$\begin{aligned} \mathcal {E}[f](P) = \langle f; w_P \rangle . \end{aligned}$$

A general function \(F :\mathbb {P} \rightarrow X\) cannot be realised as an ‘embedded function’ \(F = \mathcal {E}[f]\), as the wave packet coefficients \(\langle f; w_P \rangle \) are not independent. This lack of independence is codified by the relations in Lemma 2.4. We use these relations to construct a ‘defect operator’, which measures how far a function \(F :\mathbb {P} \rightarrow X\) is from being an embedded function.

Definition 2.11

(Defect operator) Given a Banach space X and a function \(F :\mathbb {P} \rightarrow X\), the defect \(\mathfrak {d}F :\mathbb {P} \rightarrow X\) is given by

$$\begin{aligned} \mathfrak {d}F (P) =F(P) - \sum _{Q\in {\mathbf {P}}^{\uparrow }}F(Q) \bigl \langle w_{Q}; w_{P} \bigr \rangle |I_{Q}| \qquad (P \in \mathbb {P}) \end{aligned}$$
(2.17)

where \(\mathbf {P} \in 3\mathbb {P}\) is the unique tritile containing P, and \({\mathbf {P}}^{\uparrow }\) is the vertical splitting of \(\mathbf {P}\) defined in (2.14).

The defect operator satisfies

$$\begin{aligned} \Vert \mathfrak {d}F(P) \Vert _{X}\lesssim \Vert F(P) \Vert _{X}+ \sum _{Q \in {\mathbf {P}}^{\uparrow }} \Vert F(Q) \Vert _{X} \end{aligned}$$
(2.18)

and, in virtue of Lemma 2.4, if \(F=\mathcal {E}[f]\) for some \(f :\mathbb {W} \rightarrow X\) then \(\mathfrak {d}F=0\).

In the following proposition, we show how a function \(F :\mathbb {P} \rightarrow X\) can be decomposed as the sum of an embedded function and its defect.

Proposition 2.12

(Function reconstruction) Let T be a tree, and let P be a tile with \(P \in T\) (recall from Remark 2.8 that this means \(\mathbf {P} \in T\), where \(\mathbf {P}\) is the unique tritile with \(P \in \mathbf {P}\)). Then for all \(N\in \mathbb {N}\) it holds that

$$\begin{aligned} \begin{aligned} F(P) = \mathfrak {d}F(P)&+ \biggl \langle \sum _{\begin{array}{c} Q\in T \\ |I_{P}|>|I_Q|\ge 3^{-N}|I_{P}| \end{array}} \mathfrak {d}F(Q) \,w_{Q} |I_Q|\, \,;\; w_{P} \biggr \rangle \\&+ \biggl \langle \sum _{\begin{array}{c} Q\in T \\ |I_Q|=3^{-(N+1)}|I_P| \end{array}} F(Q)w_{Q}|I_{Q}| \,;\; w_{P} \biggr \rangle . \end{aligned} \end{aligned}$$
(2.19)

Proof

We induct on \(N \in \mathbb {N}\). If \(N=0\) then the result follows immediately by definition of \(\mathfrak {d}F\), as the first sum is empty and the condition in the second sum is a rewriting of the condition \(Q \in {\mathbf {P}}^{\uparrow }\).

Let us show that if (2.19) holds for N then it also holds for \(N+1\). Apply the result with \(N=0\) to each tile in the second sum to obtain

$$\begin{aligned}&\sum _{\begin{array}{c} Q\in T \\ |I_{Q}|=3^{-(N+1)}|I_{P}| \end{array}} F(Q) \langle w_{Q}|I_{Q}|;w_P \rangle \\&\quad =\sum _{\begin{array}{c} Q\in T \\ |I_Q|=3^{-(N+1)}|I_P| \end{array}} \bigg ( \mathfrak {d}F(Q) + \sum _{\begin{array}{c} Q'\in T \\ |I_{Q'}|=3^{-(N+2)}|I_P| \end{array}} F(Q') \langle w_{Q'}|I_{Q'}| ; w_{Q} \rangle \bigg ) \langle w_Q|I_Q|;w_P \rangle \\&\quad = \mathbf {I} + \biggl \langle \sum _{\begin{array}{c} Q'\in T \\ |I_{Q'}|=3^{-(N+2)}|I_P| \end{array}} F(Q') w_{Q'}|I_{Q'}| ; \sum _{\begin{array}{c} Q\in T \\ |I_Q|=3^{-(N+1)}|I_P| \end{array}} \langle w_P;w_Q \rangle w_{Q}|I_Q| \biggr \rangle \\&\quad = \mathbf {I} + \biggl \langle \sum _{\begin{array}{c} Q \in T \\ |I_{Q'}|=3^{-(N+2)}|I_P| \end{array}} \,\, F(Q')w_{Q'}|I_{Q'}| ; w_P \biggr \rangle \end{aligned}$$

where

$$\begin{aligned} \mathbf {I} := \sum _{\begin{array}{c} Q\in T \\ |I_Q|=3^{-(N+1)}|I_P| \end{array}} \mathfrak {d}F(Q) \langle w_Q|I_Q|;w_P \rangle ; \end{aligned}$$

where the last identity holds since the tiles \(Q \in T\) with \(|I_Q| = 3^{-(N+1)} |I_P|\) are disjoint and cover P. Plugging this into (2.19) for N gives the statement for \(N+1\) as required. \(\square \)

Remark 2.13

We have remarked that \(\mathfrak {d}F=0\) if \(F=\mathcal {E}[f]\) for some \(f :\mathbb {W} \rightarrow X\). A similar result is true if F is not precisely an embedded function, but rather a ‘cut-off’ embedded function; for this result we need to think in terms of tritiles rather than tiles. If \(F=\mathcal {E}[f]\) and \(\mathbb {A}\subset 3\mathbb {P}\), then \(\mathfrak {d}(\mathbb {1}_{\mathbb {A}}F) (\mathbf {P}) \ne 0\) only if P happens to be on the “boundary” of the set \(\mathbb {A}\); that is, if \(\mathbf {P}\in \mathbb {A}\) and there exists \(\mathbf {Q}\le \mathbf {P}\) with \(|I_{\mathbf {Q}}|=|I_{\mathbf {P}}|/3\) such that \(\mathbf {Q}\notin \mathbb {A}\), or if \(\mathbf {P}\notin \mathbb {A}\) and there exists \(\mathbf {Q}\le \mathbf {P}\) with \(|I_{\mathbf {Q}}|=|I_{\mathbf {P}}|/3\) such that \(\mathbf {Q}\in \mathbb {A}\). A crucial observation is that if \(\mathbb {A}\) is convex, then for any fixed \(x\in \mathbb {W}\) there exist at most two tritiles \(\mathbf {P}\) on the boundary of \(\mathbb {A}\) with \(x\in I_{\mathbf {P}}\).

3 Analysis in Banach Spaces

The harmonic analysis of functions \(f :\mathbb {W} \rightarrow X\) valued in a Banach space X exhibits phenomena that are not present in the scalar case \(X = \mathbb {C}\). Generally techniques that work for scalar-valued functions require geometric assumptions on X in order to have X-valued extensions. The most famous of these geometric assumptions is the UMD (Unconditional Martingale Differences) property, which we discuss in Sect. 3.2. We will also require the q-Hilbertian property (also referred to as the \(\theta \)-Hilbertian property in the literature). Before discussing these geometric assumptions we give a short introduction to Rademacher sums, a crucial tool in Banach-valued analysis without which not much can be said.

A relatively complete introduction to Banach-valued analysis is the incomplete series [19, 20]. The reader will benefit from having a copy of these references at hand while reading this paper.

3.1 Rademacher Sums

A great deal of scalar-valued harmonic analysis is connected with square functions; that is, functions of the form

$$\begin{aligned} t \mapsto \bigg ( \sum _{n=1}^N |f_n(t)|^2 \bigg )^{1/2} \qquad \forall t \in \mathbb {R}\end{aligned}$$

where \((f_n)_{n \in \{1,\ldots ,N\}}\) is a sequence of \(\mathbb {C}\)-valued functions on \(\mathbb {R}\) (for example). If X is a Banach lattice (or in particular, a function space), then for all finite sequences \((x_n)_{n=1}^N\) in X one can make sense of the quantity

$$\begin{aligned} \bigg ( \sum _{n=1}^N |x_n|^2 \bigg )^{1/2} \in X \end{aligned}$$

as an element of X. However, for general Banach spaces X, this is not possible. The correct X-valued analogue of a square function is a Rademacher sum, which is a quantity of the form

$$\begin{aligned} \mathbb {E}\bigg \Vert \sum _{n=1}^N \varepsilon _n x_n \bigg \Vert _X := \int _\Omega \bigg \Vert \sum _{n=1}^N \varepsilon _n(\omega ) x_n \bigg \Vert _X \, \mathrm {d}\omega , \end{aligned}$$

where \((x_n)_{n=1}^N\) is a finite sequence in X, and where \((\varepsilon _n)_{n=1}^N\) is a sequence of independent Rademacher variables on some probability space \(\Omega \), i.e. random variables taking the values \(\pm 1\) with probability 1/2. When X is a Banach lattice with finite cotype (for example, if \(X = L^p(\Xi )\) for some \(\sigma \)-finite measure space \(\Xi \), with \(p \in [1,\infty )\)), then Rademacher sums are equivalent to norms of square functions; that is,

$$\begin{aligned} \mathbb {E}\bigg \Vert \sum _{n=1}^N \varepsilon _n x_n \bigg \Vert _{X} \simeq \bigg \Vert \bigg ( \sum _{n=1}^N |x_n|^2 \bigg )^{1/2} \bigg \Vert _{X} \end{aligned}$$
(3.1)

for all finite sequences \((x_n)_{n \in \{1,\ldots ,N\}}\) in X. This is the Khintchine–Maurey theorem [20, Theorem 7.2.13]. In this paper we do not work with Banach lattices, so (other than this paragraph) we do not discuss square functions; only Rademacher sums.

Here we mention two particularly important results that allow us to manipulate Rademacher sums. The first lets us replace the expectation in a Rademacher sum with an \(L^p\)-expectation for any \(p \in (0,\infty )\); the second lets us pull out bounded scalar coefficients in a Rademacher sum. We will use these results throughout the paper, often without mention. For proofs see [20, Theorems 6.2.4 and 6.1.13].

Theorem 3.1

(Kahane–Khintchine) Let X be a Banach space. For all finite sequences \((x_n)_{n=1}^N\) in X and all \(p \in (0,\infty )\), we have the equivalence

$$\begin{aligned} \mathbb {E}\bigg \Vert \sum _{n=1}^N \varepsilon _n x_n \bigg \Vert _X \simeq _{p} \bigg (\mathbb {E}\bigg \Vert \sum _{n=1}^N \varepsilon _n(\omega ) x_n \bigg \Vert _X^p \bigg )^{1/p} \end{aligned}$$
(3.2)

with implicit constant independent of N.

Theorem 3.2

(Kahane’s contraction principle) Let X be a Banach space. For all finite sequences \((x_n)_{n=1}^N\) in X and \((a_n)_{n=1}^N\) in \(\mathbb {C}\), we have

$$\begin{aligned} \mathbb {E}\bigg \Vert \sum _{n=1}^N \varepsilon _n a_n x_n \bigg \Vert _X \lesssim \Vert a\Vert _\infty \mathbb {E}\bigg \Vert \sum _{n=1}^N \varepsilon _n x_n \bigg \Vert _X, \end{aligned}$$
(3.3)

with implicit constant independent of N.

3.2 The UMD Property

As already mentioned, the most important of our geometric assumptions is the UMD property. It is natural to assume this property when doing Banach-valued harmonic analysis, as a Banach space X is UMD if and only if the Hilbert transform extends to a bounded operator on \(L^p(\mathbb {R};X)\) for all \(p \in (1,\infty )\) [5, 6]. The classical reflexive function spaces, for example \(L^p\)-spaces, Sobolev spaces, and Triebel–Lizorkin and Besov spaces, are all UMD. However, there are also important UMD spaces that are not function spaces (or not even Banach lattices); in particular, non-commutative \(L^p\)-spaces, including the Schatten classes \(\mathcal {C}^p\) (see [34, Chapter 14] and [19, Appendix D]). For more exposition on UMD spaces see for example [7, 19, 34]. We recall one possible definition of the UMD property in terms of Haar decompositions.

For every dyadic interval \(J=[m2^{n},(m+1)2^{n}) \subset \mathbb {R}\), \(n,m\in \mathbb {Z}\), define the \(L^1\)-normalised Haar function

$$\begin{aligned} h_{J}:=|J|^{-1}( \mathbb {1}_{J_{0}}-\mathbb {1}_{J_{1}}), \end{aligned}$$
(3.4)

where \(J_0\) and \(J_1\) are the left and right halves of \(J=J_{0} \cup J_{1}\), i.e.

$$\begin{aligned} J_{0} := [m2^{n},(2m+1)2^{n-1}), \; J_{1} := [(2m+1)2^{n-1},(m+1)2^{n} ). \end{aligned}$$
(3.5)

It is straightforward to see that \(\langle h_{J};h_{J'}\rangle =0\) unless \(J=J'\), and thus

$$\begin{aligned} \bigg \Vert \sum _{J\subset [0,1)} a_{J} \langle f ; h_{J} \rangle h_{J} |J| \bigg \Vert _{L^2(I;\mathbb {C})} \le \Vert f \Vert _{L^{2}([0,1);\mathbb {C})} \qquad \forall f \in L^2([0,1);\mathbb {C}) \end{aligned}$$
(3.6)

for any finitely-supported sequence of signs \(a_J \in \{-1,1\}\), where the sum is over all dyadic intervals \(J \subset [0,1)\). When \(L^2\) is replaced with \(L^p\) for some \(p \in (1,\infty )\), the estimate (3.6) still holds, with a constant depending on p (although naturally the proof above, being reliant on orthogonality, does not extend to \(p \ne 2\)). This motivates the following definition.

Definition 3.3

A Banach space X has the UMD property if there exists \(p \in (1,\infty )\) such that for any \(f\in L^{p}([0,1),X)\) and any finitely-supported sequence \((a_{J})_{J \subset [0,1)}\) of signs, it holds that

$$\begin{aligned} \bigg \Vert \sum _{J\subset [0,1)} a_{J}\langle f ; h_{J}\rangle h_{J} |J| \bigg \Vert _{L^p([0,1);X)} \lesssim \Vert f \Vert _{L^{p}([0,1),X)} \end{aligned}$$
(3.7)

where the sum is over all dyadic intervals \(J\subset [0,1)\).

If (3.7) holds for one \(p \in (1,\infty )\), then it holds for all \(p \in (1,\infty )\) (with a different constant) and with [0, 1) replaced by any dyadic interval (see [19, Theorems 4.2.7 and 4.2.12].

The Haar functions are in fact 2-Walsh wave packets associated to \(T^1\bigl (B_{1}(0)^{2}\bigr )\), so the bound (3.7) can be interpreted as unconditionality of a tree projection operator. In the 3-Walsh case, we use the following randomised version of (3.7); the proof is a bit harder than the 2-Walsh case because the tree projections cannot be directly related to martingale transforms. The idea is to reduce to the tree \(T(B_1(0)^2)\) by modulation, translation, and dilation, and then to reduce matters to a result of Clément et al. [9] which has already done the hard work of relating 3-Walsh–Fourier projections to martingale transforms.

Proposition 3.4

Let \(p \in (1,\infty )\) and X be a UMD Banach space. Then for all trees T and all \(f \in L^p(I_T;X)\) we have

$$\begin{aligned} \mathbb {E}\bigg \Vert \sum _{\begin{array}{c} u,v=0\\ u\ne v \end{array}}^{2}\sum _{\mathbf {P}\in T^{u}} \varepsilon _{\mathbf {P}_{v}} \langle f; w_{\mathbf {P}_v} \rangle w_{\mathbf {P}_v} |I_{\mathbf {P}}| \bigg \Vert _{L^p(I_{T};X)} \lesssim \Vert f\Vert _{L^p(I_T;X)}. \end{aligned}$$
(3.8)

Proof

First we reduce to consideration of the tree \(T_1 := T(B_1(0)^2)\). Fix an arbitrary tree T. Define the ‘lacunary tiles’ associated with T to be the set of tiles

$$\begin{aligned} T^{\mathrm {lac}}:=\bigcup _{\begin{array}{c} u,v=0\\ u\ne v \end{array}}^{2}\{\mathbf {P}_{v} : \mathbf {P}\in T^{u} \} = \bigcup _{\begin{array}{c} v=0 \end{array}}^{2}\{\mathbf {P}_{v} : \mathbf {P}\in T, \, \xi _{T}\notin \omega _{\mathbf {P}_{v}}\}. \end{aligned}$$

The lacunary tiles associated with T can be related to those associated with \(T_1\) by the relation

$$\begin{aligned} {{\,\mathrm{Dil}\,}}_{|I_{T}|^{-1}}{{\,\mathrm{Tr}\,}}_{-x_{T}}{{\,\mathrm{Mod}\,}}_{-\xi _{T}} T^{\mathrm {lac}} =T_1^{\mathrm {lac}}, \end{aligned}$$

with dilation, translation, and modulation operators acting on tiles as in Remark 2.3. Applying these operators to the wave packets appearing in (3.8) we obtain

$$\begin{aligned}&{{\,\mathrm{Dil}\,}}_{|I_{T}|^{-1}}{{\,\mathrm{Tr}\,}}_{-x_{T}}{{\,\mathrm{Mod}\,}}_{-\xi _{T}} \sum _{\begin{array}{c} u,v=0\\ u\ne v \end{array}}^{2}\sum _{\mathbf {P}\in T^{u}} \varepsilon _{\mathbf {P}_{v}} \langle f; w_{\mathbf {P}_v} \rangle w_{\mathbf {P}_v} |I_{\mathbf {P}}| \\&\quad = \sum _{P\in T^{\mathrm {lac}}} \varepsilon _{P} \langle f; w_{P} \rangle \, {{\,\mathrm{Dil}\,}}_{|I_{T}|^{-1}}{{\,\mathrm{Tr}\,}}_{-x_{T}}{{\,\mathrm{Mod}\,}}_{-\xi _{T}} w_{P} \,|I_{P}| \\&\quad = \sum _{ P\in T_1^{\mathrm {lac}} } a_{P}\varepsilon _{P} \langle {{\,\mathrm{Dil}\,}}_{|I_{T}|^{-1}}{{\,\mathrm{Tr}\,}}_{-x_{T}}{{\,\mathrm{Mod}\,}}_{-\xi _{T}}f; w_{P} \rangle \,w_{P}\, |I_{P}| \end{aligned}$$

for some unimodular constants \(a_{P}\). Supposing that (3.8) holds for \(T_1\), the contraction principle (3.3) yields that

$$\begin{aligned}&|I_{T}|^{-1/p'}\mathbb {E}\bigg \Vert \sum _{\begin{array}{c} u,v=0\\ u\ne v \end{array}}^{2}\sum _{\mathbf {P}\in T^{u}} \varepsilon _{\mathbf {P}_{v}} \langle f; w_{\mathbf {P}_v} \rangle w_{\mathbf {P}_v} |I_{\mathbf {P}}| \bigg \Vert _{L^p(I_{T};X)} \\&\quad = \mathbb {E}\bigg \Vert \sum _{\begin{array}{c} u,v=0\\ u\ne v \end{array}}^{2}\sum _{\mathbf {P}\in T_1^{u}} a_{P_{v}}\varepsilon _{\mathbf {P}_{v}} \langle {{\,\mathrm{Dil}\,}}_{|I_{T}|^{-1}}{{\,\mathrm{Tr}\,}}_{-x_{T}}{{\,\mathrm{Mod}\,}}_{-\xi _{T}} f; w_{\mathbf {P}_v} \rangle w_{\mathbf {P}_v} |I_{\mathbf {P}}| \bigg \Vert _{L^p(B_{1}(0);X)} \\&\quad \lesssim \Vert {{\,\mathrm{Dil}\,}}_{|I_{T}|^{-1}}{{\,\mathrm{Tr}\,}}_{-x_{T}}{{\,\mathrm{Mod}\,}}_{-\xi _{T}} f \Vert _{L^p(B_{1}(0);X)} =|I_{T}|^{-1/p'}\Vert f \Vert _{L^p(I_{T};X)}. \end{aligned}$$

Thus it suffices to show (3.8) for the tree \(T = T_1\). For this tree, only the 0-part is nontrivial; i.e. \(T_1=T_1^{0}\). Let us show the bound restricted to the summand corresponding to \(v=1\); the bound for the \(v=2\) summand is shown in the same way, and one combines these summands using the triangle inequality. Using the Kahane–Khintchine inequality (Theorem 3.1) and Fubini one has

$$\begin{aligned}&\Biggl ( \mathbb {E}\bigg \Vert \sum _{\mathbf {P}\in T^{0}} \varepsilon _{\mathbf {P}_{1}} \langle f; w_{\mathbf {P}_1} \rangle w_{\mathbf {P}_1} |I_{\mathbf {P}}| \bigg \Vert _{L^p(B_{1}(0);X)}\Biggr )^{p} \\&\quad \simeq \mathbb {E}\bigg \Vert \sum _{\mathbf {P}\in T^{0}} \varepsilon _{\mathbf {P}_{1}} \langle f; w_{\mathbf {P}_1} \rangle w_{\mathbf {P}_1} |I_{\mathbf {P}}| \bigg \Vert _{L^p(B_{1}(0);X)}^{p} \\&\quad = \int _{B_{1}(0)} \int _\Omega \bigg \Vert \sum _{\mathbf {P}\in T^{0}} \varepsilon _{\mathbf {P}_{1}}(\omega ) \langle f; w_{\mathbf {P}_1} \rangle w_{\mathbf {P}_1}(x) |I_{\mathbf {P}}| \bigg \Vert _{X}^{p} \, \mathrm {d}\omega \, \mathrm {d}x \\&\quad = \int _{B_{1}(0)} \int _\Omega \bigg \Vert \sum _{\mathbf {P}\in T^{0}} \varepsilon _{|I_{\mathbf {P}}|}(\omega ) \langle f; w_{\mathbf {P}_1} \rangle w_{\mathbf {P}_1}(x) |I_{\mathbf {P}}| \bigg \Vert _{X}^{p} \, \mathrm {d}\omega \, \mathrm {d}x \\&\quad \simeq \mathbb {E}\bigg \Vert \sum _{n=0}^{\infty }\varepsilon _{n}\,\, \sum _{\begin{array}{c} \mathbf {P}\in T^{0}\\ |I_{\mathbf {P}}|=3^{-n} \end{array}} \langle f; w_{\mathbf {P}_1} \rangle w_{\mathbf {P}_1} |I_{\mathbf {P}}| \bigg \Vert _{L^p(B_{1}(0);X)}^{p}. \end{aligned}$$

In reindexing the Rademacher variables we used that for each \(x \in B_1(0)\) the tiles \(\mathbf {P} \in T^0\) for which \(w_{\mathbf {P}_1}(x) \ne 0\) are in bijective correspondence with the scales \(\{|I_\mathbf {P}| : \mathbf {P} \in T_0\}\), and thus the two sets of Rademacher variables

$$\begin{aligned} \{\varepsilon _{\mathbf {P}_1} : \mathbf {P} \in T_0, w_{\mathbf {P}_1}(x) \ne 0\}, \quad \{\varepsilon _{|I_{\mathbf {P}}|} : \mathbf {P} \in T_0\} \end{aligned}$$

are equally distributed.

For each \(n \in \mathbb {N}\), let \(S_{n}\) denote the Walsh–Fourier projection onto the interval \(B_{3^{-n}}(3^{-n})=\{\xi \in \mathbb {W}:\xi _{-k}=\delta _{-n}(k)\}\), so that

$$\begin{aligned} \mathbb {E}\bigg \Vert \sum _{\mathbf {P}\in T^{0}} \varepsilon _{\mathbf {P}_{1}} \langle f; w_{\mathbf {P}_1} \rangle w_{\mathbf {P}_1} |I_{\mathbf {P}}| \bigg \Vert _{L^p(B_{1}(0);X)} \simeq \mathbb {E}\bigg \Vert \sum _{n=0}^{\infty } \varepsilon _{n} S_{n}f \bigg \Vert _{L^p(B_{1}(0);X)}. \end{aligned}$$

To bound this quantity, we use a result of Clément et al. [9, Corollary 4.4]; since X is UMD, this result implies

$$\begin{aligned} \mathbb {E}\bigg \Vert \sum _{n \ge 1} \varepsilon _n S_{n} f\bigg \Vert _p \lesssim \Vert f\Vert _{L^p(B_1(0);X)} \qquad \forall f \in L^p(B_1(0);X) \end{aligned}$$
(3.9)

and completes the proof.Footnote 7\(\square \)

Remark 3.5

With additional work, one could improve the randomised estimate (3.8) to full unconditionality (i.e. replacing the Rademacher variables with an arbitrary deterministic choice of signs) by working through [9, Section 4], modifying the martingale difference sequence to take into account orthogonal wave packets at the same scale, as in the proof of unconditionality of the Haar decomposition (see [19, Theorem 4.2.13]). Since we only need the randomised estimate, we leave this to the hypothetical interested reader.

3.3 r-Hilbertian Spaces

For \(p,q \in [1,\infty ]\) and \(\theta \in [0,1]\), we let \([p,q]_\theta \in [1,\infty ]\) be the number defined by the relation

$$\begin{aligned} \frac{1}{[p,q]_\theta } = \frac{1-\theta }{p} + \frac{\theta }{q}. \end{aligned}$$

Definition 3.6

Let \(r \in [2,\infty )\). We say that a Banach space X is r-Hilbertian if there exists a Hilbert space H and a Banach space Y, such that (HY) is an interpolation couple, and such that X is isomorphic to the complex interpolation space \([H,Y]_\theta \), with \([2,\infty ]_\theta = r\).

Remark 3.7

In [34], r-Hilbertian spaces are referred to as \(\theta \)-Hilbertian. In our computations the parameter r plays a more important role, so we prefer to use our terminology.

For an introduction to interpolation spaces, see for example [4] or [19, Appendix C]. Note that if X is r-Hilbertian, then X is s-Hilbertian for all \(s > r\). Every \(L^p\)-space with \(p \in [2,\infty )\), either classical or non-commutative, is p-Hilbertian: to see this, note that \(L^p = [L^2,L^\infty ]_\theta \) with \([2,\infty ]_\theta = p\). By the same argument, replacing \(L^\infty \) with \(L^1\), \(L^p\) is \(p^\prime \)-Hilbertian when \(p \in (1,2]\).

r-Hilbertian spaces enjoy the following ‘r-orthogonality’ of wave packet coefficients, which should be compared to the notions of tile-type and quartile-type in [21,22,23,24]. It should be noted that this is the only consequence of the r-Hilbertian property that we actually use. Thus one could isolate this estimate as a geometric assumption, perhaps called ‘Walsh tile-type r’ (although that name is already taken). However, we do not know how to establish the property without assuming the r-Hilbertian property, so we choose not to make this definition.

Proposition 3.8

(Walsh tile-type) Let X be r-Hilbertian, then

$$\begin{aligned} \biggl ( \sum _{P \in A} \Vert \langle f; w_P \rangle \Vert _X^r \, |I_{P}|\biggr )^{1/r} \lesssim \Vert f\Vert _{L^r(\mathbb {W};X)} \qquad \forall f \in L^r(\mathbb {W};X), \end{aligned}$$
(3.10)

for any finite collection \(A\subset \mathbb {P}\) of pairwise disjoint tiles, with implicit constant independent of A.

Proof

Suppose X is isomorphic to \([H,Y]_\theta \), where H is a Hilbert space, Y is a Banach space, and \([2,\infty ]_\theta = r\). Let \(\mathring{L}^\infty (\mathbb {W};Y)\) denote the closure of the Schwartz functions \(\mathscr {S}(\mathbb {W};Y)\) in \(L^\infty (\mathbb {W};Y)\). A straightforward estimate yields

$$\begin{aligned} \sup _{P \in A} \Vert \langle f; w_P \rangle \Vert _{Y}\le \Vert f\Vert _{L^\infty (\mathbb {W};Y)} \qquad \forall f \in \mathring{L}^\infty (\mathbb {W};Y), \end{aligned}$$

while Plancherel’s theorem yields

$$\begin{aligned} \biggl ( \sum _{P \in A} \Vert \langle f; w_P \rangle \Vert _{H}^{2}\,|I_{P}|\biggr )^{1/2} \le \Vert f\Vert _{L^2(\mathbb {W};H)} \qquad \forall f \in L^2(\mathbb {W};H). \end{aligned}$$

The desired inequality follows by complex interpolation, with all sequence spaces on A weighted by \(P \mapsto |I_P|\),

$$\begin{aligned} \ell ^r(A;X) \cong \ell ^r(A;[H,Y]_\theta ) = [\ell ^2(A;H), \ell ^\infty (A;Y)]_\theta \end{aligned}$$

(see [37, §1.18.1, Remark 2] for the equality at the end, and [37, §1.18.4, Remark 3] for interpolation between \(L^2\) and \(\mathring{L}^\infty \)). \(\square \)

Remark 3.9

It is natural to suspect that if a Banach space X is r-Hilbertian for some \(r < \infty \), then it must be UMD. This is false; a counterexample is given by Qiu’s construction (see [19, §4.3.c] and [35]).Footnote 8 For all \(r \in [2,\infty ]\), and \(k \in \mathbb {N}\), inductively define spaces

$$\begin{aligned} X_0^r := \ell ^2_2(\ell ^r_2), \qquad X_{k+1}^r := \ell _2^2(\ell _2^r(X_{k+1})) \end{aligned}$$

(here \(\ell ^r_2(Y) := \ell ^r(\{0,1\};Y)\), where \(\{0,1\}\) is equipped with counting measure). Then set \(X^r := \oplus ^r_{n \in \mathbb {N}} X_k^r\). For all \(r \ne 2\), \(X^r\) is not UMD, while \(X^r = [X^2,X^\infty ]_\theta \) is r-Hilbertian.

4 Outer-\(L^p\) Spaces

In this section we introduce outer structures and their associated outer-\(L^p\) quasinorms. Roughly speaking, an outer structure on a topological space consists of an outer measure on the space, a Banach space X, and a size on X-valued functions on the topological space. Currently the standard references on this topic are the initial work by Do and Thiele [16], and the first Banach-valued implementation by Di Plinio and Ou [15]. However, the outer-\(L^p\) concept is still quite new, and the terminology and definitions are not fixed. Our interpretation of the theory differs slightly (but not fundamentally) from what appears in the literature. In Sects. 4.2 and 4.3 we analyse particular outer structures that are relevant to our problem.

4.1 Initial Definitions

For a topological space \(\mathbb {X}\) we let \(\mathscr {B}(\mathbb {X})\) denote the \(\sigma \)-algebra of Borel sets in \(\mathbb {X}\), and for a Banach space X we let \(\mathscr {B}(\mathbb {X};X)\) denote the set of strongly Borel measurable functions \(\mathbb {X}\rightarrow X\). Recall that a Polish space is a topological space that is homeomorphic to a complete separable metric space. This is a technical assumption that will ultimately play no role in this paper, as we only really care about the countable space \(3\mathbb {P}\) with the discrete topology.

Definition 4.1

(Outer structure) Let \(\mathbb {X}\) be a Polish space. An outer structure on \(\mathbb {X}\), or simply an outer structure, consists of the following data:

  • a collection \(\mathfrak {E}\subset \mathscr {B}(\mathbb {X})\) of generating sets,

  • a function \(\sigma :\mathfrak {E} \rightarrow [0,\infty )\), called the premeasure,

  • a Banach space X,

  • an X-size (or simply a size) S on \((\mathbb {X},\mathfrak {E})\); that is, a family of maps indexed by \(E\in \mathfrak {E}\)

    $$\begin{aligned} \mathscr {B}(\mathbb {X};X) : F \mapsto \Vert F \Vert _{S(E)} \in [0,\infty ] \qquad \forall E \in \mathfrak {E}\end{aligned}$$

    such that there exists a constant \(C \ge 1\) satisfying the following properties for all \(E \in \mathfrak {E}\) and \(F,G \in \mathscr {B}(\mathbb {X};X)\):

    unconditionality::

    \(\Vert \mathbb {1}_{A} F \Vert _{S(E)} \le C \Vert F \Vert _{S(E)}\) for all

    $$\begin{aligned} A\in \mathfrak {E}^{\cup }=\Bigl \{ A :A=\bigcup _{n\in \mathbb {N}}E_{n} \text { with } E_{n}\in \mathfrak {E}\Bigr \}. \end{aligned}$$
    homogeneity::

    \(\Vert \lambda F \Vert _{S(E)} = |\lambda | \Vert F \Vert _{S(E)}\) for all \(\lambda \in \mathbb {C}\);

    quasi-triangle inequality::

    \(\Vert F+G \Vert _{S(E)} \le C(\Vert F \Vert _{S(E)} + \Vert G \Vert _{S(E)})\);

    nondegeneracy::

    \(\Vert F \Vert _{S(E)}=0\) for all \(E\in \mathfrak {E}\) if and only if \(F=0\).

    That is, the maps \(\Vert \cdot \Vert _{S(E)}\) are (possibly infinite) quasinorms on E, with quasinorm constant uniformly bounded in \(E \in \mathfrak {E}\), and with an additional unconditionality property.Footnote 9

Given an outer structure on \(\mathbb {X}\) as above, we define the induced outer measure \(\sigma :\mathcal {P}(\mathbb {X}) \rightarrow [0,\infty ]\) (which we denote by the same letter as the premeasure) by

$$\begin{aligned} \sigma (A) := \inf \Bigl \{\sum _{E\in \mathbf {E}\subset \mathfrak {E}} \sigma (E) : \bigcup _{E\in \mathbf {E}} E \supset A \Bigr \} \qquad \forall A \subset \mathbb {X}\end{aligned}$$

where the infimum is taken over all countable covers \(\mathbf {E}\) of A by generating sets. For all \(f \in \mathscr {B}(\mathbb {X};X)\) we define \(\Vert f \Vert _S := \sup _{E \in \mathfrak {E}} \Vert f \Vert _{S(E)}\), and for all \(\lambda > 0\) we define the outer superlevel measure

$$\begin{aligned} \sigma (\Vert f \Vert _{S} > \lambda ) := \inf \{\sigma (A) : A \subset \mathbb {X}, \Vert \mathbb {1}_{\mathbb {X}\setminus A}f \Vert _{S} \le \lambda \}. \end{aligned}$$

Different choices of sizes lead to fundamentally different outer structures, even when the outer measure and the Banach space remain fixed. Thus we consider the size (and the underlying Banach space) as a component of the outer structure.

To each outer structure is associated a family of quasinorms, defined in a way that mimics the so-called layer cake representation of the \(L^p\) norm.

Definition 4.2

(Outer-\(L^{p}\)quasinorms) Let \(\mathbb {X}\) be a Polish space, and let \((\mathfrak {E},\sigma ,X,S)\) be an outer structure on \(\mathbb {X}\). For all \(p \in (0,\infty )\) we define the outer-\(L^p\)quasinorms and weak outer-\(L^p\)quasinorms of a function \(f\in \mathscr {B}(\mathbb {X};X)\) by setting

$$\begin{aligned}&\Vert F\Vert _{L_{\sigma }^{p} S} := \bigg ( \int _{0}^{\infty } p\lambda ^{p-1} \sigma (\Vert F \Vert _{S}> \lambda ) \, \mathrm {d}\lambda \bigg )^{1/p}&\forall p\in (0,\infty ), \\&\Vert F\Vert _{L_{\sigma }^{p,\infty } S} := \sup _{\lambda> 0} \lambda \, \sigma (\Vert F \Vert _{S} > \lambda )^{1/p}&\forall p\in (0,\infty ), \\&\Vert F\Vert _{L_{\sigma }^{\infty } S} := \Vert F \Vert _{S}. \end{aligned}$$

It is straightforward to check that these are indeed quasinorms.

A Hölder-type inequality holds for outer-\(L^p\) spaces defined with respect to different sizes provided that it holds in a certain sense for the sizes themselves. The proof below is a straightforward extension of that of [16, Proposition 3.4].

Proposition 4.3

(Outer Hölder inequality) Let \(\mathbb {X}\) be a Polish space. For each \(u \in \{0,1,2\}\) let \((\mathfrak {E},\sigma ,X_u,S_u)\) be an outer structure on \(\mathbb {X}\), and let \((\mathfrak {E},\sigma ,X,S)\) be another outer structure on \(\mathbb {X}\). Note that all these outer structures have the same generating sets and premeasure. Let \(\Pi :X_0 \times X_1 \times X_2 \rightarrow X\) be a bounded trilinear map, and suppose that the size-Hölder inequality

$$\begin{aligned} \Vert \Pi (F_{0},F_{1},F_{2}) \Vert _{S} \lesssim \prod _{u=0}^{2} \Vert F_{u} \Vert _{S_{u}} \qquad \forall F_u \in \mathscr {B}(\mathbb {X};X_u) \end{aligned}$$
(4.1)

holds. Then for all \(p_u \in [1,\infty ]\) we have the outer Hölder inequality

$$\begin{aligned} \Vert \Pi (F_{0},F_{1},F_{2}) \Vert _{L_\sigma ^{p} S} \lesssim _{p_0,p_1,p_2} \prod _{u=0}^{2} \Vert F_{u} \Vert _{L_{\sigma }^{p_{u}} S_{u}} \qquad \forall F_u \in \mathscr {B}(\mathbb {X};X_u) \end{aligned}$$
(4.2)

with \(p^{-1}=\sum _{u=0}^{2}p_{u}^{-1}\).

Proof

Assume that the factors on the right hand side of (4.2) are finite and non-zero, for otherwise there is nothing to prove. By homogeneity we may assume that \(\Vert F_{u} \Vert _{L^{p_{u}}_{\sigma }S_{u}}^{p_{u}} =1\) for each u. For each \(u\in \{0,1,2\}\) and \(n\in \mathbb {Z}\) let \(A_{n}^{u}\subset \mathbb {X}\) be such that

$$\begin{aligned} \sum _{n\in \mathbb {Z}} 2^{n} \sigma (A_{n}^{u}) \lesssim 1 \qquad \Vert \mathbb {1}_{\mathbb {X}\setminus A_{n}^{u}} F_{u}\Vert _{S_{u}}\lesssim 2^{n/p_{u}}. \end{aligned}$$

We may assume that \(A_{n}^{u}\subset A_{n-1}^{u}\) by considering \({\widetilde{A}}_{n}^{u}=\bigcup _{k\ge n} A_{n}^{u}\) and noticing that \({\widetilde{A}}_{n}^{u}\) satisfies the conditions above. Let \(A_{n}=\bigcup _{u=0}^{2}A_{n}^{u}\). Then it holds that

$$\begin{aligned} \sum _{n\in \mathbb {Z}} 2^{n} \sigma (A_{n}) \lesssim \sum _{n\in \mathbb {Z}} \sum _{u=0}^{2}2^{n} \sigma (A_{n}^{u})\lesssim 1, \end{aligned}$$

and from (4.1) it follows that

$$\begin{aligned} \Vert \mathbb {1}_{\mathbb {X}\setminus A_{n}} \Pi (F_{0},F_{1},F_{2})\Vert _{S}= \Vert \Pi ( \mathbb {1}_{\mathbb {X}\setminus A_{n}}F_{0}, \mathbb {1}_{\mathbb {X}\setminus A_{n}}F_{1}, \mathbb {1}_{\mathbb {X}\setminus A_{n}}F_{2})\Vert _{S}\lesssim 2^{n/p}, \end{aligned}$$

which concludes the proof. \(\square \)

It is possible to control classical \(L^1\) norms by outer-\(L^1\) quasinorms, by the following Radon–Nikodym-type domination principle. For the proof in the case \(X = \mathbb {C}\), which extends to general Banach spaces, see [38, Lemma 2.2] and [16, Proposition 3.6]

Proposition 4.4

(Radon–Nikodym-type domination) Let \(\mathbb {X}\) be a Polish space, and let \((\mathfrak {E},\sigma ,X,S)\) be an outer structure on \(\mathbb {X}\) such that \(\mathbb {X}= \bigcup _{i \in \mathbb {N}} E_i\) for some countable sequence of generating sets \(E_i \in \mathfrak {E}\). If \(\mathfrak {m}\) is a positive Borel measure on \(\mathbb {X}\) such that

$$\begin{aligned} \int _E \Vert F(x) \Vert _{X} \, \mathrm {d}\mathfrak {m} (x) \lesssim \Vert F \Vert _{S(E)} \sigma (E) \qquad \forall E \in \mathfrak {E}, \, \forall F \in \mathscr {B}(\mathbb {X};X) \end{aligned}$$

and

$$\begin{aligned} \sigma (A) = 0 \, \Rightarrow \, \mathfrak {m}(A) = 0 \qquad \forall A \in \mathscr {B}(\mathbb {X}), \end{aligned}$$

then

$$\begin{aligned} \int _\mathbb {X}\Vert F(x) \Vert _{X} \, \mathrm {d}\mathfrak {m}(x) \lesssim \Vert F\Vert _{L^1_\sigma S} \qquad \forall F \in \mathscr {B}(\mathbb {X};X). \end{aligned}$$

The outer-\(L^p\) spaces support a useful Marcinkiewicz-type interpolation theorem, proven in [16, Proposition 3.5] (see also [15, Propostion 7.4]). In applications we only prove bounds for outer-\(L^p\) quasinorms by establishing endpoint weak outer-\(L^p\) bounds.

Proposition 4.5

(Marcinkiewicz interpolation) Let \(\mathbb {X}\) be a Polish space, and let \((\mathfrak {E}, \sigma , X, S)\) be an outer structure on \(\mathbb {X}\). Let \(\Omega \) be a \(\sigma \)-finite measure space, and let T be a quasi-sublinear operator mapping \(L^{p_1}(\Omega ;X) + L^{p_2}(\Omega ;X)\) into \(\mathscr {B}(\mathbb {X};X)\) for some \(1 \le p_1 < p_2 \le \infty \). Suppose that

$$\begin{aligned} \begin{aligned} \Vert Tf\Vert _{L^{p_1,\infty }_\sigma S}&\lesssim \Vert f\Vert _{L^{p_1}(\Omega ;X)}, \\ \Vert Tf\Vert _{L^{p_2,\infty }_\sigma S}&\lesssim \Vert f\Vert _{L^{p_2}(\Omega ;X)} \end{aligned} \qquad \forall f \in L^{p_1}(\Omega ;X) + L^{p_2}(\Omega ;X). \end{aligned}$$

Then for all \(p \in (p_1,p_2)\).

$$\begin{aligned} \Vert Tf\Vert _{L^p_\sigma S} \lesssim \Vert f\Vert _{L^p(\Omega ;X)} \qquad \forall f \in L^p(\Omega ;X). \end{aligned}$$

4.2 Particular Outer Structures

Now we move from general outer structures to those with relevance to Walsh time-frequency analysis. Two collections of generating sets will be used: the collection \(\mathbb {T}\) of trees, and the collection \(\mathbb {D}\) of strips. We will use the premeasures \(\mu :\mathbb {T} \rightarrow [0,\infty )\) and \(\nu :\mathbb {D} \rightarrow [0,\infty )\) defined by

$$\begin{aligned} \mu (T) := |I_T|, \qquad \nu (D) := |I_D|. \end{aligned}$$

Two families of sizes on \((3\mathbb {P},\mathbb {T})\), called ‘deterministic’ and ‘randomised’, will be needed. The deterministic sizes are \(\mathbb {C}\)-sizes, while the randomised sizes are \(X^3\)-sizes, where X is a given Banach space.

Definition 4.6

(Deterministic sizes) The \(\mathbb {C}\)-sizes \(S^1\) and \(S^\infty \) on \((3\mathbb {P},\mathbb {T})\) are given by

$$\begin{aligned} \Vert F\Vert _{S^1(T)}&:= \frac{1}{|I_{T}|} \sum _{\mathbf {P} \in T}| F(\mathbf {P}) |\, |I_{\mathbf {P}}| \\ \Vert F\Vert _{S^\infty (T)}&:= \sup _{\mathbf {P} \in T}| F(\mathbf {P}) |. \end{aligned}$$

for all \(F \in \mathscr {B}(3\mathbb {P};\mathbb {C})\).Footnote 10 We also define the mixed deterministic \(\mathbb {C}\)-size \(S^{(\infty ,1)}\) by

$$\begin{aligned} \Vert F \Vert _{S^{(\infty ,1)}(T)} :=&\Bigl \Vert \sum _{v=0}^{2}\sum _{\mathbf {P} \in T^{v}} |F(\mathbf {P})|\mathbb {1}_{I_{\mathbf {P}}}(x)\Bigr \Vert _{L^{\infty }} \\ =&\sup _{x\in I_{T}}\sum _{\begin{array}{c} \mathbf {P}\in T \\ I_{\mathbf {P}}\ni x \end{array}} | F (\mathbf {P})|. \end{aligned}$$

Definition 4.7

(Randomised sizes) Let X be a Banach space. The \(X^{3}\)-size \(\mathbb {S}\) is given for all \(F \in \mathscr {B}(3\mathbb {P};X^3)\) by

$$\begin{aligned} \Vert F\Vert _{\mathbb {S}(T)} := \big \Vert \Vert F \Vert _{X^{3}} \big \Vert _{S^{\infty }(T)}+ \big \Vert \Vert \mathfrak {d}F\Vert _{X^3} \big \Vert _{S^{(\infty ,1)}(T)} + \sum _{u\in \{0,1,2\}} \Vert F\Vert _{\mathbb {S}_{u}(T)}, \end{aligned}$$

where

$$\begin{aligned} \Vert F\Vert _{\mathbb {S}_{u}(T)} := \sum \limits _{v\ne u}\bigg (\frac{1}{|I_{T}|} \int _{I_T} \mathbb {E}\bigg \Vert \sum _{\mathbf {P} \in T^{u}} \varepsilon _{\mathbf {P}} F(\mathbf {P}_{v}) \mathbb {1}_{I_{\mathbf {P}}}(x) \bigg \Vert _{X}^2 \, \mathrm {d}x \bigg )^{1/2}. \end{aligned}$$
(4.3)

Remark 4.8

We do not mention the Banach space X in the notation for the randomised size \(\mathbb {S}\); this should always be clear from context. Often we will refer to three functions \(F_u \in \mathscr {B}(3\mathbb {P};X_u)\) (\(u \in \{0,1,2\}\)) valued in different Banach spaces, and discuss the three sizes \(\Vert F_u\Vert _{\mathbb {S}}\); here we have three different \(X_u^3\)-sizes \(\mathbb {S}\), but we gain no clarity from denoting these sizes differently.

It is almost clear that \(\mathbb {S}\) satisfies all the conditions of a size; the only subtlety is in showing that the component measuring \(\mathfrak {d}F\) satisfies the unconditionality property.

Proposition 4.9

Let X be a Banach space, \(F \in \mathscr {B}(3\mathbb {P};X)\), and suppose that \(A \in \mathbb {T}^{\cup }\) is a countable union of trees. Then

$$\begin{aligned} \big \Vert \Vert \mathfrak {d}(\mathbb {1}_{A} F) \Vert _{X^{3}} \big \Vert _{S^{(\infty ,1)}(T)}\lesssim \big \Vert \Vert F \Vert _{X^{3}} \big \Vert _{S^{\infty }(T)}+ \big \Vert \Vert \mathfrak {d}F \Vert _{X^{3}} \big \Vert _{S^{(\infty ,1)}(T)}. \end{aligned}$$
(4.4)

It follows that \(\mathbb {S}\) satisfies the unconditionality property.

Proof

Notice that for all tritiles \(\mathbf {P}\),

$$\begin{aligned} \Vert \mathfrak {d}(\mathbb {1}_{A} F)(\mathbf {P}) \Vert _{X^{3}} \le \Vert \mathfrak {d}(F)(\mathbf {P}) \Vert _{X^{3}} + \sum _{\begin{array}{c} \mathbf {Q}\le \mathbf {P}\\ |I_{\mathbf {Q}}|=|I_{\mathbf {Q}}|/3 \end{array}}\,\Bigl ( \Vert F(\mathbf {P}) \Vert _{X^{3}}+ \Vert F(\mathbf {Q})\Vert _{X^{3}} \Bigr )\Bigl | \mathbb {1}_{A}(\mathbf {P}) - \mathbb {1}_{A}(\mathbf {Q})\Bigr |\nonumber \\ \end{aligned}$$
(4.5)

by (2.18) and since \(\mathfrak {d}(\mathbb {1}_{A} F)(\mathbf {P})=\mathfrak {d}F(\mathbf {P})\) unless \(\mathbf {P}\notin A\) and \(\mathbf {Q}\in A\), where \(\mathbf {Q}\) is a tritile with \(\mathbf {Q}\le \mathbf {P}\) and \(|I_{\mathbf {Q}}|=|I_{\mathbf {P}}|/3\). Since \(A\in \mathbb {T}^{\cup }\) , for any \(x\in I_{T}\) it holds that there is at most one \(\mathbf {P}\in 3\mathbb {P}\) such that

$$\begin{aligned} \sum _{\begin{array}{c} \mathbf {Q}\le \mathbf {P}\\ |I_{\mathbf {Q}}|=|I_{\mathbf {Q}}|/3 \end{array}} \Bigl | \mathbb {1}_{A}(\mathbf {P}) - \mathbb {1}_{A}(\mathbf {Q})\Bigr | \ne 0 \text { and } x\in I_{\mathbf {P}}; \end{aligned}$$

writing out the definition of \(S^{(\infty ,1)}\), one sees that this gives the required estimate. \(\square \)

We make use of the two premeasures \(\mu \) and \(\nu \) on \(3\mathbb {P}\) by iterating this construction to obtain ‘iterated’ outer structures.

Definition 4.10

(Iterated outer structures) Let X be a Banach space. Given an X-size S on \((3\mathbb {P},\mathbb {T})\), for all \(q \in (0,\infty )\) we define an X-size on \((3\mathbb {P},\mathbb {D})\) by

(4.6)

It is straightforward to verify that this is indeed an X-size on \((3\mathbb {P},\mathbb {D})\), and thus is an iterated outer structure on \(3\mathbb {P}\), inducing iterated outer-\(L^p\)quasinorms for all \(p \in (0,\infty ]\).

The following iterated outer Hölder inequality is a straightforward consequence of the ‘non-iterated’ outer Hölder inequality of Proposition 4.3.

Corollary 4.11

(Hölder inequality for iterated outer-\(L^{p}\) spaces) Let \(X_0, X_1, X_2,X\) be Banach spaces, and let \(\Pi ' :X_0\times X_1 \times X_2 \rightarrow X\) be a bounded trilinear form.Footnote 11 Let S be a X-size on \((3\mathbb {P},\mathbb {T})\), and for each \(u \in \{0,1,2\}\), let \(S_u\) be an \(X_u\)-size on \((3\mathbb {P},\mathbb {T})\) such that the size-Hölder inequality

$$\begin{aligned} \Vert \Pi '(F_0,F_1,F_2) \Vert _{S} \lesssim \prod _{u=0}^2 \Vert F_u \Vert _{S_{u}} \qquad \forall F_u \in \mathscr {B}(3\mathbb {P};X_u^{3}) \end{aligned}$$

holds. Then for all \(p_u, q_u \in [1,\infty ]\),

where \(p^{-1} = \sum _{u=0}^2 p_u^{-1}\) and \(q^{-1} = \sum _{u=0}^2 q_u^{-1}\).

We return to consideration of our trilinear form \(\Pi :X_0\times X_1 \times X_{2} \rightarrow \mathbb {C}\). Define an ‘extended’ trilinear form \(\Pi ^{*}:X_{0}^{3}\times X_{1}^{3}\times X_{2}^{3}\rightarrow \mathbb {C}\) by

$$\begin{aligned} \Pi ^{*}\Bigl ( (x_{0,0},x_{0,1},x_{0,2}),(x_{1,0},x_{1,1},x_{1,2}),(x_{2,0},x_{2,1},x_{2,2}) \Bigr ):= \Pi (x_{0,0},x_{1,1},x_{2,2}). \end{aligned}$$

The most important result of this section is the following size-Hölder inequality for the randomised sizes and the deterministic size \(S^1\).

Proposition 4.12

(Size-Hölder) Let \(X_0,X_1,X_2\), \(\Pi \), and \(\Pi ^*\) be as above. Then

$$\begin{aligned} \Vert \Pi ^{*}(F_0,F_1, F_2) \Vert _{S^1(T)} \le \prod _{u=0}^{2} \Vert F_u\Vert _{\mathbb {S}(T)} \qquad \forall T \in \mathbb {T}, \, \forall F_u \in \mathscr {B}(3\mathbb {P};X_u^{3}). \end{aligned}$$

Proof

First note that

$$\begin{aligned} \begin{aligned} \sum _{\mathbf {P} \in T} \Bigl |\Pi ^{*}\bigl (F_0(\mathbf {P}), F_1(\mathbf {P}), F_2(\mathbf {P})\bigr )\Bigr |\,|I_{\mathbf {P}}|&= \sum _{\mathbf {P} \in T} \Bigl |\Pi \bigl (F_0(\mathbf {P}_{0}), F_1(\mathbf {P}_{1}), F_2(\mathbf {P}_{2})\bigr )\Bigr |\,|I_{\mathbf {P}}| \\&\le \sum _{u=0}^2 \sum _{\mathbf {P} \in T^u} \Bigl |\Pi \bigl (F_0(\mathbf {P}_{0}), F_1(\mathbf {P}_{1}), F_2(\mathbf {P}_{2})\bigr )\Bigr |\, |I_{\mathbf {P}}|, \end{aligned} \end{aligned}$$

so it suffices to fix \(u \in \{0,1,2\}\) and deal with the summands in the last entry individually. We concentrate on the case \(u=0\); the other cases are analogous. We restrict the sum over \(\mathbf {P}\) to

$$\begin{aligned} 3\mathbb {P}_{N}=\Bigl \{\mathbf {P}\in 3\mathbb {P}:|I_{\mathbf {P}}|>2^{-N} \Bigr \} \end{aligned}$$

and we look for a bound independent of N, allowing us to conclude by standard limiting arguments. For ease of notation we set \(F_{0}^{N}(\mathbf {P}):=\mathbb {1}_{3\mathbb {P}^{N}}(\mathbf {P})F_{0}(\mathbf {P})\).

Fix a normalised sequence \(a \in \ell ^\infty (T^u;\mathbb {C})\) and estimate by duality

$$\begin{aligned} \begin{aligned}&\sum _{\mathbf {P} \in T^0} \Bigl |\Pi \bigl ( F_0^{N}(\mathbf {P}_{0}), F_1(\mathbf {P}_{1}), F_2(\mathbf {P}_{2})\bigr )\Bigr |\, |I_{\mathbf {P}}| \\&\quad = \sum _{\mathbf {P} \in T^0} a_{\mathbf {P}} \Pi \bigl (F_0^{N}(\mathbf {P}_{0}), F_1(\mathbf {P}_{1}), F_2(\mathbf {P}_{2})\bigr )\, |I_{\mathbf {P}}| \\&\quad \le \int _{I_T} \sum _{\mathbf {P} \in T^0} a_{\mathbf {P}} \Pi \bigl ( \mathfrak {d}F_0^{N}(\mathbf {P}_{0}) , F_1(\mathbf {P}_{1}), F_2(\mathbf {P}_{2})\bigr ) \mathbb {1}_{I_\mathbf {P}}(x) \, \mathrm {d}x \\&\qquad + \sum _{\mathbf {P} \in T^0} a_{\mathbf {P}} \Pi \Bigl ( \Bigl \langle \sum _{v=0}^{2} \sum _{\begin{array}{c} \mathbf {Q}\in T^{v} \\ |I_{\mathbf {P}}|>|I_{\mathbf {Q}}| \end{array}} \mathfrak {d}F_{0}^{N}(\mathbf {Q}_{v}) w_{\mathbf {Q}_{v}} |I_{\mathbf {Q}}| ;w_{\mathbf {P}_{0}}\Bigr \rangle , F_1(\mathbf {P}_{1}) , F_2(\mathbf {P}_{2})\Bigr ) \, |I_{\mathbf {P}}|. \end{aligned} \end{aligned}$$

We bound the first summand as follows:

$$\begin{aligned} \begin{aligned}&\int _{I_T} \sum _{\mathbf {P} \in T^0} a_{\mathbf {P}} \Pi \bigl ( \mathfrak {d}F_0(\mathbf {P}_{0}) , F_1(\mathbf {P}_{1}), F_2(\mathbf {P}_{2})\bigr ) \mathbb {1}_{I_\mathbf {P}}(x) \, \mathrm {d}x \\&\quad \lesssim |I_{T}|\; \bigl \Vert \Vert \mathfrak {d}F_{0} \Vert _{X_{0}^{3}}\bigr \Vert _{S^{(\infty ,1)}}\;\bigl \Vert \Vert F_{1} \Vert _{X_{1}^{3}}\bigr \Vert _{S^{\infty } }\; \bigl \Vert \Vert F_{2} \Vert _{X_{2}^{3}}\bigr \Vert _{S^{\infty }}. \end{aligned} \end{aligned}$$

As for the second summand,

$$\begin{aligned} \begin{aligned}&\sum _{\mathbf {P} \in T^0} a_{\mathbf {P}}\Pi \Bigl ( \Bigl \langle \sum _{v=0}^{2} \sum _{\begin{array}{c} \mathbf {Q}\in T^{v} \\ |I_{\mathbf {P}}|>|I_{\mathbf {Q}}| \end{array}} \mathfrak {d}F_{0}^{N}(\mathbf {Q}_{v}) w_{\mathbf {Q}_{v}} |I_{\mathbf {Q}}| ; w_{\mathbf {P}_{0}} \Bigr \rangle , F_1(\mathbf {P}_{1}) , F_2(\mathbf {P}_{2}) \Bigr )\, |I_{\mathbf {P}}| \\&\quad = \sum _{\mathbf {P} \in T^0} a_{\mathbf {P}} \Bigl \langle \sum _{v=0}^{2} \sum _{\begin{array}{c} \mathbf {Q}\in T^{v} \\ |I_{\mathbf {P}}|>|I_{\mathbf {Q}}| \end{array}} \Pi \Bigl ( \mathfrak {d}F_{0}^{N}(\mathbf {Q}_{v}) , F_1(\mathbf {P}_{1}) , F_2(\mathbf {P}_{2})\Bigr ) w_{\mathbf {Q}_{v}} |I_{\mathbf {Q}}| ;w_{\mathbf {P}_{0}}\Bigr \rangle \, |I_{\mathbf {P}}| \\&\quad = \sum _{v=0}^{2} \sum _{\mathbf {Q} \in T^v} \sum _{\begin{array}{c} \mathbf {P}\in T^{0} \\ I_{\mathbf {P}}\supsetneq I_{\mathbf {Q}} \end{array}} \Pi \Bigl ( \mathfrak {d}F_{0}^{N}(\mathbf {Q}_{v}) , F_1(\mathbf {P}_{1}) , F_2(\mathbf {P}_{2})\Bigr ) a_{\mathbf {P}}\, b_{\mathbf {P},\mathbf {Q}} \, |I_{\mathbf {Q}}| \end{aligned} \end{aligned}$$

where the coefficients \(b_{\mathbf {P},\mathbf {Q}} := \langle w_{\mathbf {Q}_{v}}; w_{\mathbf {P}_{0}}\rangle \, |I_{\mathbf {P}}|\) satisfy \(|b_{\mathbf {P},\mathbf {Q}}|<1\). Letting \(\varepsilon _{\mathbf {P}}\) be independent Rademacher variables, we have

$$\begin{aligned}&\sum _{v=0}^{2} \sum _{\mathbf {Q} \in T^v} \sum _{\begin{array}{c} \mathbf {P}\in T^{0} \\ I_{\mathbf {P}}\supsetneq I_{\mathbf {Q}} \end{array}} \Pi \Bigl ( \mathfrak {d}F_{0}^{N}(\mathbf {Q}_{v}) , F_1(\mathbf {P}_{1}) , F_2(\mathbf {P}_{2})\Bigr ) a_{\mathbf {P}}\, b_{\mathbf {P},\mathbf {Q}} \, |I_{\mathbf {Q}}| \\&\quad = \sum _{v=0}^{2} \sum _{\mathbf {Q} \in T^v}\int _{I_{T}} \sum _{\begin{array}{c} \mathbf {P}\in T^{0} \\ I_{\mathbf {P}}\supsetneq I_{\mathbf {Q}} \end{array}} \Pi \Bigl ( \mathfrak {d}F_{0}^{N}(\mathbf {Q}_{v}) , a_{\mathbf {P}}\, b_{\mathbf {P},\mathbf {Q}} F_1(\mathbf {P}_{1}) \mathbb {1}_{I_{\mathbf {P}}}(x), F_2(\mathbf {P}_{2}) \mathbb {1}_{I_{\mathbf {P}}}(x)\Bigr ) \mathbb {1}_{I_{\mathbf {Q}}}(x) \, \mathrm {d}x \\&\quad = \sum _{v=0}^{2} \sum _{\mathbf {Q} \in T^v}\int _{I_{T}} \mathbb {E}\; \Pi \biggl ( \mathfrak {d}F_{0}^{N}(\mathbf {Q}_{v}) , \sum _{\begin{array}{c} \mathbf {P}\in T^{0} \\ I_{\mathbf {P}}\supsetneq I_{\mathbf {Q}} \end{array}} \varepsilon _{\mathbf {P}} a_{\mathbf {P}}\, b_{\mathbf {P},\mathbf {Q}} F_1(\mathbf {P}_{1}) \mathbb {1}_{I_{\mathbf {P}}}(x), \\&\qquad \qquad \qquad \qquad \qquad \qquad \quad \sum _{\begin{array}{c} \mathbf {P}'\in T^{0} \\ I_{\mathbf {P}'}\supsetneq I_{\mathbf {Q}} \end{array}} \varepsilon _{\mathbf {P}'} F_2(\mathbf {P}'_{2}) \mathbb {1}_{I_{\mathbf {P}'}}(x)\biggr ) \,\mathbb {1}_{I_{\mathbf {Q}}}(x) \, \mathrm {d}x\\&\quad \lesssim \sum _{v=0}^{2} \sum _{\mathbf {Q} \in T^v} \int _{I_{T}}\Vert \mathfrak {d}F_{0}^{N}(\mathbf {Q}_{v}) \Vert _{X_{0}} \prod _{u=1,2} \Bigl ( \mathbb {E}\Vert \sum _{\begin{array}{c} \mathbf {P}\in T^{0} \\ I_{\mathbf {P}}\supsetneq I_{\mathbf {Q}} \end{array}} \varepsilon _{\mathbf {P}} F_u(\mathbf {P}_{u}) \mathbb {1}_{I_{\mathbf {P}}}(x) \Vert _{X_{u}}^{2} \Bigr )^{1/2} \mathbb {1}_{I_{\mathbf {Q}}}(x) \, \mathrm {d}x \\&\quad \lesssim \sum _{v=0}^{2} \int _{I_{T}}\Bigl ( \sum _{\mathbf {Q} \in T^v} \Vert \mathfrak {d}F_{0}^{N}(\mathbf {Q}_{v}) \Vert _{X_{0}}\mathbb {1}_{I_{\mathbf {Q}}}(x)\Bigr ) \prod _{u=1}^2 \sup _{\mathbf {Q}\in T^{v}}\Bigl ( \mathbb {E}\Bigl \Vert \sum _{\begin{array}{c} \mathbf {P}\in T^{0} \\ I_{\mathbf {P}}\supsetneq I_{\mathbf {Q}} \end{array}} \varepsilon _{\mathbf {P}} F_u(\mathbf {P}_{u}) \mathbb {1}_{I_{\mathbf {P}}}(x) \Bigr \Vert _{X_{u}}^{2} \Bigr )^{1/2} \, \mathrm {d}x. \end{aligned}$$

Finally, applying Cauchy–Schwartz to the last entry we obtain

$$\begin{aligned} \begin{aligned}&\sum _{\mathbf {P} \in T^0} a_{\mathbf {P}}\Pi \Bigl ( \Bigl \langle \sum _{v=0}^{2} \sum _{\begin{array}{c} \mathbf {Q}\in T^{v} \\ |I_{\mathbf {P}}|>|I_{\mathbf {Q}}| \end{array}} \mathfrak {d}F_{0}^{N}(\mathbf {Q}_{v}) w_{\mathbf {Q}_{v}} |I_{\mathbf {Q}}| ; w_{\mathbf {P}_{0}} \Bigr \rangle , F_1(\mathbf {P}_{1}) , F_2(\mathbf {P}_{2}) \Bigr )\, |I_{\mathbf {P}}| \\&\quad \lesssim \big \Vert \Vert \mathfrak {d}F_{0} \Vert _{X_{0}^{3}} \big \Vert _{S^{(\infty ,1)}(T)} \Vert F_{1} \Vert _{\mathbb {S}(T)} \Vert F_{2} \Vert _{\mathbb {S}(T)} \end{aligned} \end{aligned}$$

concluding the proof. \(\square \)

This Hölder inequality, combined with Radon–Nikodym domination, leads to the following result.

Corollary 4.13

Let \(X_0,X_1,X_2\), \(\Pi \), and \(\Pi ^*\) be as above. Let \((p_0,p_1,p_2)\) and \((q_0,q_1,q_2)\) be Hölder triples of exponents. Then

$$\begin{aligned} \sum _{\mathbf {P} \in 3\mathbb {P}} |\Pi ^*(F_0(\mathbf {P}),F_1(\mathbf {P}),F_2(\mathbf {P}))||I_{\mathbf {P}}| \lesssim \prod _{v=0}^2 \Vert F_v \Vert _{L^{p_v}_\mu \mathbb {S}} \end{aligned}$$

and

(4.7)

for all \(F_v \in \mathscr {B}(3\mathbb {P};X_v^3)\).

Proof

The first estimate follows from combining the outer Hölder inequality (Proposition 4.3) with Radon–Nikodym domination (Proposition 4.4), using Proposition 4.12. For the second, we have

$$\begin{aligned} \Vert \Pi ^*(F_0,F_1,F_2) \Vert _{L^1_\mu S^1} \lesssim \prod _{v=0}^2 \Vert F_v \Vert _{L^{q_v}_\mu \mathbb {S}_{v}} \end{aligned}$$

as a consequence of the outer Hölder inequality. By multiplying by characteristic functions of strips this implies

(4.8)

For each \(F :3\mathbb {P} \rightarrow \mathbb {C}\) and each strip \(D \in \mathbb {D}\), Radon–Nikodym domination yields

so that

Applying Radon–Nikodym domination and the iterated outer Hölder inequality (Corollary 4.11) completes the proof. \(\square \)

Remark 4.14

The tritile form associated to \(\Pi :X_0 \times X_1 \times X_2 \rightarrow \mathbb {C}\) can be written as

$$\begin{aligned} \Lambda _\Pi (f_0,f_1,f_2) = \sum _{\mathbf {P} \in 3\mathbb {P}} \Pi ^*\Big (\mathcal {E}[f_0](\mathbf {P}), \mathcal {E}[f_1](\mathbf {P}), \mathcal {E}[f_2](\mathbf {P})\Big ), \end{aligned}$$

so by Corollary 4.13 we have

Thus given a Hölder triple \((p_u)_{u=0}^2\), in order to prove the \(L^p\)-bounds

$$\begin{aligned} |\Lambda _\Pi (f_0,f_1,f_2)| \lesssim \prod _{u=0}^2 \Vert f_u\Vert _{L^{p_u}(\mathbb {W};X_u)} \qquad \forall f_u \in \mathscr {S}(\mathbb {W};X_u), \end{aligned}$$

it suffices to find a Hölder triple \((q_0,q_1,q_2)\) such that

for all \(u \in \{0,1,2\}\).

4.3 Size Domination

In this section we prove a ‘size domination’ theorem, which allows us to control the randomised size \(\mathbb {S}\) of an embedded function \(\mathcal {E}[f]\) by the deterministic size \(S^\infty \). This uses the UMD property of the Banach space under consideration; this property is not used anywhere else. Thus in proving Theorem 1.1 we may replace the \(\mathbb {S}\) with \(S^\infty \), which makes life easier.

Theorem 4.15

Let X be a UMD Banach space. Then for all convex \(\mathbb {A}\subset 3\mathbb {P}\),

$$\begin{aligned} \bigl \Vert \mathbb {1}_{\mathbb {A}} \mathcal {E}[f]\bigr \Vert _{\mathbb {S}} \lesssim \bigl \Vert \mathbb {1}_{\mathbb {A}} \Vert \mathcal {E}[f] \Vert _{X^{3}}\bigr \Vert _{S^\infty }, \qquad \forall f \in \mathscr {S}(\mathbb {W};X) \end{aligned}$$

with implicit constant independent of \(\mathbb {A}\).

Convexity of a set of tritiles is defined in Definition 2.10. Any of the standard norms on \(X^3\) will do the job here, but we use the \(\ell ^\infty \)-norm

$$\begin{aligned} \Vert (x_0,x_1,x_2) \Vert _{X^{3}} = \sup _{u \in \{0,1,2\}} \Vert x_u \Vert _{X}. \end{aligned}$$

We will prove Theorem 4.15 later in the section. First we show how it implies outer-\(L^{p}\) quasinorm bounds. The argument is standard, but we include it to show the role played by convexity.

Corollary 4.16

Let X be a UMD Banach space. Then for all convex \(\mathbb {A}\subset 3\mathbb {P}\), all \(u \in \{0,1,2\}\), and all \(p,q \in (1,\infty ]\), we have the bounds

$$\begin{aligned} \bigl \Vert \mathbb {1}_{\mathbb {A}}\,\mathcal {E}[f] \bigr \Vert _{L_\mu ^p \mathbb {S}} \lesssim \bigl \Vert \mathbb {1}_{\mathbb {A}} \Vert \mathcal {E}[f] \Vert _{X^{3}} \bigr \Vert _{L_\mu ^p S^{\infty }} \qquad \forall f \in \mathscr {S}(\mathbb {W};X) \end{aligned}$$
(4.9)

and

(4.10)

Proof

Let us show that (4.9) holds. Assume that the right hand side of the inequality is finite. Then for each \(n\in \mathbb {Z}\) there exists a countable union of trees \(E_{n}=\bigcup _{i\in \mathbb {N}}T_{n,i}\) such that

$$\begin{aligned} \sum _{n\in \mathbb {Z}} \mu (E_{n}) 2^{pn} \lesssim _{p} \Vert \mathbb {1}_{\mathbb {A}} \Vert \mathcal {E}[f] \Vert _{X^{3}}\Vert _{L_\mu ^{p} S^{\infty }}^{p} \quad \text {and} \quad \Vert \mathbb {1}_{\mathbb {A}\setminus E_{n}} \Vert \mathcal {E}[f] \Vert _{X^{3}} \Vert _{S^\infty } \le 2^{n}. \end{aligned}$$

For each n and i the set \(3\mathbb {P}\setminus T_{n,i} \) is convex, and thus so is \(3\mathbb {P}\setminus E_{n} = \bigcap _{i\in \mathbb {N}}\bigl ( 3\mathbb {P}\setminus T_{n,i}\bigr )\). Theorem 4.15 implies that

$$\begin{aligned} \Vert \mathbb {1}_{\mathbb {A}\setminus E_{n} } \mathcal {E}[f] \Vert _{\mathbb {S}} \lesssim 2^{n}, \end{aligned}$$

so by the definition of the outer-\(L^{p}\) quasinorms it holds that

$$\begin{aligned} \Vert \mathbb {1}_{\mathbb {A}\setminus E_{n} } \mathcal {E}[f] \Vert _{L^{p}_{\mu }\mathbb {S}}^{p} \lesssim \sum _{n\in \mathbb {Z}} \mu (E_{n}) 2^{pn} \end{aligned}$$

as required. Similar reasoning yields the iterated bounds (4.10); it suffices to recall that strips and their complements are convex. \(\square \)

The proof of Theorem 4.15 relies on the following lemma.

Lemma 4.17

Let X be a Banach space and \(f \in \mathscr {S}(\mathbb {W};X)\). Let T be a tree and \(\mathbb {A}\) a finite convex set. Then there exists a function \(g \in \mathscr {S}(\mathbb {W};X)\) supported on \(I_T\) such that

$$\begin{aligned} \mathcal {E}[g](\mathbf {P})=\mathcal {E}[f](\mathbf {P}) \qquad \forall \mathbf {P} \in T\cap \mathbb {A}\end{aligned}$$
(4.11)

and

$$\begin{aligned} \Vert g\Vert _{L^{\infty }(I_{T};X)} \lesssim \Vert \mathbb {1}_{\mathbb {A}} \Vert \mathcal {E}[f] \Vert _{X^{3}} \Vert _{S^\infty (T)}. \end{aligned}$$
(4.12)

Proof

The set \(\mathbb {A}\cap T\) can be assumed to be non-empty, otherwise we can take \(g=0\). We first reason under the assumption that \(\mathbf {P}_{T}\in \mathbb {A}\). Let

$$\begin{aligned} \mathcal {J} := \bigcup _{\mathbf {P} \in \mathbb {A}} {{\,\mathrm{ch}\,}}(I_{\mathbf {P}}), \end{aligned}$$

where we recall that \({{\,\mathrm{ch}\,}}(J)\) denotes the set of triadic children of the interval J. By convexity of \(\mathbb {A}\), the set \(\mathcal {J}\) satisfies

$$\begin{aligned} J\in \mathcal {J},\, J\subset J'\subsetneq I_{T} \implies J'\in \mathcal {J}. \end{aligned}$$
(4.13)

Let \(\overline{\mathcal {J}}\) be the partition of \(I_{T}\) generated by \(\mathcal {J}\), i.e. the elements of \(\overline{\mathcal {J}}\) are the maximal triadic subintervals of \(I_{T}\), ordered by inclusion, that do not contain any interval of \(\mathcal {J}\) as a proper subset. The set \(\overline{\mathcal {J}}\) can also be characterised as the set of minimal elements of \(\mathcal {J}\) with respect to inclusion. It follows that for any \(J\in \overline{\mathcal {J}}\) there exists a unique \(\mathbf {P}(J)\) such that \(J \in {{\,\mathrm{ch}\,}}(I_{\mathbf {P}(J)}) \). Furthermore, for any \(\mathbf {P}\in \mathbb {A}\), the elements of \(\overline{\mathcal {J}}\) cannot contain \(I_{\mathbf {P}}\), and thus \(\{J\in \overline{\mathcal {{J}}}:J\subset I_{\mathbf {P}}\}\) partitions \(I_{\mathbf {P}}\).

For every \(J\in \overline{\mathcal {J}}\) let \(Q_{J}\) be the unique tile such that \(\xi _{T}\in \omega _{Q_{J}}\) and \(I_{Q_{J}}=J\), and set

$$\begin{aligned} g:=\sum _{J\in \overline{\mathcal {J}}} \langle f; w_{Q_{J}}\rangle w_{Q_{J}} |J|. \end{aligned}$$

Let us show that (4.11) holds. Given any \(\mathbf {P} \in \mathbb {A}\) the intervals \(\{J\in \overline{\mathcal {{J}}}:J\subset I_{\mathbf {P}}\}\) partition \(I_{\mathbf {P}}\), and since any such J does not properly contain any of the triadic children of \(I_{\mathbf {P}}\), it holds that \(|J|\le |I_{\mathbf {P}}|/3\) and thus \(|\omega _{\mathbf {P}}|=3/|I_{\mathbf {P}}|\le |\omega _{Q_{J}}|\). Since \(\xi _{T}\in \omega _{P_{J}}\cap \omega _{\mathbf {P}}\) this implies that

$$\begin{aligned} \mathbf {P}\subset \bigcup _{J\in \overline{\mathcal {J}},\, J\subset I_{P}} Q_{J}. \end{aligned}$$

By Lemma 2.4, for any \(u\in \{0,1,2\}\) it holds that

$$\begin{aligned} w_{\mathbf {P}_{u}} = \sum _{J\in \overline{\mathcal {J}},\; J\subset I_{\mathbf {P}}} \langle w_{\mathbf {P}_{u}}; w_{Q_{J}}\rangle w_{Q_{J}} |J|. \end{aligned}$$

It follows that

$$\begin{aligned} \langle f ; w_{\mathbf {P}_{u}}\rangle&= \sum _{J\in \overline{\mathcal {J}},\; J\subset I_{\mathbf {P}}} \langle f ; w_{Q_{J}} (x)\rangle \langle w_{Q_{J}}; w_{\mathbf {P}_{u}}\rangle |J| \\&= \langle g ; w_{\mathbf {P}_{u}}\rangle - \sum _{J\in \overline{\mathcal {J}},\; J\not \subset I_{\mathbf {P}}} \langle f ; w_{Q_{J}} (x)\rangle \langle w_{Q_{J}}; w_{\mathbf {P}_{u}}\rangle |J| \\&= \langle g ; w_{\mathbf {P}_{u}}\rangle , \end{aligned}$$

where the last equality holds by maximality of \(\overline{\mathcal {J}}\): if \(J\in \overline{\mathcal {J}}\) and \(J\not \subset I_{\mathbf {P}}\) then \(J\cap I_{\mathbf {P}}=\emptyset \).

Now we prove the bound (4.12). The wave packets \(w_{Q_{J}}\) for \(J\in \overline{\mathcal {J}}\) have disjoint time support so it suffices to show that

$$\begin{aligned} \Vert \langle f ; w_{Q_{J}}\rangle \Vert _{X} \lesssim \sup _{\begin{array}{c} \mathbf {P}\in \mathbb {A}\\ u \in \{0,1,2\} \end{array}} \bigl \Vert \mathcal {E}_{u}[f](\mathbf {P})\bigr \Vert _{X} \end{aligned}$$

for all such J. Notice that \(Q_{J}\subset \bigcup _{u=0}^{2}\mathbf {P}(J)_{u}\) with \(\mathbf {P}(J)\) as above, so using Lemma 2.4 we obtain that

$$\begin{aligned}&\Vert \langle f ; w_{Q_{J}}\rangle \Vert _{X} \le \sum _{u\in \{0,1,2\}} \Vert \langle f; w_{\mathbf {P}(J)_{u}}\rangle \Vert _{X} \, \bigl |\langle w_{Q_{J}}; w_{\mathbf {P}(J)_{u}}\rangle \bigr |\, |I_{\mathbf {P}(J)_{u}}| \\&\quad \le \sum _{u\in \{0,1,2\}} \Vert \langle f; w_{\mathbf {P}(J)_{u}}\rangle \Vert _{X} \lesssim \sup _{\begin{array}{c} \mathbf {P}\in \mathbb {A}\\ u \in \{0,1,2\} \end{array}} \bigl \Vert \mathcal {E}_{u}[f](\mathbf {P})\bigr \Vert _{X} \end{aligned}$$

as required.

Finally, suppose that \(\mathbf {P}_{T}\notin \mathbb {A}\). Let \((\mathbf {O}_i)_i\) be the maximal elements of \(T\cap \mathbb {A}\) with respect to the order \(\le \). The intervals \(I_{\mathbf {O}_{i}}\) are pairwise disjoint, and \(T\cap \mathbb {A}\) can be written as a union of disjoint sets \(\cup _{i}T(\mathbf {O}_{i})\cap \mathbb {A}\). Applying the above reasoning to each \(T(\mathbf {O}_{i})\) we obtain a set of disjointly supported functions \(g_{i}\) satisfying

$$\begin{aligned}&\mathcal {E}[g_{i}](\mathbf {P})=\mathcal {E}[f](\mathbf {P}) \qquad \forall \mathbf {P}\in T(\mathbf {O}_{i})\cap \mathbb {A}\\&\qquad \Vert g_{i}\Vert _{L^{\infty }(I_{T};X)} \lesssim \Vert \mathbb {1}_{\mathbb {A}} \Vert \mathcal {E}[f] \Vert _{X^{3}} \Vert _{S^\infty (T(\mathbf {O}_{i}))}. \end{aligned}$$

Setting \(g=\sum _{i}g_{i}\) completes the proof. \(\square \)

Recall that the randomised size \(\mathbb {S}\) is the sum of three types of terms,

$$\begin{aligned} \Vert \mathbb {1}_\mathbb {A}\mathcal {E}[f] \Vert _{\mathbb {S}(T)} = \big \Vert \mathbb {1}_\mathbb {A}\Vert \mathcal {E}[f]\Vert _{X^3} \big \Vert _{S^\infty (T)} + \big \Vert \Vert \mathfrak {d}(\mathbb {1}_\mathbb {A}\mathcal {E}[f])\Vert _{X^3} \big \Vert _{S^{(1,\infty )}(T)} + \sum _{u \in \{0,1,2\}} \Vert \mathcal {E}[f]\Vert _{\mathbb {S}_u(T)}. \end{aligned}$$

The first summand need not be estimated; we handle the remaining summands separately.

Proposition 4.18

(Defect size domination) Let X be a Banach space and \(\mathbb {A}\subset 3\mathbb {P}\) a convex set. Then for all trees T and all \(f \in \mathscr {S}(\mathbb {W};X)\),

$$\begin{aligned} \big \Vert \Vert \mathfrak {d}(\mathbb {1}_{\mathbb {A}} \mathcal {E}[f]) \Vert _{X^{3}} \big \Vert _{S^{(\infty ,1)}(T)}\lesssim \big \Vert \mathbb {1}_{\mathbb {A}} \Vert \mathcal {E}[f] \Vert _{X^{3}} \big \Vert _{S^{\infty }(T)}. \end{aligned}$$
(4.14)

Proof

Using the estimate (4.5) and the fact that \(\mathfrak {d}\mathcal {E}[f] = 0\), for all tritiles \(\mathbf {P}\) we have

$$\begin{aligned} \Vert \mathfrak {d}(\mathbb {1}_{A}\mathcal {E}[f])(\mathbf {P})\Vert _{X^{3}}\lesssim \sum _{\begin{array}{c} \mathbf {Q}\le \mathbf {P}\\ |I_{\mathbf {Q}}|=|I_{\mathbf {Q}}|/3 \end{array}}\,\Bigl ( \Vert \mathbb {1}_{\mathbb {A}}\mathcal {E}[f](\mathbf {P}) \Vert _{X^{3}}+ \Vert \mathbb {1}_{\mathbb {A}}\mathcal {E}[f](\mathbf {Q})\Vert _{X^{3}} \Bigr )\Bigl | \mathbb {1}_{\mathbb {A}}(\mathbf {P}) - \mathbb {1}_{\mathbb {A}}(\mathbf {Q})\Bigr |. \end{aligned}$$

Since \(\mathbb {A}\) is convex, for each \(x\in I_{T}\) there are at most two tritiles \(\mathbf {P}\) such that \(x\in I_{\mathbf {P}}\) and

$$\begin{aligned} \sum _{\begin{array}{c} \mathbf {Q}\le \mathbf {P}\\ |I_{\mathbf {Q}}|=|I_{\mathbf {Q}}|/3 \end{array}}\,\Bigl | \mathbb {1}_{\mathbb {A}}(\mathbf {P}) - \mathbb {1}_{\mathbb {A}}(\mathbf {Q})\Bigr |\ne 0 \end{aligned}$$

It follows that

$$\begin{aligned}&\big \Vert \Vert \mathfrak {d}(\mathbb {1}_{A}\mathcal {E}[f])(\mathbf {P})\Vert _{X^{3}} \big \Vert _{S^{(\infty ,1)}} \\&\quad \lesssim \big \Vert \Vert \mathbb {1}_{\mathbb {A}} \mathcal {E}[f] \Vert _{X^{3}} \Bigr \Vert _{S^{\infty }(T)} \Bigl \Vert \mathbf {P} \mapsto \, \sum _{\begin{array}{c} \mathbf {Q}\le \mathbf {P}\\ |I_{\mathbf {Q}}|=|I_{\mathbf {Q}}|/3 \end{array}}\, \Bigl | \mathbb {1}_{\mathbb {A}}(\mathbf {P}) - \mathbb {1}_{\mathbb {A}}(\mathbf {Q})\Bigr |\Bigr \Vert _{S^{(\infty ,1)}(T)} \\&\quad \lesssim \big \Vert \Vert \mathbb {1}_{\mathbb {A}} \mathcal {E}[f] \Vert _{X^{3}} \big \Vert _{S^{\infty }(T)} \end{aligned}$$

as required. \(\square \)

Proposition 4.19

(Lacunary size domination) Let X be a UMD Banach space and \(\mathbb {A}\subset 3\mathbb {P}\) a convex set. Then for all trees T, all \(f \in \mathscr {S}(\mathbb {W};X)\), and all \(u \in \{0,1,2\}\),

$$\begin{aligned} \Vert \mathbb {1}_\mathbb {A}\mathcal {E}[f]\Vert _{\mathbb {S}_u(T)} \lesssim \big \Vert \mathbb {1}_\mathbb {A}\Vert \mathcal {E}[f]\Vert _{X^3} \big \Vert _{S^\infty (T)}. \end{aligned}$$
(4.15)

Proof

Let \(3\mathbb {P}_{N}=\{P \in 3\mathbb {P}: 3^{-N}<|I_{P}|<3^{N}\}\). We show that

$$\begin{aligned} \Vert \mathbb {1}_{\mathbb {A}\cap 3\mathbb {P}_{N}} \mathcal {E}[f] \Vert _{\mathbb {S}_u(T)} \lesssim \sup _{\begin{array}{c} \mathbf {P} \in \mathbb {A}\cap T\cap 3\mathbb {P}_{N} \\ v \in \{0,1,2\} \end{array}} \Vert \langle f; w_{\mathbf {P}_v} \rangle \Vert _{X} \end{aligned}$$

for fixed N; the theorem follows by passing to the limit \(N \rightarrow \infty \). Since \(\mathbb {A}\cap T \cap 3\mathbb {P}_{N} \) is finite and convex, by Lemma 4.17 there exists a function \(g \in \mathscr {S}(\mathbb {W};X)\) supported on \(I_T\) such that

$$\begin{aligned}&\Vert g\Vert _{L^{\infty }(I_{T};X)} \lesssim \sup _{\begin{array}{c} \mathbf {P} \in T \cap \mathbb {A}\cap 3\mathbb {P}_{N} \\ v \in \{0,1,2\} \end{array}} \Vert \langle f ; w_{\mathbf {P}_v}\rangle \Vert _{X}\\&\langle g; w_{\mathbf {P}_{v}}\rangle = \langle f; w_{\mathbf {P}_{v}}\rangle \qquad \forall \mathbf {P} \in T \cap \mathbb {A}\cap 3\mathbb {P}_{N}, \;v\in \{0,1,2\}. \end{aligned}$$

Now fix \(v \in \{0,1,2\} \setminus \{u\}\). Since X is UMD we have by Proposition 3.4

$$\begin{aligned}&|I_T|^{-1/2} \mathbb {E}\biggl \Vert \sum _{\mathbf {P} \in T^u \cap \mathbb {A}\cap 3\mathbb {P}_{N} } \varepsilon _{\mathbf {P}} \langle f; w_{\mathbf {P}_v} \rangle w_{\mathbf {P}_v}\,|I_{\mathbf {P}}| \biggr \Vert _{L^2(I_T;X)} \\&\quad \lesssim |I_T|^{-1/2} \mathbb {E}\biggl \Vert \sum _{\mathbf {P} \in 3\mathbb {P}_N \cap T^u} \varepsilon _{\mathbf {P}} \langle g; w_{\mathbf {P}_v} \rangle w_{\mathbf {P}_v}\,|I_{\mathbf {P}}| \biggr \Vert _{L^2(I_T;X)} \\&\quad \lesssim |I_T|^{-1/2} \Vert g\Vert _{L^2(I_T;X)} \le \Vert g\Vert _{L^\infty (I_T;X)}, \end{aligned}$$

Summing this over \(v \ne u\) and using the \(L^\infty \)-bound on g yields (4.15). \(\square \)

Theorem 4.15 follows immediately from Propositions 4.18 and 4.19.

5 Proofs of the Embedding Bounds

In this section we prove Theorem 1.1: modulation invariant Carleson embedding bounds into iterated and non-iterated outer-\(L^p\) spaces. Before getting to the proofs themselves, we isolate a tile selection algorithm that appears multiple times in the proofs. Thanks to the size domination theorem (Theorem 4.15), we only need this simple tile selection procedure, rather than a more complicated tree selection procedure (as used for example in [23]).

Proposition 5.1

(Tile selection) Let \(F\in \mathscr {B}(3\mathbb {P};\mathbb {C})\). For any \(\lambda >0\) there exists a (possibly empty) set \(\mathbb {B}_{\lambda }\) of pairwise disjoint tritiles such that, if we set \(E_{\lambda } := \bigcup _{\mathbf {B} \in \mathbb {B}_{\lambda }} T(\mathbf {B})\),

  • for each \(\mathbf {B} \in \mathbb {B}_{\lambda }\), \(\bigl |F(\mathbf {B})\bigr | > \lambda \),

  • for all \(\mathbf {P} \in 3\mathbb {P}\setminus E_{\lambda }\), \(\bigl |F(\mathbf {P})\bigr | \le \lambda \).

Proof

Let \(\mathbb {M}_{\lambda } := \{\mathbf {P} \in 3\mathbb {P}: \bigl |F(\mathbf {P})\bigr | > \lambda \}\). If \(\mathbb {M}_{\lambda }=\emptyset \) then just set \(\mathbb {B}_{\lambda }=\emptyset \). Otherwise let \(\mathbb {B}_{\lambda }\subset \mathbb {M}_\lambda \) be the subset of tritiles in \(\mathbb {M}_\lambda \) that are maximal with respect to \(\le \). Then \(\mathbb {B}_\lambda \) satisfies the first required condition, and to see the second one simply notes that \(\mathbb {M}_{\lambda } \subset E_{\lambda }\). To see that \(\mathbb {B}_{\lambda }\) consists of pairwise disjoint tritiles, suppose that \(\mathbf {P},\mathbf {Q} \in \mathbb {B}_{\lambda }\) with \(\mathbf {P} \cap \mathbf {Q} \ne \emptyset \). Then either \(\mathbf {P}\le \mathbf {Q}\) or \(\mathbf {Q}\le \mathbf {P}\), and by maximality of \(\mathbf {P}\) and \(\mathbf {Q}\) in \(\mathbb {M}_\lambda \) we must have that \(\mathbf {P} = \mathbf {Q}\). \(\square \)

We are ready to prove our modulation invariant Carleson embedding bounds. We prove these with respect to the deterministic size \(S^\infty \), under an r-Hilbertian assumption; we will obtain Theorem 1.1 as a corollary of the size domination theorem. First we consider embeddings into non-iterated outer-\(L^p\) spaces. These are easier to prove, but they only hold for \(p > r\).

Theorem 5.2

Let X be a Banach space which is r-Hilbertian for some \(r \in [2,\infty )\).Footnote 12 Then the bounds

$$\begin{aligned} \big \Vert \Vert \mathcal {E}[f]\Vert _{X^3} \big \Vert _{L_\mu ^p S^\infty }&\lesssim \Vert f \Vert _{L^p(\mathbb {W};X)}\qquad \forall p\in (r,\infty ], \\ \big \Vert \Vert \mathcal {E}[f]\Vert _{X^3} \big \Vert _{L_\mu ^{r,\infty } S^\infty }&\lesssim \Vert f \Vert _{L^r(\mathbb {W};X)} \end{aligned}$$

hold for all \(f \in \mathscr {S}(\mathbb {W};X)\).

Proof

By interpolation (i.e. by Proposition 4.5) it suffices to establish weak endpoint bounds for \(p=\infty \) and \(p=r\). The \(p = \infty \) endpoint follows immediately from the definition of \(S^\infty \):

$$\begin{aligned} \big \Vert \Vert \mathcal {E}[f]\Vert _{X^3} \big \Vert _{L_\mu ^\infty S^\infty } =\sup _{\begin{array}{c} \mathbf {P}\in 3\mathbb {P}\\ u\in \{0,1,2\} \end{array}} \Vert \langle f;w_{\mathbf {P}_{u}}\rangle \Vert _{X}\le \Vert f \Vert _{L^{\infty }(\mathbb {W};X)}. \end{aligned}$$

For the weak outer-\(L^{r}\) endpoint, we need to show that for every \(\lambda > 0\) there exists a set \(E_\lambda \subset 3\mathbb {P}\) such that

$$\begin{aligned} \mu (E_\lambda ) \lesssim \lambda ^{-r} \Vert f\Vert _{r}^{r} \quad \text {and} \quad \big \Vert \mathbb {1}_{3\mathbb {P}\setminus E_\lambda } \Vert \mathcal {E}[f]\Vert _{X^3} \big \Vert _{S^\infty } \lesssim \lambda . \end{aligned}$$
(5.1)

Apply the tile selection (Proposition 5.1) at level \(\lambda \) to the function \(F(\mathbf {P})=\Vert \mathcal {E}[f](\mathbf {P}) \Vert _{X^{3}}\) to get a disjoint collection of tritiles \(\mathbb {B}_{\lambda }\) such that

$$\begin{aligned} \big \Vert \mathbb {1}_{3\mathbb {P}\setminus E_\lambda } \Vert \mathcal {E}_v[f]\Vert _{X^3} \big \Vert _{S^\infty } = \sup _{\begin{array}{c} \mathbf {P} \in T \cap 3\mathbb {P}\setminus E_\lambda \\ u \in \{0,1,2\} \end{array}} \Vert \langle f; w_{\mathbf {P}_u}\rangle \Vert _{X} \le \lambda \end{aligned}$$

with \(E_\lambda := \bigcup _{\mathbf {B} \in \mathbb {B}} T(\mathbf {B})\). It remains to show the bound on \(\mu (E_{\lambda })\).

For each \(\mathbf {B} \in \mathbb {B}_{\lambda }\) there exists a tile \(P_{\mathbf {B}}\in \mathbf {B}\) of the tritile \(\mathbf {B}\) such that \(\Vert \langle f; w_{P_{\mathbf {B}}} \rangle \Vert _{X} > \lambda \). The tritiles \(\mathbf {B}\) are pairwise disjoint and thus so are the tiles \(P_{\mathbf {B}}\); therefore we have

$$\begin{aligned} \mu (E_\lambda ) \le \sum _{\mathbf {B} \in \mathbb {B}_{\lambda }} |I_{\mathbf {B}}| \le \lambda ^{-r} \sum _{\mathbf {B} \in \mathbb {B}_{\lambda }} \Vert \langle f; w_{P_{\mathbf {B}}} \rangle \Vert _X^r |I_{P_{\mathbf {B}}}| \lesssim \lambda ^{-r} \Vert f\Vert _r^r \end{aligned}$$

where the last estimate follows from Proposition 3.8 applied to all finite subsets of \(\mathbb {B}_{\lambda }\). \(\square \)

Now we prove the embeddings into iterated outer-\(L^p\) spaces, which hold for all \(p > 1\), but which are much harder to prove.

Theorem 5.3

Let X be a Banach space which is r-Hilbertian for some \(r \in [2,\infty )\).Footnote 13 Then for all \(p \in (1,\infty )\) and \(q \in (\min (p,r)^\prime (r-1),\infty ]\) the bound

holds for all \(f \in \mathscr {S}(\mathbb {W};X)\).

Proof

Fix \(p \in (1,\infty )\). We will establish various endpoints depending on the position of p relative to r; interpolation will then yield the estimates that we claim. In all cases, we will first fix \(\lambda > 0\) and utilise the set \(K_\lambda \subset 3\mathbb {P}\) defined (dependent on p) as follows: write

$$\begin{aligned} \big \{x \in \mathbb {W}: M_{\min (p,r)}(\Vert f\Vert _{X})(x) > \lambda \big \} = \bigcup _{n \in \mathbb {N}} I_{n,\lambda } \end{aligned}$$

as a disjoint union of (maximal) triadic intervals, and then define

$$\begin{aligned} K_\lambda := \bigcup _{n \in \mathbb {N}} D(I_{n,\lambda }), \end{aligned}$$

where \(D(I_{n,\lambda })\) is the strip generated by \(I_{n,\lambda }\). Since the \(\min (p,r)\)-maximal function \(M_{\min (p,r)}\) is of weak type (pp), we then have

$$\begin{aligned} \nu (K_\lambda ) \le \sum _{n \in \mathbb {N}} |I_n| \lesssim \lambda ^{-p} \Vert f \Vert _{L^p(\mathbb {W};X)}. \end{aligned}$$

In each case it remains to show for an appropriate exponent q that

(5.2)

for all \(\lambda > 0\).

Endpoint 1: \(p < \infty \), \(q = \infty \). Here we need to show that

This follows from the definition of \(K_\lambda \):

Endpoint 2: \(p \ge r\), \(q = r\). We must show that for every strip \(D \in \mathbb {D}\) and every \(\tau > 0\) there exists \(E_\tau \subset 3\mathbb {P}\) such that

$$\begin{aligned} \mu (E_\tau ) \lesssim (\lambda /\tau )^{r} |I_D| \quad \text {and} \quad \big | \mathbb {1}_{(D\setminus K_\lambda )\setminus E_\tau } \Vert \mathcal {E}[f] \Vert _{X^3}\big |_{S^\infty } \lesssim \tau . \end{aligned}$$
(5.3)

It suffices to assume that \(\tau < \lambda \), for otherwise we can take \(E_\tau = \emptyset \) and the result follows from Endpoint 1.

Fix a strip D. We may assume \(I_{D}\not \subset I_{n,\lambda }\) for all \(n\in \mathbb {N}\), since otherwise \(D\setminus K_{\lambda }=\emptyset \) and there is nothing to prove. It thus holds that

(5.4)

For \(\mathbf {P} \in D\) we have that \(\langle f; w_{\mathbf {P}_v}\rangle = \langle f\mathbb {1}_{I_{D}}; w_{\mathbf {P}_v}\rangle \) for all \(v\in \{0,1,2\}\). The non-iterated version of the embedding, i.e. Theorem 5.2, then guarantees that

and we are done.

‘Endpoint’ 3: \(p < r\)and \(q> p'(r-1)\). We will show that for every strip \(D \in \mathbb {D}\) and every \(\tau > 0\) there exists \(E_\tau \subset 3\mathbb {P}\) such that

$$\begin{aligned} \mu (E_\tau ) \lesssim (\lambda /\tau )^{q} |I_D| \quad \text {and} \quad \big \Vert \mathbb {1}_{(D \setminus K_{\lambda })\setminus E_\tau } \Vert \mathcal {E}[f]\Vert _{X^3} \big \Vert _{S^\infty } \lesssim \tau . \end{aligned}$$
(5.5)

for any \(q>p'(r-1)\). The result of Endpoint 1 allows us to consider only q close to \(p'(r-1)\) and extend the result to all q by interpolation. Furthermore it suffices to assume that \(\tau < \lambda \), for otherwise we can take \(E_\tau = \emptyset \) and the result follows from the \(s=\infty \) bound.

Fix a strip D. As before we may assume \(I_D \not \subset I_{n,\lambda }\) for all \(n\in \mathbb {N}\), so if \(I_{n,\lambda }\) intersects \(I_D\), we must have \(I_{n,\lambda } \subsetneq I_D\). Henceforth we consider only those indices \(n\in \mathbb {N}\) for which \(I_{n,\lambda } \subsetneq I_D\), and we drop \(\lambda \) from the notation. For each \(k \in \mathbb {N}\) let \((J_{n,k,m})_{m\in \mathbb {N}}\) denote the maximal subintervals of \(I_{n}\) on which \(M_p(\Vert f\Vert _{X}) > 2^k\lambda \).

Let us decompose f by setting

$$\begin{aligned} \begin{aligned} f\mathbb {1}_{I_{D}}&=f_{-1}+ \sum _{k=0}^{\infty }f_{k}, \\ f_{-1}&=f\mathbb {1}_{I_{D}\setminus \bigcup _{n\in \mathbb {N}}I_{n}} \qquad f_{k} := \sum _{n\in \mathbb {N}}\sum _{m\in \mathbb {N}} f \mathbb {1}_{\Delta J_{n,k,m}} \quad \forall k \in \mathbb {N}, \end{aligned} \end{aligned}$$

with

$$\begin{aligned} \Delta J_{n,k,m}=J_{n,k,m} \setminus \Bigl ( \bigcup _{m' \in \mathbb {N}} \bigcup _{k' > k} J_{n,k',m'} \Bigr ). \end{aligned}$$

We have bounds

$$\begin{aligned} \begin{aligned}&\Vert f_{k} \Vert _{L^{\infty }(\mathbb {W};X)}\le 2^{k}\lambda \qquad \forall k\in \{-1\}\cup \mathbb {N}, \\&\qquad \sum _{m\in \mathbb {N}} |\Delta J_{n,k,m}| \le \sum _{m \in \mathbb {N}} |J_{n,k,m}| \lesssim 2^{-kp} |I_n| \qquad \forall n,k\in \mathbb {N}' \end{aligned} \end{aligned}$$
(5.6)

the latter follows from the weak \(L^{p}\) boundedness of \(M_{p}\).

This decomposition induces the following decomposition of the embedded function \(\mathcal {E}[f\mathbb {1}_{I_D}]\):

$$\begin{aligned} \mathcal {E}[f\mathbb {1}_{I_D}] = \sum _{k=-1}^\infty F_k, \qquad F_{k}:= \mathcal {E}[f_{k}] \quad \forall k\in \{-1\}\cup \mathbb {N}. \end{aligned}$$
(5.7)

Now fix \(\varepsilon > 0\), and for each \(k \ge -1\) apply the tile selection of Proposition 5.1 to \(F_k\) at level \(2^{-\varepsilon k}\tau \), yielding sets \(\mathbb {B}_k\) and \({\widetilde{E}}_k := \sum _{\mathbf {B} \in \mathbb {B}_k} T(\mathbf {B})\) of tritiles such that

$$\begin{aligned} \big \Vert \mathbb {1}_{(D \setminus K_\lambda )\setminus {\widetilde{E}}_k} \Vert F_k\Vert _{X^3} \big \Vert _{S^\infty } \lesssim 2^{-\varepsilon k}\tau \end{aligned}$$
(5.8)

and

$$\begin{aligned} \begin{aligned} \mu ({\widetilde{E}}_k) \le \sum _{\mathbf {B} \in {\widetilde{E}}_k} |I_{\mathbf {B}}|&\le (2^{-\varepsilon k}\tau )^{-r} \Vert f_k\Vert _{L^{r}(\mathbb {W};X)}^r\\&\le \tau ^{-r} 2^{kr\epsilon } \Vert f_{k} \Vert _{L^{\infty }(\mathbb {W};X)}^{r} \bigl |{{\,\mathrm{spt}\,}}(f_{k})\bigr |\lesssim \Bigl ( \frac{\lambda }{\tau }\Bigr )^{r} 2^{k(r(1+\epsilon )-p)} |I_D|. \end{aligned} \end{aligned}$$
(5.9)

On the other hand for any \(\mathbf {P}\in D\setminus K_{\lambda }\) one has that

$$\begin{aligned} \Vert F_{k}(\mathbf {P})\Vert _{X^{3}} = \sup _{\begin{array}{c} \mathbf {P}\in D\setminus K_{\lambda }\\ u\in \{0,1,2\} \end{array}}\bigl |\langle f_{k};w_{\mathbf {P}_{u}} \rangle \bigr | \lesssim 2^{k(1-p)} \lambda \qquad \forall k\in \{-1\}\cup \mathbb {N}. \end{aligned}$$

For \(k=-1\) this is a trivial consequence of (5.6), while for \(k\in \mathbb {N}\) notice that \(I_{n}\cap I_{\mathbf {P}}\ne \emptyset \) only if \(I_{n}\subset I_{\mathbf {P}}\) so

$$\begin{aligned} \begin{aligned} \bigl |\langle f_{k}; w_{\mathbf {P}_{u}}\rangle \bigr |&\le \Vert f_{k} \Vert _{L^{\infty }} \sum _{n:I_{n}\subset I_{\mathbf {P}}} \frac{\bigl |{{\,\mathrm{spt}\,}}{f_{k}}\cap I_{n}\bigr |}{|I_{\mathbf {P}}|}\lesssim \Vert f_{k} \Vert _{L^{\infty }(\mathbb {W};X)} \sum _{n:I_{n}\subset I_{\mathbf {P}}}\sum _{m\in \mathbb {N}} \frac{|\Delta J_{n,m,k}|}{|I_{\mathbf {P}}|} \\&\lesssim 2^{k}\lambda \sum _{n:I_{n}\subset I_{\mathbf {P}}} \frac{2^{-kp}|I_{n}|}{|I_{\mathbf {P}}|} \le 2^{k(1-p)}\lambda . \end{aligned} \end{aligned}$$

It follows that \({\widetilde{E}}_k\) is empty when \(2^{k(p-1-\epsilon )} \gtrsim \frac{\lambda }{\tau }\), i.e. when \(k \ge {\overline{k}}_{\lambda /\tau }\) with \(2^{{\overline{k}}_{\lambda /\tau }}\simeq (\lambda /\tau )^{\frac{1}{p-1-\epsilon }}\)

We conclude by setting \(E_\tau := \bigcup _{k=-1}^{{\overline{k}}_{\lambda /\tau }} {\widetilde{E}}_{k}\). Since \(r(1+\epsilon )-p>0\), estimate (5.9) gives that

$$\begin{aligned} \mu (E_{\tau })\le \sum _{k=-1}^{{\overline{k}}_{\lambda /\tau }}\mu ({\widetilde{E}}_{k}) \lesssim \Bigl ( \frac{\lambda }{\tau }\Bigr )^{r} 2^{{\overline{k}}_{\lambda /\tau }(r(1+\epsilon )-p)}|I_{D}| \lesssim \Bigl ( \frac{\lambda }{\tau }\Bigr )^{(r-1)\frac{p}{p-1-\epsilon }} |I_{D}| \lesssim \Bigl ( \frac{\lambda }{\tau }\Bigr )^{q} |I_{D}| \end{aligned}$$

where the last inequality holds since \(\epsilon >0\) is arbitrary and \(\tau \lesssim \lambda \). On the other hand

$$\begin{aligned} \big \Vert \mathbb {1}_{(D \setminus K_\lambda )\setminus E_{\tau }} \Vert F\Vert _{X^3} \big \Vert _{S^\infty } \lesssim \sum _{k=-1}^{\infty } \big \Vert \mathbb {1}_{(D \setminus K_\lambda )\setminus {\widetilde{E}}_k} \Vert F_k\Vert _{X^3} \big \Vert _{S^\infty } \lesssim \sum _{k=-1}^{\infty } 2^{-\varepsilon k}\tau \lesssim \tau \end{aligned}$$

and this concludes the proof. \(\square \)

Proof of Theorem 1.1

The argument is identical for the iterated and non-iterated embeddings, so we only show the iterated case. By Corollary 4.16, using that X is UMD, for any convex \(\mathbb {A}\subset 3\mathbb {P}\) it holds that

and by the iterated embeddings for \(S^\infty \) (Theorem 5.3), using that X is r-Hilbertian,

The first inequality above follows by the unconditionality property of sizes and thus of outer-\(L^{p}\) quasi-norms. This completes the proof. \(\square \)

6 Applications to the Tritile Form

Again we consider three Banach spaces \(X_0\), \(X_1\), \(X_2\) and a bounded trilinear form \(\Pi :X_0 \times X_1 \times X_2 \rightarrow \mathbb {C}\). Each \(X_u\) is assumed to be UMD and \(r_u\)-Hilbertian for some \(r_u \in [2,\infty )\). Recall that the tritile form is the trilinear form \(\Lambda _\Pi :\prod _{u=0}^2 \mathscr {S}(\mathbb {W};X_u) \rightarrow \mathbb {C}\) defined by

$$\begin{aligned} \Lambda _\Pi (f_0,f_1,f_2) := \sum _{\mathbf {P} \in 3\mathbb {P}} \Pi \Big ( \langle f_0; w_{\mathbf {P}_{0}} \rangle , \langle f_1; w_{\mathbf {P}_{1}} \rangle , \langle f_2; w_{\mathbf {P}_{2}} \rangle \Big ) |I_{\mathbf {P}}|. \end{aligned}$$

Using the embedding theorems from the previous section, we will establish \(L^p\)-bounds and sparse domination for \(\Lambda _\Pi \).

6.1 \(L^p\) Bounds

Proof of Theorem 1.2

The condition (1.12) guarantees the existence of a Hölder triple \((q_0,q_1,q_2)\) such that

$$\begin{aligned} q_u > \min (p_u,r_u)^\prime (r_u - 1) \end{aligned}$$

for all \(u \in \{0,1,2\}\), and then by Theorem 1.1 we have

(6.1)

for all u. By Remark 4.14 this suffices to prove the theorem. \(\square \)

The set of exponents \((p_u)_{u\in \{0,1,2\}}\) to which Theorem 1.2 applies (more precisely, the set of reciprocals \((1/p_u)_{u\in \{0,1,2\}}\)) can be characterised as the interior of a polygon. Let \(\beta _u = 1/p_u\) and \(\gamma _u := 1/r_u\). Say that \((p_0,p_1,p_2)\) is admissible if

$$\begin{aligned} \sum _{u=0}^2 \frac{1}{\min (p_u,r_u)^\prime (r_u-1)} > 1. \end{aligned}$$

We rewrite the left hand side of this condition as

$$\begin{aligned} \sum _{u=0}^2 \frac{1}{\min (p_u,r_u)^\prime (r_u-1)}&= \sum _{u=0}^{2} \frac{1}{\max (p_u^\prime ,r_u^\prime ) (\frac{1}{\gamma _u} - 1)} \\&= \sum _{u=0}^{2} \min (1-\beta _u, 1-\gamma _u)\frac{\gamma _u}{1-\gamma _u}. \end{aligned}$$

It follows that an admissible exponent \((p_0,p_1,p_2)\) exists only if

$$\begin{aligned} \rho := \bigg (\sum _{u=0}^{2} \gamma _u\bigg ) - 1 > 0, \end{aligned}$$
(6.2)

and we assume this condition in what follows. Consider the set of exponents

$$\begin{aligned} S := \bigg \{\beta \in (-\infty ,1]^3 : \sum _{u=0}^{2} \beta _u = 1, \, \sum _{u=0}^{2} \min (1-\beta _u, 1-\gamma _u)\frac{\gamma _u}{1-\gamma _u} > 1\bigg \}. \end{aligned}$$

This set is the interior of a polygon; the vertices of this polygon may be found by choosing \(w \in \{0,1,2\} \setminus \{u\}\) arbitrarily, setting \(\beta _u = \gamma _u\), and making \(\beta _w > \gamma _w\) as large as possible. Let v be the single element of \(\{0,1,2\} \setminus \{u,w\}\), so that \(1 - \beta _v = \beta _u + \beta _w = \gamma _u + \beta _w\). Then the second condition in the definition of S, for \(\beta _w > 1 - \gamma _v - \gamma _u\), becomes

$$\begin{aligned} \gamma _u + (1-\beta _w)\frac{\gamma _w}{1-\gamma _w} + \gamma _v > 1. \end{aligned}$$

Rearranging this gives

$$\begin{aligned} \beta _w < \gamma _w + \rho \bigg (\frac{1}{\gamma _w} - 1\bigg ), \end{aligned}$$

so the vertices of \(\partial S\) are given by the 6 points \(\beta \) in the Hölder triangle determined by their (uw)-components

$$\begin{aligned} (\beta _u,\beta _w) = (\kappa _u, \gamma _w + \rho (\gamma _w^{-1} - 1)) \qquad (u \ne w \in \{0,1,2\}). \end{aligned}$$
(6.3)

The region of exponents \((\beta _u) = (p_u^{-1})\) to which Theorem 1.2 applies is thus the interior of the convex hull of the 6 points in (6.3), intersected with the cube \((0,1)^3\) (noting that S generally contains some exponents with nonpositive entries).

Thus, comparing our result with that of Hytönen, Lacey, and Parissis [23], we see that we obtain the same \(L^p\) bounds for the tritile operator as they do for the quartile operator when restricted to the reflexive range \(p_u \in (1,\infty )\).Footnote 14

6.2 Sparse Domination

Proof of Theorem 1.3

We follow the argument in [39, §1.4.3].

We will show the following abstract sparse domination result: for any Hölder triple \((q_{u})_{u\in \{0,1,2\}}\) and any triple of exponents \(p_{u}\in [1,\infty )\), we have the bound

(6.4)

for any \(F_{u}\in \mathscr {B}(3\mathbb {P};X_{u}^3)\). This result suffices to prove the theorem; to see this, let \(F_{u}=\mathcal {E}[f_{u}]\) and notice that \(F_u = \mathcal {E}[f_u \mathbb {1}_I]\) on D(I). Thus by Theorem 1.1, choosing the Hölder triple \((q_u)_{u \in \{0,1,2\}}\) such that \(q_u > \min (p_u,r_u)'(r_u - 1)\) for each u (such a choice is possible by condition (1.12)), the bound

holds. Since \(|\Lambda _{\Pi }(f_{0},f_{1},f_{2})|\le \sum _{\mathbf {P}\in 3\mathbb {P}} \Bigl | \Pi ^*\bigl (F_{0}(\mathbf {P}),F_{1}(\mathbf {P}),F_{2}(\mathbf {P})\bigr )\Bigr |\,|I_{\mathbf {P}}| \) this implies the conclusion of the theorem.

It remains to show that (6.4) holds. The definition of the iterated outer-\(L^p\) quasinorms implies that for every strip D, there exists a subset \(K_D \subset D\) such that

(6.5)
(6.6)

Without loss of generality we may assume that \(K_{D}\) is a union of strips, i.e. \(K_{D}=\bigcup _{I\in \mathcal {J}(I_{D})}D(I)\), and that these strips are pairwise disjoint. Set \(\mathcal {G}_0 = \{I_{0}\}\) for some initial interval. Proceed iteratively: having defined a collection of intervals \(\mathcal {G}_n\), define

$$\begin{aligned} \mathcal {G}_{n+1} := \bigcup _{I\in \mathcal {G}_{n}} \mathcal {J}(I ) \end{aligned}$$

where \(\mathcal {J}(I)\) is the set of intervals defined by (6.5) and (6.6) (with \(D = D(I)\)). The bound (6.6) guarantees, by induction, that

$$\begin{aligned} \max _{I\in \mathcal {G}_{n}} |I|\le 2^{-n}|I_{0}| \end{aligned}$$

and thus \(\bigcap _{n\in \mathbb {N}}\bigcup _{I\in \mathcal {G}_{n}} I=\emptyset \). Let \((q_{u})_{u\in \{0,1,2\}}\) be any Hölder triple; using the Hölder inequality for \(L_\mu ^{q_{u}} \mathbb {S}\) gives us

(6.7)

Recall that for any \(I\in \mathcal {G}\) we have set \(K_{D(I)}=\bigcup _{I'\in \mathcal {J}(I)} D(I') \) so that (6.5) holds and guarantees the last bound.

We now show that \(\mathcal {G}=\bigcup _{n\in \mathbb {N}}\mathcal {G}_{n}\) is sparse with \(\Vert \mathcal {G} \Vert _{sp}\le 1\). The intervals of \(\mathcal {G}\) are nested in the sense that if \(J\in \mathcal {G}_{n+1}\) then there exists \(J'\supset J\) with \(J'\in \mathcal {G}_{n}\). First suppose \(I\in \mathcal {G}_{n_{0}}\) for some \(n_{0}\in \mathbb {N}\); it follows by induction from (6.6) that

$$\begin{aligned} \sum _{ \begin{array}{c} J\subset I\\ J\in \mathcal {G}_{n_{0}+k} \end{array}} |J| \le |I|/2^{k}. \end{aligned}$$

and thus

$$\begin{aligned} \sum _{ \begin{array}{c} J\subset I\\ J\in \mathcal {G} \end{array}} |J|=\sum _{k=1}^{\infty } \sum _{ \begin{array}{c} J\subset I\\ J\in \mathcal {G}_{n_{0}+k} \end{array}} |J| \le \sum _{k=0}^{\infty }|I|/2^{k}= |I|. \end{aligned}$$

If \(I\notin \mathcal {G}\), then there exists \(n_{0}\in \mathbb {N}\) and disjoint intervals \(I_{m}\subset I\), \(I\in \mathcal {G}_{n_{0}}\) such that

$$\begin{aligned} \{J\subset I :J\in \mathcal {G} \}=\bigcup _{m\in \mathbb {N}} \{J\subset I_{m}:J\in \mathcal {G}\}. \end{aligned}$$

Thus \(\Vert \mathcal {G} \Vert _{sp}\le 1\).

For any \(F_{u}\in \mathscr {B}(3\mathbb {P};X_{u}^3)\) with \(u\in \{0,1,2\}\) we can write

$$\begin{aligned} \sum _{\mathbf {P} \in 3\mathbb {P}} \Bigl | \Pi ^*\bigl (F_{0}(\mathbf {P}),F_{1}(\mathbf {P}),F_{2}(\mathbf {P})\bigr )\Bigr ||I_{\mathbf {P}}| \le \sup _{D_{0}\in \mathbb {D}}\sum _{\mathbf {P}\in D_{0}} \Bigl | \Pi ^*\bigl (F_{0}(\mathbf {P}),F_{1}(\mathbf {P}),F_{2}(\mathbf {P})\bigr )\Bigr ||I_{\mathbf {P}}| . \end{aligned}$$

Estimating the sum over \(D_0\) via (6.7) and using that \(\Vert \mathcal {G}\Vert _{sp} \le 1\) for the collections \(\mathcal {G}\) that we constructed shows that (6.4) holds, and completes the proof. \(\square \)