1 Introduction

In this paper we study the tail of the number of times a critical branching random walk on \(\mathbb Z^d\) returns to the origin. The result is most interesting in the upper-critical dimension \(d=4\), where we find that the local time has a stretched-exponential tail.

Theorem 1.1

Let \(d \ge 1\), let \(( B_n )_{n\ge 0}\) be a branching random walk on \(\mathbb Z^d\) whose offspring distribution \(\mu \) is critical, non-trivial, and sub-exponential, started with a single particle at the origin, and let L(0) be the total number of particles that visit the origin. Then

$$\begin{aligned} \mathbb P_{\mu ,0}\left( L(0) \ge n\right) = {\left\{ \begin{array}{ll} \Theta \bigl (n^{-2/(4-d)}\bigr ) &{} \qquad d<4\\ \exp \left[ -\Theta (\sqrt{n})\right] &{} \qquad d=4\\ \exp \left[ -\Theta (n)\right] &{} \qquad d > 4 \end{array}\right. } \end{aligned}$$

for every \(n\ge 1\).

Here, we say that the offspring distribution \(\mu \) is critical if it has mean 1, non-trivial if \(\mu (1)<1\), and sub-exponential if there exist positive constants C and c such that \(\mu (n) \le C e^{-cn}\) for every \(n\ge 1\). We use both “\(f(n)=\Theta (g(n))\) for every \(n\ge 1\)” and “\(f(n) \asymp g(n)\) for every \(n\ge 1\)” to mean that there exist positive constants cC depending only on the offspring distribution \(\mu \) and the dimension d such that \(cg(n) \le f(n) \le C g(n)\) for every \(n\ge 1\). Similar meaning applies to the symbols \(\preceq \) and \(\succeq \), so that, for example, “\(f_n(x) \preceq g_n(x)\) for every \(n\ge 1\) and \(x\in \mathbb Z^d\)” means that there exists a positive constant C depending only on the offspring distribution \(\mu \) and the dimension d such that \(f_n(x) \le C g_n(x)\) for every \(n \ge 1\) and \(x \in \mathbb Z^d\).

Our work is motivated in part by our hope to understand the analogous questions for the Abelian sandpile model [4, 18]. In this model, surveyed in [6], the total number of times the origin topples in an avalanche at equillibrium (equivalently, the total number of waves in an avalanche) is expected to behave in a roughly analogous way to the branching random walk local time at the origin, with a closer analogy expected to hold in dimensions \(d>4\). Currently, the distribution of the total number of waves in an avalanche remains poorly understood even in the high-dimensional case, where other aspects of the model are now fairly well-understood [3, 5, 7].

There is an extensive literature on critical branching random walk on \(\mathbb Z^d\), with works particularly relevant to the present paper including [2, 11,12,13,14, 21,22,23,24]. In light of this extensive literature, we were surprised to find that the tail of the local time had not previously been studied. The basic methods that we use (inductive analysis of moments via diagrammatic sums) are well-known to experts, but we have included a detailed exposition so that this paper could be used as an introduction to these techniques.

We also prove the following off-diagonal version of Theorem 1.1. We use the notation \(\langle x \rangle = 2 \vee d(0,x)\), where d(0, x) denotes the graph distance between 0 and x, to avoid dividing by zero.

Theorem 1.2

Let \(d \ge 1\), let \((B_n)_{n\ge 0}\) be a branching random walk on \(\mathbb Z^d\) whose offspring distribution is critical, non-trivial, and sub-exponential, started with a single particle at the origin, and let L(x) be the total number of particles that visit x. Then

$$\begin{aligned} \mathbb P_{\mu ,0}(L(x) \ge n) \asymp {\left\{ \begin{array}{ll} \min \left\{ n^{-2/(4-d)}, \langle x \rangle ^{-2} \right\} &{} \qquad d<4\\ \exp \bigg [ -\Theta \!\left( \min \left\{ \sqrt{n}, \frac{n}{\log \langle x \rangle }\right\} \right) \bigg ] \langle x \rangle ^{-2} \log ^{-1}\langle x \rangle &{} \qquad d=4\\ \exp \Big [ -\Theta (n)\Big ] \langle x \rangle ^{-d+2} &{} \qquad d > 4 \end{array}\right. } \end{aligned}$$

for every \(n \ge 1\) and \(x \in \mathbb Z^d\).

The proof of Theorem 1.2 relies on the asymptotics of the hitting probability

$$\begin{aligned} \mathbb P_{\mu ,0}\!\left( L(x) \ge 1\right) \asymp {\left\{ \begin{array}{ll} \langle x \rangle ^{-2} &{} d \le 3\\ \langle x \rangle ^{-2} \log ^{-1} \langle x \rangle &{} d =4\\ \langle x \rangle ^{-d+2} &{} d \ge 5 \end{array}\right. } \qquad \text { for every }x \in \mathbb Z^d: \end{aligned}$$
(1.1)

These estimates are due in the case \(d \ne 4\) to Le Gall and Lin [13, 14], who also proved the lower bound in the case \(d=4\), while the \(d=4\) upper bound in the case was proven by Zhu [23, 24]. (In fact the exact asymptotics of \(\mathbb P_{\mu ,0}\!\left( L(x) \ge 1\right) \) have also been established by the same authors, see [13, Theorem 7] and [21, 23].)

Remark 1.3

It is well-known that a critical branching random walk conditioned to survive forever visits the origin infinitely often if and only if \(d \le 4\) [2]. This is closely related to the fact that the conditional distribution of L(x) given \(L(x)>0\) is tight as \(x \rightarrow \infty \) if and only if \(d \ge 5\).

Remark 1.4

In the context of super-Brownian motion (which is a continuum analogue of critical branching random walk), Le Gall and Merle [15] studied the conditional distribution of the occupation measure \(\mathcal {Z}(B_1(x))\) of the unit ball \(B_1(x)\) for large x, given that this measure is positive. Their results are closely related to Theorem 1.2. In particular, they show that if \(d=4\) then the conditional distribution of the normalized occupation measure \(\mathcal {Z}(B_1(x))/\log |x|\) given that it is positive converges to an exponential distribution as \(x\rightarrow \infty \). It would be interesting to establish a version of their theorem in the discrete case.

Remark 1.5

It is natural to consider the distribution of L(x) for branching random walks on graphs other than \(\mathbb Z^d\). It should be straightforward to adapt the proof of Theorem 1.1 to bounded degree graphs that are d-Ahlfors regular and satisfy Gaussian heat kernel estimates. See e.g. [10, 20] for background on these notions. We restrict attention to the usual nearest-neighbour random walk on \(\mathbb Z^d\) for clarity of exposition.

2 Background

2.1 Branching random walk

Let us now very briefly define the model, referring the reader to e.g. [17, 21] for more details on branching processes and Galton-Watson trees. Given \(d\ge 1\), an offspring distribution \(\mu \) (i.e., a probability measure on \(\{0,1,\ldots \}\)), and a point \(x \in \mathbb Z^d\) we write \(\mathbb P_{\mu ,x}\) for the law of a branching random walk \((B_n)_{n\ge 0}\) on \(\mathbb Z^d\) with offspring distribution \(\mu \) started with a single particle at x. More precisely, \((B_n)_{n\ge 0}\) is a Markov chain whose state space is the set of finitely supported functions \(\mathbb Z^d \rightarrow \{0,1\ldots \}\), where \(B_0(y) = \mathbb {1}(y=x)\) and where we think of \(B_n(y)\) as the number of particles occupying the point y at generation n. At each time step, each particle splits into a random number of offspring particles independently at random according to the offspring distribution \(\mu \), and each offspring particle immediately performs an independent simple random walk step. We define the local time \(L_n(x) = \sum _{m =0}^n B_m(x)\) to be the total number of particles that occupy the site x up to time n, and similarly define the limit \(L(x) = \sum _{m=0}^\infty B_m(x)\).

Alternatively, we may construct branching random walk by first taking a Galton-Watson tree T with offspring distribution \(\mu \), which encodes the genealogy of the particles of the branching random walk, letting \(X:V(T) \rightarrow \mathbb Z^d\) be a uniform random graph homomorphism from T into \(\mathbb Z^d\) mapping the root to x (i.e., a simple random walk on \(\mathbb Z^d\) started at x and indexed by T), and letting \(B_n(y) = \#\{v \in \partial T_n : X(v) =y \}\) for every \(n\ge 0\) and \(y\in \mathbb Z^d\). We write \(\partial T_r\) for the set of vertices of T at distance exactly r from the root. It is easily seen that if \(\mu \) is critical then \(\mathbb E_{\mu ,0}[\# \partial T_r]= 1\) for every \(r\ge 0\). Moreover, if \(\mu \) is critical, non-trivial, and has finite variance \(\sigma ^2\), then Kolmogorov’s estimate states that

$$\begin{aligned} \mathbb P_{\mu ,0}(\partial T_r \ne \emptyset ) \sim \frac{2}{\sigma ^2r} \qquad \text { as }r \rightarrow \infty . \end{aligned}$$
(2.1)

This estimate was proven by Kolmogorov under a third moment assumption [9], and in full generality by Kesten, Ney, and Spitzer [8]; see [16] for a modern proof.

2.2 Random walk estimates

We now briefly recall the relevant background concerning random walk on \(\mathbb {Z}^d\), referring the reader to e.g. [10, 20] for further background. Let \(p_n(u,v)\) denote the n-step transition probabilities of simple random walk on \(\mathbb {Z}^d\). The Gaussian heat kernel estimates state that

$$\begin{aligned} p_n(x,y) + p_{n+1}(x,y) \asymp n^{-d/2}\exp \left[ -\Theta \Bigl ( d(x,y)^2/n \Bigr ) \right] \end{aligned}$$
(2.2)

for every \(x,y \in \mathbb Z^d\) and \(n \ge 1\), where d(xy) denotes the graph distance between x and y. (Note that the constants in the \(\Theta \) notation may differ for the lower and upper bounds.) Note that \(p_n(x,y)=0\) if n has a different parity to d(xy). In particular, we have that

$$\begin{aligned} p_{2n}(x,x) \asymp n^{-d/2} \end{aligned}$$
(2.3)

for every \(x\in \mathbb Z^d\) and \(n\ge 1\). If \(d \ge 3\), the Gaussian heat kernel estimates can be integrated over time to obtain that the Green’s function \(\mathbf {G}(u,v)=\sum _{n\ge 0} p_n(u,v)\) satisfies

$$\begin{aligned} \mathbf {G}(u,v) \asymp d(u,v)^{-d+2} \end{aligned}$$
(2.4)

for every \(u,v \in \mathbb Z^d\).

3 Diagrammatic expansion of moments

In this section we discuss how the moments of the branching random walk local time may be expanded in terms of diagrammatic sums, and then prove a recursive inequality that may be used to bound these sums. This basic methodology is well-known to experts, see e.g. [15, eq. 6] for an application to super-Brownian motion, and [1] for related techniques in percolation.

Recall that a rooted plane tree is a locally finite tree with a distinguished root vertex and a distinguished linear ordering of the children of each vertex; an isomorphism of trees is an isomorphism of rooted plane trees if it preserves this additional data. Note that a rooted plane tree cannot have any nontrivial automorphisms. We may consider a Galton-Watson tree T to be a rooted plane tree by picking a uniform random linear ordering of the children of every vertex.

Let \(k\ge 0\). We define a k-labelled rooted plane tree to be a finite rooted plane tree S with vertex set V(S), together with a (not necessarily injective) labelling function \(\ell :\{0,1,\ldots ,k\} \rightarrow V(S)\) mapping 0 to the root of S such that every leaf of S is labelled (i.e., is in the image of \(\ell \)). Note that leaves of S may have multiple labels, and that internal vertices may also have labels. Given a k-labelled rooted plane tree S, we write \(\partial V (S) = \ell (\{0,1,\ldots ,k\})\) and \(V^\circ (S)=V(S) \setminus \partial V(S)\) to denote the sets of labelled and unlabelled vertices of S. An isomorphism of rooted plane trees is an isomorphism of labelled rooted plane trees if it preserves the labelling.

We say that a k-labelled rooted plane tree is a (labelled) k-skeleton if every unlabelled vertex has at least two children.

In particular, up to isomorphism there is only one 0-skeleton, which has one vertex labelled 0 and no edges. Similarly, there are exactly two isomorphism classes of 1-skeletons, which have one and two vertices respectively. For each \(k\ge 0\), we let \(\mathcal {S}_k\) be a set of isomorphism class representatives for the set of labelled k-skeletons and let \({\mathcal {H}}_k\) be a set of isomorphism class respresentatives for the set of k-labelled rooted plane trees.

We will use the modified Green’s function

$$\begin{aligned} {\tilde{\mathbf {G}}}(x,y)&= \sum _{k\ge 1} p_k(x,y) = \mathbf {G}(x,y)-\mathbb {1}(x=y), \end{aligned}$$

and

$$\begin{aligned} {\tilde{\mathbf {G}}}_n(x,y)&= \sum _{k= 1}^n p_k(x,y) = \mathbf {G}_n(x,y)-\mathbb {1}(x=y) \end{aligned}$$

for each \(x,y \in \mathbb Z^d\) and \(n\ge 1\).

For each \(k\ge 0\), each k-labelled rooted plane tree S, and each \(\mathbf {x} = (x_0,\ldots ,x_k) \in (\mathbb Z^d)^{k+1}\) we write \(\Lambda (\mathbf {x};S)=\Lambda (x_0,\ldots ,x_k;S)\) for the set \(\mathbf {y} = (y_u)_{u \in V(S)} \in (\mathbb Z^d)^{V(S)}\) such that \(y_{\ell (i)}=x_i\) for every \(0\le i \le k\). (This set is empty if \(\ell (i)=\ell (j)\) but \(x_i\ne x_j\).) When S is a k-skeleton, we define the S-diagram to be the function \(\mathbf {D}( \;\cdot \; ; S) : (\mathbb Z^d)^{k+1} \rightarrow [0,\infty ]\) given by

$$\begin{aligned} \mathbf {D}(\mathbf {x};S) = \mathbf {D}(x_0,\ldots ,x_k;S) = \sum _{\mathbf {y} \in \Lambda (\mathbf {x};S)} \prod _{u \sim v} {\tilde{\mathbf {G}}}(y_u,y_v), \end{aligned}$$

where the second product is over all unordered pairs of adjacent vertices in S. In particular, if S is the 0-skeleton then \(\mathbf {D}( \;\cdot \; ; S) \equiv 1\), while if S is the 1-skeleton with two vertices then \(\mathbf {D}( x,y ; S) \equiv {\tilde{\mathbf {G}}}(x,y)\). Similarly, for each \(k,n\ge 0\) and each k-skeleton S we define the truncated S-diagram to be the function \({\mathbf {D}}_n( \;\cdot \; ; S) : (\mathbb Z^d)^{k+1} \rightarrow [0,\infty ]\) given by

$$\begin{aligned} {\mathbf {D}}_n({\mathbf {x}};S) = \mathbf {D}_n(x_0,\ldots ,x_k;S) = \sum _{\mathbf {y} \in \Lambda (\mathbf {x};S)} \prod _{u \sim v} {\tilde{\mathbf {G}}}_n(y_u,y_v), \end{aligned}$$

where, as before, the second product is over all unordered pairs of adjacent vertices in S.

Recall that \(\mathbb E_{\mu ,x}\) denotes the law of a branching random walk \((B_n)_{n\ge 0}\) with offspring distribution \(\mu \) started with a single particle at x. Recall also that we write \(L_n(y) = \sum _{k=0}^n B_k(y)\) for the total number of particles that visit y up to time n, and write \(L(y)=\sum _{k=0}^\infty B_k(y)\) for the total number of particles that ever visit y. For each \(k\ge 0\), we define \(b_k\) to be the expectation of the binomial coefficient \(\left( {\begin{array}{c}\cdot \\ k\end{array}}\right) \)

under the offspring distribution \(\mu \), that is,

$$\begin{aligned} b_k = \sum _{n= k}^\infty \left( {\begin{array}{c}n\\ k\end{array}}\right) \mu (n). \end{aligned}$$

In particular, \(b_0=b_1=1\) when \(\mu \) is critical, \(b_k<\infty \) if and only if \(\mu \) has a finite kth moment, and \(b_k>0\) if and only if \(\sum _{n \ge k} \mu (n)>0\). In particular \(b_2>0\) whenever \(\mu \) is critical and non-trivial. For each vertex v in a rooted plane tree S, we write c(v) for the number of children of v.

Proposition 3.1

(Diagrammatic expansion of moments) Let \(\mu \) be critical and let \(d\ge 1\). We have that

$$\begin{aligned} \mathbb E_{\mu ,x_0}\left[ \prod _{i=1}^k L(x_i) \right]&= \sum _{S \in \mathcal {S}_k} \mathbf {D}(x_0,x_1,\ldots ,x_k;S) \prod _{v \in V(S)} b_{c(u)} \end{aligned}$$

and

$$\begin{aligned} \mathbb E_{\mu ,x_0}\left[ \prod _{i=1}^k L_n(x_i) \right]&\le \sum _{S \in \mathcal {S}_k} \mathbf {D}_n(x_0,x_1,\ldots ,x_k;S) \prod _{v \in V(S)} b_{c(u)} \end{aligned}$$

for every \(n,k\ge 0\) and \(x_0,\ldots ,x_k \in \mathbb Z^d\).

Proof

We first explain the appearance of the combinatorial term \(\prod b_{c(u)}\) in the proposition. Let T be the genealogical tree of B, and let X be the random embedding of T into \(\mathbb Z^d\). Let H be a k-labelled rooted plane tree. We say that a graph homomorphism \(\phi \) from H into the Galton-Watson tree T is an embedding if it is injective, maps the root of H to the root of T, and respects the plane structure of H and T in the sense that for every vertex v of H with children \(u_1,\ldots ,u_n\), the children \(\phi (u_1),\ldots ,\phi (u_n)\) of \(\phi (v)\) in T appear in the same linear order as \(u_1,\ldots ,u_n\) do in H. However, T may have additional vertices not corresponding to any vertex in H. It is easily seen by induction on the height of H that

$$\begin{aligned} \mathbb E_{\mu ,x_0}\left[ \#\!\left\{ \text {embeddings of } H\text { into }T\right\} \right] = \prod _{u \in H} b_{c(u)} \end{aligned}$$
(3.1)

for every finite rooted plane tree H. (This equality holds even if \(\mu \) is not critical.)

We begin with the first, non-trucated formula. Given a k-tuple of not necessarily distinct vertices \({\mathbf {v}}= (v_1,\dots ,v_k) \in V(T)^k\), let \(H({\mathbf {v}})\) be the k-labelled rooted plane tree spanned by the union of the geodesics between the root of T and the vertices \(v_1,\ldots ,v_k\), with labelling function defined by setting \(\ell (0)\) to be the root of T and setting \(\ell (i)=v_i\) for each \(1 \le i \le k\). We can write

$$\begin{aligned} \prod _{i=1}^k L(x_i)&= \#\Bigl \{{\mathbf {v}}\in V(T)^k : X(v_i)=x_i \; \forall 1 \le i \le k\Bigr \}. \\&= \sum _{H \in {\mathcal {H}}_k} \#\Bigl \{{\mathbf {v}}\in V(T)^k : H({\mathbf {v}}) \cong H,\; X(v_i)=x_i \; \forall 1 \le i \le k\Bigr \}. \end{aligned}$$

On the other hand, by definition of the embedding X we have that

$$\begin{aligned} \mathbb E\left[ \prod _{i=1}^k L(x_i) \bigg | T \right]&= \sum _{H \in {\mathcal {H}}_k}\#\Bigl \{{\mathbf {v}}\in V(T)^k : H({\mathbf {v}}) \cong H \Bigr \} \sum _{\mathbf {y} \in \Lambda (\mathbf {x},H)}\prod _{u \sim v} p_1(y_u,y_v). \\&= \sum _{H \in {\mathcal {H}}_k} \#\{\text {embeddings of }H\text { into }T\} \sum _{\mathbf {y} \in \Lambda (\mathbf {x},H)} \prod _{u \sim v} p_1(y_u,y_v), \end{aligned}$$

where \(p_1(\cdot ,\cdot )\) denotes the one-step transition probabilities for simple random walk on \(\mathbb Z^d\). Taking expectations over T and applying (3.1), we obtain that

$$\begin{aligned} \mathbb E_{\mu ,x_0}\left[ \prod _{i=1}^k L(x_i) \right] = \sum _{H \in {\mathcal {H}}_k} \sum _{y \in \Lambda (x,H)}\prod _{u \in H} b_{c(u)} \prod _{v = c(u)} p_1(y_u,y_v). \end{aligned}$$
(3.2)

For each H in \({\mathcal {H}}_k\), let \(S(H) \in \mathcal {S}_k\) denote the k-skeleton obtained from H by replacing each path whose interior vertices are unlabelled vertices of degree two by a single edge. Thus, for each \(S\in \mathcal {S}_k\), the set of \(H \in {\mathcal {H}}_k\) with \(S(H) = S\) is equal to the set of k-labelled rooted plane trees that can be obtained from S by replacing each edge with a path of arbitrary length. Since \(\mu \) is critical and \(b_1=1\), one may readily verify that

$$\begin{aligned} \sum _{\begin{array}{c} H \in {\mathcal {H}}_k\\ S(H)=S \end{array}} \sum _{y \in \Lambda (x,H)}\prod _{u \in H} b_{c(u)} \prod _{v = c(u)} p_1(y_u,y_v) = \sum _{y \in \Lambda (x,S)}\prod _{u \in V(S)} b_{c(u)} \prod _{v = c(u)} {\tilde{\mathbf {G}}}(y_u,y_v) \end{aligned}$$

for every \(S \in \mathcal {S}_k\) and \(x_0,x_1,\ldots ,x_k \in \mathbb Z^d\). The first claim follows from this together with (3.2).

The proof in the truncated case is fairly similar, and we give only a very brief outline. For each \(n \ge 0\) and \(k\ge 0\), let \({\mathcal {H}}_{n,k} \subset {\mathcal {H}}_k\) denote the set of k-labelled rooted plane trees with height at most n, and let \({\mathcal {H}}_{n,k}'\) denote the set of k-labelled rooted plane trees in which each path whose interior vertices are unlabelled vertices of degree two has length at most n. Clearly \({\mathcal {H}}_{n,k} \subset {\mathcal {H}}_{n,k}'\). We have by similar reasoning to above that

$$\begin{aligned} \mathbb E\left[ \prod _{i=1}^k L_n(x_i)\right]&= \sum _{H \in {\mathcal {H}}_{n,k}} \sum _{y \in \Lambda (x,H)} \prod _{u \in H} b_{c(u)} \prod _{v=c(u)}p_1(y_u,y_v)\\&\le \sum _{H \in {\mathcal {H}}_{n,k}'} \sum _{y \in \Lambda (x,H)} \prod _{u \in H} b_{c(u)} \prod _{v=c(u)}p_1(y_u,y_v)\\&= \sum _{S \in \mathcal {S}_k} \sum _{y \in \Lambda (x,S)} \prod _{u \in V(S)} b_{c(u)} \prod _{v=c(u)}{\tilde{\mathbf {G}}}_n (y_u,y_v) \end{aligned}$$

as claimed. \(\square \)

We next state and prove a recursive inequality that allows us to bound the diagrammatic sums arising in Lemma 3.1. For each \(k\ge 0\), let \(\mathcal {S}_k'\) be the set of k-skeletons whose labelling function is injective. We observe that for any tuple \({\mathbf {x}}\), the maximum \(\max _{S\in \mathcal {S}'_k} \mathbf {D}({\mathbf {x}};S)\) is invariant to permuting the elements of \({\mathbf {x}}\). Indeed, \(\mathbf {D}({\mathbf {x}};S)\) is invariant under applying the same permutation to both the entries of \({\mathbf {x}}\) and the labels of S. (If 0 is not a fixed point of the permutation, this requires one to change the root of S.) The symmetry of the random walk implies that such re-rooting also does not change \(\mathbf {D}\). In light of this, for each \(k\ge 1\) and \(x\in \mathbb Z^d\), we define

$$\begin{aligned} M_k(x) : = \max _{S \in \mathcal {S}_k'} \mathbf {D}(0,\ldots ,0,x;S) {=} \max _{S \in \mathcal {S}_k'} \mathbf {D}(x,0,\ldots ,0;S) = \max _{S \in \mathcal {S}_k'} \mathbf {D}(0,x,\ldots ,x;S), \end{aligned}$$

where the equality of these three expressions follow from the symmetry noted above. We could equivalently define \(M_k(x)\) by maximizing \(\mathbf {D}({\mathbf {x}};S)\) over all \(S \in \mathcal {S}_k'\) and all \({\mathbf {x}}\) which are a permutation of \((0,\dots ,0,x)\). Similarly, we define the truncated version

$$\begin{aligned} M_{k,n}(x) : = \max _{S \in \mathcal {S}_k'} \mathbf {D}_n(0,\ldots ,0,x;S) = \max _{S \in \mathcal {S}_k'} \mathbf {D}_n(x,0,\ldots ,0;S) = \max _{S \in \mathcal {S}_k'} \mathbf {D}_n(0,x,\ldots ,x;S) \end{aligned}$$

for each \(k\ge 0\) and \(n\ge 0\). Note that \(M_1(x)={\tilde{\mathbf {G}}}(0,x)\) and \(M_{1,n}(x)={\tilde{\mathbf {G}}}_n(0,x)\) for every \(x\in \mathbb Z^d\) and \(n\ge 0\).

Lemma 3.2

(Recursive inequality for the maximal diagram) Let \(d\ge 1\) and \(k\ge 2\). Then

$$\begin{aligned} M_k(x) \le \left[ 1 \vee {\tilde{\mathbf {G}}}(0,0)^{-1}\right] \max _{0<r<k} \left\{ \sum _{y \in \mathbb Z^d} M_{r}(y) M_{k-r}(y) {\tilde{\mathbf {G}}}(y,x) \right\} \end{aligned}$$
(3.3)

and

$$\begin{aligned} M_{k,n}(x) \le \left[ 1 \vee {\tilde{\mathbf {G}}}_n(0,0)^{-1}\right] \max _{0<r<k}\left\{ \sum _{y \in \mathbb Z^d} M_{r,n}(y) M_{k-r,n}(y) {\tilde{\mathbf {G}}}_n(y,x) \right\} . \end{aligned}$$
(3.4)

Note that the quantities \(1\vee {\tilde{\mathbf {G}}}_n(0,0)^{-1}\) and \(1 \vee {\tilde{\mathbf {G}}}(0,0)^{-1}\) are bounded above by \(p_2(0,0)^{-1}=2d\) when \(n\ge 2\). Be warned, however, that \(1\vee {\tilde{\mathbf {G}}}_n(0,0)^{-1}\) is infinite when \(n \in \{0,1\}\). Later in the paper we will be careful to avoid this case.

Proof of Lemma 3.2

We will prove (3.3), the proof of (3.4) being almost identical. It suffices to prove that

$$\begin{aligned}&M_k(x) \le \max _{0<r<k}\Bigl \{ M_{r}(x)M_{k-r}(x) \Bigr \} \,\vee \, M_{k-1}(0) {\tilde{\mathbf {G}}}(0,x) \, \vee \, \nonumber \\&\quad \max _{0<r<k}\left\{ \sum _{y \in \mathbb Z^d} M_{r}(y) M_{k-r}(y) {\tilde{\mathbf {G}}}(y,x) \right\} \end{aligned}$$
(3.5)

for every \(k\ge 2\). Indeed, the first and second terms are each clearly smaller than the third multiplied by \(M_1(0)^{-1}={\tilde{\mathbf {G}}}(0,0)^{-1}\) (consider the contributions to the sum in the third term from \(y=0\) and \(y=x\)).

Let \(k\ge 2\), let \(S\in \mathcal {S}_k'\), let \(x\in \mathbb Z^d\), and let \(\mathbf {x} =(0,\ldots ,0,x) \in (\mathbb Z^d)^{k+1}\). We consider three cases, which correspond to the three terms being maximized over in the inequality (3.5):

  1. 1.

    \(\ell (k)\) is not a leaf.

  2. 2.

    \(\ell (k)\) is a leaf and the parent of \(\ell (k)\) is in \(\partial V(S)\) (i.e. is labelled).

  3. 3.

    \(\ell (k)\) is a leaf and the parent of \(\ell (k)\) is in \(V^\circ (S)\) (i.e., is unlabelled).

Case 1 Let \(a \ge 1\) be the number of labelled vertices that are descendants of \(\ell (k)\) in S. Since \(\ell \) is injective, \(\ell (k)\) is not the root of S and \(a<k\).

Let \(S_1\) be the a-skeleton formed by \(\ell (k)\) and its descendants in S, where we consider \(\ell (k)\) to be the root of \(S_1\) and re-index the labels if necessary so that the labelling function has domain \(\{0,\ldots ,a\}\). Similarly, let \(S_2=(T_2,\ell _2)\) be the \((k-a)\)-skeleton obtained from S by deleting all the descendants of \(\ell (k)\), and re-indexing the labels so that the labelling function \(\ell _2\) has domain \(\{0,1,\ldots ,k-a\}\) and satisfies \(\ell _2(k-a)=\ell (k)\). (In both cases, the details of relabelling are not important.) Having done this, we observe that, by the definitions,

$$\begin{aligned}&\mathbf {D}(0,\ldots ,0,x;S) = \mathbf {D}(x,0,\ldots ,0;S_1)\mathbf {D}(0,\ldots ,0,x;S_2)\nonumber \\&\quad \le M_a(x)M_{k-a}(x). \end{aligned}$$
(3.6)

We deduce that if \(S \in \mathcal {S}_k'\) is such that \(\ell (k)\) is not a leaf of S then

$$\begin{aligned} \mathbf {D}(0,\ldots ,0,x;S) \le \max \Bigl \{ M_r(x)M_{k-r}(x) : 1 \le r \le k-1 \Bigr \}, \end{aligned}$$
(3.7)

which corresponds to the first term in (3.5).

Case 2 We may define a \((k-1)\)-skeleton \(S'\) by deleting \(\ell (k)\) from S.

The definitions then ensure that

$$\begin{aligned} \mathbf {D}(0,\ldots ,0,x;S) = \mathbf {D}(0,\ldots ,0;S') {\tilde{\mathbf {G}}}(0,x) \le M_{k-1}(0){\tilde{\mathbf {G}}}(0,x), \end{aligned}$$
(3.8)

which corresponds to the second term in (3.5).

Case 3 Let v be the (unlabelled) parent of \(\ell (k)\). Let a be the number of labelled descendants of v other than \(\ell (k)\). Since v is unlabelled it has at least two children, and therefore has \(a \ge 1\). Let \(S_1\) be the a-skeleton consisting of v and its descendants other than \(\ell (k)\), where we consider v to be the root of \(S_1\) and re-index the other labels as appropriate. Similarly, let \(S_2\) be the \((k-a)\)-skeleton obtained from S by deleting all the descendants of v (but not v itself), re-indexing all the remaining labelled vertices to have labels in \(\{0,\ldots ,k-a-1\}\), and giving v the label \(k-a\). (The details of how this is done are not important.) It follows from the definitions that

$$\begin{aligned} \mathbf {D}(0,\ldots ,0,x;S)= & {} \sum _{y\in \mathbb Z^d} \mathbf {D}(0,\ldots ,0,y;S_2) \mathbf {D}(y,0,\ldots ,0;S_1) {\tilde{\mathbf {G}}}(y,x) \\&\le \sum _{y\in \mathbb Z^d} M_a(y) M_{k-a}(y) {\tilde{\mathbf {G}}}(y,x). \end{aligned}$$

We deduce that if \(S\in \mathcal {S}_k'\) is such that \(\ell (k)\) is a leaf and the parent of \(\ell (k)\) is in \(V^\circ (S)\) then

$$\begin{aligned} \mathbf {D}(0,\ldots ,0,x;S) \le \max _{0<r<k} \left\{ \sum _{y\in \mathbb Z^d} M_r(y) M_{k-r}(y) {\tilde{\mathbf {G}}}(y,x) \right\} , \end{aligned}$$
(3.9)

which corresponds to the third term in (3.5).

Since one of the three cases above holds for every \(S \in \mathcal {S}_k'\), the claimed inequality (3.5) follows from (3.7), (3.8), and (3.9). \(\square \)

We now note that bounds on \(M_k\) and \(M_{k,n}\) yield bounds on all diagrams, i.e. also with non-injective labels. Indeed, suppose that \(S \in \mathcal {S}_k\) for some \(k\ge 1\) and that the labelling function of S is not injective. Let \(r=|\ell (\{0,\ldots ,k\})|-1\), let \(\sigma :\{0,\ldots ,r\} \rightarrow \{0,\ldots ,k\}\) be defined recursively by \(\sigma (0)=0\) and \(\sigma (i) = \min \{ j > \sigma (i-1) : \ell (j) \notin \ell (\{0,\ldots ,\sigma (i-1)\})\}\) for each \(1 \le i \le r\), and let \(S'\) be the r-skeleton with the same underlying rooted plane tree as S and with labelling function \(\ell '(i)=\ell (\sigma (i))\). Then it follows from the definitions that

$$\begin{aligned}&\mathbf {D}(x_0,x_1,\ldots ,x_k;S) = \mathbb {1}\Bigl (x_i=x_j \text { for every }0\le ,i,j \le k\hbox { with }\ell (i)=\ell (j)\Bigr )\\&\quad \mathbf {D}(x_0,x_{\sigma (1)},\ldots ,x_{\sigma (r)};S') \end{aligned}$$

for every \(x_0,x_1,\ldots ,x_k \in \mathbb Z^d\). In particular, it follows that

$$\begin{aligned} \max _{S \in \mathcal {S}_k} \mathbf {D}(0,\ldots ,0,x;S) \le \max _{0 \le r \le k} M_r(x) \end{aligned}$$
(3.10)

for every \(k\ge 0\) and \(x\in \mathbb Z^d\). Similar reasoning gives that

$$\begin{aligned} \max _{S \in \mathcal {S}_k} \mathbf {D}_n(0,\ldots ,0,x;S) \le \max _{0 \le r \le k} M_{r,n}(x) \end{aligned}$$
(3.11)

for every \(k\ge 0\), \(x\in \mathbb Z^d\), and \(n\ge 0\).

4 Low dimensions

In this section we prove the following proposition, which implies the case \(d<4\) of Theorems 1.1 and 1.2. We remark that in this low dimensional case we do not require a sub-exponential tail for the offspring distribution, and a moment condition is sufficient.

Proposition 4.1

Suppose either that \(d\in \{1,2\}\) and that the offspring distribution \(\mu \) is critical, non-trivial, and has finite second moment, or that \(d=3\) and the offspring distribution \(\mu \) is critical, non-trivial, and has finite third moment. Then

$$\begin{aligned} \mathbb P_{\mu ,0}(L(x) \ge n) \asymp \min \left\{ n^{-2/(4-d)}, \langle x \rangle ^{-2} \right\} \qquad \text { for every }n\ge 1\hbox { and }x\in \mathbb Z^d. \end{aligned}$$

for every \(n\ge 1\).

Remark 4.2

One can also obtain from our proof that if \(d=3\) and \(\mu \) has finite second moment then

$$\begin{aligned} n^{-2} \log ^{-1} (n+1) \preceq \mathbb P_{\mu ,0}(L(0) \ge n) \preceq n^{-2} \log (n+1) \end{aligned}$$

for every \(n\ge 1\).

Our analysis is informed by the following heuristic: In low dimensions, the easiest way for the local time L(x) to be large is for the genealogical tree to be sufficiently large, without any other unusual behaviour for the tree or the associated random walks. Indeed, intuitively, if the genealogical tree survives to generation k, which occurs with probability \(\Theta (k^{-1})\), then it typically contains roughly \(k^2\) vertices, and the locations of the corresponding particles are roughly uniformly distributed on the ball of radius \(k^{1/2}\). Thus, if R denotes the survival time of the branching random walk, we should typically have that \(L(x) =0\) if \(R \ll \langle x \rangle ^2\) and that L(x) is \(\Theta (R^{(4-d)/2})\) if \(R = \Omega (\langle x \rangle ^2)\). Thus, we expect that the easiest way to have \(L(x) \ge n\) is for R to be at least \(\min \left\{ \langle x \rangle ^2, n^{2/(4-d)}\right\} \), which leads to the expression given in Proposition 4.1. One may think of this heuristic argument as yielding a hyperscaling relation for branching random walk below the critical dimension, and the proof of Proposition 4.1 as a rigorous verification of this hyperscaling relation.

We now begin the rigorous proof of Proposition 4.1. We shall see that it is sufficient to look at the first three moments of the truncated local time \(L_n(x)\). (In dimensions \(d=1,2\) it suffices to consider the first and second moment, while in \(d=3\) dimensions using the second moment results in an unwanted logarithmic correction.)

Lemma 4.3

Let \(\mu \) be critical, and let \(d\in \{1,2,3\}\). Then the following moment bounds hold.

  1. (a)

    If \(\mu \) has finite second moment then

    $$\begin{aligned} \mathbb E_{\mu ,0}\left[ L_n(x)^2\right] \preceq {\left\{ \begin{array}{ll} n^{3-d} &{} d \le 2\\ \log (n+1) &{} d=3 \end{array}\right. } \qquad \text { for every }x\in \mathbb Z^d\hbox { and }n\ge 1. \end{aligned}$$
    (4.1)
  2. (b)

    If \(\mu \) has finite third moment, then

    $$\begin{aligned} \mathbb E_{\mu ,0}\left[ L_n(x)^3\right] \preceq n^{(10-3d)/2} \qquad \text { for every }x\in \mathbb Z^d\hbox { and }n\ge 1. \end{aligned}$$
    (4.2)

Note that these bounds are clearly not sharp when, say, \(\langle x \rangle \gg \sqrt{n}\). This will not be a problem for us as the estimates are sharp in the regimes that we wish to apply them.

We will frequently use the easily proved fact that for every \(c >0\) and \(\alpha \in \mathbb R\) there exists a constant \(C=C(c,\alpha )\) such that

$$\begin{aligned} \sum _{r \ge 1} r^{\alpha } \exp \left[ -c r^2/n \right] \le C {\left\{ \begin{array}{ll} n^{(1+\alpha )/2} &{} \alpha >-1\\ \log (n+1) &{} \alpha = -1\\ 1 &{} \alpha <-1 \end{array}\right. } \qquad \text {for every }n\ge 1. \end{aligned}$$

Proof of Lemma 4.3

It suffices to consider the case \(n \ge 2\), so that \(1 \vee {\tilde{\mathbf {G}}}_n(0,0)^{-1} \le 4d^2 \preceq 1\).

(a) Second moment. Let \(S \in \mathcal {S}_2\) be a 2-skeleton. Fix \(1 \le d \le 3\). No 2-skeleton has a vertex of degree more than three. Since \(b_0,b_1,b_2 < \infty \) by assumption, and there is a finite number (10) of 2-skeletons, it suffices by Lemma 3.1 to prove that

$$\begin{aligned} M_{k,n}(x) \preceq {\left\{ \begin{array}{ll} n^{3-d} &{} d \le 2\\ \log (n+1) &{} d=3\end{array}\right. } \qquad \text { for every }x\in \mathbb Z^d\hbox { and } n\ge 1. \end{aligned}$$
(4.3)

for every \(k=0,1,2\) and \(x\in \mathbb Z^d\). This bound is trivially satisfied for \(k=0\), since in this case \(M_{0,n}(x) = \mathbb {1}(x=0) \le 1\). For \(k=1\) we have that

$$\begin{aligned} M_{1,n}(x) = {\tilde{\mathbf {G}}}_n(0,x) \preceq \sum _{k=1}^n k^{-d/2} \preceq {\left\{ \begin{array}{ll} n^{1/2} &{} d=1 \\ \log (n+1) &{} d=2\\ 1 &{} d =3, \end{array}\right. } \end{aligned}$$
(4.4)

which is of lower order than the required bound. For \(k=2\), we apply Lemma 3.2 to deduce that

$$\begin{aligned} M_{2,n}(x) \preceq \sum _{y \in \mathbb Z^d} {\tilde{\mathbf {G}}}_n(0,y)^2 {\tilde{\mathbf {G}}}_n(y,x) \end{aligned}$$
(4.5)

for every \(x \in \mathbb Z^d\) and \(n\ge 2\). Applying the Gaussian heat kernel estimates eq. (2.2) we deduce that there exists a positive constant c such that

$$\begin{aligned} M_{2,n}(x) \preceq \sum _{y \in \mathbb Z^d} \sum _{k_1=1}^n \sum _{k_2=1}^n \sum _{k_3=1}^n k_1^{-d/2}k_2^{-d/2}k_3^{-d/2}\exp \left[ -\frac{c \langle y \rangle ^2}{k_1} -\frac{c \langle y \rangle ^2}{k_2}-\frac{c \langle x-y \rangle ^2}{k_3} \right] .\nonumber \\ \end{aligned}$$
(4.6)

Using that \(\#\{y \in \mathbb Z^d: \langle y \rangle =r\} = O(r^{d-1})\) and changing variables to \(z=x-y\) if \(k_3=\min \{k_1,k_2,k_3\}\), we have that

$$\begin{aligned} \sum _{y \in \mathbb Z^d} \exp \left[ -\frac{c \langle y \rangle ^2}{k_1} -\frac{c \langle y \rangle ^2}{k_2}-\frac{c \langle x-y \rangle ^2}{k_3} \right]&\preceq \sum _{r=1} r^{d-1} \exp \left[ -\frac{cr^2}{\min \{k_1,k_2,k_3\}}\right] \\&\preceq \min \{k_1,k_2,k_3\}^{d/2} \end{aligned}$$

and hence that

$$\begin{aligned} M_{2,n}(x) \preceq \sum _{k_1=1}^n \sum _{k_2=1}^n \sum _{k_3=1}^n k_1^{-d/2}k_2^{-d/2}k_3^{-d/2} \min \{k_1,k_2,k_3\}^{d/2}. \end{aligned}$$

If \(d \in \{1,2\}\), we bound \(\min \{k_1,k_2,k_3\}^{d/2} \le k_1^{d/6} k_2^{d/6} k_3^{d/6}\) and deduce that

$$\begin{aligned} M_{2,n}(x) \preceq \sum _{k_1=1}^n \sum _{k_2=1}^n \sum _{k_3=1}^n k_1^{-d/3}k_2^{-d/3}k_3^{-d/3} \preceq n^{3-d} \end{aligned}$$

for every \(n\ge 2\) as claimed. Meanwhile, if \(d=3\), we compute that

$$\begin{aligned} M_{2,n}(x) \preceq \sum _{k_1=k_3}^n \sum _{k_2=k_3}^n \sum _{k_3=1}^n k_1^{-3/2}k_2^{-3/2} \preceq \sum _{k_3=1}^n k_3^{-1} \preceq \log (n+1) \end{aligned}$$

for every \(n\ge 2\) as claimed.

(b) Third moment. Since no 3-skeleton has a vertex with more than 3 offspring, and since \(b_0,b_1,b_2,b_3 < \infty \) by assumption, it suffices by Lemma 3.1 to prove that

$$\begin{aligned} M_{k,n}(x) \preceq n^{(10-3d)/2} \end{aligned}$$

for every \(n\ge 2\), \(k=0,1,2,3\), and \(x\in \mathbb Z^d\). The fact that this bound is satisfied for \(k=0,1,2\) has already been established. For \(k=3\), we apply Lemma 3.2 and (4.5) to deduce that

$$\begin{aligned} M_{3,n}(x) \preceq \sum _{y \in \mathbb Z^d} \sum _{z\in \mathbb Z^d} {\tilde{\mathbf {G}}}_n(0,z)^2 {\tilde{\mathbf {G}}}_n(z,y) {\tilde{\mathbf {G}}}_n(0,y) {\tilde{\mathbf {G}}}_n(y,x). \end{aligned}$$

As before, we apply the Gaussian heat kernel estimates (2.2) to bound this sum by

$$\begin{aligned}&M_{3,n}(x) \preceq \sum _{y,z \in \mathbb Z^d} \sum _{1 \le k_1,\ldots ,k_5 \le n} k_1^{-d/2}k_2^{-d/2}k_3^{-d/2}k_4^{-d/2}k_5^{-d/2}\\&\quad \cdot \exp \left[ -\frac{c \langle z \rangle ^2}{k_1} -\frac{c \langle z \rangle ^2}{k_2} - \frac{c \langle z-y \rangle ^2}{k_3} -\frac{c \langle y \rangle ^2}{k_4} -\frac{c \langle x-y \rangle ^2}{k_5} \right] . \end{aligned}$$

By similar reasoning to above, we can bound

$$\begin{aligned}&\sum _{y,z \in \mathbb Z^d} \exp \left[ -\frac{c \langle z \rangle ^2}{k_1} -\frac{c \langle z \rangle ^2}{k_2} - \frac{c \langle z-y \rangle ^2}{k_3} -\frac{c \langle y \rangle ^2}{k_4} -\frac{c \langle x-y \rangle ^2}{k_5} \right] \\&\quad \preceq \sum _{z \in \mathbb Z^d} \exp \left[ -\frac{c \langle z \rangle ^2}{k_1} -\frac{c \langle z \rangle ^2}{k_2}\right] \min \{k_3,k_4,k_5\}^{d/2}\\&\quad \preceq \min \{k_1,k_2\}^{d/2} \min \{k_3,k_4,k_5\}^{d/2}. \end{aligned}$$

By symmetry we can also bound the left hand side by \(\min \{k_1,k_2,k_3\}^{d/2} \min \{k_4,k_5\}^{d/2}\). Now observe that, using that \(\min \{k_3,k_4,k_5\} \le k_3^{2/5} k_4^{3/10} k_5^{3/10}\)

$$\begin{aligned}&\min \left\{ \min \{k_1,k_2\}^{d/2} \min \{k_3,k_4,k_5\}^{d/2}, \min \{k_1,k_2,k_3\}^{d/2} \min \{k_4,k_5\}^{d/2} \right\} \\&\quad \le \min \left\{ k_1^{d/4}k_2^{d/4} k_3^{d/5}k_4^{3d/20}k_5^{3d/20}, k_1^{3d/20}k_2^{3d/20}k_3^{d/5}k_4^{d/4}k_5^{d/4} \right\} \le \prod _{i=1}^5 k_i^{d/5}, \end{aligned}$$

where we once again bounded the minimum by the geometric mean and used that \((1/4+3/20)/2 = 2/5\) in the final inequality. Thus, we may bound

$$\begin{aligned} M_{3,n}(x) \preceq \sum _{1 \le k_1,\ldots ,k_5 \le n} \prod _{i=1}^5 k_i^{-3d/10} \preceq \left( n^{1-3d/10}\right) ^5 = n^{(10-3d)/2} \end{aligned}$$

for every \(n\ge 1\) and \(x \in \mathbb Z^d\) as required. \(\square \)

Before applying Lemma 4.3 to prove Proposition 4.1, let us recall the Paley-Zygmund inequality and its higher-moment variants. The usual Paley-Zygmund inequality states that if X is a non-negative random variable with finite second moment then

$$\begin{aligned} \mathbb P\left( X \ge \varepsilon \mathbb E[X]\right) \ge \frac{(1-\varepsilon )^2 \mathbb E[X]^2}{\mathbb E[X^2]} \end{aligned}$$

for every \(0\le \varepsilon \le 1\). Applying this inequality to the conditional distribution of a non-negative random variable X given that \(X>0\) and doing a little algebra, we obtain that in fact

$$\begin{aligned} \mathbb P\left( X \ge \varepsilon \mathbb E\left[ X \mid X>0\right] \right) \ge \frac{(1-\varepsilon )^2 \mathbb E[X]^2}{\mathbb E[X^2]} \end{aligned}$$

for every \(0 \le \varepsilon \le 1\).

The Paley-Zygmund inequlity also has the following \(L^p\) version. We include a short proof since this inequality is less standard.

Lemma 4.4

Let X be a non-negative random variable. Then

$$\begin{aligned} \mathbb P\left( X \ge \varepsilon \mathbb E\left[ X\right] \right) \ge \frac{(1-\varepsilon )^{p/(p-1)} \mathbb E[X]^{p/(p-1)}}{\mathbb E[X^p]^{1/(p-1)}} \end{aligned}$$

for every \(p >1\) and \(0 \le \varepsilon \le 1\).

Proof

Hölder’s inequality implies that

$$\begin{aligned} \mathbb E[X]&\le \varepsilon \mathbb E\left[ X \right] \mathbb P\left( X < \varepsilon \mathbb E\left[ X \right] \right) + \mathbb E\left[ X \mathbb {1}\left( X \ge \varepsilon \mathbb E\left[ X \right] \right) \right] \\&\le \varepsilon \mathbb E\left[ X\right] + \mathbb E\left[ X^p\right] ^{1/p} \mathbb P\left( X \ge \varepsilon \mathbb E\left[ X\right] \right) ^{(p-1)/p}. \end{aligned}$$

Rearranging gives the desired inequality. \(\square \)

Now suppose that X is a nonnegative random variable. Applying the above inequality to a random variable Z distributed according to the conditional distribution of X given \(X>0\) gives that

$$\begin{aligned}&\mathbb P\left( X \ge \varepsilon \mathbb E\left[ X \mid X>0\right] \right) \nonumber \\&\quad = \mathbb P(X>0)\mathbb P\left( Z \ge \varepsilon \mathbb E[Z]\right) \nonumber \\&\quad \ge \mathbb P(X>0)\frac{(1-\varepsilon )^{p/(p-1)} \mathbb E[Z]^{p/(p-1)}}{\mathbb E[Z^p]^{1/(p-1)}} = \frac{(1-\varepsilon )^{p/(p-1)} \mathbb E[X]^{p/(p-1)}}{\mathbb E[X^p]^{1/(p-1)}} \nonumber \\ \end{aligned}$$
(4.7)

for every \(p>1\) and \(0\le \varepsilon \le 1\).

Proof of Proposition 4.1

Let \(1 \le d \le 3\). We assume that \(\mu \) has finite second moment if \(d\in \{1,2\}\) and that \(\mu \) has finite third moment if \(d =3\). We begin with the upper bounds. We have by (1.1) that

$$\begin{aligned} \mathbb P_{\mu ,0}(L(x) \ge n ) \le \mathbb P_{\mu ,0}(L(x)\ge 1) \asymp \langle x \rangle ^{-2} \end{aligned}$$

for every \(n\ge 1\) and \(x\in \mathbb Z^d\). (When \(d = 3\) this estimate can be proven directly by noting that the expectation of L(x) is equal to \(\mathbf {G}(0,x)\) and applying the estimate (2.4).)

Thus it suffices to prove that

$$\begin{aligned} \mathbb P_{\mu ,0}(L(x) \ge n ) \preceq n^{-2/(4-d)} \end{aligned}$$

for every \(n\ge 1\) and \(x\in \mathbb Z^d\). Since \(\mu \) has finite second moment, is critical and non-trivial, we have by the Kolmogorov estimate (2.1) that

$$\begin{aligned} \mathbb P_{\mu ,0}(\partial T_r \ne \emptyset )\asymp \frac{1}{r} \qquad \text { for every }r\ge 1, \end{aligned}$$

where T is the genealogical tree of the branching process. Thus, we can bound

$$\begin{aligned} \mathbb P_{\mu ,0}(L(x) \ge n)\le & {} \mathbb P_{\mu ,0} \left( L_r(x)\ge n \right) + \mathbb P_{\mu ,0}(\partial T_r \ne \emptyset ) \\\le & {} \frac{1}{n^d} \mathbb E_{\mu ,0} \left[ L_r(x)^d\right] + \mathbb P_{\mu ,0}(\partial T_r \ne \emptyset ) \preceq {\left\{ \begin{array}{ll} n^{-1} r^{1/2} + r^{-1} &{} d=1\\ n^{-2} r + r^{-1} &{} d=2\\ n^{-3} r^{1/2} + r^{-1} &{} d=3 \end{array}\right. } \end{aligned}$$

for every \(n,r\ge 1\). Taking \(r= n^{2/3}\) when \(d=1\), \(r=n\) when \(d=2\), and \(r=n^2\) when \(d=3\), we obtain that

$$\begin{aligned} \mathbb P_{\mu ,0}(L(x) \ge n) \preceq n^{-2/(4-d)} \end{aligned}$$

for every \(n\ge 1\) and \(x\in \mathbb Z^d\) as desired.

We now turn to the lower bounds. It suffices to prove that there exists a constant c such that

$$\begin{aligned} \mathbb P_{\mu ,0}(L(x) \ge n) \succeq n^{-2/(4-d)} \end{aligned}$$

for every \(x\in \mathbb Z^d\) and every \(n \ge c \langle x \rangle ^{4-d}\): the required bound for smaller n follows since \(\mathbb P_{\mu ,0}(L(x) \ge n)\) is a decreasing function of n. For each \(r \ge 1\) we have by linearity of expectation that

$$\begin{aligned} \mathbb E_{\mu ,0}\left[ \sum _{\ell =r}^{2r} B_\ell (x) \bigg |\sum _{\ell =r}^{2r} B_\ell (x) >0 \right]\ge & {} \mathbb E_{\mu ,0}\left[ \sum _{\ell =r}^{2r} B_\ell (x) \bigg | \partial T_r \ne \emptyset \right] \\= & {} \mathbb E_{\mu ,0}\left[ \sum _{\ell =r}^{2r} |\partial T_\ell | p_\ell (0,x) \right] \mathbb P(\partial T_r \ne \emptyset )^{-1} \asymp r \sum _{\ell =r}^{2r}p_\ell (0,x) \end{aligned}$$

for every \(x\in \mathbb Z^d\) and \(r\ge 1\). If \(r \succeq \langle x \rangle ^2\) and \(\ell \) has the right parity then \(p_\ell (0,x) \asymp r^{-d/2}\). It follows that

$$\begin{aligned} \mathbb E_{\mu ,0}\left[ \sum _{\ell =r}^{2r} B_\ell (x) \bigg | \sum _{\ell =r}^{2r} B_\ell (x) >0 \right] \succeq r^{(4-d)/2} \end{aligned}$$
(4.8)

for every \(x\in \mathbb Z^d\) and \(r\ge \langle x \rangle ^2\).

Suppose that \(d \in \{1,2\}\). We deduce from (4.8), (4.1) and the Paley-Zygmund inequality that there exists a constant \(c>0\) such that if \(r\ge \langle x \rangle ^2\) then

$$\begin{aligned} \mathbb P_{\mu ,0}\left[ L(x) \ge c r^{(4-d)/2} \right] \ge \mathbb P_{\mu ,0}\left[ \sum _{\ell =r}^{2r} B_\ell (x) \ge c r^{(4-d)/2} \right] \succeq \frac{r^{2-d}}{r^{3-d}}=\frac{1}{r}, \end{aligned}$$

and the desired lower bound follows by taking \(r=\lceil (n/c)^{2/(4-d)}\rceil \). Now suppose that \(d=3\). Applying (4.7) with \(p=3\) we obtain that there exists a constant c such that

$$\begin{aligned} \mathbb P_{\mu ,0}\left[ L(x) \ge c r^{(4-d)/2} \right] \ge \mathbb P_{\mu ,0}\left[ \sum _{\ell =r}^{2r} B_\ell (x) \ge c r^{(4-d)/2} \right] \succeq r^{-1}, \end{aligned}$$

and we conclude as before. \(\square \)

5 High dimensions

In this section we treat the case \(d \ge 5\).

Proposition 5.1

Let \(d\ge 5\) and suppose that the offspring distribution \(\mu \) is critical, non-trivial, and sub-exponential. Then

$$\begin{aligned} \mathbb P_{\mu ,0}(L(x)\ge n) = \exp \left[ -\Theta (n)\right] \langle x \rangle ^{-d+2} \end{aligned}$$

for every \(n\ge 1\) and \(x\in \mathbb Z^d\).

The lower bound is simple, and most of our work will go into proving the upper bound. By a standard computation, which we reproduce below, it suffices to prove that there exists a constant \(C=C(\mu ,d)\) such that \(\mathbb E_{\mu ,0}[L(x)^k] \le C^k k! \langle x \rangle ^{-d+2}\) for every \(k\ge 1\). Thus, applying Lemma 3.1, it suffices to prove the following two lemmas. Recall that c(u) is the number of offspring of a vertex u in a skeleton.

Lemma 5.2

(The skeleton partition function) If \(\mu \) is critical and sub-exponential then there exist a constant \(\kappa =\kappa (\mu )\) such that \(\sum _{S \in \mathcal {S}_k} \prod _{u\in S} b_{c(u)} \preceq \kappa ^k k!\) for every \(k\ge 1\).

Lemma 5.3

(Contribution of a single skeleton) Let \(d\ge 5\). There exists a constant \(\lambda =\lambda (d)\) such that

$$\begin{aligned} \mathbf {D}(0,x,\ldots ,x;S) \le \lambda ^k \langle x \rangle ^{-d+2} \end{aligned}$$

for every \(k \ge 0\), \(S \in \mathcal {S}_k\) and \(x\in \mathbb Z^d\).

We begin with Lemma 5.2.

Proof of Lemma 5.2

Since \(\mu \) is subexponential it satisfies a bound of the form \(\mu (n) \le C \lambda ^n\) for some \(C<\infty \) and \(\lambda <1\). Thus, we have by a standard generating function calculation [19, Eq. 1.31] that

$$\begin{aligned} b_k \le C \sum _{n =k}^\infty \left( {\begin{array}{c}n\\ k\end{array}}\right) \lambda ^n = \frac{C}{1-\lambda } \left( \frac{\lambda }{1-\lambda }\right) ^k \end{aligned}$$

for every \(k\ge 0\). Since \(\sum _{u\in S} c(u) = |V(S)|-1\) for every skeleton S, it follows that

$$\begin{aligned} \prod _{u\in S} b_{c(u)} \le \left( \frac{C}{1-\lambda }\right) ^{|V(S)|}\left( \frac{\lambda }{1-\lambda }\right) ^{|V(S)|-1} \end{aligned}$$

for every skeleton S.

Let \(\mathcal {S}_{n,k} \subseteq \mathcal {S}_k\) be the set of isomorphism classes of k-skeletons with exactly n vertices, and let \({\mathcal {T}}_n\) denote the set of isomorphism classes of rooted plane trees with exactly n vertices. It is well known [19, Example 2.16] that \(|{\mathcal {T}}_n|\) is given by the Catalan number

$$\begin{aligned} |{\mathcal {T}}_n| = \frac{1}{n} \left( {\begin{array}{c}2n-2\\ n-1\end{array}}\right) \le 4^n. \end{aligned}$$
(5.1)

For each rooted plane tree \(T \in {\mathcal {T}}_n\) there are at most \(n^k\) isomorphism classes of k-skeletons with underlying rooted tree T, so that \(|\mathcal {S}_{n,k}| \le 4^n n^k\) for every \(n\ge 1\) and \(k\ge 0\). On the other hand, if \(S \in \mathcal {S}_k\) then every vertex of \(V^\circ (S)\) has degree at least three, so that

$$\begin{aligned} 3|V^\circ (S)| + |\partial V(S)| \le \sum _{u \in V(S)} \deg (u) = 2|V(S)|-2 = 2|V^\circ (S)| + 2 |\partial V(S)| -2\nonumber \\ \end{aligned}$$
(5.2)

and hence that \(|V(S)| \le 2k\). Putting these observations together, we obtain that

$$\begin{aligned} \sum _{S \in \mathcal {S}_k} \prod _{u\in S} b_{c(u)}&\le \sum _{S \in \mathcal {S}_k} \left( \frac{C}{1-\lambda }\right) ^{|V(S)|}\left( \frac{\lambda }{1-\lambda }\right) ^{|V(S)|-1} \\&= \sum _{n=1}^{2k} |\mathcal {S}_{n,k}| \left( \frac{C}{1-\lambda }\right) ^{n}\left( \frac{\lambda }{1-\lambda }\right) ^{n-1} \le \sum _{n=1}^{2k} 4^n n^k \left( \frac{C}{1-\lambda }\right) ^{n}\left( \frac{\lambda }{1-\lambda }\right) ^{n-1}, \end{aligned}$$

for every \(k\ge 0\), from which the claim follows easily. \(\square \)

Lemma 5.3 will be proven using the recursive inequality Lemma 3.2 together with the following simple fact, which is related to the fact that the simple random walk bubble diagram converges when \(d\ge 5\).

Lemma 5.4

Let \(d\ge 5\). Then there exists a constant \(C=C(d) \ge 1\) such that

$$\begin{aligned} C^{-1}\langle x \rangle ^{-d+2} \le \sum _{y \in \mathbb Z^d} \langle y \rangle ^{-2d+4}\langle x-y\rangle ^{-d+2} \le C \langle x \rangle ^{-d+2} \end{aligned}$$

for every \(x \in \mathbb Z^d\).

Proof of Lemma 5.4

The lower bound is trivial from the contribution of \(y=0\). For the upper bound, consider the set \(A = \{y \in \mathbb Z^d : d(x,y) \ge d(0,x)/2\}\). We will control the contribution to the sum from A and \(A^c\) separately. If \(y\in A\) then \(\langle x-y \rangle \succeq \langle x \rangle \), so that

$$\begin{aligned} \sum _{y \in A} \langle y \rangle ^{-2d+4}\langle x-y\rangle ^{-d+2} \preceq \sum _{y \in A} \langle y \rangle ^{-2d+4} \langle x \rangle ^{-d+2} \preceq \langle x \rangle ^{-d+2}, \end{aligned}$$
(5.3)

where we used that \(\sum _{y \in \mathbb Z^d}\langle y \rangle ^{-2d+4}\) is finite when \(d\ge 5\). On the other hand, if \(y \in A^c\) then \(d(0,x)/2 \le d(0,y) \le 3d(0,x)/2\) and we have that

$$\begin{aligned} \sum _{y \in A^c} \langle y \rangle ^{-2d+4} \langle x-y \rangle ^{-d+2} \preceq \langle x \rangle ^{-2d+4} \sum _{y \in A^c} \langle x -y \rangle ^{-d+2}. \end{aligned}$$

Since there are \(O(r^{d-1})\) points y with \(\langle x - y \rangle = r\) for each \(r\ge 1\), we deduce that

$$\begin{aligned} \sum _{y \in A^c} \langle y \rangle ^{-2d+4} \langle x -y \rangle ^{-d+2} \preceq \langle x \rangle ^{-2d+4} \sum _{r=1}^{3\langle x \rangle /2} r \preceq \langle x \rangle ^{-2d+6} \preceq \langle x \rangle ^{-d+2} \end{aligned}$$
(5.4)

where we used that \(d \ge 4\) in the last inequality. Combining (5.3) and (5.4) completes the proof. \(\square \)

Proof of Lemma 5.3

Let \(C_1 \ge 1\) be such that \({\tilde{\mathbf {G}}}(x,y) \le C_1 \langle x-y \rangle ^{-d+2}\) for every \(x,y \in \mathbb Z^d\), let \(C_2 \ge 1\) be the constant from Lemma 5.4, and let \(\lambda = C_1^2 C_2[1 \vee {\tilde{\mathbf {G}}}(0,0)^{-1}]\). We will prove by induction on k that

$$\begin{aligned} M_k(x) \le C_1 \lambda ^{k-1} \langle x \rangle ^{-d+2} \end{aligned}$$
(5.5)

for every \(k\ge 1\). The base case \(k=1\) is immediate, since \(\mathcal {S}_1'\) has only one element and this element S has \(\mathbf {D}(0,x;S)={\tilde{\mathbf {G}}}(0,x) \le C_1 \langle x \rangle ^{-2}\). Now suppose that \(k\ge 2\) and that the induction hypothesis (5.3) holds for all \(1 \le r \le k-1\). Applying Lemma 3.2 and Lemma 5.4 we obtain that

$$\begin{aligned} M_k(x)&\le \left[ 1\vee {\tilde{\mathbf {G}}}(0,0)^{-1}\right] \max \left\{ C_1^3 \lambda ^{k-r-1} \lambda ^{r-1} \sum _{y \in \mathbb Z^d} \langle y \rangle ^{-2d+4} \langle x -y \rangle ^{-d+2} : 1 \le r \le k-1\right\} \\&\le \left[ 1 \vee {\tilde{\mathbf {G}}}(0,0)^{-1}\right] C_1^3 C_2 \lambda ^{k-2} \langle x \rangle ^{-d+2} \le C_1 \lambda ^{k-1} \langle x \rangle ^{-d+2} \end{aligned}$$

for every \(x \in \mathbb Z^d\). This completes the induction.

The claim follows from (5.5) and (3.10). \(\square \)

Proof of Proposition 5.1

We begin with the upper bound. Lemmas 3.1 , 5.2, and 5.3 imply that there exists a constant \(\alpha \) such that \(\mathbb E_{\mu ,0}[L(x)^k] \le \alpha ^k k! \langle x \rangle ^{-d+2}\) for every \(k \ge 1\) and \(x\in \mathbb Z^d\). We deduce that

$$\begin{aligned}&\mathbb E_{\mu ,0}\left[ e^{L(x)/2\alpha } \mathbb {1}(L(x)>0)\right] \le \frac{e^{1/2\alpha }}{e^{1/2\alpha }-1}\mathbb E_{\mu ,0}\left[ e^{L(x)/2\alpha } -1\right] \nonumber \\&\quad = \frac{e^{1/2\alpha }}{e^{1/2\alpha }-1}\sum _{k \ge 1} \frac{1}{2^k \alpha ^k k!} \mathbb E_{\mu ,0}\left[ L(x)^k\right] \le \frac{e^{1/2\alpha }}{e^{1/2\alpha }-1} \langle x \rangle ^{-d+2} \end{aligned}$$
(5.6)

for every \(x\in \mathbb Z^d\), and hence by Markov’s inequality that

$$\begin{aligned} \mathbb P_{\mu ,0}(L(x)\ge n) \le \frac{e^{-(n-1)/2\alpha } \langle x \rangle ^{-d+2}}{e^{1/2\alpha }-1} \end{aligned}$$

for every \(x\in \mathbb Z^d\) and \(n\ge 1\) as claimed.

We finish with the lower bound. First suppose that \(x=0\). The probability q that the initial particle has at least one grandchild is positive, and any grandchild has probability 1/(2d) of being back at the origin. By the Markov property, the probability that there are at least n visits to 0 is at least \((q/2d)^n = e^{-\Theta (n)}\) for every \(n\ge 1\). If \(x \ne 0\), then we claim that

$$\begin{aligned} \mathbb P_{\mu ,0}(L(x)\ge n)&\ge \mathbb P_{\mu ,0}(L(x)>0)\mathbb P_{\mu ,x}(L(x) \ge n) \\&= \mathbb P_{\mu ,0}(L(x)>0)\mathbb P_{\mu ,0}(L(0) \ge n) \succeq \langle x \rangle ^{2-d} e^{-\Theta (n)} \end{aligned}$$

as required, where the final inequality follows from (1.1). Indeed, for the first inequality, note that if we explore the genealogical tree T in a breadth-first manner until x is visited for the first time, the part of the branching process that is descended from this first visit to x has conditional law \(\mathbb P_{\mu ,x}\). This completes the proof. \(\square \)

6 The critical dimension

In this section we deal with the case of the upper critical dimension \(d=4\), which is the most technical. We rely on the machinery developed in the previous sections, in particular Lemmas 5.2 and  3.2. The following is the \(d=4\) case of Theorem 1.2.

Proposition 6.1

Let \(d=4\) and suppose that the offspring distribution \(\mu \) is critical, nontrivial, and subexponential. Then

$$\begin{aligned} \mathbb P_{\mu ,0}(L(x)\ge n) \asymp \exp \bigg [ -\Theta \!\left( \min \left\{ \sqrt{n}, \tfrac{n}{\log \langle x \rangle }\right\} \right) \bigg ] \langle x \rangle ^{-2} \log ^{-1}\langle x \rangle \end{aligned}$$

for every \(n\ge 1\) and \(x\in \mathbb Z^d\).

Remark 6.2

Proposition 6.1 shows that in four dimensions, unlike in low dimensions, the easiest way for L(0) to be large is not for the genealogical tree to be “large in a typical way”. Indeed, L(0) is typically logarithmic in the size of the tree, so for L(0) to be of order n we would need the tree to survive to generation \(e^{\Omega (n)}\). This occurs with probability \(e^{-\Omega (n)}\), which is much smaller than the probability that \(L(0) \ge n\).

The proof of this proposition relies on the results of Zhu [23, 24] (i.e., the \(d=4\) case of the hitting probability estimate (1.1)) in the case \(x \ne 0\), but is self-contained in the case \(x=0\). Indeed, the proposition will follow from Zhu’s results together with the following proposition.

Proposition 6.3

Let \(d=4\) and suppose that the offspring distribution \(\mu \) is critical, non-trivial, and subexponential. Then there exist positive constants c and C such that

$$\begin{aligned} c^k k! [k+\log \langle x \rangle ]^{k-1}\langle x \rangle ^{-2} \le \mathbb E_{\mu ,0}[L(x)^k] \le C^k k! [k+\log \langle x \rangle ]^{k-1} \langle x \rangle ^{-2} \end{aligned}$$

for every \(x\in \mathbb Z^d\) and \(k\ge 1\).

We begin with the following lemma, which is the four-dimensional analogue of Lemma 5.4.

Lemma 6.4

Let \(d=4\). Then there exists a positive constant C such that

$$\begin{aligned} \sum _{y \in \mathbb Z^d} \langle x-y\rangle ^{-2} \langle y \rangle ^{-4} [k+\log \langle y \rangle ]^k \le \frac{C\langle x \rangle ^{-2}}{k+1} [k+1+ \log \langle x \rangle ]^{k+1} \end{aligned}$$

for every \(x \in \mathbb Z^d\) and \(k\ge 0\).

Proof of Lemma 6.4

Partition \(\mathbb Z^4\) into three sets ABC according to the distance to 0 and x:

$$\begin{aligned} A&= \{ y \in \mathbb Z^d : d(0,y) \le 2 d(0,x) \text { and } d(x,y) \ge d(0,x)/2 \}, \\ B&= \{ y \in \mathbb Z^d : d(x,y) < d(0,x)/2\}, \\ C&= \{y \in \mathbb Z^d : d(0,y) > 2 d(0,x)\}. \end{aligned}$$

We will control the contribution to the sum of each of these three sets separately. If \(y\in A\) then \( \langle x \rangle /2 \le \langle x -y \rangle \le 3\langle x \rangle \), so that

$$\begin{aligned} \sum _{y \in A} \langle x-y\rangle ^{-2} \langle y \rangle ^{-4} [k+\log \langle y \rangle ]^k&\asymp \langle x \rangle ^{-2} \sum _{y \in A} \langle y \rangle ^{-4} [k+\log \langle y \rangle ]^k \asymp \langle x \rangle ^{-2} \sum _{r=1}^{2 \langle x \rangle } r^{-1} [k+\log r]^k. \end{aligned}$$

The sum on the right hand side can be bounded with a little calculus: We have the integral identity

$$\begin{aligned} \int _1^s t^{-1}(k+ \log t)^k \text {d}t = \frac{(k+\log s)^{k+1}}{k+1} - \frac{k^{k+1}}{k+1} \end{aligned}$$

for every \(s \ge 1\), and since the function \(t^{-1}(k+\log t)^k\) is decreasing when \(t\ge 1\) (as can be seen by computing the derivative to be \(-t^{-2} (k+\log t)^{k-1} \log t\)), we have that

$$\begin{aligned} \sum _{r=1}^{2\langle x \rangle } r^{-1}[k+\log r]^k \le k^k + \int _1^{2\langle x \rangle } t^{-1}[k+\log t]^k \text {d}t = \frac{[k+\log 2\langle x \rangle ]^{k+1}}{k+1} + \frac{k^k}{k+1}\\ \le \frac{2}{k+1} [k+1+\log \langle x \rangle ]^{k+1} \end{aligned}$$

and hence that

$$\begin{aligned} \sum _{y \in A} \langle x-y\rangle ^{-2} \langle y \rangle ^{-4} [k+\log \langle y \rangle ]^k \preceq \frac{\langle x \rangle ^{-2}}{k+1}[k+1+\log \langle x \rangle ]^{k+1} \end{aligned}$$
(6.1)

as required.

It remains to upper bound the contributions from B and C. If \(y \in B\) then \(d(0,x)/2 \le d(0,y) \le 2d(0,x)\) and we have that

$$\begin{aligned}&\sum _{y \in B} \langle x -y \rangle ^{-2} \langle y \rangle ^{-4} [k+\log \langle y \rangle ]^k \asymp \langle x \rangle ^{-4} [k+\log 2\langle x \rangle ]^k \sum _{y \in B} \langle x -y \rangle ^{-2} \nonumber \\&\quad \asymp \langle x \rangle ^{-4} [k+1+\log \langle x \rangle ]^k \sum _{r=1}^{2\langle x \rangle } r \nonumber \\&\quad \asymp \langle x \rangle ^{-2}[k+1+\log \langle x \rangle ]^k \le \frac{\langle x \rangle ^{-2}}{k+1}[k+1+\log \langle x \rangle ]^{k+1} \end{aligned}$$
(6.2)

as required.

Finally, if \(y \in C\) then \(d(x,y) \ge d(0,y)-d(0,x) \ge d(0,y)/2\) and \(d(0,y) > d(0,x)\), so that

$$\begin{aligned} \sum _{y \in C} \langle x-y \rangle ^{-2} \langle y \rangle ^{-4} [k+\log \langle y \rangle ]^k \asymp \sum _{y \in C} \langle y \rangle ^{-6} [k+\log \langle y \rangle ]^k \end{aligned}$$
(6.3)

Up to constants, there are \(2^{4n}\) choices for y with \(2^n \le \langle y \rangle < 2^{n+1}\). For each such y we have \(\langle y \rangle ^{-6} [k+\log \langle y \rangle ]^k \asymp 2^{-6n} (k+n\log 2)^k\), so the total contribution from all such y’s is (up to constants) \(2^{-2n} (k+n\log 2)^k\). Thus

$$\begin{aligned} \sum _{y \in C} \langle x-y \rangle ^{-2} \langle y \rangle ^{-4} [k+\log \langle y \rangle ]^k \asymp \sum _{2^n > \langle x \rangle } 2^{-2n} (k+n\log 2)^k. \end{aligned}$$
(6.4)

The ratio of consecutive terms in this sum is

$$\begin{aligned} \frac{2^{-2(n+1)} (k+(n+1)\log 2)^k}{2^{-2n} (k+n\log 2)^k} \le \frac{1}{4} \left( 1+\frac{1}{k+n\log 2}\right) ^k \le \frac{e}{4}. \end{aligned}$$

Since \(e/4 < 1\), it follows that the sum on the right of (6.4) is of the same order as its first term, and we deduce that

$$\begin{aligned} \sum _{y \in C} \langle x-y \rangle ^{-2} \langle y \rangle ^{-4} [k+\log \langle y \rangle ]^k \asymp \langle x \rangle ^{-2} [k+ \log 2\langle x \rangle ]^k \preceq \frac{\langle x \rangle ^{-2}}{k+1} [k+1+\log \langle x \rangle ]^{k+1}.\nonumber \\ \end{aligned}$$
(6.5)

This is also of the required order, completing the proof. \(\square \)

Proof of Proposition 6.3

We begin with the upper bound. Let \(C_1\ge 1\) be a constant such that \({\tilde{\mathbf {G}}}(0,x) \le C_1 \langle x \rangle ^{-2}\) for every \(x\in \mathbb Z^4\), let \(C_2 \ge 1\) be the constant from Lemma 6.4, and let \(\lambda = C_1^2 C_2 [1 \vee {\tilde{\mathbf {G}}}(0,0)^{-1}]\). We prove by induction on k that

$$\begin{aligned} M_k(x) \le C_1 \lambda ^{k-1} \langle x \rangle ^{-2} [k-1+\log \langle x \rangle ]^{k-1} \end{aligned}$$
(6.6)

for every \(k\ge 1\) and \(x\in \mathbb Z^4\). The base case \(k=1\) is trivial. For \(k\ge 2\), we may apply Lemma 3.2 and the induction hypothesis to obtain that

$$\begin{aligned}&M_k(x) \le [1 \vee {\tilde{\mathbf {G}}}(0,0)^{-1}]C_1^3 \lambda ^{k-2} \\&\quad \cdot \max \Biggl \{ \sum _{y \in \mathbb Z^4} \langle y \rangle ^{-4}\langle x -y \rangle ^{-2}[k-r-1+\log \langle y \rangle ]^{k-r-1}[r-1+\log \langle y \rangle ]^{r-1} : 1 \le r \le k-1 \Biggr \} \end{aligned}$$

and hence that

$$\begin{aligned} M_k(x)\le & {} [1 \vee {\tilde{\mathbf {G}}}(0,0)^{-1}]C_1^3 \lambda ^{k-2} \sum _{y \in \mathbb Z^4} \langle y \rangle ^{-4}\langle x -y \rangle ^{-2}[k-2+\log \langle y \rangle ]^{k-2} \\&\le C_1 \lambda ^{k-1} \langle x \rangle ^{-2}[k-1+\log \langle x \rangle ]^{k-1} \end{aligned}$$

as desired, where we applied Lemma 6.4 in the second line. As in the proof of Proposition 5.1, it follows from (6.6), Lemmas 3.1, and  5.2 that there exists a constant \(C_3\) such that

$$\begin{aligned} \mathbb E_{\mu ,0}[L(x)^k] \le C_3^k k! [k-1+\log \langle x \rangle ]^{k-1} \langle x \rangle ^{-2} \end{aligned}$$
(6.7)

for every \(x\in \mathbb Z^d\) and \(k\ge 1\) as claimed.

We now turn to the lower bound. We first prove the bound for k of the form \(2^\ell \) for some natural number \(\ell \ge 1\). For each \(\ell \ge 0\), let \(k=2^\ell \) and let \(T=T_\ell \) be the rooted plane tree with boundary in which the root has degree 1, the descendants of the root’s child form a complete binary tree of height \(\ell \), and \(\partial V(T_\ell )\) is equal to the set of leaves of T. Let \(\rho \) be the root of \(T_\ell \), let \(v_0\) be the child of the root, and for each vertex v of \(T_\ell \) other than \(\rho \), let \(\sigma (v)\) denote the parent of v in T. There are k! ways to label the non-root leaves of T with the labels \(\{1,\ldots ,k\}\), and each such labelling yields a distinct k-skeleton. Let \(S=S_\ell \) be one such labelled k-skeleton. Applying Lemma 3.1, we have by symmetry that

$$\begin{aligned} \mathbb E_{\mu ,0}[L(x)^k] = \mathbb E_{\mu ,x}[L(0)^k]\ge k! (b_2)^{k-1} \mathbf {D}(x,0,\ldots ,0;S). \end{aligned}$$
(6.8)

(Recall that \(b_2\) is the second descending moment of the offspring distribution, which is positive since \(\mu \) is critical and nontrivial.)

Consider the set \(\Phi \) of functions \(\phi : V^\circ (T) \rightarrow \{0,1,\ldots ,k \vee \lceil \log _2 \langle x \rangle \rceil \}\) that are decreasing along each branch of the tree, i.e. such that

\(\phi (\sigma (v))\ge \phi (v)\) for each \(v\in V^\circ (T) \setminus \{v_0\}\). For each \(\phi \in \Phi \), we define \(\Lambda (\phi )\) to be the set of functions \(f: V^\circ (T) \rightarrow \mathbb Z^4\) such that \(2^{\phi (v)} \le d(0, f(v)) < 2^{\phi (v)+1}\) for every \(v\in V^\circ (T)\). Note that the sets \(\Lambda (\phi )\) and \(\Lambda (\psi )\) are disjoint whenever \(\phi ,\psi \in \Phi \) are distinct. Moreover, if \(\phi \in \Phi \) and \(f\in \Lambda (\phi )\) then

\(d(f(v),f(\sigma (v)) \le 2 \max \{d(0,f(v)),d(0,f(\sigma (v))\}\) so that \(\langle f(v)-f(\sigma (v)) \rangle \preceq \langle f(\sigma (v)) \rangle \asymp 2^{\phi (\sigma (v))}\) for every \(v\in V^\circ \setminus \{v_0\}\). Similarly, we necessarily have that \(\langle x - f(v_0) \rangle \preceq 2^{k \vee \log _2 \langle x \rangle } \le 2^k \langle x \rangle \). Thus, we obtain from the definitions that there exists a positive constant \(c_1\) such that

$$\begin{aligned} \mathbf {D}(x,0,\ldots ,0;S)&\ge c_1^k \langle x \rangle ^{-2} \sum _{\phi \in \Phi } |\Lambda (\phi )| \prod _{v\in V^\circ (T)}2^{-4\phi (v)}. \end{aligned}$$

Next observe that there exists a positive constant \(c_2\) such that

$$\begin{aligned} |\Lambda (\phi )| = \prod _{v\in V^\circ (T)} |\{y \in \mathbb Z^4 : 2^{\phi (v)} \le d(0,y) < 2^{\phi (v)+1}\}| \ge c_2^k \prod _{v\in V^\circ (T)} 2^{4 \phi (v)}, \end{aligned}$$

so that there exists a positive constant \(c_3\) such that

$$\begin{aligned} \mathbf {D}(x,0,\ldots ,0;S) \ge c_3^k \langle x \rangle ^{-2} |\Phi |. \end{aligned}$$

It remains to estimate \(|\Phi |\). Let \(E_i\) be the set of edges of T connecting vertices at distance i from the root to the children of these vertices, so that \(|E_i|=2^i\) for \(0 \le i \le \ell \), and let \(E' = \bigcup _{i=0}^{\ell -1} E_i\). Let \(\Psi \) be the set of functions \(\psi : E' \rightarrow \{0,\ldots ,k \vee \lceil \log _2 \langle x \rangle \rceil \}\) such that if \(e\in E_i\) then \(\psi (e) \le 2^{i-\ell }(k \vee \lceil \log _2 \langle x \rangle \rceil )\). We clearly have that

$$\begin{aligned}&|\Psi | = \prod _{m=1}^{\ell } \left\lfloor \frac{k \vee \lceil \log _2 \langle x \rangle \rceil }{2^m}+1\right\rfloor ^{2^{\ell -m}} \ge \prod _{m=1}^{\ell } \left[ \frac{k \vee \log _2 \langle x \rangle }{2^m}\right] ^{2^{\ell -m}} \\&\qquad = [k \vee \log _2 \langle x \rangle ]^{2^{\ell }-1} 2^{-\sum _{m=1}^{\ell } m2^{\ell -m}} \end{aligned}$$

We claim that there is an injection \(\Psi \rightarrow \Phi \). Given \(\psi \in \Psi \), let \(\phi \in \Phi \) be defined recursively by \(\phi (v_0)=k \vee \lceil \log _2 \langle x \rangle \rceil \) and \(\phi (v) = \phi (\sigma (v))-\psi (\{v,\sigma (v)\})\) for every \(v\in V^\circ (T) \setminus \{v_0\}\). The function \(\phi \) is indeed an element of \(\Phi \), since \(\phi (v) \ge k \vee \log _2 \langle x \rangle - \sum _{i=1}^{\ell -1} 2^{i-\ell } (k \vee \log _2 \langle x \rangle ) \ge 0\) for every \(v\in V^\circ (T)\). Moreover, distinct elements of \(\Psi \) clearly lead to distinct elements of \(\Phi \) under this assignment, as claimed. We deduce that

$$\begin{aligned} |\Phi | \ge |\Psi | \ge [k \vee \log _2 \langle x \rangle ]^{k-1} 2^{-\sum _{m=1}^{\ell -1} m2^{\ell -m}} \ge c_4^k [k \vee \log _2 \langle x \rangle ]^{k-1} \end{aligned}$$

where \(c_4=2^{-\sum _{m=1}^\infty m 2^{-m}}>0\). It follows that there exists a constant \(c_5>0\) such that

$$\begin{aligned} \mathbf {D}(x,0,\ldots ,0;S_\ell ) \ge c_5^k [k + \log _2 \langle x \rangle ]^{k-1} \langle x \rangle ^{-2} \end{aligned}$$
(6.9)

for every \(k=2^\ell \ge 2\) and \(x \in \mathbb Z^4\). Putting together (6.8) and (6.9), we obtain that there exists a constant \(c_6>0\) such that

$$\begin{aligned} \mathbb E_{\mu ,0}[L(x)^k] \ge c_6^k k! \left[ k + \log \langle x \rangle \right] ^{k-1} \langle x \rangle ^{-2} \end{aligned}$$
(6.10)

for every \(x\in \mathbb Z^4\) and every \(k=2^{\ell }\) for some \(\ell \ge 1\).

To get the lower bound for k which is not a power of 2, we interpolate using log-convexity. By Cauchy-Schwarz, for any random variable \(X\ge 0\) and any \(a\ge i\ge 0\) we have \(\left( \mathbb EX^a\right) ^2 \le \left( \mathbb EX^{a-i}\right) \left( \mathbb EX^{a+i}\right) \), so that the moments \(\mathbb EX^n\) are a log-convex sequence. Since we have the claimed upper bound for every k and the lower bound for powers of 2, the lower bound follows for all k. More precisely, let \(a\in [k,2k]\) be a power of 2, and let \(b=2a-k\). Log-convexity gives

$$\begin{aligned} \mathbb E_{\mu ,0}\left[ L(x)^k\right] \ge \frac{\mathbb E_{\mu ,0}[L(x)^a]^2}{\mathbb E_{\mu ,0}[L(x)^{b}]}. \end{aligned}$$

Applying (6.10) to control the numerator and (6.7) to control the denominator yields the lower bound for arbitrary k. \(\square \)

Proof of Proposition 6.1

By Zhu’s Theorem, it suffices to prove that

$$\begin{aligned} \mathbb P_{\mu ,0}(L(x) \ge n \mid L(x)>0) = \exp \left[ -\Theta \left( \min \left\{ \sqrt{n}, \frac{n}{\log \langle x \rangle }\right\} \right) \right] \end{aligned}$$

for every \(x\in \mathbb Z^4\) and \(n \ge 1\). Moreover, Zhu’s Theorem and Proposition 6.3 imply that there exist positive constants \(c_1\) and \(C_2\) such that

$$\begin{aligned} c^k_1 e^{k \log k} [k \vee \log \langle x \rangle ]^{k-1} \log \langle x \rangle \le \mathbb E_{\mu ,0}\left[ L(x)^k \big | L(x)>0 \right] \le C^k_1 e^{k\log k} [k \vee \log \langle x \rangle ]^{k-1} \log \langle x \rangle \end{aligned}$$

for every \(x\in \mathbb Z^4\) and \(k\ge 1\), and hence that there exist positive constants \(c_2\) and \(C_2\ge 1\) such that

$$\begin{aligned} c_2^k e^{k \log k} [k \vee \log \langle x \rangle ]^{k} \le \mathbb E_{\mu ,0}\left[ L(x)^k \mid L(x)>0 \right] \le C_2^k e^{k\log k} [k \vee \log \langle x \rangle ]^{k} \nonumber \\ \end{aligned}$$
(6.11)

for every \(x\in \mathbb Z^4\) and \(k\ge 1\).

For the upper bound, we apply (6.11) and Stirling’s approximation to obtain that there exists a constant \(C_3\) such that

$$\begin{aligned}&\mathbb E_{\mu ,0}\left[ \exp \left( \frac{1}{2eC_2} \min \left\{ \frac{L(x)}{\log \langle x \rangle }, \sqrt{L(x)} \right\} \right) \mid L(x)>0\right] \\&\quad = \sum _{k=0}^\infty \frac{(2eC_2)^{-k}}{k!} \mathbb E_{\mu ,0} \left[ \min \left\{ \frac{L(x)}{\log \langle x \rangle }, \sqrt{L(x)} \right\} ^k\mid L(x)>0\right] \\&\quad \le \sum _{k=0}^\infty \frac{(2eC_2)^{-k}}{k!} \min \left\{ \mathbb E_{\mu ,0} \left[ \frac{L(x)^k}{\log ^k \langle x \rangle } \mid L(x)>0 \right] , \mathbb E_{\mu ,0}\left[ L(x)^{k/2} \mid L(x)>0\right] \right\} \\&\quad \le \sum _{k=0}^{\lfloor \log \langle x \rangle \rfloor } \frac{(2e)^{-k}}{k!} e^{k\log k} + \sum _{k=1+\lfloor \log \langle x \rangle \rfloor }^\infty \frac{e^{-k}(2C_2)^{-k/2}}{k!} e^{k \log k} \le C_3. \end{aligned}$$

Thus, it follows by Markov’s inequality that there exists a constant \(C_3\) such that

$$\begin{aligned} \mathbb P\left( L(x)\ge n \mid L(x)>0\right) \le C_3\exp \left( -\frac{1}{2eC_2} \min \left\{ \frac{n}{\log \langle x \rangle }, \sqrt{n} \right\} \right) \end{aligned}$$

for every \(x\in \mathbb Z^4\) and \(n\ge 1\) as required.

For the lower bound, we apply the Paley-Zygmund inequality to obtain that there exists a positive constant \(c_3\) such that

$$\begin{aligned}&\mathbb P_{\mu ,0}\left( L(x)^k \ge \frac{1}{2}c_2^k e^{k \log k}[k \vee \log \langle x \rangle ]^k \right) \\&\quad \ge \mathbb P_{\mu ,0}\left( L(x)^k \ge \frac{1}{2}\mathbb E_{\mu ,0}\left[ L(x)^k \mid L(x)>0\right] \right) \\&\quad \ge \frac{1}{4} \mathbb E_{\mu ,0}\left[ L(x)^{2k}\right] ^{-1}\mathbb E_{\mu ,0}\left[ L(x)^k\right] ^2 \\&\quad \ge \frac{1}{4} \frac{c_2^{2k} e^{2 k \log k}[k \vee \log \langle x \rangle ]^{2k}}{C_2^{2k} e^{2k \log 2k}[2 k \vee \log \langle x \rangle ]^{2k}} \ge c_3^k \end{aligned}$$

for every \(k \ge 1\) and \(x\in \mathbb Z^4\), and hence that there exists a positive constant \(c_4\) such that

$$\begin{aligned} \mathbb P_{\mu ,0}\left( L(x) \ge c_4 k [k \vee \log \langle x \rangle ] \right) \ge c_3^k \end{aligned}$$

for every \(k\ge 1\) and \(x\in \mathbb Z^4\). The claimed lower bound follows from this inequality by taking \(k = \min \left\{ \left\lceil \sqrt{n/c_4} \right\rceil , \left\lceil n/(c_4 \log \langle x \rangle )\right\rceil \right\} \). \(\square \)