1 Introduction

Many processes in physics and natural sciences can be described with the help of random walks and their limit processes, the so-called diffusion processes. A possible philosophical explanation of this experimentally observed phenomenon is that the limit of random walks reflects the microscopic nature of the situation: Even fully deterministic microscopic systems can give rise to erratic seemingly random motions, practically indistinguishable from those produced by a stochastic process.

Let us recall one of the first constructions of a random walk which is due to K. Pearson in 1905 [33]. A more physically motivated approach is in the paper [13] of Einstein from the same year.

We start from a point \(p\in {\mathbb {R}}^2\), choose a random direction at the tangent space, go for distance 1 along the straight line starting at this direction, and then repeat the procedure iteratively. We obtain a stochastic process in which trajectories are piecewise linear curves, see Fig. 1.

Fig. 1
figure 1

50 steps of a Pearson random walk

It is natural to renormalize this process as follows: we assume that the steps have length not 1 but \(1/\sqrt{N}\), and we take N steps in one unit of time. If the procedure of choosing the random direction is invariant with respect to the isometry group of the flat \({\mathbb {R}}^2\) which was the case in [13, 33], then by the Functional Central Limit Theorem, the limit of this sequence as \(N\rightarrow \infty \) exists and is the (flat) Brownian motion, see [8, Chapter 2].

We see that in order to define such a random walk, one needs two ingredients: the rule of choosing a random direction at a current position p (i.e. a probability distribution \(\nu _p\) on the space of tangent vectors at the point p) and an analogue of the notion of a straight line, which describes the motion of a small particle with no external forces acting upon it.

In many systems in physics and natural sciences, small particles with no external forces acting upon them move along geodesics of a Finsler metric. We give necessary definitions in Sect. 2.1. Recall that geodesics are smooth curves and, similar to the straight lines, the initial point and the initial velocity vector determine the geodesic. The above definition of the random walk is immediately generalized to this case. Indeed, starting from a point p of a Finsler manifold (MF) such that every tangent space \(T_pM\) is equipped with a probability measure \(\nu _p\), choose a random vector v in the tangent space, go the distance \(F(v)/\sqrt{N}\) along the geodesic starting at p with the initial velocity v and then repeat the procedure (if \(\nu _p\) is not centered we rescale it as in Sect. 2.2). We obtain a stochastic process in which trajectories are piecewise geodesic curves (e.g. they are glued together from geodesic segments).

The present paper studies such geodesic random walks on Finsler manifolds and their limit diffusion processes, and concentrates on the fundamental question of the existence and uniqueness of the limit process. Our main result is that under assumptions natural from the viewpoint of Finsler geometry, the limit process exists and is unique. Moreover, it is a diffusion process in which generator is a non-degenerate elliptic second-order partial differential operator for which we give a precise formula.

The Riemannian version of our result (recall that Riemannian metrics are Finsler metrics) was obtained, e.g. by Jørgensen [22].

Geodesic random walks on Finsler manifolds and their limit processes are of course natural topics from the viewpoint of both differential geometry and theory of stochastic processes. They may have applied interest since Finsler manifolds are used to model different physical situations with anisotropies at an infinitesimal level, see e.g. [1, 2, 10, 12, 17, 20, 28, 34, 45], and may also be used for certain models in information geometry, see, e.g. [40].

Although Brownian motions and diffusion processes on Finsler manifolds were discussed in the literature (see, e.g. the books [4, 44]), the very basic question of the existence and uniqueness of the limit process for Finsler geodesic random walks has not been rigorously treated.

More precisely, the work [44] on Finsler Brownian motions goes in the other direction: It starts from a stochastic differential equation which is constructed by a Finsler metric F, a volume form \(\mu \), and an extra data \(u_0\in H^1_0(M)\) on M. It is easy to see that for a generic Finsler metric, solutions of this stochastic differential equation do not correspond to a limit process of a sequence of geodesic random walks.

This general approach, in which one starts with an elliptic differential operator (or a Dirichlet form) in order to construct a diffusion process, is a very popular and powerful approach to diffusion processes on metric spaces. It allows in particular to treat the case of non-smooth background metric structures, see e.g. [18, 27, 42]. This approach does not ensure that the resulting stochastic process is the limit process of a sequence of random walks. If the background is almost Riemannian (say, Alexandrov with bounded curvature, as in [18] and [27]), the best one can do is to relate random walks on the Riemannian spaces approximating our metric space to the diffusion process on our metric space. These results cannot be applied in the Finslerian situation, since Finsler metrics cannot be approximated by Riemannian metrics. Our results will possibly allow to extend this group of methods to a Finslerian situation and we plan to do this in our future works.

Let us now discuss the corresponding results of the book [4], where many different approaches of constructing different non-equivalent diffusion processes (on the manifold or on the tangent bundle to the manifold) by a Finsler metric are suggested. One of these approaches (see [4, §A2]) is seemingly close to ours and considers the limit processes of Finsler geodesics random walks (in their case, the distribution \(\nu _p\) is quite special and is canonically constructed by the Finsler metric). Unfortunately no rigorous proof of convergence is given: it is merely claimed that the limit process exists and is unique, and referred to [35, 36] for methods and technical details.

The references [35, 36] are mostly survey papers about geodesic random walks on Riemannian manifolds. The methods discussed there assume and rely on the special form of the probability measure \(\nu _p\) on tangent spaces. Moreover, it is assumed that the Riemannian manifold is stochastically complete. The property of stochastic completeness is a nontrivial property, and examples show that not all complete manifolds are stochastically complete. In the Riemannian case, there is a number of criteria of stochastic completeness, see e.g. [19, 46]. In particular, if the Ricci curvature of a complete Riemannian manifold is bounded from below, the manifold is stochastically complete. In the Finslerian situation, we did not find any relevant works on stochastic completeness and the claim of [4, Sect. A2] that the methods of [35, 36] can easily be applied in the Finslerian situation looks overoptimistic.

Note that as a by-product, we have proved that every complete Finsler manifold of bounded geometry (see Definition  2.1) is stochastically complete; that is, the limit process of Finsler geodesic random walks is stochastically complete in the sense of [21, Sect. 4.2]. It is interesting to try to relax the assumption of bounded geometry in this statement, and we plan to do this in future works.

A very successful approach to geodesic random walks and diffusion processes on Riemannian manifolds, which allows essential freedom in the choice of the probability measures \(\nu _p\), is in [22]. Many arguments in [22] are based on the following property which holds in the Riemannian but not in the Finslerian case: Consider an arc-length parametrized geodesic segment \(\gamma :[0,\varepsilon ]\rightarrow M\) of a (smooth) Riemannian metric. Take a vector \(v\in T_{\gamma (0)}M\) of length one and its parallel transport \(v_\varepsilon \in T_{\gamma (\epsilon )}M\) along the geodesic segment. Next, consider the arc-length parametrized geodesics \(\gamma _{v}\) and \(\gamma _{v_\varepsilon }\) which start from \(\gamma (0)\) and \(\gamma (\varepsilon )\) with the initial vectors v and \(v_\varepsilon \), respectively. Then the distance between \(\gamma _{v}(t)\) and \(\gamma _{v_\varepsilon }(t)\) behaves, for \(\varepsilon \rightarrow 0\) and \(t\rightarrow 0\), as \(\varepsilon (1+C t^2)\). In the Euclidean case, the distance does not depend on t at all and is equal to \(\varepsilon \). In the Finslerian situation, this property does not hold for a generic metric and a straightforward generalization of [22] is not possible.

In this paper, we prove that under the assumptions natural from the viewpoint of Finsler geometry (everything is smooth, the manifold is complete and has bounded geometry), the sequence of geodesic random walks converges to a unique diffusion process, see Theorem 2.1. Moreover, we show that the generator of this diffusion process is an elliptic operator, and give an integral formula for its coefficients.

As explained above, the generator of the limit diffusion process is a non-degenerate elliptic operator. If the probability measure \(\nu _p\) on each \(T_pM\) is constructed by \(F_{|T_pM}\) (we give examples in Sect. 2.3.2), then this elliptic operator is a natural candidate for a Beltrami–Laplace operator of the Finsler metric. Note that, different from the Riemannian case, there exist many different Finslerian analogues of the Beltrami–Laplace operator. We refer to [3], where many different constructions of the Riemannian Beltrami–Laplace operator are mimicked in the Finslerian setting. In the Riemannian case, they all give the same Beltrami–Laplace operator. In the Finslerian case, one obtains different operators. Most operators in [3] are linear but there also exist nonlinear versions of the Finslerian Betrami–Laplace operators, see, e.g. [32, 37].

An interesting by-product of our result is that the generator of the limit diffusion process corresponding to Finsler geodesic random walks coincides, up to first-order terms (the so-called “drift”), with that of a Riemannian Brownian motion. This result of us explains why it is hard or even impossible to experimentally distinguish a diffusion process coming from a Riemannian metric from that of coming from a Finsler metric. See Sect. 2.3.1 for more details.

Naturally, the topic of this paper, and therefore also the methods of the proof, belong both to differential geometry and to the theory of stochastic processes. The group of the methods coming from stochastic processes is actually standard for this type of problems (though nontrivial) and was understood at least in the 70th–80th. The novelty which allowed to solve this natural and actively attacked problem came from Finsler geometry, and the key lemma is Lemma 4.5, in which proof uses a nontrivial and not widely known result of [38, Sect. 15].

2 Setting and Results

2.1 Finsler Manifolds

First, we recall the basic definitions in Finsler geometry. Let \(M{:}{=}M^m\) be a m-dimensional manifold, \(m\ge 1\). Suppose that \((x^1,\dots ,x^m)\) is a local coordinate at some \(p\in M\). Then \(y^i=\mathrm {d}x^i\) induces a local coordinate \((x^1,\dots ,x^m,y^1,\dots ,y^m)\) on TM. For simplicity, for a function \(H:TM\rightarrow {\mathbb {R}}\), we use the notations \(H_{x^i}=\partial _{x^i}H\) and \(H_{y^i}=\partial _{y^i}H\).

A smooth Finsler manifold (MF) is a smooth manifold M together with a continuous function \(F:TM\rightarrow {\mathbb {R}}_{\ge 0}\) called the Finsler metric (Finsler function) satisfying the following conditions:

  • Regularity: The function F is smooth on \(TM\setminus \lbrace 0\rbrace \).

  • Positive Homogeneity: For any \((x,y)\in T_xM\) and \(\lambda \ge 0\), we have \(F(x,\lambda y)=\lambda F(x,y)\).

  • Strong Convexity: For \(0\ne (x,y)\in T_xM\), the fundamental tensor defined by

    $$\begin{aligned}{}[g_{(x,y)}]_{ij}{:}{=}\left( \dfrac{1}{2} F^2 \right) _{y^iy^j} \end{aligned}$$
    (2.1)

is positive definite.

The indicatrix bundle of (MF) is defined by

$$\begin{aligned} IM=\lbrace Y\in TM:F(Y)=1\rbrace . \end{aligned}$$

For any \(p\in M\), the fibre \(I_pM\) of IM is a convex hypersurface in \(T_pM\) diffeomorphic to \({\mathbb {S}}^{m-1}\).

If \((M,{\mathbf {g}})\) is a Riemannian manifold, one can naturally endow it with a Finsler metric by setting \(F(Y):=\sqrt{{\mathbf {g}}(Y,Y)}\), \(Y\in TM\). Conversely, a Finsler function corresponds to some Riemannian metric \({\mathbf {g}}\) if and only if its fundamental tensor \(g_{ij}\) defined in (2.1) depends only on the \(x^i\)-variables.

The definitions of geodesics and exponential maps can be naturally generalized to the Finslerian situation. A smooth curve \(\gamma :[a,b]\rightarrow M\) is a geodesic, if it is a stationary point of the energy functional

$$\begin{aligned} E[\gamma ]{:}{=}\dfrac{1}{2}\int _a^b F^2(\gamma (t),{\dot{\gamma }}(t))\, \mathrm {d}t. \end{aligned}$$
(2.2)

among all piecewise smooth curves starting at \(\gamma (a)\) and ending at \(\gamma (b)\). It is known that for any \(p\in M\) and for any \(Y\in T_pM\), there exists a unique geodesic \(\gamma _Y=\gamma _Y(t)\) such that \(\gamma (0)=p\) and \({\dot{\gamma }}(0)=Y\). We define the exponential map at p to be

$$\begin{aligned} \exp _p:T_pM\ni Y\mapsto \gamma _Y(1)\in M \end{aligned}$$
(2.3)

for all \(Y\in T_pM\) such that \(\gamma _Y(t)\) is defined for \(t\in [0,1]\). We say (MF) is forward complete if for any \(p\in M\), the exponential map \(\exp _p\) is defined for all \(Y\in T_pM\). The manifold (MF) is geodesically complete, if each geodesic \(\gamma \) can be extended to a geodesic defined for all \(t\in (-\infty ,\infty )\).

For a piecewise smooth curve \(\gamma :[a,b]\rightarrow M\), its length is defined by

$$\begin{aligned} \mathbf {Length}(\gamma )=\int _a^b F(\gamma (t), {{\dot{\gamma }}}(t))\,\mathrm {d}t. \end{aligned}$$
(2.4)

The Finsler function F defines the following asymmetric and symmetrized distances on M:

$$\begin{aligned} d_a(p,q)&{:}{=}\inf \Big \lbrace \mathbf{Length} (\gamma ):\gamma \text { is a piecewise smooth curve from }p \text { to q }\Big \rbrace , \nonumber \\ d(p,q)&{:}{=}\max \lbrace d_a(p,q),d_a(q,p)\rbrace \end{aligned}$$
(2.5)

By the Hopf–Rinow theorem for Finsler manifolds (see, e.g. [5, Sect. 6.6]), if (MF) is forward complete, the metric space (Md) is complete. For a forward complete (MF), every closed ball of (Md) is compact. The manifold M can be naturally endowed with the Borel sigma-algebra that makes it a measure space.

Like in the Riemannian case, geodesics of Finsler metrics are local distance minimizing (with respect to \(d_a\)) curves. The formula (2.2) ensures that they are parametrized proportional to the arc-length parameter. Note, as F is in general not reversible, i.e. \(F(x,y)\not \equiv F(x,-y)\), the distance function \(d_a\) and geodesics are not reversible either.

We will assume below that the flag and T-curvatures (the definitions are in e.g. [38]) of our Finsler manifold are uniformly bounded. The flag curvature K can be thought as a generalization of the Riemannian sectional curvature. The definition of T-curvature (see [38, Sect. 10.1]) is essentially Finslerian since it vanishes for Riemannian manifolds.

Within the whole paper, we assume the following set of hypotheses.

H\(_{c}\): The manifold (MF) is connected and forward complete.

H\(_{b}\): The manifold (MF) has bounded geometry in the following sense:

Definition 2.1

We say a Finsler manifold (MF) has bounded geometry if the followings hold:

  1. 1.

    Uniform ellipticity: There is some constant \(C>1\) such that for any \(p\in M\) and any non-zero \(u,v\in T_pM\), we have

    $$\begin{aligned} \frac{1}{C^2}F^2(v)=\dfrac{1}{C^2}g_v(v,v)\le g_u(v,v)\le C^2g_v(v,v)=C^2F^2(v). \end{aligned}$$
    (2.6)
  2. 2.

    The flag curvature K is bounded uniformly and absolutely by some constant \(\lambda >0\), namely \(\Vert K\Vert \le \lambda \).

  3. 3.

    The T-curvature is also bounded uniformly and absolutely in the following sense. For any \(p\in M\), any \(u,v\in T_pM\) with \(F(v)=1\), the T-curvature satisfies

    $$\begin{aligned} |T_v(u)|\le \lambda \lbrace g_v(u,u)-[g_v(u,v)]^2\rbrace \end{aligned}$$
    (2.7)

Note that all objects used in the definition of “bounded geometry” are microlocal: These objects are functions in both \(x^i\) and \(y^i\) coordinates. For an explicitly given Finsler metric, it is possible to check whether it has bounded geometry. Moreover, if M is compact, then every smooth Finsler metric on it has bounded geometry.

In this paper, we use the following notations.

We say a function \(f:M\rightarrow {\mathbb {R}}\) vanishes at infinity, if \(\forall \varepsilon >0\), there exists some compact set \(K_{\varepsilon }\subset M\) such that \(\Vert f\Vert \le \varepsilon \) outside \(K_{\varepsilon }\).

We denote the unit discs on TM by

$$\begin{aligned} D_pM=\lbrace Y\in T_pM :F(Y)\le 1\rbrace . \end{aligned}$$

Let d be the symmetrized distance defined by (2.5). We denote the open balls by

$$\begin{aligned} B_p(\varepsilon )=\lbrace q\in M:d(p,q)<\varepsilon \rbrace ,\quad \varepsilon >0. \end{aligned}$$

In this paper, \({\mathcal {B}}\) is the space of Borel measurable real valued functions on M, \({\mathcal {C}}_0\) is the space of continuous real-valued functions vanishing at infinity, \({\mathcal {C}}^\infty \) is the space of smooth functions, \({\mathcal {C}}^\infty _K\) is the space of smooth functions with compact support. Furthermore, \(D([0,\infty ),M)\) is the collection of right continuous functions \(\gamma :[0,\infty )\rightarrow M\) with left limits, and \(C([0,\infty ),M)\) is space of continuous functions \(\gamma :[0,\infty )\rightarrow M\).

2.2 Rescaled Geodesic Random Walks

Consider a Finsler manifold (MF) such that each \(T_pM\) is equipped with a probability measure \(\nu _p\). By the mean of \(\nu _p\), we understand the vector \(\mu _p:= \int _{T_{p}M} d\nu _p \ \in T_pM\).

We are going to modify the sequence of random walks defined in the introduction. This change is trivial if \(\mu _p=0\) for every p: the step size is scaled by \(1/\sqrt{N}\). If \(\mu _p\ne 0\), we in addition appropriately shift the probability measure. The following example demonstrates that without such a modification, the sequence of random walk does not converge, for \(N\rightarrow \infty \), to a continuous stochastic process.

We consider \({\mathbb {R}}^2(x,y) \) with the standard flat Riemannian metric, so geodesics are straight lines. Let \(\nu _p\) be the rotationally invariant measure on the circle of radius 1 with centre at the point (r, 0), \(r\in (0,1)\), properly normalized. Clearly, \(\mu _p= r \tfrac{\partial }{\partial x}.\) In particular, the probability of the x-coordinate to increase after one jump is higher than to decrease. By the law of large numbers, the random walk “drifts” deterministically at the distance \(r\sqrt{N}\) (which goes to \(\infty \) for \(N\rightarrow \infty \)) during one time unit and diffuses at the distance of order 1 due to the central limit theorem. We clearly see that the sequence of random walks does not converge for \(N\rightarrow \infty \) to a continuous stochastic process, see Fig. 2, where we plotted trajectories of the random walk for 10 units of time for \(N= 100\) and \(N=10000\) and \(r=0.05\).

Fig. 2
figure 2

The sample paths with \(N=100\) and \(N=10000\) and \(r=0.05\); the path with \(N=10000\) is close to the deterministic motion with velocity \(r\sqrt{N}\)

Although the example is two dimensional and flat, the same phenomenon appears in all dimensions and in the Finslerian case.

Because of this, we introduce below the family of rescaled random walks (we will formalize this definition in Sect. 3.2). The \(N^\text {th}\) random walk is constructed using the measure \(\ \nu _p^N\) which is constructed by \(\nu _p\) and N via the formula \( \nu ^N_p(A ):=\nu _p(A\sqrt{N}+\mu _p-\mu _p/\sqrt{N})\). That is, we obtain \( \nu ^N_p\) by shifting the measure \(\nu _p\) in \(T_pM\) by the vector \(-\mu _p + \mu _p/\sqrt{N}\) and scaling it by \(1/\sqrt{N}\). Such a rescaling guarantees that the drift and the diffusive components of the random walk have the same order and forbid the instant escape to infinity in the limit. We will call this operation the rescaling of measure. This operation is a straightforward generalization of the one used in the Riemannian situation by E. Jørgensen in [22]. His motivation, which is also valid in our situation, was that for \(N=1\), the increments of the random walk should be distributed according to “basic” measure \(\nu _p\). Indeed, for \(N=1\), we have \(-\mu _p/\sqrt{N} + \mu _p/N\equiv 0\).

This is not the only possible choice of rescaling. A simpler rescaling operation is as follows: At every point p, we define \({\tilde{\nu }}_p\) by centring the basic measure \(\nu _p\). That is, we shift the measure \(\nu _p\) by the vector \(-\mu _p\). This makes the mean of the new measure equal to 0. Note that some papers on random walks on Finsler metric, for example [4], assume that both Finsler metric and the measure \(\mu _p\) are centrally symmetric on every \(T_pM\) (the so-called reversible situation), so the measure is automatically centred. Our choice is justified by the observation that most examples of Finsler metrics appearing in applications are not reversible. Moreover, if the Finsler geodesic random walk is used to describe a physical model, the mean of \(\nu _p\) does not have to be neglected. Indeed, microscopic particles cannot make too long jumps because of friction and collisions, even if the probability of the particle to go in one preferable direction is higher. Therefore, the particle does not escape to infinity in short time, contrary to what is suggested by the random walk described in the beginning of this section.

One can easily generalize the rescaling above by considering shifts \(-\mu _p + \alpha \mu _p/\sqrt{N}\) for some \(\alpha \) depending on p. To see this, observe that if we modify \(\nu _p\) by shifting it by \(\beta \mu _p\), then the rescaled measure will be shifted by \((1+\beta )\mu _p/\sqrt{N}\). Thus, results of our paper can be applied for any \(\alpha \).

2.3 The Main Result

We will assume that the Finsler manifold (MF) is connected and forward complete (Hypothesis H\(_{c}\)) and has bounded geometry (Hypothesis H\(_b\)). In addition, we make the following assumption on the family of measures \(\{\nu _p\}\).

H\(_\nu \): We assume that \(\nu = \{\nu _p\}\) is a smooth family of probability measures inside \(DM:=\{ Y \in TM \mid F(Y)\le 1\}\) in TM or on the F-indicatrices.

In the first case, we require that \(\nu \) is a smooth m-form on DM such that for every p, the restriction \(\nu _p:=\nu |_{T_pM}\) is a form on the disc \(D_pM:= \{ Y \in T_pM \mid F(Y)\le 1\}\) inducing a probability measure. Similarly, in the second case, \(\nu \) is a smooth \((m-1)\)-form on IM such that for each \(p\in M\), the restriction \(\nu _p:=\nu |_{I_pM}\) is a probability measure.

This hypothesis is very natural from the viewpoint of Finsler geometry and covers many choices that have their natural counterparts in the Riemannian setting; we will give a few examples in Sect. 2.3.2.

Our main result is the following theorem:

Theorem 2.1

Let Hypotheses \({\mathbf {H}}_{c}\), \({\mathbf {H}}_b\), and \({\mathbf {H}}_\nu \) be satisfied. Consider a family of geodesic random walks starting at \(p_0\) constructed from (MF) and \(\lbrace \nu _p\rbrace _{p\in M}\). Then, this sequence has a unique weak limit \(\xi \). The process \(\xi \) is a diffusion in which generator is a non-degenerate elliptic differential operator A with smooth coefficients given by

$$\begin{aligned} Af(p)=\mathrm {d}f(\mu _p)+\dfrac{1}{2}\int _{T_pM} \dfrac{\mathrm {d}^2}{\mathrm {d}t^2}\biggr \vert _{t=0}f\circ \gamma _{Y-\mu _p}(t)\nu _p (\mathrm {d}Y),\quad f\in {\mathcal {C}}^{\infty }_K. \end{aligned}$$
(2.8)

Here, \(\gamma _{Y-\mu _p}\) is the geodesic with initial vector \(Y-\mu _p\). In the local coordinates, it has the following form:

$$\begin{aligned} \begin{aligned} Af(p)&= f_k\left( \mu _p^k-\dfrac{1}{2}\int _{T_pM}\Gamma ^k_{ij}\left( p,y-\mu _p \right) (y^i-\mu _p^i)(y^j-\mu _p^j)\, \nu _p(\mathrm {d}y)\right) \\&\quad +\dfrac{1}{2} f_{ij}\int _{T_pM}(y^i-\mu _p^i)(y^j-\mu _p^j)\, \nu _p (\mathrm {d}y), \end{aligned} \end{aligned}$$
(2.9)

where \(\Gamma ^k_{ij}\) are the formal Christoffel symbols of the second kind given by

$$\begin{aligned} \Gamma ^k_{ij}(x,y)=\dfrac{1}{2}g^{ks}\left( \dfrac{\partial g_{is}}{\partial x^j}+ \dfrac{\partial g_{js}}{\partial x^i}-\dfrac{\partial g_{ij}}{\partial x^s} \right) (x,y),\quad y\ne 0, \end{aligned}$$

\(f_k= \partial _{x^k} f\) and \(f_{ij} = \partial ^2 _{x^ix^j} f\).

Moreover, \(\xi \) is stochastically complete.

2.3.1 Limit Diffusion as a Riemannian Brownian Motion with Drift

Recall that the Riemannian Brownian motion is a diffusion process which is the limit of geodesic random walks with identically distributed steps. Here, identically distributed should be understood as follows: the probability measure \(\nu _p\) is invariant with respect to the parallel transport along any curve and is invariant with respect to the standard action of \(SO({\mathbf {g}})\) on \(T_pM\). Actually, for a generic metric, invariance with respect to the parallel transport implies \(SO({\mathbf {g}})\)-invariance.

It is known that the generator of a Riemannian Brownian is proportional to the Beltrami–Laplace operator of the metric, so its symbol is proportional with a constant coefficient to the inverse of the Riemannian metric.

By Theorem 2.1, in the Finslerian case, the generator A of the limit process of the geodesic random walk is a second-order non-degenerate elliptic differential operator on M. Hence, the symbol \(\sigma (A)\) of A is dual to a Riemannian metric on M which we denote \({\mathbf {g}}_A\). Then the Beltrami–Laplace operator \(\Delta ^A\) of \({\mathbf {g}}_A\) and A has the same symbol. Hence, \(A-\Delta ^A\) is just a vector field on M. We call this vector field the drift of A.

In particular, though Finsler metrics are much more complicated than Riemannian metrics, one almost does not see the difference on the level of diffusion processes (only first-order terms of generators may be different). This should be the reason why Finslerian effects related to diffusion were not observed experimentally in physical or natural science systems, even in those where the free motion of particles corresponds to geodesics of a certain Finsler metric. See e.g. [15] where in a highly anisotropic situation (diffusion weighted magnetic resonance imaging of brain), the measurement returned a Finsler metric which is very close to a Riemannian metric.

This observation may provide additional mathematical tools for natural science and physics. Indeed, in most cases the probability distributions \(\nu _p\) can be “read” from the description of the model (in fact in many cases, they are generated by the volume form of the standard flat metric). Empirical observations of diffusions may provide tools for testing mathematical models of the system in question or determining their parameters.

2.3.2 Canonical Constructions of Riemannian Metrics

In the Riemannian situation, there is essentially only one canonical (i.e. coordinate invariant) construction of a probability measure on \(T_pM\). Indeed, coordinate invariance of the construction implies that the metric is invariant under the group \(SO({\mathbf {g}})\), which implies that in the orthogonal coordinates \((y^1,\dots ,y^m)\) on \(T_pM\), it is given by \(\phi ((y^1)^2+\cdots +(y^m)^2 )\, \mathrm {d}y^1\wedge \cdots \wedge \mathrm {d}y^m\). The function \(\phi \) is the same for all points p and has the property that it is nonnegative and that the integral \(\int _{{\mathbb {R}}^n} \phi ((y^1)^2+\cdots +(y^m)^2 )\, \mathrm {d}y^1\wedge \cdots \wedge \mathrm {d}y^m =1\).

In the Finslerian situations, there are many natural non-equivalent constructions of a measure on \(T_pM\). Let us recall the following three.

  • Measure coming from the Lebesgue measure: For any \(p\in M\), let \(\omega '_p=\mathrm {d}y^1\wedge \cdots \wedge \mathrm {d}y^m\) be a Lebesgue measure on \(T_pM\). It is known that it is unique up to a positive coefficient. We restrict it to the ball \(D_pM\) (that is, the measure of an open set \(U\subset T_pM\) it the Lebesgue measure of the intersection \(D_pM \cap U\)), and normalize it such that it becomes a probability measure.

  • Measure coming from the fundamental tensor: For any \(p\in M\), the fundamental tensor \(g_{ij}\) defines a Riemannian metric on the compact manifold \(I_pM\). Normalizing the volume on \(I_pM\) induced by \(g_{ij}\), we obtain probability measure

    $$\begin{aligned} \nu _p{:}{=}\dfrac{\mathrm {vol}_g}{\mathrm {vol}_g(I_pM)}. \end{aligned}$$
    (2.10)

    This probability measure is close to the one used in [4, Sect. A2].

  • Measure coming from the Hilbert form: Denote \({\mathbb {P}}^+(M)\), the positive projectivized tangent bundle. The Hilbert 1-form \({\hat{\omega }}=F_{y^i}\,\mathrm {d}x^i\) defined on \(TM\setminus \lbrace 0\rbrace \) is actually a pull back of some 1-form \(\omega \) on \({\mathbb {P}}^+(M)\) by the standard projection. It is well known that \(\omega \wedge (\mathrm {d}\omega )^{m-1}\) defines a volume form on \({\mathbb {P}}^+(M)\simeq IM\). Let \(\mathrm {i}_p:I_pM\rightarrow IM\) be the standard inclusion and \(\pi :IM\rightarrow M\) be the canonical projection. It is known (see, e.g. [7]) that there exists a \((m-1)\)-form \(\alpha ^F\) on IM and a volume form \(\omega ^F\) on M such that \(\alpha ^F\vert _{I_pM}\) is a unique volume form on \(I_pM\) for each \(p\in M\) with

    $$\begin{aligned} \begin{aligned}&\mathrm {vol}_{\alpha ^F}(I_pM)=1,\\&\alpha ^F\wedge \pi ^{*}\omega ^F=\omega \wedge (\mathrm {d}\omega )^{m-1}. \end{aligned}\end{aligned}$$
    (2.11)

    Hence, we can take \(\nu _p{:}{=}\mathrm {vol}_{\alpha ^F}\) on \(I_pM\).

Each of these measures satisfies the Hypothesis \({\mathbf {H}}_\nu \) and is coordinate independently constructed from F. In the case the Finsler metric is reversible, the dual of the symbol of the generator corresponding to the first measure gives the Binet–Legendre metric (see, e.g. [11, 31]). The second choice of the measure gives the averaged metric used in [29, 30] (a small modification of the construction leads to the metric from [43]), and the third choice of measure generates the Finsler Laplacian from [7]. Note that the Binet–Legendre metric, the averaged metric, and the Finsler Laplacian from [7] appear to be effective tools for solving different problems in Finsler geometry; we expect that other natural choices of the measure \(\nu _p\) may also be useful in Finsler geometry.

2.4 Example: Limit Diffusion for a Katok Finsler Metric

Let \(({\mathbb {S}}^2,{\mathbf {g}})\) be the unit sphere endowed with the standard Riemannian metric. Katok metric is constructed as follows. Let X be the vector field of rotation around the axis connecting the north and south poles of the sphere such that \({\mathbf {g}}(X,X)<1\). In the the following spherical coordinate on \({\mathbb {S}}^2\),

$$\begin{aligned} (\psi ,\theta )\mapsto (\cos (\psi )\cos (\theta ),\sin (\psi )\cos (\theta ),\sin (\theta )). \end{aligned}$$
(2.12)

Then there is some constant \(|r|<1\) such that

$$\begin{aligned} X=r\partial _{\psi }. \end{aligned}$$

Now for any \(p\in M\), the indicatrix \(I_pM\) of the constructed Finsler function F is obtained by shifting the unit sphere \(S_p{\mathbb {S}}^2\) of \({\mathbf {g}}\) (which is the indicatrix with respect to \({\mathbf {g}}\)) by X. That is,

$$\begin{aligned} I_pM{:}{=}\lbrace v+X_p:v\in T_p{\mathbb {S}}^2,\ {\mathbf {g}}(v,v)=1\rbrace \end{aligned}$$

This yields a well-defined Finsler metric as \({\mathbf {g}}(X,X)<1\). This family of metrics depending on the parameter r was constructed by A. Katok in [24]. It is widely used in Finsler geometry and in the theory of dynamical system as source of examples and counterexamples. It has constant flag curvature by [6, 16, 39], and by [9], any metric of constant flag curvature on the 2-sphere has geodesic flow conjugate to that of a Katok metric.

As the measure \(\nu _p\) we consider the Lebesgue measure as described in Sect. 2.3.2; let us calculate the generator of the corresponding diffusion process \(\xi \).

By Theorem 2.1, the diffusion process \(\xi \) generated by \(\lbrace \nu _p\rbrace _{p\in M}\) has generator A such that

$$\begin{aligned} \begin{aligned} Af(p)&= \mathrm {d}f(X)(p)+\dfrac{1}{2} \biggr \lbrace f_{ij}\int _{D_pM}(Y^i-X^i)(Y^j-X^j)\, \nu _p(\mathrm {d}Y) \\&\quad - f_k\int _{D_pM}\Gamma ^k_{ij}\left( p,Y-X \right) (Y^i-X^i)(Y^j-X^j)\, \nu _p(\mathrm {d}Y) \biggr \rbrace \\ \end{aligned} \end{aligned}$$
(2.13)

where \(\Gamma ^k_{ij}\) are the formal Christoffel symbols of the second kind of (MF) and \(f\in {\mathcal {C}}^{\infty }\).

As \(\nu _p\) is induced by a Lebesgue measure on \(T_pM\simeq {\mathbb {R}}^m\), we also denote this Lebesgue measure by \(\nu _p\) for simplicity. For any \(p\in M\), the set

$$\begin{aligned} {\hat{D}}_PM{:}{=}\lbrace Y\in T_pM\ \vert \ Y+X(p)\in D_pM\rbrace \end{aligned}$$

is just the closed unit ball on \(T_p{\mathbb {S}}^2\) with respect to \({\mathbf {g}}\). Since \(\nu _p\) is translation invariant, Equation (2.13) becomes

$$\begin{aligned} Af(p)&= \mathrm {d}f(X)(p)+\dfrac{1}{2} \biggr \lbrace f_{ij}\int _{{\hat{D}}_pM} Y^i Y^j \nu _p(\mathrm {d}Y) - f_k\int _{{\hat{D}}_pM}\Gamma ^k_{ij}\left( p,Y \right) Y^iY^j\, \nu _p(\mathrm {d}Y) \biggr \rbrace \end{aligned}$$
(2.14)

Note the integrand in the equation above is second-order homogeneous in Y. By Fubini theorem, for any \(p\in M\), there is a finite measure \(\eta _p\) on \(S_p{\mathbb {S}}^2\) such that for any integrable second-order homogeneous function h on \(T_pM\), we have

$$\begin{aligned} \int _{{\hat{D}}_pM} h(Y)\nu _p (\mathrm {d}Y)=\int _{S_p{\mathbb {S}}^2} h(Y) \eta _p (\mathrm {d}Y) \end{aligned}$$
(2.15)

Because \(\nu _p\) is invariant under any orthogonal transformation on \(T_p{\mathbb {S}}^2\) with respect to \({\mathbf {g}}\), it is clear \(\eta _p\) is a multiple of the canonical angular measure \(m_p\) on \(S_p{\mathbb {S}}^2\) with respect to \({\mathbf {g}}\). A straight forward computation shows \(\eta _p=\dfrac{1}{4\pi }m_p\). Hence from (2.14), we get

$$\begin{aligned} Af(p)=\mathrm {d}f(X)(p)+\dfrac{1}{8\pi } \biggr \lbrace f_{ij}\int _{S_p{\mathbb {S}}^2} Y^i Y^j m_p(\mathrm {d}Y) - f_k\int _{S_p{\mathbb {S}}^2}\Gamma ^k_{ij}\left( p,Y \right) Y^iY^j\, m_p(\mathrm {d}Y) \biggr \rbrace \end{aligned}$$
(2.16)

Let \(\Delta \) be the Beltrami–Laplace operator of \({\mathbf {g}}\), and let \({\hat{\Gamma }}^k_{ij}\) be the Christoffel symbols of \({\mathbf {g}}\). A straightforward computation yields

$$\begin{aligned} \dfrac{1}{8}\Delta f(p)=\dfrac{1}{8\pi }\biggr \lbrace f_{ij}\int _{S_p{\mathbb {S}}^2}Y^i Y^j \, m_p(\mathrm {d}Y) - f_k\int _{S_p{\mathbb {S}}^2}{\hat{\Gamma }}^k_{ij}(p) Y^i Y^j\, m_p(\mathrm {d}Y)\biggr \rbrace . \nonumber \\ \end{aligned}$$
(2.17)

This implies that \(\dfrac{1}{8}\Delta \) and A have the same symbol. To compute the drift of A, we assume without loss of generality that \(0\le r<1\). First, we have

$$\begin{aligned} \begin{aligned} \left( A-\dfrac{1}{8}\Delta \right) f(p)&= \mathrm {d}f(X)(p) +\dfrac{1}{8\pi }f_k \int _{S_p{\mathbb {S}}^2}{\hat{\Gamma }}^k_{ij}(p) Y^i Y^j\, m_p(\mathrm {d}Y)\\&\quad -\dfrac{1}{8\pi } f_k \int _{S_p{\mathbb {S}}^2}\Gamma ^k_{ij}\left( p,Y \right) Y^i Y^j \,m_p(\mathrm {d}Y). \end{aligned} \end{aligned}$$
(2.18)

Let \(\Phi _t\) be the flow generated by X, we know from [16, Theorem 1] that if \(\gamma (t)\) is a geodesic of \(({\mathbb {S}}^2,{\mathbf {g}})\) with \(\sqrt{{\mathbf {g}}({\dot{\gamma }},{\dot{\gamma }})}=c\), then \({\hat{\gamma }}(t)=\Phi _{ct}\circ \gamma (t)\) is a geodesic of F with initial vector \({\hat{\gamma }}'(0)={\dot{\gamma }}(0)+cX(\gamma (0))\). But in the spherical coordinate given (2.12), the flow \(\Phi _t\) simply has the form:

$$\begin{aligned} \Phi _t(\psi ,\theta )=(\psi +rt,\theta ). \end{aligned}$$

Then in this coordinate, we have

$$\begin{aligned} \dfrac{\mathrm {d}^2}{\mathrm {d}t^2}\biggr \vert _{t=0} {\hat{\gamma }}(t)= \dfrac{\mathrm {d}^2}{\mathrm {d}t^2}\biggr \vert _{t=0} \gamma (t). \end{aligned}$$

By the geodesic equation, this is equivalent to

$$\begin{aligned} \Gamma ^k_{ij}(p,{\hat{\gamma }}'(0))({\hat{\gamma }}'(0))^i ({\hat{\gamma }}'(0))^j={\hat{\Gamma }}^k_{ij}(p)(\gamma '(0))^i(\gamma '(0))^j. \end{aligned}$$

Using this and (2.18), a straight forward computation gives

$$\begin{aligned} \Big (A-\frac{1}{8}\Delta \Big )=X+\dfrac{1}{4}r^2\cos (\theta )\sin (\theta )\cdot \dfrac{r^2\cos ^2(\theta )-2}{\left( 1-r^2\cos ^2(\theta )\right) ^2}\cdot \partial _{\theta }. \end{aligned}$$
(2.19)

This is the drift of the generator A.

Fig. 3
figure 3

(l.) A sample path of the Brownian motion on the standard sphere. (r.) A sample path of the diffusion with the generator A given by (2.13) with \(r=1/2\)

On Fig. 3, one clearly sees the difference in the behaviours of the Brownian motion of the initial round metric on \({\mathbb {S}}^2\) and of the diffusion process corresponding to the Katok metric with \(r=1/2\) due to the drift given in (2.18). Of course, the pictures are just the pictures of the corresponding geodesic random walks with a sufficiently large N. Note that the same random seed was used in both pictures.

3 Preliminaries

In this section, we give a short review of the tools in Finsler geometry which will be used in our proof in later sections and formalize definitions of random geodesic walks which will allow us to apply the machinery from the theory of stochastic processes.

3.1 Finsler Geodesics and Properties of Bounded Geometry

It is well known that stationary points of the energy functional (2.2) are solutions of the Euler-Lagrange equation which in our situation is equivalent to the following system of ODEs:

$$\begin{aligned}&\frac{\mathrm {d}x^i}{\mathrm {d}t}=y^i, \end{aligned}$$
(3.1)
$$\begin{aligned}&\frac{\mathrm {d}y^k}{\mathrm {d}t}+\Gamma ^k_{ij}(x,y)y^iy^j=0. \end{aligned}$$
(3.2)

Here, \(\Gamma ^k_{ij}\) are the formal Christoffel symbols of the 2nd kind:

$$\begin{aligned} \Gamma ^k_{ij}(x,y)=\dfrac{1}{2} g^{ks}\left( \dfrac{\partial g_{is}}{\partial x^j}+\dfrac{\partial g_{js}}{\partial x^i}-\dfrac{\partial g_{ij}}{\partial x^s}\right) (x,y),\quad y\ne 0 \end{aligned}$$
(3.3)

It is immediate from (3.3) that for any \(\lambda >0\) and \(y\ne 0\), we have

$$\begin{aligned} \Gamma ^k_{ij}(x,y)=\Gamma ^k_{ij}(x,\lambda y). \end{aligned}$$
(3.4)

Denote the class of real-valued k-times continuously differentiable real-valued functions on M with compact support by \({\mathcal {C}}^k_K\). Suppose \(f\in {\mathcal {C}}^{k_0}_K\) with compact support \(K_f\). For any \(Y\in TM\), define \(f_Y(t)=f\circ \gamma _Y(t)\), where \(\gamma _Y(t)\) is the geodesic with initial vector Y as before.

Lemma 3.1

Suppose that \(f\in {\mathcal {C}}^{k_0}_K\). There exists some constant c such that

$$\begin{aligned} \begin{aligned} \biggr |\left( \dfrac{\mathrm {d}^k}{\mathrm {d}t^k}f_Y\right) (t)\biggr |\le c F^k(Y),\quad \forall k\le k_0, \end{aligned}\end{aligned}$$
(3.5)

wherever it is well defined.

Proof

First, we show that there is some constant c such that \(\biggr |\left( \dfrac{\mathrm {d}^k}{\mathrm {d}t^k}f_Y\right) (0)\biggr |\le c F^k(Y)\). Let \({\tilde{X}}\) be the geodesic spray on TM, and denote \(\pi :TM\rightarrow M\) the canonical projection. The function \(\pi ^*f\) is \({\mathcal {C}}^{k_0}\), so \(({\mathcal {L}})^k_{{\tilde{X}}}(\pi ^*f)\) is continuous on \(TM\setminus \lbrace 0\rbrace \) for any \(k\le k_0\). Hence, for some constant c, we have

$$\begin{aligned} |({\mathcal {L}})^k_{{\tilde{X}}}(\pi ^*f)({\hat{p}})|\le c ,\ \forall {\hat{p}}\in IK_f,\ \forall k\le k_0. \end{aligned}$$

For any \(p\in M\) and \(Y\in T_pM\), we have

$$\begin{aligned} \left( \dfrac{\mathrm {d}^k}{\mathrm {d}t^k}f_Y\right) (0)=({\mathcal {L}})^k_{{\tilde{X}}}(\pi ^*f)(p,Y). \end{aligned}$$

In addition, for \(Y\ne 0\), let \(Y'=\dfrac{Y}{F(Y)}\). Using

$$\begin{aligned} f_Y(t)=f\circ \gamma _{F(Y)Y'}(t)=f_{Y'}(F(Y)t), \end{aligned}$$

we get

$$\begin{aligned} \biggr |\left( \dfrac{\mathrm {d}^k}{\mathrm {d}t^k}f_Y\right) (0)\biggr |=\biggr |\left( \dfrac{\mathrm {d}^k}{\mathrm {d}t^k}f_{Y'}\right) (0)\biggr |F^k(Y). \end{aligned}$$

Then for \(p\in K_f\) and \(Y\ne 0\), we have

$$\begin{aligned} \biggr |\left( \dfrac{\mathrm {d}^k}{\mathrm {d}t^k}f_Y\right) (0)\biggr |=({\mathcal {L}})^k_{{\tilde{X}}}(\pi ^*f)(p,Y')\cdot F^k(Y)\le c F^k(Y). \end{aligned}$$

For \(p\notin K_f\), we have f vanishes identically on some neighbourhood of p. Then the function \(t\mapsto f_Y(t)\) is constant near \(t=0\). For \(Y=0\), then function \(f_Y(t)\) is also constant. It follows that \(\biggr |\left( \dfrac{\mathrm {d}^k}{\mathrm {d}t^k}f_Y\right) (0)\biggr |\le c F^k(Y)\).

Next, given any geodesic \(\gamma _Y\), let Y(t) be its velocity field. We have

$$\begin{aligned}&F(Y(t))=F(Y(0))=F(Y),\\&\quad \biggr |\left( \dfrac{\mathrm {d}^k}{\mathrm {d}t^k}f_Y\right) (t)\biggr |=\biggr |\left( \dfrac{\mathrm {d}^k}{\mathrm {d}t^k}f_{Y(t)}\right) (0)\biggr |\le cF^k(Y(t))=cF^k(Y). \end{aligned}$$

This completes the proof. \(\square \)

The injective radius at p is defined by

$$\begin{aligned} \mathrm {inj}_M(p)&{:}{=}\inf \lbrace r>0:\exp _p\vert _{D_p(r)}\ \mathrm {is\ injective} \rbrace \\ \mathrm {inj}_M&{:}{=}\inf \lbrace \mathrm {inj}_M(p):p\in M \rbrace \end{aligned}$$

The conjugate radius is defined similarly by

$$\begin{aligned} \mathrm {con}_M(p)&{:}{=}\inf \lbrace r>0:\exp _p\vert (D_p(r))\ \mathrm {is\ an\ immersion}\rbrace \\ \mathrm {con}_M&{:}{=}\inf \lbrace \mathrm {con}_M(p):p\in M\rbrace . \end{aligned}$$

We always have \(\mathrm {inj}_M(p)\le \mathrm {con}_M(p)\) for any \(p\in M\), see [5, Proposition 8.2.1]. The conjugate radius and flag curvature are related by the following well-known result, [5, Proposition 9.5.2].

Proposition 3.2

Suppose (MF) is a Finsler manifold such that its flag curvature \(\Vert K\Vert \le \lambda \). Then the conjugate radius is bounded from below by \(\mathrm {con}_M\ge \dfrac{\pi }{\sqrt{\lambda }}\), hence strictly positive.

3.2 Formal Definition of Rescaled Geodesic Random Walks

We begin this section by a brief review of the basic definitions in Markov processes used in this paper. Roughly speaking, a stochastic process is said to be Markovian if its future states depend only upon the present state, regardless of its past state.

Definition 3.1

Let \((\Omega ,{\mathcal {F}},{\mathbf {P}})\) be a probability space. An M-valued process \(\xi :\Omega \times [0,\infty )\rightarrow M\) is Markovian if for each Borel subset of B of M, for all \(n\ge 1\), \(0\le s_1<\cdots<s_n<s<t\), we have

$$\begin{aligned} \begin{aligned} {\mathbf {P}}(\xi _t\in B\vert \xi _{s_1},\dots , \xi _{s_n},\xi _s)={\mathbf {P}}(\xi _t\in B\vert \xi _s). \end{aligned}\end{aligned}$$
(3.6)

The transition probability function for a Markov process is defined by

$$\begin{aligned} P(p,s,t,B)={\mathbf {P}}(\xi _t\in B\vert \xi _s=p),\ \forall p\in M,\ \forall 0\le s\le t. \end{aligned}$$

We say \((\xi _t)_{t\ge 0}\) is time homogeneous if the following holds.

$$\begin{aligned} P(p,s,t,B)=P(p,0,t-s,B),\ \forall p\in M,\ \forall 0\le s\le t. \end{aligned}$$

All Markov processes considered in this paper are time homogeneous. A time homogenous Markov process \(\xi \) defines a semigroup \(T=(T_t)_{t\ge 0}\) of linear operators on the measurable functions \({\mathcal {B}}\) on M by

$$\begin{aligned} T_t(f)(p)={\mathbf {E}}_p[f(\xi _t)],\ p\in M,\ t\ge 0. \end{aligned}$$

We say a Markov process \(\xi \) is Feller if the semigroup T is a strongly continuous semigroup of positive contractions on the Banach space \({\mathcal {C}}_0\).

In the introduction, we gave a slightly informal definition of (rescaled) geodesic random walks. We now give a formal definition.

Let (MF) be a geodesically complete Finsler manifold. Let \(\{ \nu _p\}_{p\in M}\) be a family of measures such that each \(\nu _p\) is a probability measure on \(T_pM\). Denote by \(\mu _p\) the mean of \(\nu _p\)

$$\begin{aligned} \mu _p{:}{=}\int _{T_pM} Y\, \nu _p (\mathrm {d}Y). \end{aligned}$$

In our setting (Hypothesis \({\mathbf {H}}_\nu \)), the probability measures are compactly supported so \(\mu _p\) exists and is finite.

Definition 3.2

Let \(N\ge 1\) and let \(p_0\in M\) be fixed. A random process \((\zeta _k^N, Y_{k+1}^N)_{k\ge 0}\) is called a (rescaled) discrete time geodesic random walk on M with initial point \(p_0\) and with increments \(\{Y_{k+1}^N\}_{k\ge 0}\) compatible with the family \(\{\nu _p\}_{p\in M}\) if

  1. 1.

    the process \(\zeta _k^N\) is M valued, and \(Y_{k+1}^N\) is \(T_{\xi _k^N}M\) valued,

  2. 2.

    \(\zeta _0^N=p_0\),

  3. 3.

    for each \(k\ge 0\), \(\text {Law}(Y_{k+1}^N)=\nu _{\zeta _k^N}^N\) where

    $$\begin{aligned} \begin{aligned} \nu _p^N(B)&=\int _{T_pM} {\mathbb {I}}_B\Big ( \frac{Y-\mu _p}{\sqrt{N}}+\frac{\mu _p}{N} \Big )\,\nu _p(\mathrm {d}Y) \end{aligned}\end{aligned}$$
    (3.7)

    for any measurable \(B\subseteq T_pM\),

  4. 4.

    \(\zeta _{k+1}^N=\exp _{\zeta _k^N}(Y_{k+1}^N)\), \(k\ge 0\).

Hence, the processes \(\zeta ^N\) are defined by the family of measures \(\{\nu _p\}_{p\in M}\) and the geometry of the exponential mapping \(\exp _p\). The random walks are time homogeneous since the family \(\{\nu _p\}_{p\in M}\) does not depend on k.

In the classical (Euclidean) setting, random walks are processes with independent increments. In our setting, the independence is understood in the conditional sense, i.e. the increments \(Y_{k+1}^N\) depend only on the current position \(\zeta ^N_k\) and not on the previous positions and increments. More precisely, we introduce the natural filtration

$$\begin{aligned} {\mathcal {F}}_k^N:=\sigma \{(\zeta _0^N, Y^N_{1}),\dots , (\zeta ^N_{k-1}, Y_k^N)\},\quad k\ge 1, \end{aligned}$$
(3.8)

and say that the increments of \(\zeta ^N\) are independent if for each \(f\in {\mathcal {C}}_b(\oplus _{i=0}^{k+1} M,{\mathbb {R}})\)

$$\begin{aligned} {\mathbf {E}} \Big [f(\zeta _0^N,\dots ,\zeta _{k}^N,\zeta _{k+1}^N)\Big |{\mathcal {F}}_{k}^N\Big ] = \int _{T_{\zeta _{k}^N}M} f(\zeta _0^N,\dots ,\zeta _{k}^N, \exp _{\zeta _{k}^N}(Y))\,\nu _{\zeta _{k}^N}^N(\mathrm {d}Y). \end{aligned}$$
(3.9)

It is clear that \(\zeta ^N\) is a homogeneous discrete time M-valued Markov chain with the one-step transition operator

$$\begin{aligned} P^N f(p) = {\mathbf {E}}_p f(\zeta ^N_1) =\int _{T_{p}M} f\Big (\exp _p\Big ( \frac{Y-\mu _p}{\sqrt{N}}+\frac{\mu _p}{N}\Big )\Big )\,\nu _{p}(\mathrm {d}Y),\quad f\in {\mathcal {C}}_b(M,{\mathbb {R}}).\quad \quad \end{aligned}$$
(3.10)

Since we work in a continuous time setting, it is convenient to transform the discrete time Markov chain \(\zeta ^N\) into a continuous time Markov process. This can be done by a standard subordination procedure.

Let \(Q=(Q_t)_{t\ge 0}\) be a standard Poisson process independent of \(\{\zeta ^N\}\). Define a pseudo-Poisson process

$$\begin{aligned} \xi ^N_t= \zeta ^N_{Q_{Nt}} ,\quad t\ge 0. \end{aligned}$$

Note that the sample paths of \(\xi ^N\) belong to \(D([0,\infty ),M)\). Hence, the Markov processes \(\xi ^N\) induce probability distributions \({\mathbf {P}}^N\) on the path space \(D([0,\infty ),M)\). It is easy to see that the transition semigroup \(T^N=(T^N_t)_{t\ge 0}\) of \(\xi ^N_t\) has the form:

$$\begin{aligned} T^N_t(f)(p)={\mathbf {E}}_p[f(\xi ^N_t)]=\mathrm {e}^{-Nt}\sum _{k=0}^{\infty }\dfrac{(Nt)^k}{k!}(P^N)^k(f)(p),\ f\in {\mathcal {B}}. \end{aligned}$$
(3.11)

Finally, we introduce a of family continuous M-valued processes defined by

$$\begin{aligned} {{\hat{\xi }}}_t^N&=\exp _{\zeta ^N_{k}}\Big (N\Big (t-\frac{k}{N}\Big )Y^N_{k+1} \Big ) ,\quad t\in \Big [\frac{k}{N},\frac{k+1}{N}\Big ],\quad k\ge 0. \end{aligned}$$

Since the manifold is geodesically complete, the processes \(\zeta ^N\), \(\xi ^N\), and \({{\hat{\xi }}}^N\) are well defined for each \(N\ge 1\). By construction, the processes \({{\hat{\xi }}}^N\) have piecewise smooth sample paths consisting of geodesic segments and induce probability distributions \(\hat{{\mathbf {P}}}^N\) on the path space \(C([0,\infty ),M)\) of continuous M-valued functions. These are the geodesic random walks introduced and discussed in the Introduction, see Fig. 1 there, and in Sect. 2.4, see Fig. 3. Although the processes \({{\hat{\xi }}}^N\) are not Markovian, the convergence of the continuous time processes \((\zeta ^N_{[Nt]})_{t\ge 0}\), \((\xi ^N_t)_{t\ge 0}\) and \(({{\hat{\xi }}}^N_t)_{t\ge 0}\) is equivalent, see, e.g. [23, Theorem 17.28]. In the next sections, we will mainly work with the Markov processes \(\xi ^N\).

4 Proof of the Main Theorem

In this section, we prove the convergence of the geodesic random walks \(\lbrace \xi ^N\rbrace \). We will assume that the Finsler manifold (MF) is forward complete and connected (Hypothesis H\(_{c}\)) and has bounded geometry (Hypothesis H\(_b\)) and that the measures \(\lbrace \nu _p\rbrace _{p\in M}\) satisfy the condition H\(_{\nu }\) from Sect. 2.3.

4.1 Generators of Geodesic Random Walks

In this section, we show that the N-scaled geodesic random walks on a complete Finsler manifold (MF) with bounded geometry are Feller.

Lemma 4.1

Let \({\mathbf {H}}_\nu \) hold true and \(k\ge 0\). For any \({\mathcal {C}}^k\)-smooth function \(f:TM\rightarrow {\mathbb {R}}\), the mapping

$$\begin{aligned} \begin{aligned} p\rightarrow \int _{T_pM} f(Y)\,\nu _p(\mathrm {d}Y) \end{aligned}\end{aligned}$$
(4.1)

is also \({\mathcal {C}}^k\)-smooth.

This lemma is obvious since each \(\nu _p\) is only supported on \(D_pM\). Indeed, integral over a compact set of a function smoothly depending on parameters smoothly depends on the parameters.

Now, we are ready to show the semigroups \(\lbrace T^N\rbrace \) are Feller and give the formula of the generators.

Proposition 4.2

Suppose (MF) is complete and uniform elliptic. In addition, assume the measures \(\lbrace \nu _p\rbrace \) satisfy the hypothesis \({\mathbf {H}}_\nu \). Then for each \(N\ge 1\), the family of operators \(T^N=( T^N_t)_{t\ge 0}\) is a conservative Feller semigroup with the generator

$$\begin{aligned} A_N f =N\Big ( P^N f -f\Big ),\quad f\in {\mathcal {C}}_0. \end{aligned}$$
(4.2)

Proof

Let \(N\ge 1\) be fixed. Since by constriction, \(T^N\) is a strongly continuous semigroup of a pseudo-Poisson process, its generator has the form (4.2) by Theorem 19.2 from [23]. It is conservative due to assumption \({\mathbf {H}}_{c}\). Let us show that \(T^N_t\) maps \(\mathcal {\mathcal {C}}_0\) into itself for \(t\ge 0\). Since we have

$$\begin{aligned} \Vert P^N f \Vert \le \Vert f\Vert , \end{aligned}$$
(4.3)

the series in (3.11) converges uniformly. It suffices to show \(P^N\) maps \({\mathcal {C}}_0\) into itself.

By Lemma 4.1, the mean value \(\mu _p\) is a \({\mathcal {C}}^\infty \) vector field. Since the exponential map for Finsler manifold is at least \({\mathcal {C}}^1\), then \(P^N\) maps continuous functions into continuous functions.

For any \(\varepsilon >0\), choose some compact \(K\subseteq M\) such that \(\vert f(x)\vert <\dfrac{\varepsilon }{2}\) for \(x\notin K\). Fix any \(p_0\in K\) and define the closed forward balls at \(p_0\) for \(R\ge 0\) by

$$\begin{aligned} B^+_{p_0}(R){:}{=}\lbrace q\in M:d_a(p_0,q)\le R\rbrace . \end{aligned}$$

Because K is compact, there exists some \(R_0>0\) such that \(K\subset B^+_{p_0}(R)\) for all \(R\ge R_0\). By the Hopf–Rinow theorem (see Theorem 6.6.1 of [5]), the forward closed balls \(B^+_{p_0}(R)\) are also compact.

For any \(0\ne Y\in T_pM\) and \(p\in M\), the uniform ellipticity condition in Definition 2.1 gives

$$\begin{aligned} F^2(-Y)=g_{-Y}(Y,Y)\le C^2 g_Y(Y,Y)=C^2F^2(Y). \end{aligned}$$
(4.4)

It follows that

$$\begin{aligned} d_a(q,p_0)&\le C d_a(p_0,q)\le CR_0,\ \forall q\in K; \end{aligned}$$
(4.5)
$$\begin{aligned} d_a(p,p_0)&\ge \dfrac{1}{C} d_a(p_0,p)\ge \dfrac{R}{C},\ \forall p\in (B^+_{p_0}(R))^c,\ \forall R\ge 0. \end{aligned}$$
(4.6)

Let \(R_1{:}{=}C(C+2+CR_0)\), then \(\forall p\in (B^+_{p_0}(R_1))^c\) and \(\forall q\in K\), we have

$$\begin{aligned} d_a(p,q)\ge d_a(p,p_0)-d_a(q,p_0)\ge \dfrac{R_1}{C}-CR_0>C+1 \end{aligned}$$
(4.7)

On the other hand, for \(p\in M\) and \(Y\in D_pM\), we have

$$\begin{aligned} d_a(p,\mathbf{e }_p^N(Y))\le F\left( \dfrac{1}{\sqrt{N}}(Y-(1-1/\sqrt{N})\mu _p)\right) \le F(Y)+F(-\mu _p)\le C+1. \end{aligned}$$
(4.8)

Hence, \(\forall Y\in D_pM\) and \(p\in (B^+_p(R_1))^c\), we have \(\mathbf{e }_p^N(Y)\notin K\). It follows that \(\forall p\in (B^+_p(R_1))^c\):

$$\begin{aligned} \vert P^N f (p)\vert =\biggr |\int _{T_pM} f\circ \mathbf{e }_p^N(Y)\, \nu _p (\mathrm {d}Y) \biggr |\le \dfrac{\varepsilon }{2}. \end{aligned}$$
(4.9)

That is to say \(\Vert P^N f \Vert \le \dfrac{\varepsilon }{2}\) outside the compact set \(B^+_P(R_1)\). We conclude that \(P^N f \in {\mathcal {C}}_0\). This completes the proof. \(\square \)

4.2 Convergence of the Generators of Geodesic Random Walks

In this section, we prove the generators \(A_N\) converge on the space \({\mathcal {C}}^{\infty }_K\) to some second-order elliptic operator with smooth coefficients.

Denote

$$\begin{aligned} \begin{aligned} f_{Y-\mu _p}(t)=f\circ \gamma _{Y-\mu _p}(t) \end{aligned}\end{aligned}$$
(4.10)

Proposition 4.3

Let A be the differential operator defined by

$$\begin{aligned} Af(p){:}{=}\mathrm {d}f(\mu _p)+\dfrac{1}{2}\int _{T_pM} \dfrac{\mathrm {d}^2}{\mathrm {d}t^2}\biggr \vert _{t=0} f_{Y-\mu _p}(t)\, \nu _p (\mathrm {d}Y),\ f\in {\mathcal {C}}^{2}. \end{aligned}$$
(4.11)

Then A is a second-order positive definite elliptic operator of smooth coefficients and for each \(f\in {\mathcal {C}}^{\infty }_K\)

$$\begin{aligned} \lim _{N \rightarrow \infty }\Vert A_Nf-Af\Vert =0. \end{aligned}$$
(4.12)

Proof

The proof follows the steps from [22] in the Riemannian case. By computing the Taylor expansion of \(A_N f\), we show the convergence of the first- and second-order terms and vanishing of other higher-order terms as \(N\rightarrow \infty \).

Take any \(f\in {\mathcal {C}}^{\infty }_K\). We have

$$\begin{aligned} \begin{aligned} A_N(f)(p)&=N\Big ( P^N(f)(p)-f(p)\Big )\\&= N\ \int _{T_pM} \Big [ f\circ \gamma _{Y-(1-1/\sqrt{N}) \mu _p}\Big (\dfrac{1}{\sqrt{N}}\Big )-f(p)\Big ]\, \nu _p (\mathrm {d}Y) \end{aligned} \end{aligned}$$
(4.13)

Then for any \(p\in M\) and \(Y\in D_pM\), the Taylor expansion of

$$\begin{aligned} \begin{aligned} f_{Y-(1-1/\sqrt{N}) \mu _p }(t)=f\circ \gamma _{Y-(1-1/\sqrt{N}) \mu _p}(t) \end{aligned}\end{aligned}$$
(4.14)

gives

$$\begin{aligned} f_{Y-(1-1/\sqrt{N}) \mu _p}\left( \dfrac{1}{\sqrt{N}}\right)&= f(p)+ \dfrac{1}{\sqrt{N}} \mathrm {d}f_p(Y-(1-1/\sqrt{N}) \mu _p)\\&\quad + \dfrac{1}{2N}\dfrac{\mathrm {d}^2}{\mathrm {d}t^2}\biggr \vert _{t=0}\left( f_{Y-(1-1/\sqrt{N}) \mu _p}(t)\right) +R_N(p,Y). \end{aligned}$$

Thus, we have

$$\begin{aligned} \begin{aligned} A_Nf (p)&= N \int _{T_pM}\biggr \lbrace \dfrac{1}{\sqrt{N}} \mathrm {d}f_p(Y-(1-1/\sqrt{N}) \mu _p) \\&\quad + \dfrac{1}{2N}\dfrac{\mathrm {d}^2}{\mathrm {d}t^2}\biggr \vert _{t=0}(f_{Y-(1-1/\sqrt{N}) \mu _p}(t)) +R_N(p,Y)\biggr \rbrace \, \nu _p (\mathrm {d}Y)\\&= \mathrm {d}f(\mu _p)+\int _{T_pM} \dfrac{1}{2}\dfrac{\mathrm {d}^2}{\mathrm {d}t^2} \biggr \vert _{t=0}(f_{Y-(1-1/\sqrt{N}) \mu _p}(t))\,\nu _p (\mathrm {d}Y) \\&\quad +\int _{T_pM}N R_N(p,Y)\, \nu _p (\mathrm {d}Y). \end{aligned} \end{aligned}$$
(4.15)

Using Lemma 3.1 and Eq. (4.4), for any \(Y\in D_pM\) and \(p\in M\), there is some constant \(c_f>0\) such that

$$\begin{aligned} \begin{aligned} | R_N(p,Y)|&\le \dfrac{1}{N\sqrt{N}}\ \sup _{t \in [0,1/\sqrt{N}]}\ \biggr |\dfrac{\mathrm {d}^3}{\mathrm {d}t^3} f_{Y-(1-1/\sqrt{N}) \mu _p}(t) \biggr |\\&\le \dfrac{c_f}{N\sqrt{N}} F^3(Y-(1-1/\sqrt{N}) \mu _p)\\&\le \dfrac{c_f}{N\sqrt{N}} (F(Y)+F(-\mu _p))^3 \\&\le \dfrac{c_f}{N\sqrt{N}}(C+1)^3. \end{aligned}\end{aligned}$$
(4.16)

Clearly, we have from (4.16)

$$\begin{aligned} \lim _{N\rightarrow \infty } \sup _{p\in M, Y\in D_pM}| N R_N(p,Y)|=0. \end{aligned}$$
(4.17)

The last term in (4.15) tends to zero, since \(\nu _p\) is only supported on \(D_pM\).

For the second-order term, in a canonical coordinate of TM, we have

$$\begin{aligned} \begin{aligned}&\dfrac{\mathrm {d}^2}{\mathrm {d}t^2} \biggr \vert _{t=0}(f_{Y-(1-1/\sqrt{N}) \mu _p}(t))\\&\quad = f_{ij}\cdot y^i\left( Y-(1-\dfrac{1}{\sqrt{N}} )\mu _p\right) y^j\left( Y-(1-\dfrac{1}{\sqrt{N}} )\mu _p\right) \\&\qquad - f_k\cdot \Gamma ^k_{ij}\left( p,Y-(1-\dfrac{1}{\sqrt{N}})\mu _p \right) y^i\left( Y-(1-\dfrac{1}{\sqrt{N}} )\mu _p\right) y^j\\&\qquad \left( Y-(1-\dfrac{1}{\sqrt{N}} )\mu _p\right) . \end{aligned}\end{aligned}$$
(4.18)

Since the formal Christoffel symbols are bounded on each compact local coordinate, the right-hand side of (4.18) converges to

$$\begin{aligned} f_{ij}\cdot y^i(Y-\mu _p)y^j(Y-\mu _p) - f_k \cdot \Gamma ^k_{ij}(p,Y-\mu _p)y^i(Y-\mu _p)y^j(Y-\mu _p), \end{aligned}$$

as \(N\rightarrow \infty \) uniformly on DK for each compact chart \(K\subseteq M\), where \(DK=\lbrace Y\in D_pM:p\in K\rbrace \).

Choose a smooth coordinate on some open \(U\subseteq M\). The chain rule implies A has the following form in this coordinate.

$$\begin{aligned} Af(p)&= \mathrm {d}f(\mu _p)+ \dfrac{1}{2} \biggr ( f_{ij}\int _{T_pM} y^i(Y-\mu _p)y^j(Y-\mu _p)\,\nu _p (\mathrm {d}Y)\\&\quad - f_k\int _{T_pM} \Gamma ^k_{ij}(p,Y-\mu _p)y^i(Y-\mu _p)y^j(Y-\mu _p)\, \nu _p (\mathrm {d}Y) \biggr ). \end{aligned}$$

Because f has compact support, we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\Vert A_N f-A f \Vert =0,\quad f\in {\mathcal {C}}^{\infty }_K. \end{aligned}$$
(4.19)

It follows that the symbol of A is

$$\begin{aligned} \sigma (A)(p)=\dfrac{1}{2}\int _{T_pM} \otimes ^2 (Y-\mu _p)\, \nu _p (\mathrm {d}Y). \end{aligned}$$
(4.20)

For each \(p\in M\), the measure \(\nu _p\) is induced by either a smooth non-zero \((m-1)\)-form on \(I_pM\) or an m-form \(T_pM\) (condition \({\mathbf {H}}_{\nu }\)). Then \(\sigma (A)\) is positive definite, and hence, A is a strictly elliptic operator.

In each compact local coordinate, the functions \(\lbrace \Gamma ^k_{ij}\rbrace \) are bounded and smooth on \(TM\setminus \lbrace 0\rbrace \). It follows that A has smooth coefficients. This completes the proof.

\(\square \)

4.3 Tightness of the Family \(\lbrace \xi ^N\rbrace \)

In this section, we prove the family of random walks \(\lbrace \xi ^N\rbrace \) is tight in \(D([0,\infty ),M)\). Recall that the symmetrized distance d makes (Md) a complete separable metric space, as (MF) is forward complete and has bounded geometry.

Proposition 4.4

Let the Finsler manifold (MF) and the family \(\{ \nu _p\}_{p\in M}\) satisfy Assumptions H\(_{c}\), H\(_b\), and \({\mathbf {H}}_\nu \). Then the family of random walks \(\lbrace \xi ^N\rbrace _{N\ge 1}\) is tight in \(D([0,\infty ),M)\).

Proof

The statement follows from the Aldous criteria (Lemma 4.8) and the compact containment condition (Lemma (4.9)) that will be proven in this section. \(\square \)

Our goal consists of obtaining uniform estimates for the oscillation of the random walks \(\xi ^N\), see Eq. (4.53). Because M is in general non-compact, the injective radius lower bound \({\text {inj}}_M\) can be zero. Since the non-symmetrized distance function \(d_a(p,\cdot )\) is smooth only within the injective radius, we work on the tangent bundle TM to bypass this technical problem. To prepare the proof of Lemma 4.8 as well as Lemma 4.9, for \(R\ge 0\), define

$$\begin{aligned} D_p(R){:}{=}\lbrace Y\in T_pM:F(Y)\le R\rbrace . \end{aligned}$$

We make the following construction.

The condition \(K\le \lambda \) implies that there exists some \(0<\delta _c<1\) so that the conjugate radius \(\mathrm {con}_M>\delta _c\), see Proposition 3.2. For each \(p\in M\), the exponential map \(\exp _p\) is a smooth immersion on \(D_p(\delta _c)\) except possibly at 0. Then we can construct a geodesically complete smooth Finsler function \(F_p\) on \(T_pM\) such that \(F_p=(\exp _p)^*(F)\) on \(D_p(\frac{\delta _c}{2})\), while \(F_p\) is the standard Minkowski metric on \(T_pM\setminus D_p(1)\), under any standard identification \(T_pM\simeq {\mathbb {R}}^m\). To distinguish it from the distance functions on (MF), we denote the asymmetric and symmetric distance on \((T_pM,F_p)\) by \(d_a^p\) and \(d^p\), respectively. Note that the injective radius of \(F_p\) at \(0\in T_pM\) is at least \(\frac{\delta _c}{4}\).

Now for each \(p\in M\), we construct the measures \(\lbrace \tilde{\nu }_q\rbrace _{q\in T_pM}\), so that on \(D_p(\delta _c/2)\), the measures \(\lbrace \tilde{\nu _q}\rbrace _{q\in D_p(\delta _c/2)}\) are the lift of \(\lbrace \nu _o\rbrace _{o\in M}\) by the exponential map \(\exp _p\). In addition, we require the measures \(\lbrace \tilde{\nu }_q\rbrace _{q\in T_pM}\) satisfy the condition \({\mathbf {H}}_{\nu }\).

Then for each \(p\in M\) and \(N\ge 1\), we construct an N-scaled geodesic random walk \(\xi ^{N,p}\) on the Finsler manifold \((T_pM,F_p)\) starting at \(0\in T_pM\), using the prescribed measures \(\lbrace \tilde{\nu }_q\rbrace _{q\in T_pM}\) as in Sect. 3.2. Note as \((T_pM,F_p)\) satisfies \({\mathbf {H}}_{b}\) and \({\mathbf {H}}_{c}\), all results we proved earlier are true for the random walks \(\xi ^{N,p}\).

Lemma 4.5

There exists some \(\delta _0>0\) so that for each \(\delta \in (0,\delta _0)\), there exists a family of functions \(\lbrace f_p^{\delta }\rbrace _{p\in M}\) such that

  1. 1.

    Each \(f_p^{\delta }\) is a function on \(T_pM\) with \(0\le f_p^{\delta }\le 1\) such that \(f_p^{\delta }(0)=1\) and \(f^{\delta }_p(q)=0\) if \(q\notin D_p(\delta )\).

  2. 2.

    Denote \(A_{N,p}\) the generator associated to \(\xi ^{N,p}\), then there exists a constant \({\tilde{C}}(\delta )>0\) such that

    $$\begin{aligned} \sup _{N\ge 1}\sup _{p\in M} \sup _{q\in T_pM}\Big | A_{N,p}f_p^{\delta }(q)\Big | \le {\tilde{C}}(\delta ). \end{aligned}$$
    (4.21)

Proof

The general scheme to prove this lemma is as follows. We construct the family of functions \(\lbrace f^{\delta }_p\rbrace \) using the distance functions \(d_a^p(0,\cdot )\) on \((T_pM,F_p)\). The Hessian comparison theorem from [38, Sect. 15.1] applied to Finsler manifolds with bounded flag and T-curvature suggests that the distance functions have uniformly bounded Hessians. This fact applied to the Taylor expansion of \(f^{\delta }_p\) implies that the family of functions we constructed satisfies the conditions listed in the Lemma.

For \(\delta _c\in (0,1)\) chosen above, \(C>0\) and \(\lambda >0\) from Definition 2.1, let \(0<\delta _0<\min \{\frac{\delta _c}{4(C+1)},\frac{\pi }{2\sqrt{\lambda }}\}\). Fix any \(p\in M\), and denote \(d_a^p(0,\cdot )\) the distance function for \((T_pM,F_p)\) from \(0\in T_pM\). Because the injective radius for \(F_p\) at 0 is at least \(\frac{\delta _c}{4}\), the distance function \(d_a^p(0,\cdot )\) is smooth on the open set \(D_p(\delta )\setminus \lbrace 0\rbrace \) for each \(0<\delta <\delta _0\).

Fix any \(\delta \in (0,\delta _0)\). Let \(\psi ^{\delta }:{\mathbb {R}}\rightarrow {\mathbb {R}}\) be a smooth function with compact support contained in \([-\frac{\delta }{2},\frac{\delta }{2}]\). Further suppose \(0\le \psi ^{\delta }\le 1\) and \(\psi ^{\delta }\equiv 1\) on \(I_1=[-\frac{\delta }{4},\frac{\delta }{4}]\). For each \(p\in M\) , the function

$$\begin{aligned} f_p^{\delta }(q):=\psi ^{\delta } \circ d_a^p(0,q),\quad q\in T_pM. \end{aligned}$$
(4.22)

is smooth on \(T_pM\) and satisfies condition 1.

To prove condition 2, for any \(q\in (T_pM,F_p)\), let \(\tilde{\mu }_q\) be the mean of \(\tilde{\nu }_q\). An argument similar to Proposition 4.2 shows that for any \(q\in T_pM\)

$$\begin{aligned} A_{N,p}f_p^{\delta } (q)=N\Big [ \int _{T_q(T_pM)} f_p^{\delta }\circ \gamma _{Y-(1-1/\sqrt{N})\tilde{\mu }_q} \Big (\frac{1}{\sqrt{N}}\Big )\, \tilde{\nu }_q (\mathrm {d}Y) -f_p^{\delta }(q) \Big ]. \end{aligned}$$
(4.23)

Note \(\lbrace \tilde{\nu }_q\rbrace _{q\in T_pM}\) satisfies \({\mathbf {H}}_{\nu }\), so we only need to integrate over \(Y\in T_q(T_pM)\) with \(F_p(Y)\le 1\).

To simplify the notations, define for \(Y\in T_q(T_pM)\)

$$\begin{aligned} h_N(Y)(t)&:=d_a\Big (0,\gamma _{Y-(1-1/\sqrt{N})\mu _q}(t)\Big ), \end{aligned}$$
(4.24)
$$\begin{aligned} h_N^{\delta }(Y)(t)&:=\psi ^{\delta }\circ h_N(Y)(t)=f_p^{\delta }\circ \gamma _{Y-(1-1/\sqrt{N})\mu _q}(t), \quad t\ge 0. \end{aligned}$$
(4.25)

By Taylor theorem, there exist functions \(\lbrace t_N\rbrace \) with

$$\begin{aligned} \begin{aligned} t_N:T_q (T_pM)\rightarrow \Big (0,\frac{1}{\sqrt{N}}\Big ) \end{aligned}\end{aligned}$$
(4.26)

such that

$$\begin{aligned} h_N^{\delta }(Y)\Big (\frac{1}{\sqrt{N}}\Big ) =f_p^{\delta }(q)+\mathrm {d}f_p^{\delta }(q)\Big (\frac{Y-\tilde{\mu }_q}{\sqrt{N}}+\frac{1}{N}\tilde{\mu }_q\Big ) +\frac{1}{2N}\frac{\mathrm {d}^2}{\mathrm {d}t^2}\biggr \vert _{t=t_N(Y)}\left( h_N^{\delta }(Y)(t)\right) . \end{aligned}$$
(4.27)

Using Eqs. (4.23) and (4.27), we get

$$\begin{aligned} A_N f_p^{\delta } (q)&=N \int _{T_q(T_pM)}\Big [ \mathrm {d}f_p^{\delta }(q)\left( \dfrac{Y-\tilde{\mu }_q}{\sqrt{N}}+\dfrac{1}{N}\tilde{\mu }_q\right) \nonumber \\&\quad + \dfrac{1}{2N}\dfrac{\mathrm {d}^2}{\mathrm {d}t^2}\biggr \vert _{t=t_N(Y)}\left( h_N^{\delta }(Y)(t)\right) \,\Big ] \tilde{\nu }_q (\mathrm {d}Y) \nonumber \\&=\mathrm {d}f^{\delta }_p(q)(\tilde{\mu }_q) + \frac{1}{2}\int _{T_q(T_pM)} \frac{\mathrm {d}^2}{\mathrm {d}t^2}\biggr \vert _{t=t_N(Y)}\left( h_N^{\delta }(Y)(t)\right) \tilde{\nu }_q (\mathrm {d}Y). \end{aligned}$$
(4.28)

We need to show that the equation above is uniformly bounded for all \(p\in M\), \(q\in T_pM\) and \(N\ge 1\).

First, we show for each \(0<\delta <\delta _0\),

$$\begin{aligned} \sup _{p\in M}\sup _{q\in T_pM}|\mathrm {d}f^{\delta }_p(q)(\tilde{\mu }_q)|<\infty . \end{aligned}$$
(4.29)

Clearly, if \(d_a^p(0,q)\le \frac{\delta }{4}\) or \(d_a^p(0,q)\ge \frac{\delta }{2}\), we have

$$\begin{aligned} \mathrm {d}f^{\delta }_p(q)({\tilde{\mu }}_q)=0. \end{aligned}$$

For any \(q\in T_pM\) such that \(\frac{\delta }{4}\le d_a(0,q)\le \frac{\delta }{2}\), the Finlser metric \(F_p|_{T_q(T_pM)}\) and the measure \(\tilde{\nu }_q\) are the pull backs of F and \(\lbrace \nu _o\rbrace _{o\in M}\) by \(\exp _p\), respectively. Thus, \(F_p(\tilde{\mu }_q)<1\) and \(F_p(-\tilde{\mu }_q)\le C\). It follows that

$$\begin{aligned} | \tilde{\mu }_q(d^p_a(0,\cdot )) | \le C\quad \mathrm {if }\ \frac{\delta }{4}\le d_a^p(0,q)\le \frac{\delta }{2}. \end{aligned}$$

The function \(d_a^p(0,\cdot )\) is smooth at q For \(q\in T_pM\) with \(\frac{\delta }{4}\le d_a(0,q)\le \frac{\delta }{2}\). Hence, Eq. (4.29) holds by the chain rule.

Now it suffices to prove that the integrand in (4.28) is uniformly bounded for all \(Y\in T_q(T_pM)\), \(q\in T_pM\), \(p\in M\) and \(N\ge 1\). By the construction of \(\psi ^{\delta }\), for the case

$$\begin{aligned} \begin{aligned} d_a^p \Big (0,\gamma _{Y-(1-1/\sqrt{N}){\tilde{\mu }}_q}(t_N(Y))\Big )\ge \frac{\delta }{2}\quad \mathrm {or}\quad d_a^p\Big (0,\gamma _{Y- (1-1/\sqrt{N}){\tilde{\mu }}_q}(t_N(Y))\Big )\le \frac{\delta }{4}, \end{aligned}\nonumber \\ \end{aligned}$$
(4.30)

we have

$$\begin{aligned} \frac{\mathrm {d}^2}{\mathrm {d}t^2}\biggr \vert _{t=t_N(Y)} h_N^{\delta }(Y)(t) =0. \end{aligned}$$
(4.31)

For the case

$$\begin{aligned} \begin{aligned} \frac{\delta }{4}\le d_a^p\big (0,\gamma _{Y-(1-1/\sqrt{N}){\tilde{\mu }}_q}(t_N(Y))\big )\le \frac{\delta }{2}, \end{aligned}\end{aligned}$$
(4.32)

the function \(h_N(Y)(t)\) is smooth on some interval containing \(t=t_N(Y)\), because \(d_a^p(0,\cdot )\) is smooth on \(D_p(\delta )\setminus \lbrace 0 \rbrace \). Since \(\psi ^{\delta }\) is in \({\mathcal {C}}^{\infty }_K\), it is sufficient to show that the first and second derivatives \(h_N(Y)(t)\) with respect to t are uniformly bounded.

To simplify the notations, we denote by \(\nabla \rho :=\nabla d_a^p(0,\cdot )\) the Finsler gradient, see, e.g. [38, Eq. (3.14) in Sect. 3.2]. Following [38, Sect. 15.1], let us define

$$\begin{aligned}&\hat{{\mathbf {g}}}{:}{=}{\tilde{g}}_{\nabla \rho }, \quad r_N(Y){:}{=}d_a^p\Big (0,\gamma _{Y-(1-1/\sqrt{N}){\tilde{\mu }}_q}(t_N)\Big ), \\&{\dot{\gamma }}_N{:}{=}\frac{\mathrm {d}}{\mathrm {d}t}\Big |_{t=t_N(Y)}\gamma _{Y-(1-1/\sqrt{N}){\tilde{\mu }}_q}(t),\quad Y^{\perp }{:}{=}{\dot{\gamma }}_N-\hat{{\mathbf {g}}}({\dot{\gamma }}_N,\nabla \rho )\nabla \rho . \end{aligned}$$

Here \({\tilde{g}}\) is the fundamental tensor of \(F_p\).

Because \(\exp _p\) is an isometric immersion on \(D_p(\delta )\), on \((T_pM,F_p)\), we also have the uniform elliptic conditions

$$\begin{aligned} \dfrac{1}{C^2}{\tilde{g}}_v(v,v)\le {\tilde{g}}_u(v,v)\le C^2 {\tilde{g}}_v(v,v), \end{aligned}$$
(4.33)

for any \(0\ne u,v\in T_q(T_pM)\) with \(q\in D_p(\delta )\). It follows that for all \(Y\in T_q(T_pM)\) with \(F_p(Y)\le 1\), \(q\in D_p(\delta )\), we have

$$\begin{aligned}&F_p( {\dot{\gamma }}_N )=F_p\Big (Y-(1-1/\sqrt{N})\tilde{\mu }_q\Big )\le C+1,\\&F_p(-{\dot{\gamma }}_N)\le C+1. \end{aligned}$$

This implies that for all \(Y\in T_q(T_pM)\) with \(F_p(Y)\le 1\),

$$\begin{aligned} \Big | \frac{\mathrm {d}}{\mathrm {d}t}\Big |_{t=t_N(Y)} h_N(Y)(t)\Big |\le C+1, \end{aligned}$$

if (4.32) holds.

The second derivative of \(h_N(Y)(t)\) can be estimated by the Hessian comparison theorem (see Sect. 15.1 of [38]), using the bounded curvature conditions in Definition 2.1. Note that \((D_p(\delta ),F_p)\) also has flag curvature and T-curvature bounded by \(\vert K\vert \le \lambda \) and \(\vert T\vert \le \lambda \), because \(\exp _p\) restricted to \((D_p(\delta ),F_p)\) is an isometric immersion. Since the injective radius of \(F_p\) at 0 is at least \(\frac{\delta _c}{4}>\delta \), the Hessian comparison theorem implies

$$\begin{aligned}&\left( \sqrt{\lambda }\cdot \cot (\sqrt{\lambda }\cdot r(Y))-\lambda \right) \hat{{\mathbf {g}}}(Y^{\perp },Y^{\perp }) \le \frac{\mathrm {d}^2}{\mathrm {d}t^2}\biggr \vert _{t=t_N(Y)}h_N(Y(t)), \end{aligned}$$
(4.34)
$$\begin{aligned}&\frac{\mathrm {d}^2}{\mathrm {d}t^2}\Big |_{t=t_N(Y)} h_N(Y(t))\le \left( \sqrt{\lambda }\cdot \coth (\sqrt{\lambda }\cdot r(Y))+\lambda \right) \hat{{\mathbf {g}}}(Y^{\perp },Y^{\perp }). \end{aligned}$$
(4.35)

Using the fact \(\hat{{\mathbf {g}}}(\nabla \rho ,\nabla \rho )=F^2_p(\nabla \rho )=1\) on \(D_p(\delta )\setminus \lbrace 0\rbrace \), we get

$$\begin{aligned} \hat{{\mathbf {g}}}(Y^{\perp },Y^{\perp })&\ =\left|\hat{{\mathbf {g}}}({\dot{\gamma }}_N,{\dot{\gamma }}_N)-\hat{{\mathbf {g}}}^2({\dot{\gamma }}_N,\nabla \rho )\right|. \end{aligned}$$
(4.36)

If \(x\in D_p(\delta )\), for any tangent vectors \(Y_1,Y_2\in T_x(T_pM)\) with \(Y_1\ne 0\), the fundamental inequality in Finsler geometry (see 1.2.16 of [5]) and the inequality \(F_p(Y_2)\le C F_p(-Y_2)\) give

$$\begin{aligned} \begin{aligned} \vert {\tilde{g}}_{Y_1}(Y_1,Y_2)\vert \le C F_p(Y_1)F_p(Y_2). \end{aligned}\end{aligned}$$
(4.37)

Substituting this into (4.36) and using uniform ellipticity, for Y such that \(F_p(Y)\le 1\) and (4.32) holds, we obtain

$$\begin{aligned} \hat{{\mathbf {g}}}(Y^{\perp },Y^{\perp })&\le C {\tilde{g}}_{{\dot{\gamma }}_N}({\dot{\gamma }}_N,{\dot{\gamma }}_N)+C^2F^2_p({\dot{\gamma }}_N)\\&\le 2C^2F_p^2\Big (Y-\Big (1-\frac{1}{\sqrt{N}}\Big ){\tilde{\mu }}_q\Big )\\&\le 2C^2(C+1)^2. \end{aligned}$$

Since \(\frac{\delta }{4}\le r_N(Y)\le \frac{\delta }{2}\) and \(\delta <\frac{\pi }{2\sqrt{\lambda }}\), we have for all \(N\ge 1\):

$$\begin{aligned} 0\le \cot (\sqrt{\lambda }\cdot r_N(Y))\le \coth (\sqrt{\lambda }\cdot r_N(Y)). \end{aligned}$$
(4.38)

Then for all \(N\ge 1\) and Y with \(F_p(Y)\le 1\), we have the estimate:

$$\begin{aligned} \left|\frac{\mathrm {d}^2}{\mathrm {d}t^2}\Big |_{t=t_N(Y)} h_N(Y(t))\right|&\le \Big (\sqrt{\lambda }\cdot \coth (\frac{\delta \sqrt{\lambda }}{4}) +\lambda \Big )\hat{{\mathbf {g}}}(Y^{\perp },Y^{\perp }), \end{aligned}$$
(4.39)
$$\begin{aligned}&\le 2C^2(C+1)^2\Big (\sqrt{\lambda }\cdot \coth (\frac{\delta \sqrt{\lambda }}{4}) +\lambda \Big ). \end{aligned}$$
(4.40)

This shows the second derivative of \(h_N(Y)(t)\) evaluated at \(t=t_N(Y)\) is also uniformly and absolutely bounded, if (4.32) holds true. Then there exists some \({{\tilde{C}}}(\delta )>0\) such that condition 2 holds. This completes the proof. \(\square \)

The family of functions \(\{f^\delta _p\}\) will be used now to estimate the first exit time from a \(\delta \)-ball of the geodesic random walk \(\xi ^N\).

For each \(p\in M\), \(N\ge 1\), and \(\delta >0\), define the following stopping times for the random walk \(\xi ^N\) on M and \(\xi ^{N,p}\) on \(T_pM\):

$$\begin{aligned} \tau ^{N,\delta }&{:}{=}\inf \lbrace t>0:d(\xi ^N_t,p)>\delta \rbrace , \end{aligned}$$
(4.41)
$$\begin{aligned} \tau ^{N,\delta }_p&{:}{=}\inf \lbrace t>0 :d^p(\xi ^{N,p}_t,0)>\delta \rbrace ,\quad 0\in T_pM. \end{aligned}$$
(4.42)

Now we compare the exit time probabilities of the \(\delta \)-balls for \(\xi ^N_t\) and \(\xi ^{N,p}_t\) for sufficiently large N.

Lemma 4.6

For any \(p\in M\), \(N\ge 1\), and \(\delta \) such that \(0<\delta <\delta _0\) and \(\frac{2(C+1)}{\sqrt{N}}<\frac{\delta _c}{4}\), we have

$$\begin{aligned} {\mathbf {P}}_p(\tau ^{N,\delta }\le t)\le {\mathbf {P}}_0(\tau ^{N,\delta }_p\le t), \quad \forall t\ge 0. \end{aligned}$$
(4.43)

Proof

The geodesic random walks \(\xi ^{N,p}\) and \(\xi ^N\) are constructed by randomizing the time of the discrete Markov processes \(\zeta ^{N,p}\) and \(\zeta ^N\) using a Poisson process, respectively. Hence, it suffices to show for each pair \((N,\delta )\) satisfies the condition in the lemma, the following holds

$$\begin{aligned} {\mathbf {P}}_0\left( \max _{j\le k}d^p\left( 0,\zeta _j^{N,p} \right) \le \delta \right) \le {\mathbf {P}}_p\left( \max _{j\le k} d(p,\zeta ^N_j )\le \delta \right) ,\ \forall p\in M,\ \forall \delta <\delta _0,\ k\ge 0. \end{aligned}$$
(4.44)

For \(r>0\), define the closed \(\delta \)-balls of the symmetrized distances on \(T_pM\) and M, respectively:

$$\begin{aligned} B_0^p(r)&= \lbrace y\in T_pM:d^p(0,y)\le r\rbrace ,\\ B_p(r)&=\lbrace q\in M:d(p,q)\le r \rbrace . \end{aligned}$$

Because \(\delta<\delta _0<\frac{\delta _c}{4(C+1)}\), we have \(B^p_0(\delta )\subset D_p(\delta _c/2)\). Hence, \(\exp _p\) maps \((B^p_0(\delta ),F_p)\) inside \((B_p(\delta ),F)\) by an isometric immersion. Now for each \(k\ge 0\), define the following Borel sub-probability measures on \(B^p_0(\delta )\) and \(B_p(\delta )\), respectively.

$$\begin{aligned} \theta ^0_k({\hat{E}})&={\mathbf {P}}_0\left( \zeta ^{N,p}_{j}\in B^p_0(\delta ),\ 1\le j\le k-1,\ \zeta ^{N,p}_k\in {\hat{E}} \right) ,\ \forall {\hat{E}}\in {\mathcal {B}}(B^p_0(\delta ));\\ \theta _k(E)&={\mathbf {P}}_p\left( \zeta ^N_{j}\in B_p(\delta ),\ 1\le j\le k-1,\ \zeta ^N_k\in E \right) ,\ \forall E\in {\mathcal {B}}(B_p(\delta )). \end{aligned}$$

Let \(\theta ^{p}_k{:}{=}\left( \exp _p\vert _{B^p_0(\delta )}\right) _*\theta ^0_k\). For integers N such that \(\frac{2(C+1)}{\sqrt{N}}<\frac{\delta _c}{4}\), we claim \(\theta ^p_k\le \theta _k\) for all \(k\ge 0\).

We prove the claim by induction. As for \(k=0\), we have \(\theta ^0_0({\hat{E}})={\mathbf {1}}_{{\hat{E}}}(0)\) and \(\theta _0(E)={\mathbf {1}}_E(p)\). Since \(\exp _p(0)=p\), this claim holds for \(k=0\). For simplicity, we denote

$$\begin{aligned} {\mathbf {e}}_p^N{:}{=}\exp _p\Big (\frac{Y-\mu _p}{\sqrt{N}}+\frac{1}{N}\mu _p\Big ). \end{aligned}$$
(4.45)

For any \(q\in B^p_0(\delta )\) and \(Y\in T_q(T_pM)\) with \(F_p(Y)\le 1\), the condition \(\frac{2(C+1)}{\sqrt{N}}<\frac{\delta _c}{4}\) implies the N-scaled geodesic segment \(\gamma (t)=\exp _q(t{\mathbf {e}}^N_q(Y))\) for \(t\in [0,1]\) of \(F_p\) is mapped by \(\exp _p\) to a geodesic segment on (MF). Also note for \(q\in B^p_0(\delta )\), the mean is preserved under \(\exp _p\) by

$$\begin{aligned} (\mathrm {d}\exp _p(q))_*(\tilde{\mu }_q)=\mu _{\exp _p(q)}. \end{aligned}$$

Hence, for any \(q\in B^p_0(\delta )\) and \(Y\in T_q(T_pM)\) with \(F_p(Y)\le 1\), we get

$$\begin{aligned} \exp _p \left( {\mathbf {e}}^N_q(Y)\right) ={\mathbf {e}}^N_{\exp _p(q)}\left( (\mathrm {d}\exp _p(q))_*(Y) \right) . \end{aligned}$$
(4.46)

For each \(N\ge 1\), let \(P^o(x,\cdot )\) and \(P(y,\cdot )\) be the one-step transition probabilities of \(\zeta ^{N,p}\) and \(\zeta ^N\), respectively. For \(q\in B^p_0(\delta )\), set \(q_1=\exp _p(q)\). Since \(\tilde{\nu }_q\) is the pull back of \(\nu _{q_1}\) by \((d\exp _p(q))\), then (4.46) implies for any Borel set \({\hat{E}}\subset B^p_0(\delta )\), we have

$$\begin{aligned} P^o(q,{\hat{E}})=\tilde{\nu }_q\left( \left( {\mathbf {e}}^N_q\right) ^{-1}({\hat{E}})\right) \le \nu _{q_1}\left( \left( {\mathbf {e}}^N_{q_1}\right) ^{-1}(\exp _p({\hat{E}}))\right) =P(q_1,\exp _p({\hat{E}})). \end{aligned}$$
(4.47)

The inequality in the previous formula appears because the exponential map \(\exp _p\) restricted to \(B^p_0(\delta )\) is not necessarily injective.

In particular, for any Borel \(E\subset B_p(\delta )\), we get

$$\begin{aligned} P(q_1,E)\ge P^o(q,(\exp _p)^{-1}(E)\cap B^p_0(\delta )). \end{aligned}$$
(4.48)

By the Markov property, we have for all \(k\ge 1\)

$$\begin{aligned} \theta ^o_k({{\hat{E}}})&=\int _{B^p_0(\delta )} P^o(y,{\hat{E}})\theta ^o_{k-1}(\mathrm {d}y),\ \forall {\hat{E}}\in {\mathcal {B}}(B^p_0(\delta ));\\ \theta _k(E)&=\int _{B_p(\delta )} P(x,E)\theta _{k-1}(\mathrm {d}x),\ \forall E\in {\mathcal {B}}(B_p(\delta )). \end{aligned}$$

Hence, we have the following chain of inequalities:

$$\begin{aligned} \theta _k(E)&=\int _{B_p(\delta )} P(x,E)\theta _{k-1}(\mathrm {d}x)\\&\ge \int _{B_p(\delta )} P(x,E)\theta ^p_{k-1} (\mathrm {d}x)\\&= \int _{B^p_0(\delta )} P(\exp _p(y),E)\theta ^o_{k-1}(\mathrm {d}y)\\&\ge \int _{B^p_0(\delta )} P^o(y,(\exp _p)^{-1}(E)\cap B^p_0(\delta ))\theta ^o_{k-1}(\mathrm {d}y)\\&=\theta ^p_k(E). \end{aligned}$$

where the first inequality comes for the induction assumption, and the second inequality is due to (4.48).

In particular, for all \(k\ge 0\), the inequality

$$\begin{aligned} \theta ^o_k(B^p_0(\delta ))= \theta ^p_k(B_p(\delta ))\le \theta _k(B_p(\delta )) \end{aligned}$$

implies (4.44) holds. This completes the proof. \(\square \)

Next, we prove the following estimate on the first exit times of \(\xi ^N\) from \(\delta \)-balls on M.

Lemma 4.7

For each \(\delta >0\), there is \(C(\delta )>0\) such that for all \(t\ge 0\)

$$\begin{aligned} \sup _{p\in M}\sup _{N\ge 1}{\mathbf {P}}_p(\tau ^{N,\delta }\le t)\le C(\delta )t. \end{aligned}$$
(4.49)

In particular,

$$\begin{aligned} \sup _{p\in M}\sup _{N\ge 1}{\mathbf {E}}_p\mathrm {e}^{-\tau ^{N,\delta }} <1. \end{aligned}$$
(4.50)

Proof

We adapt ideas from [26] by Kunita. The proof is divided into two steps. First, we consider the exit times for the lifted random walks \(\xi ^{N,p}\), and we show

$$\begin{aligned} \sup _{p\in M}\sup _{N\ge 1}{\mathbf {P}}_0(\tau ^{N,\delta }_p\le t)\le {\tilde{C}}(\delta )t, \end{aligned}$$
(4.51)

where \(0\in T_pM\) and \({\tilde{C}}(\delta )>0\) is the constant defined in Lemma 4.5. Next we use Lemma 4.6 and (4.51) to prove (4.49), and (4.50) directly follows from (4.49).

It suffices to show (4.51) for sufficiently small \(\delta \). For any \(\delta \in (0,\delta _0)\) and \(p\in M\), let \(\lbrace f^{\delta }_p\rbrace \) be the family of functions constructed in Lemma 4.5. Using \(0\le f^{\delta }_p\le 1\) and \(f^{\delta }_p(0)=1\), we have

$$\begin{aligned} {\mathbf {P}}_0(\tau ^{N,\delta }_p\le t)&=1-{\mathbf {P}}_0(\tau ^{N,\delta }_p>t)\\&\le 1-{\mathbf {E}}_0\Big [ {\mathbb {I}}(\tau ^{N,\delta }_p>t)f^{\delta }_p(\xi ^{N,p}(\tau ^{N,\delta }_p\wedge t))\Big ], \\&=1- {\mathbf {E}}_0\Big [\Big (1- {\mathbb {I}}(\tau ^{N,\delta }_p\le t)\Big )f^{\delta }_p\Big (\xi ^{N,p}(\tau ^{N,\delta }_p\wedge t)\Big )\Big ]\\&= f^{\delta }_p(0)-{\mathbf {E}}_0 f^{\delta }_p\Big (\xi ^{N,p}(\tau ^{N,\delta }_p\wedge t)\Big )+ {\mathbf {E}}_0\Big [ {\mathbb {I}}(\tau ^{N,\delta }_p\le t)f^{\delta }_p\Big (\xi ^{N,p}(\tau ^{N,\delta }_p\wedge t)\Big )\Big ]\\&= f^{\delta }_p(0)-{\mathbf {E}}_0 f^{\delta }_p\Big (\xi ^{N,p}(\tau ^{N,\delta }_p\wedge t)\Big )+ {\mathbf {E}}_0\Big [ {\mathbb {I}}(\tau ^{N,\delta }_p\le t) f^{\delta }_p\Big (\xi ^{N,p}(\tau ^{N,\delta }_p)\Big )\Big ]. \end{aligned}$$

Here, the notation \(a\wedge b:=\min \{a,b\}\) is standard in the theory of stochastic processes. Taking into account that \(f^{\delta }_p\left( \xi ^{N,p}\left( \tau ^{N,\delta }_p\right) \right) =0\) and applying the Dynkin formula to above, we obtain

$$\begin{aligned} {\mathbf {P}}_0(\tau ^{N,\delta }_p\le t) \le -{\mathbf {E}}_0\int ^{\tau ^{N,\delta }_p\wedge t}_0 A_{N,p} f^{\delta }_p(\xi ^{N,p}_s)\,\mathrm {d}s \le \sup _{N\ge 1}\Vert A_{N,p} f^{\delta }_p\Vert \cdot t= {\tilde{C}}(\delta )t,\ \forall p\in M. \end{aligned}$$

Let \(N_0\) be the smallest positive integer such that \(\dfrac{2(C+1)}{\sqrt{N_0}}< \dfrac{\delta _c}{4}\). For \(N\le N_0\), we have always have

$$\begin{aligned} {\mathbf {P}}_p(\tau ^{N,\delta }\le t)\le {\mathbf {P}}(Q(Nt)>0)\le Nt\le N_0t. \end{aligned}$$
(4.52)

This together with Lemma 4.6 proves the inequality (4.49) by setting \(C(\delta )=\max \{{{\tilde{C}}}(\delta ),N_0\}\).

Furthermore, for \(t_*=\dfrac{1}{2C(\delta )}>0\)

$$\begin{aligned} {\mathbf {E}}_p\mathrm {e}^{-\tau ^{N,\delta }}&={\mathbf {E}}_p{\mathbb {I}}(\tau ^{N,\delta }\le t_*)\mathrm {e}^{-\tau ^{N,\delta }}+{\mathbf {E}}_p{\mathbb {I}}(\tau ^{N,\delta }> t_*)\mathrm {e}^{-\tau ^{N,\delta }}\\&\le {\mathbf {P}}_p(\tau ^{N,\delta }\le t_*)+\mathrm {e}^{-t_*}\left( 1-{\mathbf {P}}_p(\tau ^{N,\delta }\le t_*) \right) \\&= \mathrm {e}^{-t_*}+(1-\mathrm {e}^{-t_*}){\mathbf {P}}_p(\tau ^{N,\delta }\le t_*). \end{aligned}$$

Note that due to (4.49), we have \({\mathbf {P}}_p(\tau ^{N,\delta }\le t_*)\le C(\delta )t_*\le \dfrac{1}{2}\). Hence we obtain

$$\begin{aligned} {\mathbf {E}}_p\mathrm {e}^{-\tau ^{N,\delta }}\le \mathrm {e}^{-t_*}+\dfrac{1-\mathrm {e}^{-t_*}}{2}=\dfrac{1+\mathrm {e}^{-t_*}}{2}<1. \end{aligned}$$

This proves the second inequality of the lemma. \(\square \)

Now we are ready to show the Aldous criteria hold in our situation.

Lemma 4.8

(Aldous criteria) For any initial point \(p\in M\), any \(T>0\), \(\delta >0\), and any \(({\mathcal {F}}^N)\)-stopping times \(0\le \tau \le T\), we have

$$\begin{aligned} \lim _{s\rightarrow 0}\limsup _{N\rightarrow \infty }\sup _{\tau }\sup _{h\in [0,s]}{\mathbf {P}}_p\left( d(\xi ^N_{\tau },\xi ^N_{\tau +h})>\delta \right) =0 \end{aligned}$$
(4.53)

Proof

Let \(\delta ,s>0\) be fixed. For each \(N\ge 1\), \(p\in M\), \(h\in [0,s]\) and a stopping time \(\tau \), we have

$$\begin{aligned} {\mathbf {P}}_p\left( d(\xi ^N_{\tau },\xi ^N_{\tau +h})>\delta \right)&={\mathbf {E}}_p\biggr [ {\mathbb {I}}\left( d(\xi ^N_{\tau },\xi ^N_{\tau +h})>\delta )\right) \biggr ],\\&={\mathbf {E}}_p\biggr [{\mathbf {E}}\biggr [{\mathbb {I}}\left( d(\xi ^N_{\tau },\xi ^N_{\tau +h})>\delta )\right) \vert {\mathcal {F}}^N_{\tau }\biggr ]\biggr ],\\&={\mathbf {E}}_p\biggr [ {\mathbf {P}}\left( d(\xi ^N_{\tau },\xi ^N_{\tau +h}) >\delta \vert {\mathcal {F}}^N_{\tau }\right) \biggr ]. \end{aligned}$$

The strong Markov property of \(\xi ^N\) yields

$$\begin{aligned} {\mathbf {E}}_p\biggr [ {\mathbf {P}}\left( d(\xi ^N_{\tau },\xi ^N_{\tau +h})>\delta \vert {\mathcal {F}}^N_{\tau }\right) \biggr ] ={\mathbf {E}}_p{\mathbf {P}}_{\xi ^N_{\tau }}\biggr (d(\xi ^N_0,\xi ^N_h)>\delta \biggr ) \le {\mathbf {E}}_p\Big [\sup _{q\in M} {\mathbf {P}}_q(\tau ^{N,\delta }\le h)\Big ]. \end{aligned}$$

Thus by Lemma 4.7, we have

$$\begin{aligned} {\mathbf {P}}_p\left( d(\xi ^N_{\tau },\xi ^N_{\tau +h})>\delta \right) \le C(\delta )h\le C(\delta )s. \end{aligned}$$
(4.54)

Taking supremums and letting \(s\rightarrow 0\), we obtain the limit in Eq. (4.53). \(\square \)

Next we show the family of processes \(\lbrace \xi ^N\rbrace \) has compact containment property as follows.

Lemma 4.9

(compact containment condition) For any \(\varepsilon >0\), \(T\ge 0\) and \(p\in M\) there is a compact neighbourhood \(K_{\varepsilon }(p)\subseteq M\) of p such that

$$\begin{aligned} \inf _N{\mathbf {P}}_p\Big (\xi ^N_t\in K_{\varepsilon }(p),\ t\in [0,T]\Big )\ge 1-\varepsilon . \end{aligned}$$
(4.55)

Proof

Let us define the following sequence of exit times

$$\begin{aligned} \tau ^N_0&{:}{=}0,\\ \tau ^N_k&{:}{=}\inf \left\{ s>\tau ^N_{k-1}:d\left( \xi ^N_s,\xi ^N_{\tau ^N_{k-1}}\right) > 1\right\} ,\ k\ge 1, \end{aligned}$$

(as usual, we set \(\inf \emptyset =+\infty \)).

By Lemma 4.7, there exists some constant \(c_1\in (0,1)\) such that

$$\begin{aligned} \sup _{p\in M}\sup _N{\mathbf {E}}_p \mathrm {e}^{-\tau ^N_1}=\sup _{p\in M}\sup _N{\mathbf {E}}_p \mathrm {e}^{-\tau ^{N,\delta _1}}\le c_1<1 \end{aligned}$$

Then for \(k\ge 1\) and \(\forall N\ge 1\), the strong Markov property yields

$$\begin{aligned} {\mathbf {E}}_p\mathrm {e}^{-\tau ^N_k}&= {\mathbf {E}}_p \big [\mathrm {e}^{-\tau ^N_{k-1}}\cdot \mathrm {e}^{\tau ^N_{k-1}-\tau ^N_k}\big ],\\&= {\mathbf {E}}_p \biggr [\mathrm {e}^{-\tau ^N_{k-1}}\cdot {\mathbf {E}}\big [\mathrm {e}^{\tau ^N_{k-1}-\tau ^N_k}\vert {\mathcal {F}}^N_{\tau ^N_{k-1}}\big ]\biggr ],\\&= {\mathbf {E}}_p \biggr [\mathrm {e}^{-\tau ^N_{k-1}} \cdot {\mathbf {E}}_{\xi ^N_{\tau ^N_{k-1}}} \mathrm {e}^{-\tau ^N_1}\biggr ],\\&\le c_1\cdot {\mathbf {E}}_p\mathrm {e}^{-\tau ^N_{k-1}}\le c_1^k. \end{aligned}$$

For any \(\varepsilon >0\) and \(T\ge 0\), define

$$\begin{aligned} k_{\varepsilon }{:}{=}\left\lceil \dfrac{\ln {\varepsilon }-T}{\ln {c_1}}\right\rceil . \end{aligned}$$
(4.56)

Then the exponential Markov inequality gives \(\forall p\in M\), \(\forall N\ge 1\):

$$\begin{aligned} {\mathbf {P}}_p\left( \tau ^N_{k_{\varepsilon }}\le T\right) ={\mathbf {P}}_p \left( \mathrm {e}^{-\tau ^N_{k_{\varepsilon }}}\ge \mathrm {e}^{-T} \right) \le \mathrm {e}^{T}{\mathbf {E}}_p\mathrm {e}^{-\tau ^N_{k_{\varepsilon }}}\le \mathrm {e}^{T} c_1^{k_{\varepsilon }}\le \varepsilon . \end{aligned}$$
(4.57)

By construction and the triangle inequality, we have that for each \(k\ge 1\), each \(N\ge 1\)

$$\begin{aligned}&d\left( \xi ^N_{\tau ^N_{k}},\xi ^N_{\tau ^N_{k-1}}\right) \le 1+ \sup _N\sup _p d(p,\zeta ^N_1). \end{aligned}$$
(4.58)

We estimate the last term:

$$\begin{aligned} d(p,\zeta ^N_1)&\le \sup _{Y\in D_pM}d\Big (p,\exp _p\Big (\frac{Y-\mu _p}{\sqrt{N}}+\frac{\mu _p}{N}\Big )\Big ) \end{aligned}$$
(4.59)
$$\begin{aligned}&\le \sup _{Y\in D_pM}\dfrac{C}{\sqrt{N}}{}\cdot F\Big (Y-\Big (1-\frac{1}{\sqrt{N}}\Big )\mu _{p}\Big ) \end{aligned}$$
(4.60)
$$\begin{aligned}&\le C(C+1). \end{aligned}$$
(4.61)

Thus, we have for \(k\ge 1\)

$$\begin{aligned} d\left( \xi ^N_0,\xi ^N_{\tau ^N_k}\right) \le k\Big (1+C(C+1)\Big ). \end{aligned}$$
(4.62)

Now for \(p\in M\), \(\varepsilon >0\), and \(k_{\varepsilon }\) defined as in Eq. (4.56), consider the closed ball

$$\begin{aligned} K_p(\varepsilon ){:}{=}\lbrace q\in M :d(p,q)\le R(\varepsilon ,T)\rbrace , \end{aligned}$$
(4.63)

with radius

$$\begin{aligned} \begin{aligned} R(\varepsilon ,T)=k_{\varepsilon }(1+C(C+1))+1. \end{aligned}\end{aligned}$$
(4.64)

Then \(K_p(\varepsilon )\) is closed and forward bounded; hence, it is compact by Hopf–Rinow theorem.

Eventually we get that \(\forall p\in M\) and \(N\ge 1\):

$$\begin{aligned} \begin{aligned}&{\mathbf {P}}_p\left( \xi ^N_t\notin K_p(\varepsilon )\ \mathrm {for\ some\ } t\le T \right) \\&\quad \le {\mathbf {P}}_p\left( \tau ^N_{k_{\varepsilon }}\le T \right) + {\mathbf {P}}_p\left( \xi ^N_t\notin K_p(\varepsilon )\ \mathrm {for\ some\ } t\le T,\ \tau ^N_{k_{\varepsilon }}> T \right) \\&\quad \le \varepsilon , \end{aligned} \end{aligned}$$
(4.65)

since the last summand is equal to zero by construction of the set \(K_p(\varepsilon )\). This finishes the proof of compact containment condition. \(\square \)

So far, we have proved the sequence \(\lbrace \xi ^N\rbrace \) satisfies both Aldous criteria and the compact containment condition. Thus, this sequence is tight. It is well-known tightness implies being relatively compact. Thus, any subsequence of \(\lbrace \xi ^N\rbrace \) has a further subsequence converging weakly to some process \(\xi \) on M.

We close this section by showing any limit process of \(\lbrace \xi ^N\rbrace \) has continuous paths almost surely.

Proposition 4.10

Any limit point \(\xi \) of geodesic random walks \(\lbrace \xi ^N\rbrace \) is a.s. continuous.

Proof

The uniform elliptic condition implies that the jump sizes of the geodesic random walks \(\xi ^N\) converge to zero uniformly as \(N\rightarrow \infty \), since

$$\begin{aligned} \begin{aligned} d(\xi ^N_{t-},\xi ^N_t)\le \frac{C+1}{\sqrt{N}},\ \forall t\in [0,\infty ). \end{aligned}\end{aligned}$$
(4.66)

Hence, the statement follows immediately from Theorem 3.10.2 of [14]. \(\square \)

4.4 Convergence of Geodesic Random Walks

In this section, we give the proof of Theorem 2.1. We already know the sequence \(\lbrace \xi ^N\rbrace \) is relatively compact. To show the weak convergence, it remains to prove all limit points of \(\lbrace \xi ^N\rbrace \) have the same law. This is achieved by showing that any limit point of this sequence is a solution to a well-posed martingale problem.

We first need the following lemma. Recall that A defined in (2.8) is the limit of the generators \(A_N\) .

Lemma 4.11

For any \(p\in M\), any limit point \(\xi \) of \(\lbrace \xi ^N\rbrace \) and any \(f\in {\mathcal {C}}^{\infty }_K\), we have

$$\begin{aligned} \begin{aligned} f(\xi _t)-f(p)-\int _0^t Af(\xi _s)\,\mathrm {d}s,\quad \forall t\ge 0, \end{aligned}\end{aligned}$$
(4.67)

is a martingale.

Proof

It suffices to show that for any \(l\ge 1\), any \(h_1,\dots ,h_l\in {\mathcal {C}}_b(M)\), any \(0\le s\le t\), \(s_1,\dots ,s_l\in [s,t]\), and any \(f\in {\mathcal {C}}^{\infty }_K\), the following holds.

$$\begin{aligned} {\mathbf {E}}\Big [ \Big ( f(\xi _t)-f(\xi _s)-\int _s^t Af(\xi _r)\,\mathrm {d}r \Big ) \prod _{j=1}^l h_j(\xi _{s_j}) \Big ]=0. \end{aligned}$$
(4.68)

Since \(\xi ^N\) is a Markov process for all \(N\ge 1\), it follows for all \(0\le s\le t\) that

$$\begin{aligned}\begin{aligned} f(\xi ^N_t)-f(\xi ^N_s)-\int ^t_s A_N f(\xi ^N_r)\,\mathrm {d}r \end{aligned}\end{aligned}$$

is a martingale. Hence, for each \(N\ge 1\)

$$\begin{aligned} {\mathbf {E}}\Big [ \Big ( f(\xi ^N_t)-f(\xi ^N_s)-\int _s^t A_Nf(\xi ^N_r)\,\mathrm {d}r \Big ) \prod _{j=1}^l h_j(\xi ^N_{s_j}) \Big ]=0. \end{aligned}$$
(4.69)

Separate the formula (4.68) into two terms, and let \(\lbrace \xi ^{N_k}\rbrace \) be a subsequence converging weakly to \(\xi \). Since \(\xi \) has continuous paths almost surely, the finite dimensional distributions of \(\xi ^{N_k}\) always converge weakly to those of \(\xi \) (Theorem 3.7.8 of [14]). Thus, we have

$$\begin{aligned} \begin{aligned} {\mathbf {E}}\Big [ \Big ( f(\xi _t)-f(\xi _s)\Big ) \prod _{j=1}^l h_j(\xi _{s_j}) \Big ] =\lim _{N_k\rightarrow \infty }{\mathbf {E}}\Big [ \Big ( f(\xi ^{N_k}_t)-f(\xi ^{N_k}_s)\Big ) \prod _{j=1}^l h_j(\xi ^{N_k}_{s_j}) \Big ]. \end{aligned}\nonumber \\ \end{aligned}$$
(4.70)

Furthermore,

$$\begin{aligned} \begin{aligned} {\mathbf {E}}\Big [\int _s^t A_{N_k}f(\xi ^{N_k}_r)\,\mathrm {d}r \cdot \prod _{j=1}^l h_j\left( \xi ^{N_k}_{s_j}\right) \Big ]&={\mathbf {E}}\Big [\int _s^t A f(\xi ^{N_k}_r)\,\mathrm {d}r \cdot \prod _{j=1}^l h_j\left( \xi ^{N_k}_{s_j}\right) \Big ]\\&\quad +{\mathbf {E}}\Big [\int _s^t (A_{N_k}-A)f(\xi ^{N_k}_r)\,\mathrm {d}r \cdot \prod _{j=1}^l h_j\left( \xi ^{N_k}_{s_j}\right) \Big ] \end{aligned}\nonumber \\ \end{aligned}$$
(4.71)

and the latter summand vanishes as \(N_k\rightarrow \infty \) because the functions \(h_j\) are bounded and by Proposition 4.3

$$\begin{aligned} \lim _{N\rightarrow \infty }\Vert (A_{N_k}-A)f\Vert =0,\quad \forall f\in {\mathcal {C}}^{\infty }_K. \end{aligned}$$

To treat the first term, since \(x\mapsto Af(x)\) is continuous and bounded, we have for each \(r\in [s,t]\)

$$\begin{aligned} \begin{aligned} \lim _{N_k\rightarrow \infty } {\mathbf {E}}\Big [ A f(\xi ^{N_k}_r) \cdot \prod _{j=1}^l h_j\left( \xi ^{N_k}_{s_j}\right) \Big ]= {\mathbf {E}}\Big [ A f(\xi _r) \cdot \prod _{j=1}^l h_j(\xi _{s_j}) \Big ]. \end{aligned}\end{aligned}$$
(4.72)

Thus, by Fubini’s and Lebesgue’s theorems, we get

$$\begin{aligned} \begin{aligned} \lim _{N\rightarrow \infty } {\mathbf {E}}\Big [\int _s^t A f(\xi ^N_r)\,\mathrm {d}r \cdot \prod _{j=1}^l h_j(\xi ^N_{s_j}) \Big ]&=\int _s^t \lim _{N\rightarrow \infty } {\mathbf {E}}\Big [ A f(\xi ^N_r) \cdot \prod _{j=1}^l h_j(\xi ^N_{s_j}) \Big ]\,\mathrm {d}r\\&= {\mathbf {E}}\Big [\int _s^t A f(\xi _r)\,\mathrm {d}r \cdot \prod _{j=1}^l h_j(\xi _{s_j}) \Big ]. \end{aligned}\end{aligned}$$
(4.73)

and (4.68) is established. \(\square \)

Proposition 4.12

The martingale problem (4.67) has a unique solution which is stochastically complete. Hence, the sequence \(\lbrace \xi ^N\rbrace \) converges weakly.

Proof

The well-posedness follows from the well-posedness of the martingale problem in \({\mathbb {R}}^m\). Indeed, in any chart U, the generator A is a second-order strongly elliptic operator with smooth coefficients. We can extend the generator on \(U^c\) such that its coefficients are uniformly Lipschitz. Then the martingale problem is well posed, e.g. by Theorem 5.1.4 in [41]. By Theorem 4.6.1 of [14], the stopped martingale is also well posed for any initial distribution. The localized solutions in countably many charts can glued together by Lemma 4.6.5 and Theorem 4.6.6 in [14], see also Sect. 4.11 in [25]. \(\square \)

In summary, we have shown the sequence \(\lbrace \xi ^N\rbrace \) converges weakly to some process \(\xi \) on M which is a solution to a well-posed martingale problem. This completes the proof of Theorem 2.1.

Eventually let us also prove, since it is an important and useful property, that the limit process is Feller.

Proposition 4.13

The limit process \(\xi \) is Feller, i.e. its semigroup preserves \({\mathcal {C}}_0(M)\).

Proof

In any chart, \(\xi \) is a non-degenerate diffusion with smooth coefficients; hence, its semigroup maps \({\mathcal {C}}_0(M)\) to \({\mathcal {C}}(M)\).

Denote \((T_t)_{t\ge 0}\) the semigroup of \(\xi \) as usual. Let \(f\in {\mathcal {C}}_0(M)\), \(t>0\) and \(\varepsilon >0\) be fixed. Choose a compact set \(C_\varepsilon \) such that \(|f(x)|\le \varepsilon \) for \(x\notin C_\varepsilon \). Define \(R(\varepsilon ,t)\) as in (4.64) in Lemma 4.9. By this lemma, for any \(p\in M\) such that \(d(p,C_\varepsilon )>R(\varepsilon ,t)\), we have

$$\begin{aligned} \begin{aligned} |{\mathbf {E}}_p f(\xi _t)|&\le {\mathbf {E}}_p |f(\xi _t)|{\mathbb {I}}(\tau ^R\le t)+{\mathbf {E}}_p |f(\xi _t)|{\mathbb {I}}(\tau ^R> t)\\&\le \Vert f\Vert \cdot {\mathbf {P}}_p(\tau ^R\le t) + \varepsilon \le (\Vert f\Vert +1) \varepsilon . \end{aligned}\end{aligned}$$
(4.74)

Thus, \((T_t)(f)\) vanishes at infinity for \(f\in {\mathcal {C}}_0\). As \(\xi \) is a limit point of \(\lbrace \xi ^N\rbrace \), Lemma 4.7 implies

$$\begin{aligned} \sup _{p\in M} {\mathbf {P}}_p(d(p,\xi _t)>\delta )\le C(\delta )t,\ \forall \delta >0,\ \forall t\ge 0. \end{aligned}$$

The strong continuity of the semigroup \((T_t)\) follows. \(\square \)