1 Introduction

Let (Mg) be a compact d-dimensional Riemannian manifold with strictly convex boundary (\(d\ge 2\)) and \(\Phi :M\rightarrow {\mathbb {C}}^{m\times m}\) (\(m\ge 1\)) a continuous matrix potential. Suppose \(\gamma : [0,\tau ]\rightarrow M\) is a unit-speed geodesic with endpoints on \(\partial M\) and consider the linear matrix differential equation

$$\begin{aligned} \dot{U}(t) + \Phi (\gamma (t))U(t) = 0,\quad U(\tau ) = \mathrm {id}. \end{aligned}$$
(1.1)

This has a unique continuous solution \(U:[0,\tau ]\rightarrow {{\,\mathrm{Gl}\,}}(m,{\mathbb {C}})=\{A\in {\mathbb {C}}^{m\times m}:\det A \ne 0\}\), and we write \(C_\Phi (\gamma )=U(0)\in {{\,\mathrm{Gl}\,}}(m,{\mathbb {C}})\) for its value at the boundary. The matrix \(C_\Phi (\gamma )\) is called scattering data or non-abelian X-ray transform of \(\Phi \) (along \(\gamma \)). For \(m=1\), we have \(\log C_\Phi (\gamma )= \int _0^\tau \Phi (\gamma (t)) \mathrm {d}t\), which is the standard X-ray transform; for \(m\ge 2\), this relation breaks as \({{\,\mathrm{Gl}\,}}(m,{\mathbb {C}})\) ceases to be abelian.

We are concerned with an inverse problem for the non-abelian X-ray transform with access to partial data: Can one recover \(\Phi \) in an open set \(O\subset M\) from measuring \(C_\Phi (\gamma )\) for geodesics \(\gamma \) that do not leave O? For \(d\ge 3\) and \(O\subset M\) satisfying the so-called foliation condition (see Definition 1.1 below), it is known that locally, smooth potentials are determined uniquely by their scattering data. Precisely, [20] establishes injectivity of the map

$$\begin{aligned} C^\infty (O,{\mathbb {C}}^{m\times m})\ni \Phi \mapsto (C_\Phi (\gamma ): \gamma \in \Gamma _O), \end{aligned}$$
(1.2)

where \(\Gamma _O\) is the set of unit-speed geodesics \(\gamma :[0,\tau ]\rightarrow M\) with \(\gamma ([0,\tau ]) \subset O\) and both endpoints on \(\partial M\), so-called O-local geodesics. In this article, injectivity is refined to a Hölder-type stability estimate; this estimate is our main result, precisely formulated in Theorem 1.3 below.

Non-abelian X-ray tomography provides the mathematical basis for the novel imaging technology of polarimetric neutron tomography [10, 21], which seeks to determine a magnetic field within a medium by probing it with neutron beams and measuring the spin change that results from traversing the magnetic field. In this setting, \(\Phi \) takes values in \(\mathfrak {so}(3)=\{A\in {\mathbb {R}}^{3\times 3}:A^\mathrm{{T}}=-A\}\) and encodes the magnetic field and \(C_\Phi (\gamma ) \in SO(3)\) describes the resulting rotation of the spin vector for a neutron travelling along \(\gamma \). For a survey on further applications of non-abelian X-ray tomography, we refer to [15].

Even in the simplest example, when M is a Euclidean ball (thus geodesics are straight lines) and we have access to full data (\(O=M\)), the inverse problem described above is very challenging. It is non-linear, and for \(m\ge 2\), no explicit inversion formula is known or expected to exist.

At the same time, real-life applications demand a computational approach to ‘solve’ the inverse problem, typically in the presence of statistical noise on the measurements. An attractive and widely used such approach is Stuart’s framework of Bayesian inverse problems [4], in which \(\Phi \) is estimated from draws of a ‘posterior probability measure’, which can be computed from a finite number of observations \(C_\Phi (\gamma _1),\dots , C_\Phi (\gamma _n)\).

From a theoretical point of view, this shifts the focus to a rigorous study of the performance of Bayesian algorithms. For the non-abelian X-ray transform, this was initiated in [14], where the authors prove a statistical consistency result in dimension \(d=2\). This roughly asserts that potentials \(\Phi \) can be recovered from \(C_\Phi \) by a Bayesian algorithm with arbitrary accuracy, as the number of measurements \(n \rightarrow \infty \). One of the key ingredients in the statistical analysis of non-linear inverse problems is a quantitative stability estimate with good control on the involved constants. This principle has emerged in a series of recent papers, including [14] for the two-dimension non-abelian X-ray transform, as well as [1] and [9], which analyse the Calderón-problem and an inverse problem for the Schrödinger equation, respectively. In our case, establishing consistency in \(L^2\)-norm requires a stability estimate of the form

$$\begin{aligned} \Vert \Phi - \Psi \Vert _{L^2} \le C(\Phi , \Psi ) \cdot d(C_\Phi , C_\Psi ), \end{aligned}$$
(1.3)

where \(C(\Phi ,\Psi )>0\) is bounded over large classes of potentials \(\Phi ,\Psi \) and \(d(\cdot , \cdot )\) is an appropriate (semi-)metric. In [14], the authors prove such an estimate in the two-dimensional case for \(d(\cdot ,\cdot )\) given by the distance in an \(H^1\)-Sobolev space. Using an interpolation argument, they derive further stability estimates with \(d(\cdot ,\cdot )=\Vert \cdot - \cdot \Vert ^\mu _{L^2}\) and \(\mu \in (0,1)\). Our main theorem contains a version of this Hölder-type stability estimate for \(d\ge 3\) and implies essentially the same consistency result as in two dimensions, however, with the caveat of requiring priors of higher regularity and obtaining a slower rate of convergence.

In a Euclidean setting, the two-dimensional results from [14] are relevant also in higher dimensions, as one can reduce to \(d=2\) by recovering \(\Phi \) slice by slice; nevertheless, there are several reasons to study the case \(d\ge 3\) intrinsically. Besides the applicability to a wider class of geometries, partial data results become available, which for \(d=2\) are less well understood and not available in general [2]. This is of direct relevance to real-life applications, where one might have access only to localised measurement data. Further, the methods for proving injectivity are very different in \(d=2\) versus \(d\ge 3\) and the quest for new stability estimates requires refining the methods to make them more quantitative, which might in turn prove useful in other problems. This is especially true in \(d \ge 3\), where injectivity is proved by means of a novel and extremely versatile technique, as explained in the next paragraph.

The working horse behind many partial data results in \(d\ge 3\), for non-abelian X-ray tomography as well as boundary rigidity and some other geometric inverse problems, is a ground-breaking technique of Uhlmann and Vasy [27]. Their method automatically provides a local stability estimate for the linearised problem; however, there are two less welcome features: the necessity of smooth data and the need to globalise. Let us elaborate on these points to explain the main technical contributions of this article.

With microlocal analysis at the core of the method, smoothness of the underlying data (in our case, the potential \(\Phi \)) is not easily relaxed to lower regularity; in particular, the constants in the local stability estimate a priori depend continuously on \(\Phi \) only in the \(C^\infty \)-topology. However, statistical consistency demands better control and one of our main contributions is to show uniformity on arbitrarily large \(C^k\)-balls (for \(k\ge 0\) sufficiently large).

By ‘globalisation’, we mean the extension of injectivity from small neighbourhoods of boundary points to larger domains, or all of M, via a layer-stripping argument. As the initial domain of injectivity depends on the potentials \(\Phi \), the layer-stripping argument becomes more delicate and another contribution of this paper is to carefully combine the arguments from [20] and [24] to globalise stability estimates for the non-abelian X-ray transform.

1.1 Notation and Background

We denote with \(SM=\{(x,v)\in TM:\vert v\vert =1\}\) the unit-sphere bundle of M and write \(\pi :SM\rightarrow M\) for the projection onto the base variable. SM is itself a manifold with boundary and, writing \(\nu \) for the inward-pointing unit-normal to \(\partial M\), we can decompose \(\partial SM\) into

$$\begin{aligned} \partial _\pm SM=\{(x,v) \in SM: x\in \partial M, \pm \langle \nu (x), v\rangle \ge 0\}. \end{aligned}$$

Let X be the geodesic vector field on SM and \(\varphi _t\) the geodesic flow. We then write \(\gamma _{x,v}(t)=\pi (\varphi _t(x,v))\) for the geodesic adapted to \((x,v)\in SM\) and \(\tau (x,v)\in [0,\infty ]\) for the first time that \(\gamma _{x,v}\) exits M. If \(\tau (x,v)<\infty \) for all \((x,v)\in SM\), then M is called non-trapping. Further, we say that \(\partial M\) is strictly convex if its second fundamental form is positive definite everywhere.

If M is non-trapping and has strictly convex boundary, then \(\partial _+SM\) naturally parametrises all geodesics with endpoints on \(\partial M\) and the non-abelian X-ray transform can be recast as map

$$\begin{aligned} C(M,{\mathbb {C}}^{m\times m})\rightarrow C(\partial _+SM,{{\,\mathrm{Gl}\,}}(m,{\mathbb {C}})),\quad \Phi \mapsto C_\Phi . \end{aligned}$$
(1.4)

Precisely, we set \(C_\Phi = U_\Phi \vert _{\partial _+SM}\), where \(U_\Phi :SM\rightarrow {{\,\mathrm{Gl}\,}}(m,{\mathbb {C}})\) denotes the unique continuous solution (differentiable along the geodesic flow) of

$$\begin{aligned} (X+\Phi )U_\Phi = 0 \text { on } SM\quad \text { and }\quad U_\Phi = \mathrm {id}\text { on } \partial _-SM. \end{aligned}$$
(1.5)

For \(O\subset M\) open, we write \({\mathcal {M}}_O\subset \partial _+SM\) for the open set of all (xv) for which \(\gamma _{x,v}(t)\in O\) for \(0\le t\le \tau (x,v)\). The set \({\mathcal {M}}_O\) parametrises the collection \(\Gamma _O\) of O-local geodesics. The following condition, introduced in this form in [20], ensures that O is scanned by sufficiently many geodesics emerging from \({\mathcal {M}}_O\) and allows to prove an injectivity result as stated below.

Definition 1.1

An open subset \(O\subset M\) satisfies the foliation condition, if there is a smooth, strictly convex function \(\rho : O \rightarrow {\mathbb {R}}\) which is exhausting in the sense that \(O_{\ge c}=\{x \in O: \rho (x) \ge c\}\subset M\) is compact for all \(c>\inf _O \rho \).

Theorem 1.2

(Paternain et al. [20])  Let \(d\ge 3\), assume that \(\partial M\) is strictly convex and O satisfies the foliation condition. Then for smooth potentials \(\Phi ,\Psi :M\rightarrow {\mathbb {C}}^{m\times m}\), we have that

$$\begin{aligned} C_\Phi = C_\Psi \text { on } {\mathcal {M}}_O \quad \Longrightarrow \quad \Phi = \Psi \text { on } O. \end{aligned}$$
(1.6)

In fact, the authors of [20] consider a more general situation, where scattering data are defined with respect to attenuations \({\mathcal {A}}(x,v) = \Phi (x) + A_x(v)\) that may depend on the direction v up to first order. That is, \(\Phi \) is a matrix potential as above and \(A\in \Omega ^1(O,{\mathbb {C}}^{m\times m})\) is a matrix-valued one form. In that case a similar result holds true, but one only has injectivity up to the gauge \({\mathcal {A}}\mapsto u^{-1} \mathrm {d}u + u^{-1} {\mathcal {A}}u\) (\(u: O\rightarrow {{\,\mathrm{Gl}\,}}(m,{\mathbb {C}})\) smooth).

For the full-data problem \((O=M)\), the foliation condition reduces to the existence of a strictly convex function on M and is set into relation with other geometric properties of M in Section 2 of [20]: For example, if M (with \(\partial M\) strictly convex) supports a strictly convex function, it is automatically non-trapping and contractible. Conversely, if M has non-negative sectional curvatures (or non-positive sectional curvatures, and it is simply connected), then it admits a strictly convex function.

Let us conclude with a brief overview of the history of the problem. Assuming a flat background geometry and access to full data, the problem was first studied by Vertgeim [29], with further pioneering work by Novikov [16] and Eskin [5], who established injectivity in dimension \(d\ge 3\) and \(d=2\), respectively (up to gauge in the general problem mentioned above).

In the geometric setting and for \(d=2\), the full-data problem is typically studied on compact surfaces (Mg) that are simple in the sense that \(\partial M\) is strictly convex and M is assumed to be non-trapping and free of conjugate points. There, injectivity of the map

$$\begin{aligned} C^\infty (M,{\mathfrak {g}})\rightarrow C^\infty (\partial _+SM,G), \Phi \mapsto C_\Phi \end{aligned}$$
(1.7)

(where \(G\subset {{\,\mathrm{Gl}\,}}(m,{\mathbb {C}})\) is a matrix Lie group with Lie-algebra \({\mathfrak {g}}\)) was first proved for \(G = U(m)\) (the unitary group) in [19]. In the case, \(G={{\,\mathrm{Gl}\,}}(m,{\mathbb {C}})\) injectivity was established under a negative curvature assumption in [17] and, very recently, for general simple surfaces [18]. Partial data results on the other hand (even for \(m=1\)) are less well understood in \(d=2\) [2], and there is no analogue for (1.2) for smooth (non-analytic) potentials.

1.2 Main Result

Our main analytical result is the following stability estimate for the non-abelian X-ray transform on a compact manifold (Mg), assumed to be non-trapping and have a strictly convex boundary.

Theorem 1.3

Suppose \(d\ge 3\) and \(K\subset O\subset M\) are such that K is compact and O is open and satisfies the foliation condition. Then for smooth potentials \(\Phi ,\Psi :M\rightarrow {\mathbb {C}}^{m\times m}\), we have

$$\begin{aligned} \Vert \Phi - \Psi \Vert _{L^2(K)} \le C(\Phi ,\Psi ) \cdot \Vert C_\Phi - C_\Psi \Vert _{L^2({\mathcal {M}}_O)}^{\mu (\Phi ,\Psi )}, \end{aligned}$$
(1.8)

where \(C>0\) and \(\mu \in (0,1)\) obey an estimate of the form

$$\begin{aligned} C(\Phi ,\Psi )\vee \mu (\Phi ,\Psi )^{-1} \le \omega (\Vert \Phi \Vert _{C^k(M)} \vee \Vert \Psi \Vert _{C^k(M)}) \end{aligned}$$
(1.9)

for some non-decreasing function \(\omega :[0,\infty )\rightarrow [0,\infty )\) and an integer \(k\ge 0\).

Here, the Lebesgue spaces \(L^2(K)\) and \(L^2({\mathcal {M}}_O)\) (with codomain \({\mathbb {C}}^{m\times m}\) suppressed from the notation) are defined with respect to the natural Riemannian volume forms on the ambient manifolds M and \(\partial _+SM\). The space \(C^k(M)\) consists of functions which are k-times continuously differentiable up to \(\partial M\) and a choice of continuous norm \(\Vert \cdot \Vert _{C^k(M)}\) (defined with respect to some atlas) is assumed to be fixed throughout the discussion. Further, the notation \(a\vee b\) is used for the maximum of two quantities \(a,b>0\).

Finally, we mention that in the formulation of Theorem 1.3 as well as below, we assume smoothness of the involved potentials \(\Phi ,\Psi \) only for convenience. In all cases, one can derive results for potentials of finite regularity \(C^k\) (for \(k\ge 0\) determined by (1.9) or a similar bound) by means of an approximation argument: If \(\Phi \in C^k(M,{\mathbb {C}}^{m\times m})\) is approximated by a sequence of smooth potentials \((\Phi _n:n\ge 0)\) in \(C^k\)-norm, then \(C_{\Phi _n}\rightarrow C_\Phi \) in \(H^k(\partial _+SM)\) by Corollary 2.5 below. As the \(C^k\)-norms are bounded along the sequence, a bound as in (1.9) prevents the constants from blowing up, such that the stability estimate persists in the limit.

In view of the local stability estimates for the linearised problem in [20, Theorem 1.3] and the situation in \(d=2\) [14, Corollary 1.4], one might expect a stronger result, with the right-hand side of (1.8) being replaced by a Lipschitz-type bound \(\le C(\Phi ,\Psi ) \cdot \Vert C_\Phi - C_\Psi \Vert _{F({\mathcal {M}}_O)}\) in terms of a suitable function space F, say of Sobolev regularity \(H^1\) or even \(H^{1/2}\). However, our result is both in line with the available estimates for the related conformal boundary rigidity problem, and it is sufficient to prove statistical consistency. Let us elaborate on these points:

The conformal boundary rigidity problem (determining a Riemannian metric g on M within a fixed conformal class from its boundary distance function) shares many features with the problem at hand: It is a gauge-free non-linear problem, which in dimension \(d\ge 3\) is solved with Uhlmann–Vasy’s method, also requiring a layer-stripping argument to propagate injectivity into the interior of M. It is, thus, natural to compare the available stability estimates [24, Theorem 1.4] and indeed, equation (3), there is of a similar form as (1.8) here. Both here and there, the passage to the weaker Hölder-type estimate is an artefact of the globalisation procedure, which employs interpolation at every step of the layer-stripping argument.

To understand the statistical consequences of Theorem 1.3, we draw the comparison with the stability estimates in [14]. Their result concerns the full-data problem on a simple surface (Mg) and states that

$$\begin{aligned} \Vert \Phi - \Psi \Vert _{L^2(M)} \le C(\Phi ,\Psi ) \cdot \Vert C_\Phi - C_\Psi \Vert _{H^1(\partial _+SM)} \end{aligned}$$
(1.10)

for all smooth potentials \(\Phi ,\Psi :M\rightarrow {\mathfrak {u}}(m)\) and some \(C(\Phi ,\Psi )>0\) which is bounded, as long as the \(C^1\)-norms of \(\Phi \) and \(\Psi \) are bounded. This stability estimate is derived by means of a Pestov-type energy estimate which does not extend to higher dimensions and necessitates the restriction to \({\mathfrak {u}}(m)\)-valued potentials (although, using the techniques from a recent injectivity result [18], one might be able to extend it to general matrix potentials). By means of the forward estimates in [14] and an interpolation argument (as described in the proof of Theorem 5.16 there), estimate (1.10) can be brought into the following Hölder-type form, again valid for smooth \(\Phi ,\Psi :M\rightarrow {\mathfrak {u}}(m)\)

$$\begin{aligned} \Vert \Phi - \Psi \Vert _{L^2(M)} \le C_k(\Phi , \Psi ) \cdot \Vert C_\Phi - C_\Psi \Vert _{L^2(\partial _+SM)}^{(k-1)/k}, \quad k \ge 2 \end{aligned}$$
(1.11)

where \( C_k(\Phi ,\Psi ) = c_{1,k} \exp \left( c_{2,k} \left( \Vert \Phi \Vert _{C^k(M)} \vee \Vert \Phi \Vert _{C^k(M)}\right) \right) \) for constants \(c_{1,k},c_{2,k}>0\) only depending on (Mg) and m. This resembles the estimates given in Theorem 1.3 and indeed, in Sect. 6, we show that the statistical analysis of [14] carries over to the full-data case (\(O=M\)) in \(d\ge 3\): In a Bayesian framework, and under a suitable choice of priors \(\Pi _n\) on \(C(M,{\mathbb {R}}^{m\times m})\), the following consistency result holds true:

Theorem 1.4

(Consistency) Let \(\Phi _0\in C^\infty (M,{\mathbb {R}}^{m\times m})\) and suppose we observe \((X_i,V_i)\) and \(Y_i =C_{\Phi _0}(X_i,V_i) + \epsilon _i\) (\(i=1,\dots ,n\)), where the directions \((X_i,V_i)\in \partial _+SM\) are drawn uniformly at random and \(\epsilon _i\in {\mathbb {R}}^{m\times m}\) is independent Gaussian noise. Then, as the sample size \(n\rightarrow \infty \), the potential \(\Phi _0\) can be recovered as \(L^2\)-limit (in probability) of the posterior means \({\mathbb {E}}_{\Pi _n}[\Phi \vert (X_i,V_i,Y_i)_{i=1}^n] \in C(M,{\mathbb {R}}^{m\times m})\).

As the statistical analysis is conceptually independent of the remaining paper, a more detailed discussion of the underlying priors and a comparison with [14] is postponed to Sect. 6. The theorem above is restated in Theorem 6.2 and the remarks thereafter.

Continuing our discussion of Theorem 1.3, we remark that estimate (1.9) is a way of saying that for smooth potentials \(\Phi \) and \(\Psi \) lying inside of a fixed ball \(\{{\Vert \cdot \Vert _{C^k(M)} }\,\le A\}\) (\(A>0\)), one may choose the constants C and \(\mu \) uniformly. This is a stronger result than the uniformity in [24, Theorem 1.4] (conformal boundary rigidity), which only holds over sufficiently small balls. However, similar to the just cited result, the required regularity k for which (1.9) is true, is unknown. This is in stark contrast with the two-dimensional situation in (1.11), where one can freely choose \(k\ge 2\). To the knowledge of the author, the available techniques to reduce the required regularity to some smaller \(k'\ll k\) (cf. [6, Theorem 2b], where \(k'=2\)), only yield uniformity for generic elements of \(C^{k'}\), which is not sufficient for the statistical application mentioned above.

1.3 Key Ideas and Structure

Analysis of the non-abelian X-ray transform starts with a pseudo-linearisation identity that we will now describe. Given a potential \(\Phi \in C^\infty (M,{\mathbb {C}}^{m\times m})\) we call any (smooth) solution \(R:SM\rightarrow {{\,\mathrm{Gl}\,}}(m,{\mathbb {C}})\) to \((X+\Phi )R=0\) on SM an integrating factor for \(\Phi \). Smooth integrating factors always exist in our setting (M compact, non-trapping & with strictly convex boundary) and can be used to express the non-abelian X-ray transform in terms of the linear, weighted X-ray transform

$$\begin{aligned} I_W f(x,v) = \int _0^{\tau (x,v)} Wf(\varphi _t(x,v)) \mathrm {d}t,\quad (x,v)\in \partial _+SM, \end{aligned}$$
(1.12)

defined for \(W:SM\rightarrow {\mathbb {C}}^{m\times m}\) and \(f:M\rightarrow {\mathbb {C}}^{m}\). Precisely, we have

Lemma 1.5

Let \(\Phi ,\Psi \in C^\infty (M,{\mathbb {C}}^{m\times m})\) and suppose that \(R_\Phi \) and \(R_\Psi \) are smooth integrating factors for \(\Phi \) and \(\Psi \), respectively. Then we have

$$\begin{aligned} C_\Phi - C_\Psi = R_\Phi \cdot I_{{\mathcal {W}}_{\Phi ,\Psi }} (\Phi - \Psi ) \cdot \alpha ^*R^{-1}_\Psi \quad \text { on } \partial _+SM, \end{aligned}$$
(1.13)

where \(\alpha (x,v)=\varphi _{\tau (x,v)}(x,v)\) is the scattering relation of (Mg) and the weight \({\mathcal {W}}_{\Phi ,\Psi }:SM\rightarrow \mathrm {End}({\mathbb {C}}^{m\times m})\) is defined pointwise by \({\mathcal {W}}_{\Phi ,\Psi }A = R_\Phi ^{-1} A R_\Psi \) for \(A\in {\mathbb {C}}^{m\times m}\).

Note that the weighted X-ray transform in (1.13) is to be understood ‘one level higher’, identifying \({\mathbb {C}}^{m\times m} \cong {\mathbb {C}}^{m'}\) and \({{\,\mathrm{End}\,}}({\mathbb {C}}^{m\times m}) \cong {\mathbb {C}}^{m'\times m'}\) for \(m'=m^2\).

Proof

Let \(F_\Phi \) be a first integral for \(R_\Phi ^{-1}\vert _{\partial _-SM}\) that is \(F_\Phi : SM\rightarrow {{\,\mathrm{Gl}\,}}(m,{\mathbb {C}})\) solves \(XF_\Phi =0\) on SM and \(F_\Phi = R_\Phi ^{-1}\) on \(\partial _-SM\). Then \(U_\Phi = R_\Phi F_\Phi \) satisfies (1.5) and \(C_\Phi = U_\Phi \vert _{\partial _+SM}\). Using the corresponding notation for \(\Psi \), we have

$$\begin{aligned} U_\Phi - U_\Psi = R_\Phi \cdot (F_\Phi F^{-1}_\Psi - R^{-1}_\Phi R_\Psi ) \cdot F_\Psi , \end{aligned}$$
(1.14)

which, when restricted to \(\partial _+SM\), yields (1.13). To see this, note that \(G = F_\Phi F^{-1}_\Psi - R^{-1}_\Phi R_\Psi \) satisfies \(XG = - {\mathcal {W}}_{\Phi ,\Psi }(\Phi -\Psi )\) on SM and \(G = 0\) on \(\partial _-SM\). The fundamental theorem of calculus now implies that \(G\vert _{\partial _+SM} = I_{{\mathcal {W}}_{\Phi ,\Psi }}(\Phi - \Psi )\) and since further \(F_\Psi \vert _{\partial _+SM} = \alpha ^* R_\Psi ^{-1}\), the proof is complete. \(\square \)

We can now summarise the content of the subsequent sections and lay out the general strategy to prove the main results of this article.

In Sect. 2, we prove a forward estimate for the map \(\Phi \mapsto R_\Phi \), which allows to translate stability estimates for the weighted X-ray transform into one for the non-abelian one. Further consequences are forward estimates for \(\Phi \mapsto C_\Phi \), which are of interest in statistical applications. The techniques in this section are similar to the ones in [14], suitably adjusted to deal with dimension \(d\ge 3\) and integrating factors taking values in the non-compact group \({{\,\mathrm{Gl}\,}}(m,{\mathbb {C}})\).

Section 3 prepares the further analysis by proving a quantitative version of the microlocal technique (local inversion of scattering operators near elliptic points), introduced in the context of X-ray transforms by Uhlmann and Vasy [27]. We give a self-contained proof, emphasising quantitative bounds on the involved constants.

In Sect. 4, we start the stability analysis by considering the weighted X-ray transform \(f\mapsto I_W f\). By [20, Theorem 1.3], if \(W:SM\rightarrow {{\,\mathrm{Gl}\,}}(m,{\mathbb {C}})\) is a smooth invertible weight, then every convex boundary point \(p\in \partial M\) has a neighbourhood O such that for \(K\subset O\) compact, we have

$$\begin{aligned} \Vert f \Vert _{L^2(K)} \lesssim _K C \cdot \Vert I_W f \Vert _{H^1({\mathcal {M}}_O)}. \end{aligned}$$
(1.15)

Here, both \(C>0\) and the maximal size of O (say, measured by the largest radius \(h>0\) for which the ball \(B(p,h)\subset O\)) depend on W, and we will be concerned with understanding their behaviour as W varies. Standard techniques imply that C(W) and h(W) depend continuously on W in the \(C^\infty \) topology, but this is not sufficient for our purposes. Using the quantitative analysis from Sect. 3, we can upgrade this to uniformity as long as \(\Vert W \Vert _{C^k(SM)} \vee \Vert W^{-1} \Vert _{L^\infty (SM)}\) (for some \(k\gg 1\)) remains bounded.

In Sect. 5, we use the local stability result from the previous section to successively derive further stability estimates. First, using a layer-stripping argument similar to the one in [20], we extend stability to arbitrary sets satisfying the foliation condition. Next, we use the pseudo-linearisation formula to translate this into a stability estimate for the non-abelian X-ray transform and finish the proof of our main theorem.

Finally, in Sect. 6, we illustrate the strength of Theorem 1.3 by proving a statistical consistency result similar to the one in [14]. This section is mostly expository, as the extension to higher dimensions and general \({\mathbb {R}}^{m\times m}\)-valued potentials is fairly straightforward.

2 Forward Estimates

In this section, (Mg) is a compact, non-trapping Riemannian manifold with strictly convex boundary \(\partial M\) and dimension \(d\ge 2\). Further, as it requires no additional effort, we work in a slightly more general setting and replace matrix potentials \(\Phi :M\rightarrow {\mathbb {C}}^{m\times m}\) by attenuations \({\mathcal {A}}:SM\rightarrow {\mathbb {C}}^{m\times m}\).

Recall that an integrating factor for \({\mathcal {A}}\) is a solution to the transport equation \((X+{\mathcal {A}})R= 0\) on SM. The main result of this section then reads as follows:

Theorem 2.1

For every \({\mathcal {A}}\in C^\infty (SM,{\mathbb {C}}^{m\times m})\), there exists an integrating factor \(R_{\mathcal {A}}\in C^\infty (SM,{{\,\mathrm{Gl}\,}}(m,{\mathbb {C}}))\) with

$$\begin{aligned} \Vert R_{\mathcal {A}}^{\pm 1} \Vert _{C^k(SM)} \le c_{1,k} \exp (c_{2,k} \Vert {\mathcal {A}}\Vert _{L^\infty (SM)}) \cdot (1+ \Vert {\mathcal {A}}\Vert _{C^k(SM)})^k, \quad k\ge 0 \end{aligned}$$

for constants \(c_{1,k},c_{2,k}>0\) only depending on M and m. If \({\mathcal {A}}\) takes values in \({\mathfrak {u}}(m)\), the exponential factors can be dropped.

In order to define \(R_{\mathcal {A}}\), we use a standard trick to avoid differentiability issues at the glancing region \(S\partial M\): We embed M into the interior of a slightly larger manifold \(M_1\) and extend \({\mathcal {A}}\) smoothly to an attenuation \({\mathcal {A}}_1:SM_1\rightarrow {\mathbb {C}}^{m\times m}\) with compact support in \(SM_1^{\mathrm {int}}\). Then

$$\begin{aligned} (X+{\mathcal {A}}_1) U = 0 \text { on } SM_1\quad \text { and } \quad U = \mathrm {id}\text { on } \partial _-SM_1 \end{aligned}$$

has a unique solution \(U_{{\mathcal {A}}_1}:SM_1\rightarrow {{\,\mathrm{Gl}\,}}(m,{\mathbb {C}})\), which is constant \(\equiv \mathrm {id}\) near \(S\partial M_1\) and, thus, smooth on all of \(SM_1\). Setting \(R_{\mathcal {A}}= U_{{\mathcal {A}}_1} \vert _{SM}\) gives the desired integrating factor and the forward estimate above is a consequence of the following result, applied to the larger manifold \(M_1\).

Proposition 2.2

Let \({\mathcal {A}}\in C^k(SM,{\mathbb {C}}^{m\times m})\) (\(k\ge 0\)) and suppose \(U_{\mathcal {A}}\in C^k(SM,{{\,\mathrm{Gl}\,}}(m,{\mathbb {C}}))\) solves \((X+{\mathcal {A}})U = 0\) on SM and \(U = \mathrm {id}\) on \(\partial _-SM\).

  1. (i)

    Writing \(\tau _\infty = \sup _{SM} \tau \), we have \(\Vert U_{\mathcal {A}}\Vert _{L^\infty (SM)} \le m^{1/2} \exp (\tau _\infty \Vert {\mathcal {A}}\Vert _{L^\infty (SM)}).\)

  2. (ii)

    If \({{\,\mathrm{supp}\,}}{\mathcal {A}}\subset K\) for a compact set \(K\subset SM^{\mathrm {int}}\), then

    $$\begin{aligned} \Vert U_{\mathcal {A}}\Vert _{C^k(SM)} \le c \mathrm{{e}}^{(2k+1)\tau _\infty \Vert {\mathcal {A}}\Vert _{L^\infty (SM)}} (1+ \Vert {\mathcal {A}}\Vert _{C^k(SM)})^k , \end{aligned}$$

    for a constant \(c=c(k,m,K,M)>0\).

  3. (iii)

    The assertions remain true if \(U_{\mathcal {A}}\) is replaced by its inverse \(U_{\mathcal {A}}^{-1}\). Further, if \({\mathcal {A}}\) takes values in \({\mathfrak {u}}(m)\), the exponential factors can be dropped.

Proof of Theorem 2.1

Following the construction outlined above, Proposition 2.2 yields an estimate of \(R_{\mathcal {A}}\) in terms of the norms \(\Vert {\mathcal {A}}_1 \Vert _{C^k(SM_1)}\) and it remains to replace this by \( \Vert {\mathcal {A}}\Vert _{C^k(SM)}\). Formally, this can be achieved by using Seeley’s extension operator \(E:C^\infty (SM,{\mathbb {C}}^{m\times m}) \rightarrow C^\infty (SM_1,{\mathbb {C}}^{m\times m})\) (Lemma 7.2). One can arrange (by multiplying with a fixed cut-off) that \({{\,\mathrm{supp}\,}}E{\mathcal {A}}\subset K\) for all \({\mathcal {A}}\in C^\infty (SM,{\mathbb {C}}^{m\times m})\) and a fixed \(K\subset SM_1^{\mathrm {int}}\). Then, as E is continuous between the respective \(C^k\)-spaces, setting \({\mathcal {A}}_1= E {\mathcal {A}}\) allows to estimate \(\Vert {\mathcal {A}}_1 \Vert _{C^k(SM_1)} \lesssim \Vert {\mathcal {A}}\Vert _{C^k(SM)}\) as desired. \(\square \)

2.1 Proof of Proposition 2.2

We start by constructing suitable commuting frames, adapting [14, Lemma 5.1] to arbitrary dimensions \(d\ge 2\).

Lemma 2.3

Suppose \(\Sigma \subset \partial _+SM\backslash S\partial M\) is open and \(\{ P_1,\dots , P_{2d-2}\}\) is a commuting frame of \(T\Sigma \). Then these vector fields can be extended smoothly to the open set \(W_\Sigma =\{\varphi _t(x,v):(x,v)\in \Sigma , 0\le t\le \tau (x,v)\}\subset SM\) to yield a commuting frame \(\{X,P_1,\dots ,P_{2d-2}\}\) of \(TW_\Sigma \).

Remark 1

The lemma can be strengthened to allow \(\Sigma \subset \partial _+SM\) with \(\Sigma \cap S\partial M \ne \emptyset \). In that case, the extended vector fields are continuous on \(W_\Sigma \) and smooth on \(W_\Sigma \backslash S\partial M\). (One can show that the map \(\Phi \) below is a homeomorphism on \(\Sigma \times {\mathbb {R}}\) and an immersion in \(\Sigma \backslash S\partial M\times {\mathbb {R}}\). Since we do not use the stronger result, we omit the details.)

Proof

Let (Ng) be a no return extension of M (cf. Lemma 7.1) and denote the geodesic flow on N also by \(\varphi _t\). We claim that the map

$$\begin{aligned} \Phi : \Sigma \times {\mathbb {R}}\rightarrow SN,\quad (x,v,t)\mapsto \varphi _t(x,v) \end{aligned}$$

is a diffeomorphism onto its image. Injectivity follows immediately from the no-return property: If \(\Phi (x,v,t)=\Phi (y,w,s)\), then \(\gamma _{x,v}\) enters M both at times 0 and \(t-s\), which is impossible unless \((x,v,t)=(y,w,s)\). It remains to prove that \(\Phi \) is an immersion, so let us compute its derivative at \((x,v,t)\in \Sigma \times {\mathbb {R}}\): For a tangent vector \(\xi \oplus a \partial _t \in T_{(x,v)}\Sigma \oplus T_t{\mathbb {R}}\), we have

$$\begin{aligned} \Phi _*(\xi \oplus a \partial _t) = \mathrm {d}\varphi _t(x,v)(\xi ) + a X(\varphi _t(x,v))\in T_{\varphi _t(x,v)}SN. \end{aligned}$$

If \(\Phi _*(\xi \oplus a \partial _t) =0\), then the previous display implies \(\xi + a X(x,v) = 0\) and as X is transversal to \(\Sigma \), we must have \(a=0\) and \(\xi =0\). Hence, \(\Phi \) is an immersion and the claim follows from invariance of domain.

Now extend the vector fields \( P_1,\dots , P_{2d-2}\) to t-independent smooth vector fields \({\tilde{P}}_1, \dots ,\tilde{P}_{2d-2}\) on \(\Sigma \times {\mathbb {R}}\). Then \(\{\partial _t,\tilde{P}_1,\dots ,{\tilde{P}}_{2d-2}\}\) is a commuting frame on \(\Sigma \times {\mathbb {R}}\) which pushes forward along \(\Phi \) to a commuting frame \(\{X,P_1,\dots ,P_{2d-2}\}\) on \(\Phi (\Sigma \times {\mathbb {R}})\subset SN\). Restricting to \(W_\Sigma =\Phi (\Sigma \times {\mathbb {R}})\cap S M\) finishes the proof. \(\square \)

Proof of Proposition 2.2

Let us first remark why (iii) holds true. The inverse \(U_{\mathcal {A}}^{-1}\) satisfies the equation \(XU_{\mathcal {A}}^{-1} - U_{\mathcal {A}}^{-1}{\mathcal {A}}=0 \) and forward estimates can be derived with the same arguments as for \(U_{\mathcal {A}}\). Further, if \({\mathcal {A}}\) is \({\mathfrak {u}}(m)\) valued, then \(U_{\mathcal {A}}\in U(m)\), which is compact. In particular \(\Vert U_{\mathcal {A}}\Vert _{L^\infty (SM)}\) can be bounded by an absolute constant, and no exponentials arise below.

To prove part (i), fix \((x,v)\in SM\) and note that \(U(t)=U_{\mathcal {A}}(\varphi _t(x,v))\) solves

$$\begin{aligned} {\dot{U}} + {\mathcal {A}}(\varphi _t(x,v)) U = 0 \text { for } 0\le t \le \tau (x,v)\quad \text { and } \quad U(\tau (x,v))=\mathrm {id}. \end{aligned}$$

Let \(v(t)= \vert U(t) \vert _F^2\) (with \(\vert \cdot \vert _F\) the Frobenius norm), then \( \dot{v}(t) = 2 \langle \dot{U}(t),U(t)\rangle _F = 2 \langle - {\mathcal {A}}(\varphi _t(x,v)) U(t),U(t)\rangle _F \le 2 \vert {\mathcal {A}}(\varphi _t(x,v)) \vert _F \cdot \vert U(t) \vert ^2_F, \) where we used the Cauchy–Schwarz inequality and the sub-multiplicativity of the Frobenius norm. Thus, by Gronwall’s inequality (with reversed time), we have

$$\begin{aligned} v(t) \le v(\tau (x,v)) \exp \left( \int _t^{\tau (x,v)} 2 \vert {\mathcal {A}}(\varphi _s(x,v)) \vert _F \mathrm {d}s \right) ,\quad 0\le t \le \tau (x,v). \end{aligned}$$

Choose \(t=0\), such that the left-hand side becomes \(\vert U_{\mathcal {A}}(x,v)\vert _F^2\). Note that \(v(\tau (x,v))=\vert \mathrm {id}\vert _F^2=m\) and crudely bound the integral in the exponential by \(2 \tau _\infty \Vert {\mathcal {A}}\Vert _{L^\infty (M)}\). This concludes the proof of (i).

In order to show (ii), we use the following inequality, which (in the unitary version) appears as part of Lemma 5.2 in [14]: If \({\mathcal {A}},F:SM\rightarrow {\mathbb {C}}^{m\times m}\) are continuous and \(G\in C(SM,{\mathbb {C}}^{m\times m})\) is the unique solution to \((X+{\mathcal {A}})G=-F\) on SM and \(G=0\) on \(\partial _-SM\), then

$$\begin{aligned} \Vert G \Vert _{L^\infty (SM)} \le m\tau _\infty \exp (2\tau _\infty \Vert {\mathcal {A}}\Vert _{L^\infty (SM)})\cdot \Vert F\Vert _{L^ \infty (SM)}. \end{aligned}$$
(2.1)

We repeat its proof: One readily checks that

$$\begin{aligned} G(x,v) = -U_{\mathcal {A}}(x,v) \int _0^{\tau (x,v)} U_{\mathcal {A}}^{-1}F(\varphi _t(x,v)) \mathrm {d}t ,\quad (x,v)\in SM \end{aligned}$$

and thus, \(\Vert G\Vert _{L^\infty (SM)}\le \tau _\infty \Vert U_{\mathcal {A}}\Vert _{L^\infty (SM)} \Vert U_{\mathcal {A}}^{-1} \Vert _{L^\infty (SM)} \Vert F\Vert _{L^\infty (SM)}\). The norms of \(U_{\mathcal {A}}^{\pm 1}\) can be bounded with (i) and thus, (2.1) follows.

To proceed, take \(\Sigma \subset \partial _+SM\backslash S\partial M\) a small open subset (such that it admits a commuting frame). Let \(P_1,\dots ,P_{2d-2}\) be the vector fields on \(W_\Sigma \), as provided by Lemma 2.3 and write \(P^\alpha = P_1^{\alpha _1}\cdots P^{\alpha _{2d-2}}_{2d-2}\) for a multi-index \(\alpha \in {\mathbb {Z}}^{2d-2}\). We claim that

$$\begin{aligned} \begin{aligned} \Vert U_{\mathcal {A}}\Vert _{k,\Sigma }&\overset{\mathrm {def}}{=} \sup _{j+\vert \alpha \vert = k}\Vert X^j P^\alpha U_{\mathcal {A}}\Vert _{L^\infty (W_\Sigma )} \lesssim _{k,\Sigma } \mathrm{{e}}^{(2k+1)\tau _\infty \Vert {\mathcal {A}}\Vert _{L^\infty (SM)}} \Vert {\mathcal {A}}\Vert _{C^k(SM)}^k \end{aligned}\nonumber \\ \end{aligned}$$
(2.2)

for all \(k\in {\mathbb {Z}}_{\ge 0}\). Since finitely many such sets \(\Sigma _1,\dots ,\Sigma _n\) suffice to ensure \(K\subset \bigcup _i W_{\Sigma _i}\), we have \( \Vert U_{\mathcal {A}}\Vert _{C^k(SM)}\le \sum _i\sum _{\ell \le k} \Vert U_{\mathcal {A}}\Vert _{\ell ,\Sigma _i} \lesssim _k \mathrm{{e}}^{2(k+1)\tau _\infty \Vert {\mathcal {A}}\Vert _{L^\infty (SM)}} (1+ \Vert {\mathcal {A}}\Vert _{C^k(SM)})^k \) and (ii) follows.

We prove (2.2) by induction over \(k\in {\mathbb {Z}}_{\ge 0}\). The case \(k=0\) follows from part (i), so let \(k\ge 1\) and assume the result is true for \(k-1\). Consider \(G = X^jP^\alpha U_{\mathcal {A}}\) for an integer \(j\ge 0\) and multi-index \(\alpha \) such that \(j+\vert \alpha \vert = k\). We have

$$\begin{aligned} (X+{\mathcal {A}})G = [{\mathcal {A}},X^jP^\alpha ] U_{\mathcal {A}}\text { on } SM\quad \text { and }\quad G = 0 \text { on } \partial _-SM, \end{aligned}$$

where \([\cdot ,\cdot ]\) denotes the commutator and the zero boundary values follow from \({\mathcal {A}}\) having compact support and thus \(U_{\mathcal {A}}\) being constant near \(\partial _-SM\). By (2.1) we conclude that \( \Vert X^jP^\alpha U_{\mathcal {A}}\Vert _{L^\infty (W_\Sigma )} \lesssim \mathrm{{e}}^{2\tau _\infty \Vert {\mathcal {A}}\Vert _{L^\infty (SM)}} \cdot \Vert [{\mathcal {A}},X^jP^\alpha ] U_{\mathcal {A}}\Vert _{L^\infty (W_\Sigma )} \) and since \( [{\mathcal {A}},X^jP^\alpha ]\) is a differential operator on SM of order \(k-1\) and with continuous coefficients \(\lesssim _{k} \Vert {\mathcal {A}}\Vert _{C^k(SM)}\), we have

$$\begin{aligned} \Vert X^jP^\alpha U_{\mathcal {A}}\Vert _{L^\infty (W_\Sigma )} \lesssim _{k} \mathrm{{e}}^{2\tau _\infty \Vert {\mathcal {A}}\Vert _{L^\infty (SM)}} \cdot \Vert {\mathcal {A}}\Vert _{C^k(SM)} \cdot \Vert U_{\mathcal {A}}\Vert _{k-1,\Sigma }. \end{aligned}$$
(2.3)

Inequality (2.2) follows from the induction hypothesis, and we are done. \(\square \)

2.2 Consequences and Further Forward Estimates

We first recall that the standard linear X-ray transform

$$\begin{aligned} {\mathcal {I}}: C^\infty (SM) \rightarrow C^\infty (\partial _+SM ),\quad {\mathcal {I}}F(x,v) = \int _0^{\tau (x,v)} F(\varphi _t(x,v)) \mathrm {d}t, \end{aligned}$$

is continuous as map \(H^k(SM)\rightarrow H^k(\partial _+SM)\) for all \(k\ge 0\) [22, Theorem 4.2.1].Footnote 1 Independently of Theorem 2.1, this yields the following:

Corollary 2.4

Let \(f\in C^\infty (M,{\mathbb {C}}^m)\) and \( W\in C^\infty (SM,{\mathbb {C}}^{m\times m})\). Then

$$\begin{aligned} \Vert I_W f \Vert _{H^k(\partial _+SM)} \lesssim _k \Vert W \Vert _{C^k(SM)} \cdot \Vert f \Vert _{H^k(SM)}\quad k\ge 0. \end{aligned}$$
(2.4)

Proof

As \(I_Wf = {\mathcal {I}}(Wf)\), this follows immediately from the \(H^k\)-continuity of \({\mathcal {I}}\) (in its straightforward extension to vector-valued functions) and the fact that pull-back by \(\pi :SM\rightarrow M\) yields a bounded linear map \(\pi ^*:H^k(M)\rightarrow H^k(SM)\). \(\square \)

Further, using Lemma 1.5 (pseudo-linearisation) and Theorem 2.1, we obtain the following forward estimates for the non-abelian X-ray transform:

Corollary 2.5

Let \(\Phi ,\Psi \in C^k(M,{\mathbb {C}}^{m\times m})\), then

$$\begin{aligned} \Vert C_\Phi - C_\Psi \Vert _{H^k(\partial _+SM)} \le c_k(\Phi ,\Psi ) \cdot \Vert \Phi - \Psi \Vert _{H^k(M)},\quad k\ge 0, \end{aligned}$$

where

$$\begin{aligned} c_k(\Phi ,\Psi ) = c_{1,k} \exp (c_{2,k} \Vert \Phi \Vert _{L^\infty (M)} + \Vert \Psi \Vert _{L^\infty (M)} ) \cdot (1+ \Vert \Phi \Vert _{C^k(M)}+ \Vert \Psi \Vert _{C^k(M)})^{2k} \end{aligned}$$

for constants \(c_{1,k},c_{2,k}\) only depending on (Mg) and m. Further, if \(\Phi ,\Psi \) take values in \({\mathfrak {u}}(m)\), the exponential factors can be dropped.

Proof

We use the pseudo-linearisation identity \(C_\Phi - C_\Psi = R_\Phi \cdot I_{{\mathcal {W}}_{\Phi ,\Psi }} (\Phi - \Psi )\cdot \alpha ^*R_\Psi ^{-1}\) from Lemma 1.5 for the integrating factors provided by Theorem 2.1. The integrating factor \(R_\Psi \), acting via multiplication on \(H^k(\partial _+SM)\), has operator norm \(\le \Vert R_\Phi \Vert _{C^k(\partial _+SM)} \le \Vert R_\Phi \Vert _{C^k(SM)}\). A similar bound holds for \(\alpha ^*R_\Psi ^{-1}\), as \(\alpha \) is a diffeomorphism and thus, by Corollary 2.4, we obtain

$$\begin{aligned} \Vert C_\Phi - C_\Psi \Vert _{H^k(\partial _+SM)} \lesssim _k \Vert R_\Phi \Vert _{C^k(M)} \Vert {\mathcal {W}}_{\Phi ,\Psi } \Vert _{C^k(SM)} \Vert \Phi - \Psi \Vert _{H^k(M)} \Vert R_\Psi ^{-1} \Vert _{C^k(SM)}. \end{aligned}$$

As \( \Vert {\mathcal {W}}_{\Phi ,\Psi } \Vert _{C^k(SM)} \le \Vert R_\Psi \Vert _{C^k(SM)} \Vert R_\Psi \Vert _{C^k(SM)}\), the proof is finished by applying the estimates from Theorem 2.1. \(\square \)

3 Local Inversion of Scattering Operators

This section prepares the local stability estimate from Sect. 4 by proving a quantitative version of the microlocal argument that underlies Uhlmann and Vasy’s method from [27].

Their argument relies on the following phenomenon: in the context of Melrose’s scattering calculus, ellipticity of an operator near a boundary point yields local injectivity. More precisely, if X is a manifold with boundary and \(A: C_c^\infty (X^\mathrm {int}) \rightarrow C^\infty (X^\mathrm {int})\) is a (classical) scattering pseudodifferential operator (\(\psi \)do), then the leading order behaviour at \(\partial X\) is captured by its scattering principal symbol, which is a smooth function \(\sigma _\mathrm {sc}: {}^\mathrm {sc}T_{\partial X}^*X\rightarrow {\mathbb {C}}\), defined on the total space of the scattering cotangent bundle over \(\partial X\). Ellipticity at \(p\in \partial X\) then means that

$$\begin{aligned} \inf _{\zeta \in {}^\mathrm {sc}T_{p}^*X }\vert \sigma _{\mathrm {sc}}(p,\zeta ) \vert > 0 \end{aligned}$$
(3.1)

and implies the existence of a neighbourhood \(O\subset X\) of p for which

$$\begin{aligned} \ker A \cap \{u \in L^2(X): {{\,\mathrm{supp}\,}}(u) \subset O\} = 0. \end{aligned}$$
(3.2)

Together with the Fredholm property between appropriate function spaces, this can be upgraded to a stability estimate for functions supported in O. The purpose of this section is to show that the size of O as well as constants in a stability estimate can be controlled in terms of a lower bound on the scattering principal symbol and an upper bound on a fixed semi-norm of A.

To formulate the theorem, let X be a compact manifold with boundary, fix a boundary defining function \(\rho :X\rightarrow [0,\infty )\) and write \(B(\partial X, h) = \{x \in X: \rho (x) < h\}\). Then in terms of the locally convex spaces

  • \(\Psi ^{m,\ell }_\mathrm {sc}(X) = \) Fréchet space of classical scattering \(\psi \)do’s of order \((m,\ell )\)

  • \(H^{s,r}_\mathrm {sc}(X) =\) Hilbert space of Sobolev-functions of regularity (sr),

discussed in Sect. 3.1 below, our result reads as follows:

Theorem 3.1

(Local inversion of scattering operators) Let \( V \subset \partial X\) be open and \(K\subset X\) compact with \(K\cap \partial X \subset V\). Suppose \(A\in \Psi ^{m,\ell }_\mathrm {sc}(X)\) satisfies

$$\begin{aligned} \lambda (A) = \inf \{ \vert \sigma _\mathrm {sc}(A) (z,\zeta )\vert : z\in V: \zeta \in {}^\mathrm {sc}T_z^*X \}>0. \end{aligned}$$
(3.3)
  1. (i)

    There are \(h, C>0\) such that all functions \(u \in L^2(X) \) with support contained in \(K\cap B(\partial X, h)\) obey the estimate

    $$\begin{aligned} \Vert u \Vert _{L^2(X) } \le C \Vert A u \Vert _{H_\mathrm {sc}^{-m,-(d+1 + 2\ell )/2}(X)}. \end{aligned}$$
    (3.4)
  2. (ii)

    As A varies, the constants h(A) and C(A) satisfy

    $$\begin{aligned} C(A) \vee h(A)^{-1} \le \omega (\Vert A \Vert \vee \lambda (A)^{-1} ) \end{aligned}$$
    (3.5)

    for a non-decreasing function \(\omega :[0,\infty )\rightarrow [0,\infty )\) (of polynomial growth) and a continuous \(\Psi _\mathrm {sc}^{m,\ell }\)-semi-norm \(\Vert \cdot \Vert \).

The proof of Theorem 3.1 can be sketched as follows: After localising to an h-neighbourhood of \(V\cap K\) (where A is elliptic), one constructs a parametrix \(A^+\) for which the residuals \(R_A=\mathrm {id}- A^+ A\) have \(L^2\)-operator norms of order O(h), such that for \(h\ll 1\), a local inverse of A can be obtained by a Neumann series. In order to derive a quantitative bound as in (ii) one then needs to find how certain operator norms of \(A^+\) and \(R_A\) depend on A.

From the usual construction of parametrises, it is clear that the maps \(A\mapsto A^+\) and \(A\mapsto R_A\) will be continuous in the appropriate Fréchet topologies, but as the maps are non-linear, a bound as in (3.5) is not immediate. However, using finite-order parametrises, one can make microlocal constructions more economic, such that all quantities depend only on \(\lambda (A)\) and fixed semi-norm of A (corresponding to a fixed number of derivatives of its full symbol).

This reasoning seems to be part of the microlocal analysis folklore; yet the author is not aware of any reference for it, let alone in the setting of scattering pseudodifferential operators on manifolds. The novelty and usefulness of Uhlmann–Vasy’s argument, thus, warrants a careful analysis.

Finally, we remark that making the semi-norm \(\Vert \cdot \Vert \) from (3.5) more explicit is possible but requires to further open up the microlocal analysis machinery at the cost of obscuring the main argument. At the same time, the added benefit is minimal, for in later applications, A will be constructed in terms of a certain weight function W and the map \(W\mapsto A\) is both costly (in the sense that many derivatives of W need to be bounded in order to obtain control of \(\Vert A \Vert \)) and difficult to analyse quantitatively.

3.1 The Scattering Calculus

We summarise some aspects of Melrose’s scattering calculus [13] with the purpose of fixing notation and gathering the most relevant results at one place. See also [13, 28] [27, §2] and [25, §3.2].

First some general notation. Denote with \({{\bar{{\mathbb {R}}}}}^d\) the radial compactification of \({\mathbb {R}}^d\), obtained by glueing \({\mathbb {R}}^d\) and \([0,\infty )\times S^{d-1}\) along the identification \(x \mapsto (\vert x\vert ^{-1}, \vert x\vert ^{-1} x)\). More generally, given a vector bundle \(E\rightarrow X\), one can radially compactify the fibres to obtain a bundle \({\bar{E}}\rightarrow X\) [13, §1]. Further, we let \(\dot{C}(X) = \bigcap _k \rho ^k C^\infty (X)\) denote the space of functions which vanish to infinite order at \(\partial X\) (similarly defined over \(X\times X\)) and note that the natural inclusion \({\mathbb {R}}^d\subset {{\bar{{\mathbb {R}}}}}^d\) induces an isomorphism \({\mathcal {S}}({\mathbb {R}}^d)\cong \dot{C}({{\bar{{\mathbb {R}}}}}^d)\).

We can now recall the definition of \(\Psi ^{m,\ell }_\mathrm {sc}(X)\), the space of classical scattering pseudodifferential operators on X.

Definition 3.2

A linear operator \(A:\dot{C}^\infty (X)\rightarrow \dot{C}^\infty (X)\) is in \(\Psi ^{m,\ell }_\mathrm {sc}(X)\), if the following two conditions are satisfied:

  1. (i)

    The Schwartz kernel of A is smooth away from the diagonal of \(X\times X\) and vanishes to infinite order at the boundary.

  2. (ii)

    In local coordinates \((x,y)=(x,y_1,\dots ,y_{d-1})\) with \(x\vert _{\partial X} = 0\), we have

    $$\begin{aligned} Au(x,y) = \int \mathrm{{e}}^{i\xi \frac{x-x'}{x^2} + i\eta \cdot \frac{y-y'}{x}} a(x,y,\xi ,\eta ) u(x',y') \mathrm {d}\xi \mathrm {d}\eta \frac{\mathrm {d}x' \mathrm {d}y'}{(x')^{d+1}}, \end{aligned}$$
    (3.6)

    for all \(u\in C^\infty (X)\) with compact support within the chart domain, where \(a:(0,\infty )_x\times {\mathbb {R}}^{d-1}_y \times {\mathbb {R}}_\xi \times {\mathbb {R}}^{d-1}_\eta \rightarrow {\mathbb {C}}\) is smooth and satisfies

    $$\begin{aligned} x^{\ell } \langle (\xi ,\eta ) \rangle ^{-m} a(x,y,\xi ,\eta ) \in C^\infty ([0,\infty )_x\times {\mathbb {R}}^{d-1}_y \times \bar{\mathbb {R}}^d_{(\xi ,\eta )}). \end{aligned}$$
    (3.7)

Note that we use the order convention from [27], that is, \(\Psi _\mathrm {sc}^{m,\ell }(X) \) increases as m and \(\ell \) increase. The definition above differs from the (equivalent) one given in [27] in that it describes A in terms of the local model \([0,\infty )_x\times {\mathbb {R}}^{d-1}_y\) for X rather than in terms of \({{\bar{{\mathbb {R}}}}}^d\). The formulation here is for example used in [25, Proof of Prop. 4.2] and has the advantage that \((\xi ,\eta )\) provide natural coordinates for the scattering cotangent bundle introduced below.

For the sake of completeness, we mention here that (3.7) could be replaced by the condition

$$\begin{aligned} \vert (x\partial _x)^k \partial _y^\alpha \partial _{(\xi ,\eta )}^\beta a(x,y,\xi ,\eta ) \vert \lesssim _{k,\alpha ,\beta } x^{-\ell } \langle (\xi ,\eta )\rangle ^{m-\vert \beta \vert } ,\quad (k,\alpha ),\beta \in {\mathbb {Z}}^{d}_{\ge 0} \end{aligned}$$
(3.8)

to obtain the larger class \(\Psi ^{m,\ell }_{\mathrm {scc}}(X)\) of (not necessarily classical) scattering \(\psi \)do’s. The advantage of using classical operators is that their principal symbols can be realised as functions, rather than as elements in a quotient space. In particular, there is a natural way to measure their magnitude (in the sense of size of semi-norms), which is crucial for the quantitative aspect of Theorem 3.1.

Finally, we remark that \(\Psi ^{m,\ell }_\mathrm {sc}(X)\) has a natural Fréchet space structure in which a sequence of operators \(A_n\) converges to 0, iff the (weighted) symbols \(a_n\) in (3.7) converge to 0 in the \(C^\infty \)-topology. In this topology, \(\Psi ^{m,\ell }_\mathrm {sc}(X)\subset \Psi _\mathrm {sc}^{m',\ell '}(X)\) is a closed subspace whenever \(m\le m'\) and \(\ell \le \ell '\)Footnote 2.

Let us briefly discuss some key aspects of the scattering calculus: The leading order behaviour of an operator \(A\in \Psi ^{m,\ell }_\mathrm {sc}(X)\) at \(\partial X\) can be described in coordinates, where A takes form (3.6), by

$$\begin{aligned} \sigma _\mathrm {sc}(A)(y,\xi ,\eta ) = x^{\ell } \langle (\xi ,\eta ) \rangle ^{-m} a(x,y,\xi ,\eta )\vert _{x=0}, \end{aligned}$$
(3.9)

which makes sense in view of the stated regularity in (3.7). In order to understand \(\sigma _\mathrm {sc}\) invariantly, one defines the so-called scattering cotangent bundleFootnote 3\({}^\mathrm {sc}T^*X\rightarrow X\) with fibres having the following coordinate-description:

$$\begin{aligned} {}^\mathrm {sc}T_p^*X =\left\{ \xi \frac{\mathrm {d}x}{x^2} + \eta \cdot \frac{\mathrm {d}y}{x} \big \vert _p\right\} \equiv {\mathbb {R}}^d_{(\xi ,\eta )}. \end{aligned}$$
(3.10)

Let \({}^\mathrm {sc}{\bar{T}}^*X \rightarrow X\) be the ball bundle obtained by radially compactifying the fibres of \({}^\mathrm {sc}T^*X\). Then under the identification indicated in (3.10), definition (3.9) yields a smooth map \(\sigma _\mathrm {sc}(A):{}^\mathrm {sc}\bar{T}^*_{\partial X} X\rightarrow {\mathbb {C}}\), defined on the total space of the pull-back of \({}^\mathrm {sc}{\bar{T}}^*X\) to \(\partial X\). We call \(\sigma _\mathrm {sc}(A)\) the scattering principal symbol of the operator A. The principal symbol map \(A\mapsto \sigma _\mathrm {sc}(A)\) fits into a split exact sequence of Fréchet spaces:

$$\begin{aligned} 0 \rightarrow \Psi _\mathrm {sc}^{m,\ell -1}(X) \hookrightarrow \Psi _\mathrm {sc}^{m,\ell }(X) \xrightarrow {\sigma _\mathrm {sc}} C^\infty ({}^\mathrm {sc}\bar{T}_{\partial X}^*X) \rightarrow 0 \end{aligned}$$
(3.11)

By this, we mean that it is a split exact sequence of vector spaces, with all involved maps being continuous; in particular, there is a continuous right inverse \(r: C^\infty ({}^\mathrm {sc}\bar{T}_{\partial X}^*X) \rightarrow \Psi ^{m,\ell }_\mathrm {sc}(X)\) to \(\sigma _\mathrm {sc}\).

The scattering principal symbol is also called ‘principal symbol at finite points’ and can be complemented by \(\sigma _p\), the ‘principal symbol at fibre infinity’. While the joint symbol \((\sigma _p, \sigma _\mathrm {sc})\) is needed, e.g. for regularity questions, for our purposes, it suffices to keep track of the boundary behaviour.

Exactness of (3.11) is stated in [13, Prop. 20] (where the scattering principal symbol is called ‘normal operator’ and denoted \(N_\mathrm {sc}\)), while a continuous linear right split (also called quantisation map) is discussed below equation (5.30) in the same notes. Finally, we remark here that our definition of \(\sigma _\mathrm {sc}\) differs from the one in [27], where the authors do not incorporate the pre-factor \(\langle (\xi ,\eta )\rangle ^{-m}\) in (3.9), which implies that ellipticity is witnessed by a lower bound \(\vert \sigma _\mathrm {sc}\vert \gtrsim \langle (\xi ,\eta )\rangle ^{m}\) rather than \(\vert \sigma _\mathrm {sc}\vert \gtrsim 1\) as in (3.1).

Next, we note that the product of two scattering \(\psi \)do’s is again a scattering \(\psi \)do. In fact, multiplication of operators yields a bilinear continuous map

$$\begin{aligned} \Psi ^{m,\ell }_\mathrm {sc}(X)\times \Psi ^{m',\ell '}_\mathrm {sc}(X)\rightarrow \Psi _\mathrm {sc}^{m+m',\ell +\ell '}(X) \end{aligned}$$
(3.12)

and the principal symbol behaves multiplicatively, that is

$$\begin{aligned} \sigma _{\mathrm {sc}}(AB) = \sigma _\mathrm {sc}(A) \cdot \sigma _\mathrm {sc}(B). \end{aligned}$$
(3.13)

The continuity claim can be verified by keeping track of semi-norms, when proving that scattering operators are closed under multiplication and is a direct consequence of [28, Prop. 3.5]; for (3.13) see also [13, Eqs. (5.1) and (5.14)].

Finally, a natural scale of Hilbert spaces that scattering operators act on, is provided by \(H_\mathrm {sc}^{s,r}(X)\). On \({{\bar{{\mathbb {R}}}}}^d\) (the radial compactification of \({\mathbb {R}}^d\)), these spaces can be defined in terms of the standard Sobolev space on \({\mathbb {R}}^d\) as

$$\begin{aligned} H^{s,r}({{\bar{{\mathbb {R}}}}}^d) = \langle z\rangle ^{-r} H^s({\mathbb {R}}^d_z). \end{aligned}$$
(3.14)

In general, \(H_\mathrm {sc}^{s,r}(X)\) is defined by locally identifying X with open subsets of \({{\bar{{\mathbb {R}}}}}^d\). For \(s\ge 0\), they are related to the standard Sobolev spaces \(H^s(X)\) as follows:

$$\begin{aligned} {\left\{ \begin{array}{ll} H^s(X) \subset H^{s,r}_\mathrm {sc}(X) &{} \text { for } r\le -\frac{d+1}{2} \\ H^{s,r}_\mathrm {sc}(X)\subset H^s(X) &{} \text { for } r\ge -\frac{d+1}{2} + 2 s \end{array}\right. } \end{aligned}$$
(3.15)

An operator \(A\in \Psi ^{m,\ell }_\mathrm {sc}(X)\) then is continuous as map \(A:H^{s,r}_\mathrm {sc}(X)\rightarrow H^{s-m,r-\ell }_\mathrm {sc}(X)\) and indeed the inclusion

$$\begin{aligned} \Psi _\mathrm {sc}^{m,\ell }(X) \hookrightarrow {\mathcal {B}}(H^{s,r}_\mathrm {sc}(X), H^{s-m,r-\ell }_\mathrm {sc}(X)) \end{aligned}$$
(3.16)

into the space of bounded linear operators is continuous. The statements above are proved in [27, Section 2] and [28, Section 3.8], modulo continuity of (3.16), which follows from the open mapping theorem.

3.2 Proof of Theorem 3.1

As outlined above, we want to construct a local, finite-order parametrix for the scattering operator A. On the level of principal symbols, this corresponds to composition with the map \(z\mapsto 1/z\), suitably cut-off near zero. We, thus, start with a lemma that provides norm bounds for this composition map. To this end let \(\varphi :{\mathbb {C}}\rightarrow [0,1]\) be a smooth function, vanishing near zero and constant \(\equiv 1\) for \(\vert z \vert \ge 1\). Write \(\varphi _t(z) = \varphi (z/t)\) for \(t>0\) and define

$$\begin{aligned} \mathrm {inv}_t :C^\infty (M)\rightarrow C^\infty (M), \quad u \mapsto \left( x\mapsto \frac{\varphi _t(u(x))}{u(x)}\right) \end{aligned}$$
(3.17)

on an arbitrary compact manifold M (with or without boundary), which will later be taken equal to the total space of \({}^\mathrm {sc}\bar{T}_{\partial X}^*X\). Then

Lemma 3.3

The map \(\mathrm {inv}_t:C^\infty (M)\rightarrow C^\infty (M)\) is continuous with respect to the \(C^\infty \)-topology; further, for every \(k\in {\mathbb {Z}}_{\ge 0}\), there exists \(C=C(k)>0\) with

$$\begin{aligned} \Vert \mathrm {inv}_t(u) \Vert _{C^k(M)} \le C \left( 1 + 1/t\right) ^{k+1} \left( 1 + {\Vert u \Vert _{C^k(M)}}\right) ^k,\quad u\in C^\infty (M),t>0. \end{aligned}$$

Proof

We prove more generally that composition with \(\chi \in C_b^\infty ({\mathbb {C}})\) (that is, \(\chi :{\mathbb {C}}\rightarrow {\mathbb {C}}\) is smooth and all derivatives are bounded) is continuous as map \(C^\infty (M)\rightarrow C^\infty (M)\), and we have

$$\begin{aligned} \Vert \chi \circ u \Vert _{C^k(M)} \lesssim _k \Vert \chi \Vert _{C^k_b({\mathbb {C}})} \cdot \left( 1 + \Vert u \Vert _{C^k(M)} \right) ^k \end{aligned}$$
(3.18)

such that the result follows from setting \(\chi (z) = \varphi _t(z)/z\). For simplicity, we only consider the case that u is real valued and \(\chi \in {\mathbb {C}}_b^\infty ({\mathbb {R}},{\mathbb {R}})\) (the complex case only requires notational changes) and assume that M has empty boundary, noting that the general case can then be obtained by means of Seeley’s extension theorem (Lemma 7.2).

Choose local coordinates \(x^1,\dots ,x^d\) and note that \(\partial ^\alpha (\chi \circ u)\) (for \(\alpha \in {\mathbb {Z}}^d\) with \(\vert \alpha \vert =k\)) may be written as finite linear combination of terms of the form:

$$\begin{aligned} P_{m,\beta }u\overset{\mathrm {def}}{=} \left( \chi ^{(m)}\circ u\right) \cdot \prod _{i=1}^m \partial ^{\beta _i} u, \quad m \le k, \beta _i \in {\mathbb {Z}}_{\ge 0}^{d} \text { with } \sum _{i=1}^m \vert \beta _i \vert = k. \end{aligned}$$
(3.19)

Let \(K\subset M\) be a compact set inside the chart that supports \(x_1,\dots ,x_d\). Then

$$\begin{aligned} \Vert P_{m,\beta } u \Vert _{L^\infty (K)} \lesssim _{m,\beta } \Vert \chi \Vert _{C_b^k({\mathbb {C}})}\cdot \Vert u \Vert _{C^k(M)}^m \quad \text { for all } u \in C^\infty (M) \end{aligned}$$
(3.20)

and

$$\begin{aligned} \Vert P_{m,\beta } u_n - P_{m,\beta } u \Vert _{L^\infty (K)} \rightarrow 0 \quad \text { when } u_n \rightarrow u \text { in } C^\infty (M). \end{aligned}$$
(3.21)

Since \(C^k(M)\) can be normed by the sum of \(\Vert \partial ^\alpha \cdot \Vert _{L^\infty (K)}\), where \(\alpha \) runs through multi-indices in \({\mathbb {Z}}_{\ge 0}^d\) with \(\vert \alpha \vert \le k\) and K through compacts inside of chart domains, the previous two displays establish the desired result. \(\square \)

Next, we construct a local, finite-order parametrix for an operator \(A\in \Psi _\mathrm {sc}^{m,\ell }(X)\) which is elliptic in an open set \(U\subset {}^\mathrm {sc}{\bar{T}}^*_{\partial X}X\). By this, we mean that

$$\begin{aligned} \lambda (A) = \inf _U \vert \sigma _\mathrm {sc}(A) \vert >0, \end{aligned}$$
(3.22)

which encompasses the definition in Theorem 3.1, where we have \(U=\{(z,\zeta ):z\in V,\zeta \in {}^\mathrm {sc}{\bar{T}}^*_zX\}\) for some open set \(V\subset \partial X\). Then

Lemma 3.4

There exists a local parametrix \(A^+\in \Psi _\mathrm {sc}^{-m,-\ell }(X)\) and a residual operator \(R_A\in \Psi ^{0,0}_\mathrm {sc}(X)\) with the following properties.

  1. (i)

    We have \(A^+A= \mathrm {id}- R_A\) and \({{\,\mathrm{supp}\,}}\sigma _\mathrm {sc}(R_A) \cap U = \emptyset \)

  2. (ii)

    Given continuous semi-norms \(\Vert \cdot \Vert \) and \(\Vert \cdot \Vert '\) on \(\Psi _\mathrm {sc}^{-m,-\ell }(X)\) and \(\Psi ^{0,0}_\mathrm {sc}(X)\) respectively, there exists a continuous semi-norm \(\Vert \cdot \Vert ''\) on \(\Psi ^{m,\ell }_\mathrm {sc}(X)\) and an integer \(k\ge 0\) such that, as A varies, we have

    $$\begin{aligned} \Vert A^+ \Vert \vee \Vert R_A\Vert ' \lesssim _{m,\ell } \left( 1 + \lambda (A)^{-1} \right) ^{k} \cdot (1 + \Vert A \Vert '' )^k. \end{aligned}$$
    (3.23)

Proof

Let \(r:C^{\infty }({}^\mathrm {sc}{\bar{T}}^*_{\partial X}X)\rightarrow \Psi _\mathrm {sc}^{-m,-\ell }(X)\) be a continuous right split for the short exact symbol sequence in (3.11) and define

$$\begin{aligned} A^+ =r\left( \mathrm {inv}_{\lambda (A)}(\sigma _\mathrm {sc}(A))\right) \in \Psi _\mathrm {sc}^{-m,-\ell }(X). \end{aligned}$$
(3.24)

Then, as \(U\subset \{(z,\zeta )\in {}^\mathrm {sc}{\bar{T}}^*_{\partial X}X: \sigma (A)(z,\zeta )\ge \lambda (A) \}\), we have

$$\begin{aligned} \sigma _\mathrm {sc}(A^+) = \mathrm {inv}_{\lambda (A)}(\sigma _\mathrm {sc}(A)) = \sigma _\mathrm {sc}(A)^{-1} \quad \text { on } U. \end{aligned}$$
(3.25)

In particular, defining \(R_A = \mathrm {id}- A^+ A\), we see from the multiplicativity of principal symbols that \(\sigma _\mathrm {sc}(R_A) = 0\) on U, such that (i) holds true.

Next, given a continuous semi-norm \(\Vert \cdot \Vert \) on \(\Psi _\mathrm {sc}^{-m,-\ell }(X)\), as r is a continuous linear map between Fréchet spaces, there exists an integer \(k\ge 0\) with

$$\begin{aligned} \Vert A^+ \Vert \lesssim \Vert \mathrm {inv}_{\lambda (A)}(\sigma _\mathrm {sc}(A)) \Vert _{C^k({}^\mathrm {sc}{\bar{T}}^*_{\partial X}X)} \lesssim (1+\lambda (A)^{-1})^{1+k}\cdot \Vert \sigma _{\mathrm {sc}}(A) \Vert _{C^k({}^\mathrm {sc}{\bar{T}}^*_{\partial X}X)}^k, \end{aligned}$$

with implicit constants uniform in \(A\in \Psi ^{m,\ell }(X)\) with \(\lambda (A)>0\) and where the second inequality follows from the preceding lemma. Finally, \(\sigma _{\mathrm {sc}}\) itself is a continuous linear map and thus \(\Vert \sigma _{\mathrm {sc}}(A) \Vert _{C^k({}^\mathrm {sc}{\bar{T}}^*_{\partial X}X)}\lesssim \Vert A \Vert '\) for an appropriate semi-norm \(\Vert \cdot \Vert ''\). This completes the bounds on \(A^+\).

In order to bound \(\Vert R_A \Vert '\) (for a given semi-norm \(\Vert \cdot \Vert '\) on \(\Psi _\mathrm {sc}^{0,0}(X)\)), we use that multiplication of scattering operators gives a continuous bilinear map, such that

$$\begin{aligned} \Vert R_A\Vert ' \lesssim 1 + \Vert A \Vert '''\cdot \Vert A^+ \Vert '''' \end{aligned}$$
(3.26)

for an appropriate choice of semi-norms on the right-hand side. Combining this with the bounds on \(A^+\), the proof is complete. \(\square \)

We are now in a position to prove a slightly more general version of Theorem (3.1), which does not require ellipticity in all fibre directions. For this recall that has fixed, a boundary definition function \(\rho :X\rightarrow [0,\infty )\).

Proposition 3.5

(Microlocal version of Theorem 3.1) Let \(\Gamma \subset U \subset {}^\mathrm {sc}{\bar{T}}_{\partial X}^*X\) subsets such that \(\Gamma \) is compact and U is open.

  1. (i)

    Let \(A\in \Psi _\mathrm {sc}^{m,\ell }(X)\) with \(\lambda (A)>0\), as defined in (3.22). Then there exist constants \(h,C>0\) as well as a continuous semi-norm \(\Vert \cdot \Vert _0 \) on \(\Psi _\mathrm {sc}^{0,0}(X)\) with the following property: If \(Q=q_1Q_2\in \Psi _\mathrm {sc}^{0,0}(X)\) is the product of a function \(q_1\in C^\infty (X)\) and an operator \(Q_2\in \Psi _\mathrm {sc}^{0,0}(X)\) such that

    $$\begin{aligned} \Vert \rho q_1 \Vert _{L^\infty (X)} \Vert Q_2\Vert _0 < h \quad \text { and } \quad {{\,\mathrm{supp}\,}}\sigma _\mathrm {sc}(Q_2) \subset \Gamma , \end{aligned}$$
    (3.27)

    then for all \(u\in L^2(X)\) we have

    $$\begin{aligned} \Vert u \Vert _{L^2(X)} \le C \Vert Q \Vert _0 \cdot \Vert A u \Vert _{H_\mathrm {sc}^{-m,-(d+1+2\ell )/2}(X)} + 2 \Vert (\mathrm {id}- Q) u \Vert _{L^2(X)}. \end{aligned}$$
    (3.28)
  2. (ii)

    As A varies in the open set of operators with \(\lambda (A)>0\), the constants h(A) and C(A) obey an estimate of the form:

    $$\begin{aligned} C(A) \vee h(A)^{-1} \le \left( 1 + \lambda (A)^{-1} \right) ^{k} \cdot (1 + \Vert A \Vert )^k \end{aligned}$$
    (3.29)

    a continuous semi-norm \(\Vert \cdot \Vert \) on \(\Psi ^{m,\ell }_\mathrm {sc}(X)\) and an integer \(k\ge 0\).

Let us first demonstrate how Theorem 3.1 follows from this result:

Proof of Theorem 3.1

We apply Proposition 3.5 with \(U=\pi ^{-1}(V)\) and \(\Gamma =\pi ^{-1}(K')\), where \(K'\subset V\) is a compact set that contains \(K\cap \partial X\) in its interior and \(\pi :{}^\mathrm {sc}\bar{T}^*_{\partial X}X\rightarrow \partial X\) is the natural projection; we denote with \(h'\) and \(C'\) the constants from (i). Let \(Q=q_1q_2\in \Psi ^{0,0}(X)\) be the product of two functions \(q_1,q_2\in C^\infty (X)\) with

$$\begin{aligned} 1_{B(\partial X, h)} \le q_1 \le 1_{B(\partial X, 2h)} \quad \text { and }\quad 1_K\le q_2 \le 1_{V'}, \end{aligned}$$
(3.30)

where h remains to be chosen and \(V'\) is a neighbourhood of K with \(V'\cap \partial X \subset K'\). Now let \(h>0\) be such that

$$\begin{aligned} \Vert \rho q_1\Vert _{L^\infty (X)} \Vert q_2\Vert \le 2h \Vert q_2\Vert = h', \end{aligned}$$
(3.31)

then (3.27) is satisfied and we obtain (3.28). Since \(\{u: {{\,\mathrm{supp}\,}}(u) \subset K \cap B(\partial X, h) \} \subset \ker (\mathrm {id}- Q)\), this concludes the proof. \(\square \)

Proof of Proposition 3.5

Let \(A^+\) and \(R_A\) be as in Lemma 3.4. We first estimate the operator norm of \(QR_A\), acting on \(L^2(X) = H_\mathrm {sc}^{0,-(d+1)/2}(X)\). To this end, we write \(QR_A = (\rho q_1)\cdot (\rho ^{-1} Q_2 R_A)\) and treat the two factors separately. To estimate the second factor, consider the bilinear continuous map

$$\begin{aligned} \Psi ^{0,0}_{\mathrm {sc},\Gamma }(X) \times \Psi ^{0,0}_{\mathrm {sc},\Lambda }(X) \rightarrow \Psi _\mathrm {sc}^{0,-1}(X) \xrightarrow {\times \rho ^{-1}} \Psi _\mathrm {sc}^{0,0}(X)\subset {\mathcal {B}}(L^2(X)), \end{aligned}$$
(3.32)

where the involved spaces and maps are defined as follows: For \(L \subset {}^\mathrm {sc}{\bar{T}}^*_{\partial X}X\) compact we write \(\Psi ^{0,0}_{\mathrm {sc},L}(X)\) for the closed subspace of operators \(P\in \Psi _\mathrm {sc}^{0,0}(X)\) with \({{\,\mathrm{supp}\,}}\sigma _\mathrm {sc}(P) \subset L\); we let \(\Lambda = {}^\mathrm {sc}{\bar{T}}^*_{\partial X} X\backslash U\), such that \(R_A\in \Psi _{\mathrm {sc},\Lambda }^{0,0}(X)\). Then the first map in (3.32) is multiplication, which takes values in \(\Psi _\mathrm {sc}^{0,-1}(X)\) as \(\Lambda \cap \Gamma = \emptyset \). Now \(\rho ^{-1} Q_2 R_A \in {\mathcal {B}}(L^2(X))\) is the image of \((Q_2,R_A)\) under the map (3.32), and hence, its operator norm is bounded by \(\Vert Q_2 \Vert _0 \cdot \Vert R_A \Vert _0\) for a continuous semi-norm \(\Vert \cdot \Vert _0\) on \(\Psi ^{0,0}(X)\). Further, multiplication by \(\rho q_1\) has operator norm \(\le \Vert \rho q_1 \Vert _{L^\infty (X)}\). Overall, we get

$$\begin{aligned} \Vert QR_A \Vert _{L^2(X)\rightarrow L^2(X)} \le \Vert \rho q_1 \Vert _{L^\infty (X)} \cdot \Vert Q_2 \Vert _0 \cdot \Vert R_A \Vert _0. \end{aligned}$$
(3.33)

Put \(h=h (A)= \Vert R_A \Vert _0^{-1}/2\), then if Q obeys (3.27), the operator norm of \(QR_A\) is bounded by 1/2, which means that \(\mathrm {id}- QR_A\) is invertible in \({\mathcal {B}}(L^2(X))\). Write \(N=(\mathrm {id}- QR_A)^{-1} \in {\mathcal {B}}(L^2(X))\) for the inverse, then

$$\begin{aligned} u = NQA^+Au + N(\mathrm {id}- Q)u\quad \text { for all } u \in L^2(X). \end{aligned}$$
(3.34)

Now \(\Vert N \Vert _{L^2(X)\rightarrow L^2(X)}\le 2\) and thus, assuming without loss of generality that \(\Vert \cdot \Vert _0\) dominates the \(L^2(X)\)-operator norm, we obtain (3.28) with C(A) being twice the \(H_\mathrm {sc}^{0,-\frac{d+1}{2}}(X)\rightarrow H_\mathrm {sc}^{-m,-\ell -\frac{d+1}{2}}(X)\) operator norm of \(A^+\). Finally, the bound in (ii) follows from the one in Lemma 3.4 and we are done. \(\square \)

3.3 Vector-valued Case

Theorem 3.1 works equally well for operators that act between sections of vector bundles. In this section, we discuss the necessary changes in the case of trivial bundles (which is all we need in the sequel).

Let us write \(A\in \Psi ^{m,\ell }(X;{\mathbb {C}}^k)\) for \(k\times k\)-matrices of operators in \(\Psi ^{m,\ell }(X)\), understood to act between vector-valued functions in the obvious way. The scattering principal symbol is then a matrix-valued map \( \sigma _\mathrm {sc}(A): {}^\mathrm {sc}{\bar{T}}^*_{\partial X}X \rightarrow {\mathbb {C}}^{k\times k} \) and, using the notation \(\vert M \vert = (M^*M)^{1/2} \in {\mathbb {C}}^{k\times k}\) for matrices \(M \in {\mathbb {C}}^{k\times k}\), ellipticity of A is witnessed by an inequality of the form:

$$\begin{aligned} \vert \sigma _\mathrm {sc}(A)\vert> \lambda \quad \Leftrightarrow \quad \forall t \in {\mathbb {C}}^k: \langle \sigma _{\mathrm {sc}}(A) t , t \rangle > \lambda \vert t \vert ^2. \end{aligned}$$
(3.35)

Using this notation, Theorem 3.1 holds true for \(u \in L^2(X,{\mathbb {C}}^k)\) and is proved in the same way as the scalar case.

4 Local Stability of the Weighted X-ray Transform

This section is devoted to the proof of the following theorem, which is a more quantitative version of Theorem 1.3 in [20].

Theorem 4.1

Let (Mg) be a compact Riemannian manifold of dimension \(d \ge 3\) and suppose \(p\in \partial M\) is a point of strict convexity. Then there exists a smooth function \( {\tilde{x}}: M\rightarrow {\mathbb {R}}\), strictly convex near p and satisfying

$$\begin{aligned} {\tilde{x}} \le 0 = {\tilde{x}} (p) \quad \text {and} \quad \vert \mathrm {d}\tilde{x}\vert _g \le 1, \end{aligned}$$

such that for all smooth, invertible matrix weights \(W:SM\rightarrow {{\,\mathrm{Gl}\,}}(m,{\mathbb {C}})\) the following holds true:

  1. (i)

    There exist \(C,h>0\) with the following property: For \(0<c<h\) let \(B=B(p,c/2)\) and \(O=\{{\tilde{x}} >-c\}\), then

    $$\begin{aligned} \Vert f \Vert _{L^2( B)} \le C \Vert I_W f \Vert _{H^1({\mathcal {M}}_O)} \quad \text {for } f\in L^2(M). \end{aligned}$$
    (4.1)
  2. (ii)

    As W varies, the maps \(W\mapsto C(W)\) and \(W\mapsto h(W)\) obey

    $$\begin{aligned} h(W)^{-1}\vee C(W)\le \omega (\Vert W\Vert _{C^k(SM)} \vee \Vert W^{-1} \Vert _{L^\infty (SM)}) \end{aligned}$$
    (4.2)

    for some non-decreasing \(\omega :[0,\infty )\rightarrow [0,\infty )\) and an integer \(k\ge 1\).

  3. (iii)

    Under small perturbations of M and p, in a sense made precise below, one can choose \(\omega \) and k to be constant.

Let us remark on a few aspects of the theorem: The bound \(\vert \mathrm {d}{\tilde{x}}\vert _g \le 1 \) can always be achieved by scaling \({\tilde{x}}\) and is included as it ensures that the set \(O=\{{\tilde{x}}>-c\}\) \((c>0)\) contains the geodesic ball B(pc/2).

In order to make the perturbation result from part (iii) precise, assume that \(M=\{\vartheta \le 0\}\) for a smooth function \(\vartheta :M\rightarrow {\mathbb {R}}\) that is strictly convex in a neighbourhood U of p. Then for \(t>0\) small, also the boundary of the manifolds \(M_t=\{\vartheta \le -t\}\subset M\) is strictly convex in U and the Theorem applies to the weighted X-ray transform of \(M_t\) (defined via integrals over the shorter geodesics with endpoints on \(\partial M_t\)). Then (iii) means that estimate (4.2) can be made uniform for \(t>0\) sufficiently small and \(q \in \partial M_t \cap U\) close to p.

Remark 2

The compactness condition is non-essential and has only been included to simplify bound (4.2). For non-compact M and h(W) replaced by \(h(W)\vee h^*\) for a fixed upper bound \(h^*>0\), the relevant sets from part (i) lie within a compact subset \(L\subset M\) and (4.2) remains true after replacing the right-hand side by \(\omega (\Vert W \Vert _{C_L^k(SM)} \vee \Vert W^{-1} \Vert _{L^\infty (SM\vert _L)})\). Here the semi-norm \(\Vert \cdot \Vert _{C^k_L(SM)}\) is defined in local coordinates by taking the supremum over L of derivatives up to order k. In particular, if M can be embedded into a compact manifold \(M'\), then \(\Vert \cdot \Vert _{C_L^k(SM)} \lesssim \Vert \cdot \Vert _{C^k(SM')}\).

Proof of Theorem 4.1

The proof essentially consists of a careful inspection of the Uhlmann–Vasy method, which is composed of the following steps:

  1. (1)

    In a neighbourhood of p, the normal operator \(I_W^*I_W\) is modified to a ‘localised normal operator’ \(A^\chi _W\), defined over an auxiliary manifold X with \(p\in O = M\cap X\).

  2. (2)

    The operator \(A_W^\chi \) is shown to lie in the class \(\Psi ^{-1,0}_\mathrm {sc}(X)\) (Definition 3.2), elliptic near the ‘artificial boundary’ \(\partial X\). By Theorem 3.1, it is, thus, locally invertible in a neighbourhood of \(\partial X\).

  3. (3)

    A posteriori, the auxiliary manifold X is chosen such that the domain of injectivity includes \(O=M\cap X\). Stability estimates for \(A^\chi _W\) can then be translated into ones for \(I_W\).

Using Theorem 3.1(i), the constants in the resulting stability estimate are then uniform under some control on \(\sigma _\mathrm {sc}(A^\chi _W)\) and \(\Vert A^\chi _W \Vert \) (for a semi-norm \(\Vert \cdot \Vert \) on \(\Psi ^{-1,0}_\mathrm {sc}(X)\)). As \(A^\chi _W\) depends homogeneously and (in the \(C^\infty \)-topology) continuously on W, this easily translates to uniformity in terms of W and eventually yields (ii).

We will now discuss the three steps above in more detail. However, as the method has been used in several previous articles (e.g. [20, 24, 25, 27]), the exposition below will be brief and focus on the application of our quantitative result from the previous section.

Step (1) We embed M into a closed manifold (Ng) of the same dimension and extend the weight smoothly to \(W:SN\rightarrow {\mathbb {C}}^{m\times m}\). As \(p\in \partial M\) is a point of strict convexity, it admits a neighbourhood \(U\subset N\) and coordinates \((\tilde{x},y):U\xrightarrow {\sim } {\mathbb {R}}\times {\mathbb {R}}^{d-1}\) for which

$$\begin{aligned} \{{\tilde{x}} \ge 0\}\cap M = \{p\} \quad \text {and} \quad {\tilde{x}} \text { is strictly convex near } p \end{aligned}$$
(4.3)

(cf. [20, Section 3] for a construction). The following constructions are carried out with respect to a small parameter \(0<c<c_0\) (and \(c_0\) chosen later), noting dependencies when necessary. Change coordinates to \((x,y)=({\tilde{x}} + c,y)\), such that \(\{x \ge 0 \}\) is the intersection of U with a compact manifold \(X\subset N\) with strictly concave boundary near p. Consider the parametrisation

$$\begin{aligned} {\mathbb {R}}_x\times {\mathbb {R}}^{d-1}_y \times {\mathbb {R}}_\lambda \times S^{d-2}_\omega \rightarrow SU ,\quad (x,y,\lambda ,\omega ) \mapsto \frac{\lambda \partial _x+ \omega \partial _y}{\vert \lambda \partial _x+ \omega \partial _y \vert _g}, \end{aligned}$$
(4.4)

with vectors parallel to \(\partial _x\) missing in the image (they are negligible, as eventually we are interested in geodesics that are ‘nearly tangent’ to \(\partial X\)) . Pulling back the geodesic flow via (4.4) yields integral curves

$$\begin{aligned} \gamma _{x,y,\lambda ,\omega }(t) = \left( \gamma _{x,y,\lambda ,\omega }^{(1)}(t), \gamma _{x,y,\lambda ,\omega }^{(2)}(t) \right) \in {\mathbb {R}}\times {\mathbb {R}}^{d-1} \end{aligned}$$
(4.5)

and one may consider the following ‘localised normal operators’, acting on smooth functions \(f:[0,\infty )_x\times {\mathbb {R}}^{d-1}_y \rightarrow {\mathbb {C}}^m\) with suitable decay at \(x=0\):

$$\begin{aligned} \begin{aligned} A^\chi _Wf(x,y) = x^{-2} \mathrm{{e}}^{-1/x} \iiint&W^*(x,y,\lambda ,\omega ) (Wf) \left( \gamma _{x,y,\lambda ,\omega }(t), {{\dot{\gamma }}}_{x,y,\lambda ,\omega }(t) \right) \\&\times \,\mathrm{{e}}^{1/\gamma ^{(1)}_{x,y,\lambda ,\omega }(t)} \chi (x,y,\omega ,\lambda /x) ~ \mathrm {d}t \mathrm {d}\lambda \mathrm {d}\omega . \end{aligned} \end{aligned}$$
(4.6)

This corresponds to equation (4.1) in [20]. Let us discuss the ingredients of (4.6) in detail: Without loss of generality, we may assume that the interior of the box \(B= [0,2c]_x \times [-1,1]^{d-1}_y\) contains the portion of M within \(U\cap X\). Further, the ‘localising function’ \(\chi \) is assumed to satisfyFootnote 4

$$\begin{aligned} {{\,\mathrm{supp}\,}}\chi \subset B\times S^{d-2} \times [-C_0,C_0] \end{aligned}$$
(4.7)

for some \(C_0>0\) and will later be chosen such that \(A^\chi _W\) is elliptic in an appropriate sense. The domain of integration in (4.6) is \([-\delta _0,\delta _0]_t \times {\mathbb {R}}_\lambda \times S^{d-2}_\omega \), where \(\delta _0>0\) is chosen small enough to satisfy the following criteria: First, we ask that the curves (4.5), starting from B, do not leave the coordinate chart for \(\vert t \vert \le \delta _0\). Second, and after decreasing \(c_0\) if necessary, we ask that

$$\begin{aligned} \gamma _{x,y,\lambda ,\omega }^{(1)}(t) \ge \frac{C_1}{2} \left( t + \frac{\lambda }{C}_1\right) ^2 + \left( x - \frac{\lambda ^2}{2C_1} \right) ,\quad (x,y)\in B, \vert t\vert , \vert \lambda \vert < \delta _0, \end{aligned}$$
(4.8)

for some \(C_1>0\). (See equation (3.2) in [27], where this inequality is derived for \(C_1\) essentially being a lower bound of the Hessian of \({\tilde{x}}\) near p).

Step (2) Note that \(A_W^\chi \) may be viewed as operator \(C_c^\infty (X^\mathrm {int},{\mathbb {C}}^m) \rightarrow C^\infty (X^\mathrm {int},{\mathbb {C}}^m)\) with Schwartz kernel compactly contained in \((U\cap X)^2\). The crux is now that \(A_W^\chi \) fits into Melrose’ scattering calculus in the sense that \(A_W^\chi \in \Psi ^{-1,0}_\mathrm {sc}(X)\) and, upon a judicious choice of localiser \(\chi \), is elliptic near \(\partial X\cap M\). In particular, Theorem 3.1 (local inversion of scattering operators) can be used.

In order to give a precise statement, we recall that the constructions above depend on a parameter \(c>0\), and there is a whole family of operators \(A^\chi _W(c)\), defined over sub-manifolds \(X_c\subset N\) (with \(X_c\cap U=\{{\tilde{x}} + c \ge 0\}\)). We may assume that there is a flow \(\psi _c\) on N, defined for small \(c>0\), for which \(X_c=\psi _c(X_0)\).

Theorem 4.2

Upon choosing \(c_0,\lambda _0>0\) sufficiently small, we have

  1. (i)

    For all smooth localisers \(\chi \) with (4.7), the operator \(A^{\chi }_W(c)\in \Psi ^{-1,0}_\mathrm {sc}(X_c)\). Further, allowing \(\chi \) to depend continuously on c, the map

    $$\begin{aligned} \begin{aligned} {[}0,c_0) \times C^\infty (SN,{\mathbb {C}}^{m\times m})&\rightarrow \Psi ^{-1,0}_\mathrm {sc}(X_0), \quad (c,W) \mapsto \psi _c^* A^{\chi _c}_W(c) \end{aligned} \end{aligned}$$
    (4.9)

    is continuous with respect to the natural Fréchet topologies. Moreover, for any continuous semi-norm \(\Vert \cdot \Vert \) of \(\Psi ^{-1,0}_\mathrm {sc}(X_0)\) there is an integer \(k\ge 0\) such that

    $$\begin{aligned} \Vert \psi _c^* A^{\chi _c}_W(c) \Vert \lesssim \Vert W \Vert _{C^k(SN)}^2\quad \text { for all } 0\le c < c_0. \end{aligned}$$
    (4.10)
  2. (ii)

    There exists a localiser \(\chi \), smooth, satisfying (4.7) and depending continuously on c, such that for all (cW) in (4.9) we have

    $$\begin{aligned} \vert \sigma _\mathrm {sc}(A^{\chi _c}_W(c)(z,\zeta )) \vert \ge \lambda _0 \Vert W^{-1} \Vert _{L^\infty (SN)}^{-2},\quad z\in \partial X_c\cap M, \zeta \in {}^\mathrm {sc}T^*_zX_c.\nonumber \\ \end{aligned}$$
    (4.11)

Sketch of Proof

The proof is essentially carried out in Sect. 4 of [20]. We sketch the main aspects, highlighting dependencies on the weight.

Either by first computing the Schwartz kernel [20, Lemma  4.1.] or directly (akin to [25]), one verifies that \(A_W^\chi \) has an oscillatory integral expression of the form (3.6) and the pseudodifferential-property as well as the continuous dependency can be checked directly. We note here that continuous dependence on c is already implicitly used in [14] and continuous dependence on W is akin to continuous dependence on the metric as stated in e.g. [25, Prop. 4.2].

Further, (4.10) can be derived from (4.9) and the homogeneity of \(A_W^{\chi _c}(c)\) in W. This can be seen easiest in a general functional analytic setting, where we are given two Fréchet spaces E and F and a continuous map

$$\begin{aligned} \varphi : [0,\infty )\times E\rightarrow F,\quad \text {with } \varphi (c,t \cdot )=t^2 \varphi (c,\cdot ) \quad (t,c\ge 0). \end{aligned}$$
(4.12)

Then the collection of sets \(\{(t,w):0\le t< \epsilon : \Vert w \Vert ' <1 \}\), where \(\epsilon >0\) and \(\Vert \cdot \Vert '\) runs through continuous semi-norms of E, constitute a basis for the neighbourhoods of \((0,0)\in [0,\infty )\times E\). Given a continuous semi-norm \(\Vert \cdot \Vert \) on F, the set \(\{(c,w):\Vert \varphi (c,w) \Vert < 1 \}\) is an open neighbourhood of (0, 0) and thus we can find \(c_0>0\) and \(\Vert \cdot \Vert '\) with

$$\begin{aligned} \{(c,w): 0\le c< c_0, \Vert w \Vert '< 1\} \subset \{(c,w):\Vert \varphi (c,w) \Vert < 1 \}. \end{aligned}$$
(4.13)

Now take \(0\le c <c_0\) and \(w\in E\), then \((c,w/(2\Vert w \Vert '))\) lies in the left set and thus

$$\begin{aligned} \Vert \varphi (c,w) \Vert = 4 (\Vert w \Vert ')^2 \cdot \Vert \varphi (c,w/(2\Vert w \Vert ')) \Vert \le 4 (\Vert w \Vert ')^2, \end{aligned}$$
(4.14)

as desired.

To prove (ii), we first fix c. Then the symbol in said oscillatory integral expression, restricted to \(x=0\), takes the form:

$$\begin{aligned} \begin{aligned} a(0,y,\xi ,\eta ) = \iiint&\mathrm{{e}}^{i\xi ({{\hat{\lambda }}} {\hat{t}} +\alpha (0,y,0,\omega ){\hat{t}}^2 + i\eta \cdot \omega {\hat{t}} } \cdot \mathrm{{e}}^{-{{{\hat{\lambda }}} {\hat{t}} - \alpha (0,y,0, \omega ) {\hat{t}}^2}} \\&\times W^*W(0,y,0,\omega ) \chi (0,y,{{\hat{\lambda }}}, \omega ) \mathrm {d}\hat{\lambda }\mathrm {d}{\hat{t}} \mathrm {d}\omega , \end{aligned} \end{aligned}$$
(4.15)

where \(\alpha (x,y,\lambda ,\omega )=(\mathrm {d}/\mathrm {d}t)^2 \gamma _{x,y,\lambda ,\omega }^{(1)}(t) > 0\) (say, for \((x,y)\in B\)) and the integral domain is \({\mathbb {R}}_{{{\hat{\lambda }}}}\times {\mathbb {R}}_{{\hat{t}}}\times S^{d-2}_\omega \). For the particular choice \(\chi (x,y,{{\hat{\lambda }}}, \omega ) = \exp (-{{\hat{\lambda }}}^2/(2\alpha (x,y,\lambda ,\omega )))\) (multiplied with a cut-off in (xy) to ensure that it is supported in B), the integral in the last display can further be evaluated to obtain a non-zero multiple of

$$\begin{aligned} \langle \xi \rangle ^{-1} \int _{S^{d-2}} (W^*W)(0,y,0,\omega ) \mathrm{{e}}^{-\vert \eta \cdot \omega / \langle \xi \rangle \vert ^2/ 2 \alpha (0,y,0,\omega )} \mathrm {d}\omega , \end{aligned}$$
(4.16)

which corresponds to the second display below equation (4.10) in [20]. Following the reasoning of [20, proof of Prop. 4.3] below said expression yields

$$\begin{aligned} \langle (\xi ,\eta ) \rangle a(0,y,\xi ,\eta ) \ge 2\lambda _0\cdot \Vert W^{-1} \Vert _{L^\infty (SN)}^{-2} \end{aligned}$$
(4.17)

for a constant \(\lambda _0\) only depending on the local geometry near p. Here, it was used that \(W^*W(0,y,0,\omega )\) is bounded from below by the square of the smallest singular value of W, which is in turn lower bounded by \(\Vert W^{-1} \Vert _{L^\infty (SN)}^{-2}\).

The localiser \(\chi \) above has full support in \({{\hat{\lambda }}}\) and, thus, fails to satisfy (4.7). In the proof of [20, Prop. 4.3] \(\chi \) is, thus, approximated by localisers with compact \({{\hat{\lambda }}}\)-support, thus, obeying (4.7) for some \(C_0>0\). From (4.15), it follows that the approximation is uniform in W, at least under an a priori bound \(\Vert W \Vert _{L^\infty (SN)} \le 1\). This proves part (ii) for all W with \(\Vert W \Vert _{L^\infty (SN)} \le 1\), and the general case follows from a scaling argument, noting that both sides of (4.11) are homogeneous in W of degree 2.

Finally, we comment on the c-dependency: note that \(\alpha (x,y,\lambda ,\omega )\) (and, thus, \(\chi \)) implicitly depends on c through the choice of \(x={\tilde{x}} + c\). However, the dependence is clearly continuous, and \(\alpha \) can be bounded in terms of the geometry near p. In particular, the bound (4.11) is uniform in c. \(\square \)

By Theorem 4.2, for an invertible weight W, the operator \(A^{\chi }_W(c)\) is locally elliptic for suitably chosen \(\chi \) and sufficiently small \(c>0\). In particular, Theorem 3.1 can be applied and, for constants \(C,h>0\) (depending on W and c), we obtain

$$\begin{aligned} \Vert f \Vert _{L^2(X)} \le C \Vert A^\chi _W(c) f\Vert _{H_\mathrm {sc}^{1,-(d+1)/2}(X_c)},\quad \text { if } {{\,\mathrm{supp}\,}}f \subset M\cap B(\partial X_c, h).\nonumber \\ \end{aligned}$$
(4.18)

Due to (4.10) and (4.11), the uniformity statement of Theorem 3.1(ii) gives

$$\begin{aligned} C(W,c) \vee h(W,c)^{-1} \le \omega (\Vert W \Vert _{C^k(SN)} \vee \Vert W^{-1} \Vert _{L^\infty (SN)} ), \end{aligned}$$
(4.19)

valid for sufficiently small \(c>0\) and all smooth weights \(W:SN\rightarrow {{\,\mathrm{Gl}\,}}(m,{\mathbb {C}})\). Here, \(\omega :[0,\infty )\rightarrow [0,\infty )\) is a non-decreasing function and \(k\ge 1\).

Step (3) From now on, we argue with a fixed weight \(W:SN\rightarrow {{\,\mathrm{Gl}\,}}(m,{\mathbb {C}})\) and keep track of how the arising constants depend on

$$\begin{aligned} A = \Vert W \Vert _{C^k(SN)} \vee \Vert W^{-1} \Vert _{L^\infty (SN)}>0. \end{aligned}$$
(4.20)

Fix \(0<c<h\), then \(O_c=M\cap X_c\) lies in \(M\cap B(\partial X_c,h)\) and consequently (dropping the c-subscripts from now on)

$$\begin{aligned} \Vert f \Vert _{L^2(O)} \lesssim _A \Vert A^\chi _W f\Vert _{H^{1,-(d+1)/2}(X)},\quad f\in L^2(M), \end{aligned}$$
(4.21)

where it is understood that f is extended by zero outside of M. In order to translate this into a stability estimate for \(I_W\), we factor the operator \(A_W^\chi \) as

$$\begin{aligned} A_W^\chi f = x^{-2} \mathrm{{e}}^{-1/x} L_W^\mu {\tilde{I}}_W (\mathrm{{e}}^{1/x}f),\quad f\in L^2(M) \end{aligned}$$
(4.22)

where the operators \(L^\mu _W\) and \({\tilde{I}}_W\) are defined as follows: We may assume that \({\bar{U}}\) (viewed as a manifold with boundary) is simple and denote with \(\tau _U:S{\bar{U}}\rightarrow [0,\infty )\) its exit time. Note that we can write \(\chi (x,y,\lambda /x,\omega ) \mathrm {d}\lambda \mathrm {d}\omega = \mu (z,v) \mathrm {d}v\) on SU for a smooth function \(\mu :SU\rightarrow {\mathbb {R}}\) with compact support. Then

$$\begin{aligned} L^\mu _W: C(\partial _+S{\bar{U}})\rightarrow C(U), \quad L^\mu _W u(z) =\int _{S_zN} W^*(z,v) u^\sharp (z,v) {\mu }(z,v) \mathrm {d}v,\nonumber \\ \end{aligned}$$
(4.23)

where \(u^\sharp \) extends u constant along the geodesic flow. Further, \({\tilde{I}}_W\) is the weighted X-ray transform, defined with respect to the manifold \({\bar{U}}\) and (4.22) is evident, as \(f\vert _U\) is supported in \(M\cap U\), and thus, no additional mass is collected by integrating along complete geodesics in \({\bar{U}}\).

Define \({{\tilde{{\mathcal {M}}}}} \subset \partial _+S{\bar{U}}\) to consist of initial conditions (zv) for which \(z\in X\), the geodesic \(\gamma _{z,v}(t)\) enters B for some \(0\le t \le \tau _U(z,v)\) but does not hit \(\partial X\cap M\). After decreasing h if necessary, we can assume that

$$\begin{aligned} \{\varphi _t(z,v): 0\le t\le \tau _U(z,v)\}\cap {{\,\mathrm{supp}\,}}\mu = \emptyset \quad \text { for } (z,v) \in \partial _+S{\bar{U}} \backslash {{\tilde{{\mathcal {M}}}}}. \end{aligned}$$
(4.24)

Indeed, assume that \(h<C_1/C_0^2\wedge \delta _0/(2C_0)\), where \(\delta _0,C_0,C_1\) are the constants from (4.7) and (4.8). Then if the integral curve starting at \((z,v)\in \partial _+S{\bar{U}}\) enters the support of \(\mu \) at, say \((x,y,\lambda ,\omega \), we must have \(0<x<2c<2h\) and \(\vert \lambda /x \vert < C_0\), which implies that \(\vert \lambda \vert <\delta _0\) and \(x-\lambda ^2/(2C_1)>x(1-x C_0^2/(2C_1))>x(1-hC_0^2/C_1)>0\). In particular the right-hand side of (4.8) is non-negative, and the curve cannot intersect \(M\cap \partial X\).

To proceed, take \(K\subset O\) compact (such as the geodesic ball B(pc/2), when \({\tilde{x}}\) is scaled to satisfy \(\vert \nabla {\tilde{x}}\vert \le 1\)). We then have for all \(f\in L^2(M)\)

$$\begin{aligned} \Vert f \Vert _{L^2(K)} \lesssim _K \Vert \mathrm{{e}}^{-1/x} f \Vert _{L^2(O)} \lesssim _A \Vert x^{-\frac{d-1}{2} }\mathrm{{e}}^{-1/x} L^\mu _W {\tilde{I}}_W f \Vert _{H^1(X)}, \end{aligned}$$
(4.25)

where the first estimate follows from the fact that \(\mathrm{{e}}^{1/x}\) and all its derivatives are bounded on K and the second estimate follows from equation (4.21) and inclusion \(H_\mathrm {sc}^{1,\ell }(X) \subset x^{-\ell } H^{1,0}_\mathrm {sc}\subset x^{-\ell } H^1(X) \) for \(\ell = -(d+1)/2\).

Note that the function on the right-hand side in (4.25) is compactly supported in U (due to the support condition on \(\mu \)) and that \(x^{-(d-1)/2}\mathrm{{e}}^{-1/x}\) and all of its derivatives extend by zero to a bounded function on U. Thus,

$$\begin{aligned} \Vert f \Vert _{L^2(K)}\lesssim _{K,A} \Vert L^\mu _W {\tilde{I}}_W f\Vert _{H^1(U)},\quad \text { for all } f \in L^2(M) \end{aligned}$$
(4.26)

and it remains to bound the operator norm of \(L^\mu _W\) and relate \({\tilde{I}}_W\) to the transform \(I_W\) we are actually interested in.

Lemma 4.3

For all \(k\ge 0\), the operator \(L^\mu _W : H^k({{\tilde{{\mathcal {M}}}}})\rightarrow H^k(U)\) is bounded with operator norm \(\lesssim \Vert W \Vert _{C^k(S{\bar{U}})}\).

Proof of Lemma  4.3

We prove the lemma in a slightly more general setting, when \({{\tilde{{\mathcal {M}}}}}\subset \partial _+S{\bar{U}}\) is any open subset with closure not intersecting \(S\partial {\bar{U}}\) and \(\mu :SU\rightarrow {\mathbb {R}}\) is a smooth function with compact support satisfying (4.24).

The lemma then follows from factorising \(L^\mu _W\) as

$$\begin{aligned} H^k({{\tilde{{\mathcal {M}}}}}) \xrightarrow {E} H^k_c(\partial _+S{\bar{U}} ^\mathrm {int}) \xrightarrow {\sharp } H^k(SU) \xrightarrow {\times \mu W^*} H^k(SU) \xrightarrow {\pi _*} H^k(U) \end{aligned}$$
(4.27)

with the following factors: E is an extension operator (cf. Lemma 7.2), which may be chosen to map to compactly supported functions, as \({{\tilde{{\mathcal {M}}}}}\) is assumed to have compact closure in \(\partial _+S{\bar{U}}^\mathrm {int}= \partial _+S{\bar{U}}\backslash S\partial {\bar{U}}\). Due to condition (4.24), the precise choice of E is irrelevant. Next, the map \(\sharp \), defined below (4.23), is continuous, as under the isomorphism

$$\begin{aligned} \{(z,v,t)\in \partial _+S{\bar{U}}^\mathrm {int}\times {\mathbb {R}}: 0< t <\tau _U(z,v)\} \cong SU,\quad (z,v,t)\mapsto \varphi _t(z,v)\nonumber \\ \end{aligned}$$
(4.28)

it corresponds to pull back by the projection \(\mathrm {pr}_1:\partial _+S{\bar{U}}^\mathrm {int}\times {\mathbb {R}}\rightarrow \partial _+S{\bar{U}}^\mathrm {int}\). Multiplication by \(\mu W^*\) is clearly bounded with operator norm \(\lesssim _\mu \Vert W \Vert _{C^k(S\bar{U})}\). Finally, \(\pi _*\) is the push forward along the base projection, which is well known (and easily checked in coordinates) to be \(H^k\)-continuous. \(\square \)

Lemma 4.4

\(\Vert {\tilde{I}}_W f \Vert _{H^k({\tilde{M}})} \lesssim _k \Vert I_W f \Vert _{H^k({\mathcal {M}}_O)}\) (\(k\ge 0\)) for all \(f\in L^2(M)\).

Proof of Lemma 4.4

Define \(\beta : {\mathcal {M}}_O\rightarrow {{\tilde{{\mathcal {M}}}}}\) by \(\beta (z,v) = \varphi _{-\tau _U(z,-v)}(z,v)\), then \(\beta ^* ({\tilde{I}}_W f) = I_W f\) on \({\mathcal {M}}_O\). Now pull-back \(\beta ^*:H^k(\beta ({\mathcal {M}}_O)) \rightarrow H^k({\mathcal {M}}_O)\) (\(k\ge 0\)) is an isomorphism, as \(\beta \) extends across the closure of \({\mathcal {M}}_O\) to a diffeomorphism onto its image. Thus, \(\Vert {\tilde{I}}_W \Vert _{H^k({{\tilde{{\mathcal {M}}}}})} \lesssim \Vert {\tilde{I}}_W f \Vert _{H^k(\beta ({\mathcal {M}}_O))} \lesssim \Vert I_W f \Vert _{H^k({\mathcal {M}}_O)}\), where the first inequality follows from the fact that f is supported in M, and thus, \({{\,\mathrm{supp}\,}}{\tilde{I}}_W f \subset \overline{\beta ({\mathcal {M}}_O)}\). \(\square \)

We can now finish the proof of Theorem 4.1. Using (4.26) together with the previous two lemmas yields \( \Vert f \Vert _{L^2(K)} \lesssim _{K,A} \Vert I_Wf \Vert _{H^1({\mathcal {M}}_O)}\) for all \(f\in L^2(M) \) and, taking K to be the geodesic ball \(B=B(p,c/2)\) and making the W-dependency explicit again,

$$\begin{aligned} \Vert f \Vert _{L^2(B)} \le C'(W) \Vert I_W f \Vert _{H^1({\mathcal {M}}_O)},\quad f\in L^2(M), \end{aligned}$$
(4.29)

where \(C'(W) \le \omega '(\Vert W \Vert _{C^k(SN)} \vee \Vert W^{-1} \Vert _{L^\infty (SN)}^{-1})\) for \(\omega ':[0,\infty )\rightarrow [0,\infty )\) non-decreasing. One can further replace the norms on SN by their counterparts on SM, as \(\Vert I_W f \Vert _{H^1({\mathcal {M}}_O)}\) only depends on \(W\vert _{SM}\). Thus, (i) and (ii) are proved.

Finally, part (iii) is clear from the above: When p is slightly perturbed to some \(p'\in \partial M\), the ball \(B(p',c/2)\) remains within O and K may be chosen accordingly. Small perturbations of M correspond to an affine change of variables in \({\tilde{x}}\) and are, thus, inconsequential. This concludes the proof. \(\square \)

5 Proof of the Stability Estimate

Let (Mg) be compact, non-trapping and with strictly convex boundary \(\partial M\). We complete the proof of Theorem 1.3.

5.1 Layer-stripping Argument

We first derive a (global) stability estimate for the linearised problem.

Theorem 5.1

Let \(d\ge 3\) and suppose that \(K\subset O\subset M\), such that K is compact and O is open and satisfies the foliation condition. Then for \(f\in C^\infty (M,{\mathbb {C}}^m)\) and \(W\in C^\infty (SM,{{\,\mathrm{Gl}\,}}(m,{\mathbb {C}}))\), we have

$$\begin{aligned} \Vert f \Vert _{L^2(K)}\le C(W)\cdot \Vert f \Vert _{C^2(M)}^{1-\mu (W)} \cdot \Vert I_Wf\Vert _{L^2(O)}^{\mu (W)}, \end{aligned}$$
(5.1)

where \(C>0\) and \(\mu \in (0,1)\) obey an estimate

$$\begin{aligned} C(W)\vee \mu (W)^{-1} \le \omega (\Vert W\Vert _{C^k(SM)}\vee \Vert W ^{-1} \Vert _{L^\infty (SM)}) \end{aligned}$$
(5.2)

for some non-decreasing \(\omega :[0,\infty )\rightarrow [0,\infty )\) and an integer \(k\ge 2\).

Let us outline the strategy of proof for Theorem 5.1. Using the strictly convex exhaustion function on O, we can stratify K into finitely many layers, where the number of layers depends on the weight W. As each layer has a strictly convex boundary, one can use the local stability result in Theorem 4.1 and propagate the stability estimate into the interior of O layer by layer via an induction argument. More concretely, Theorem 4.1 allows to bound the norm of f within a certain layer in terms of the weighted X-ray transform, defined with respect to geodesics confined to that layer. As we are actually interested in the transform along complete geodesics in M, an error occurs. By virtue of our forward estimates, this error can be bounded in terms of the magnitude of f in the previous layers, which is controlled by the induction hypothesis.

Remark 3

The Hölder exponent \(\mu \) in the theorem is of order \(2^{-N}\), where N is the number of layers needed to stratify K. This in turn is of order \(N=O(h^{-1})\), where h is the ‘depth’ from Theorem 4.1. The integer k that appears in the theorem is essentially the same as in the local stability estimates (Theorem 4.1), in particular a hypothetical universal bound \(k\le c_d\) in Theorem 4.1 would remain true in Theorem 5.1.

Remark 4

For a fixed weight W, the result can be improved to allow control on the Hölder exponent \(\mu \) at the cost of needing bounds on higher derivatives of f. Precisely, for any \(\mu \in (0,1)\), we have \(\Vert f \Vert _{L^2(K)} \le \omega (\Vert f \Vert _{C^\ell (M)}) \Vert I_W f \Vert _{L^2({\mathcal {M}}_O)}^\mu \) for \(\omega :(0,\infty )\rightarrow (0,\infty )\) non-decreasing (and dependent on the fixed weight W) and \(\ell \gg 1\) sufficiently large. To see this, one needs to amend Lemma 5.2 below by using different interpolation spaces.

We first discuss some notation and auxiliary results that are used in the proof of Theorem 5.1. In the following, we fix a strictly convex function \(\rho :O\rightarrow {\mathbb {R}}\) with compact super-level sets \(O_{\ge c} = \{x\in O: \rho (x) \ge c\}\) for \(c>\inf _O \rho \). Then

$$\begin{aligned} M_c =\{x\in M^\mathrm {int}: \rho (x) \le c \} \end{aligned}$$
(5.3)

is a (possibly non-compact) manifold with strictly convex boundary and geodesics in \(M_c\) with endpoints on the level set \(\{\rho = c\}\) can be parametrised by the set

$$\begin{aligned} {\mathcal {M}}_c = \{(x,v)\in SM^ \mathrm {int}: \rho (x)=c, \mathrm {d}\rho (v) \le 0, \gamma _{x,v}(\tau (x,v)) \in O\}. \end{aligned}$$
(5.4)

We denote with \(I^c_W f :{\mathcal {M}}_c\rightarrow {\mathbb {C}}^m\) the weighted X-ray transform on \(M_c\), defined via integrals along the portion of geodesics within \(M_c\). The following Lemma compares this with the full X-ray transform on M and provides the key estimate that drives the layer-stripping argument.

Lemma 5.2

(Error-bound) Let \(f\in C^\infty (M, {\mathbb {C}}^m)\) and \(W\in C^\infty (SM,{\mathbb {C}}^{m\times m})\), then for all \(0<\mu \le 1\), we have

$$\begin{aligned} \Vert I_W^cf \Vert _{H^1({\mathcal {M}}_c)}^2 \lesssim _\mu C(W) \left[ 1+\left( \frac{\Vert f \Vert _{L^2(O_{\ge c})}}{\Vert I_W f\Vert _{L^2({\mathcal {M}}_O)}}\right) ^\mu \right] \cdot \Vert f \Vert _{H^2(M)}^{2-\mu } \Vert I_Wf\Vert _{L^2({\mathcal {M}}_O)}^\mu ,\nonumber \\ \end{aligned}$$
(5.5)

where \(C(W)>0\) is bounded when \(\Vert W \Vert _{C^2(SM)}\) is bounded.

Proof

Each geodesic in \(M_c\) with endpoints on the level set \(\rho = c\) can be extended to a complete O-local geodesic in M and we denote the corresponding map between initial conditions by

$$\begin{aligned} \beta _c : {\mathcal {M}}_c\rightarrow {\mathcal {M}}_O\subset \partial _+SM ,\quad (x,v) \mapsto \varphi _{-\tau (x,-v)}(x,v). \end{aligned}$$
(5.6)

The weighted X-ray transform on \(M_c\) can then be written as follows:

$$\begin{aligned} I^c_Wf(x,v) = I_W(1_{M_{c}} f)(\beta _c(x,v)), \end{aligned}$$
(5.7)

where \(1_{M_c}\) is the indicator function of \(M_c\). As \(\beta \) extends smoothly to the closure of \({\mathcal {M}}_c\), pull-back by \(\beta ^{-1}\) defines a bounded map \(H^s(\beta ({\mathcal {M}}_c))\rightarrow H^s({\mathcal {M}}_c)\) and for all \(s\in {\mathbb {R}}\), we have

$$\begin{aligned} \begin{aligned} \Vert I^c_W f\Vert _{H^s({\mathcal {M}}_c)}&\lesssim _{c,s} \Vert I_W(1_{M_c}f) \Vert _{H^s({\mathcal {M}}_O)}\\&\le \Vert I_W f\Vert _{H^s({\mathcal {M}}_O)} + \Vert I_W(1_{O_{\ge c}} f)\Vert _{H^s({\mathcal {M}}_O)}. \end{aligned} \end{aligned}$$
(5.8)

The last term accounts for the error that is made by integrating along complete geodesic in M rather than the portion within \(M_c\). We can bound this error by a forward estimate (Cor. 2.4), as long as the truncated function \(1_{O_{\ge c}}f\) is of regularity \(H^s\). This restricts the choice of s to \(\vert s \vert <1/2\), for which we obtain

$$\begin{aligned} \begin{aligned} \Vert I^c_W f\Vert _{H^s({\mathcal {M}}_c)}&\lesssim _{c,s} \Vert I_W f\Vert _{H^s({\mathcal {M}}_O)} + \Vert W \Vert _{C^1(SM)} \Vert f\Vert _{H^s(O_{\ge c})}\\ \end{aligned} \end{aligned}$$
(5.9)

In order to estimate the \(H^1\)-norm of \(I^c_Wf\), we employ the interpolation inequality \(\Vert \cdot \Vert _{H^1}^2\le \Vert \cdot \Vert _{L^2}\Vert \cdot \Vert _{H^2}\) on \({\mathcal {M}}_c\) and estimate the \(H^2\)-term via the forward estimateFootnote 5

$$\begin{aligned} \Vert I^c_W f\Vert _{H^2({\mathcal {M}}_c)} \lesssim \Vert W\Vert _{C^2(SM)} \Vert f\Vert _{H^2(O)}. \end{aligned}$$
(5.10)

Combining the estimates in the preceding displays (for \(s=0\)) and bounding the first factor in \(\Vert I_Wf\Vert _{L^2({\mathcal {M}}_O)}=\Vert I_Wf\Vert _{L^2({\mathcal {M}}_O)}^{1-\mu }\Vert I_Wf\Vert _{L^2({\mathcal {M}}_O)}^\mu \) via another forward estimate we get

$$\begin{aligned} \begin{aligned} \Vert I_W^cf \Vert _{H^1({\mathcal {M}}_c)}^2 \lesssim&_{c,s}\left( \Vert f \Vert ^{1-\mu }_{L^2(M)}\Vert W \Vert _{L^\infty (SM)}^{1-\mu } \cdot \Vert I_W f \Vert ^\mu _{L^2({\mathcal {M}}_O)} + \Vert W \Vert _{C^1(SM)} \Vert f \Vert _{L^2(O_{\ge c})}\right) \\&\times \Vert W \Vert _{C^2(SM)} \cdot \Vert f \Vert _{H^2(M)} \\ \le&~ (1 + \Vert W \Vert _{C^2(SM)} )^2 \cdot \left( 1 + \Vert f \Vert _{L^2(O_{\ge 0})}^\mu / \Vert I_W f \Vert _{L^2({\mathcal {M}}_O)}^\mu \right) \\&\times \Vert f \Vert _{H^2(M)}^{2-\mu } \Vert I_W f\Vert _{L^2({\mathcal {M}}_O)}^\mu , \end{aligned} \end{aligned}$$
(5.11)

as desired.\(\square \)

The next result is a technical Lemma that provides a convenient stratification of K into layers. The parameter \(h>0\) below will later be the ‘intial penetration depth’ from Theorem 4.1.

Lemma 5.3

Suppose \(K\subset O\) and \(\vert \nabla \rho \vert \ge 1\) on K.

  1. (i)

    For every \(h>0\), there exists a radius \(0 < r(h) \le h\) (non-decreasing in h) such that for \(p\in K\cap \partial M_c\) with \({{\,\mathrm{dist}\,}}(p,\partial M)>h/2\) we have

    $$\begin{aligned} B(p,r(h))\cap M_c \subset \bigcup _{(x,v)\in \beta ({\mathcal {M}}_c)} \gamma _{x,v}([0,\tau (x,v)]). \end{aligned}$$
    (5.12)
  2. (ii)

    For all \(h>0\), there are finitely many numbers

    $$\begin{aligned} \quad \sup _K\rho =c_0> c_1 \ge \dots> c_N > c_{N+1}=\inf _K\rho \quad (N =O(h^{-1}) ) \end{aligned}$$

    as well as points \(p_{ij}\in K\) (\(i=0,\dots ,N, j=1,\dots ,J_i\)) with the following properties: We have \(p_{0j}\in \partial M\), \(p_{ij}\in \{\rho =c_i\}\) (\(i=1,\dots , N\)) and

    $$\begin{aligned} \{x\in K : c_{i} \ge \rho (x) \ge c_{i+1}\} \subset \bigcup _{j=1}^{J_0}B(p_{0j},h)\cup \bigcup _{j=1}^{J_i} B(p_{ij},r) \end{aligned}$$
    (5.13)

    for \(i=0,\dots , N\) (where the second union is redundant for \(i=0\)).

Proof

Let us denote the set on the right-hand side of Lemma 5.12 by \(V_c\). It is straightforward to see that \(\partial M_c\subset V_c\) and that \(V_c\) is open. In particular, \(V_c\) contains an open ball around each point of \(\partial M_c\). As the set of points on \(\partial M_c\) with \({{\,\mathrm{dist}\,}}(\cdot , \partial M)\ge h/2\) is compact, the radius of the balls can be chosen uniformly (depending on h), which is equivalent to the statement of Lemma 5.3(i).

Fig. 1
figure 1

The layers from Lemma 5.3

For part (ii) we let \(N(h)=2\lceil {(\sup _K \rho - \inf _K \rho )/r(h)}\rceil \) and put \(c_i=c_{i-1}-r/2\) for \(i=1,\dots ,N\), where \(c_0 = \sup _K \rho \). The boundary points \(p_{01}, \dots , p_{0J_0}\) are then chosen such that the h-balls around them cover the compact set \(\partial M\cap K\). Now let \(x\in K\) be such that \(c_i \le \rho (x) \le c_{i+1}\) for some \(i=0,\dots , N\). If \(i=0\) or \({{\,\mathrm{dist}\,}}(x, \partial M) < h/2\), then \(x \in B(p_{0,j},h)\) for some \(j=1,\dots , J_0\). If \(i\ge 1\) and \({{\,\mathrm{dist}\,}}(x, \partial M) \ge h/2\) we claim that \(d(x,p)<r\) for some \(p\in \partial {M_{c_i}}\). Due to the compactness of \(\partial M_{c_i} \cap \{{{\,\mathrm{dist}\,}}(\cdot ,\partial M) \ge h/2\}\), finitely many such points \(p_{i1},\dots , p_{iJ_i}\in \partial M_{c_i}\) suffice to establish (5.13), so it remains to verify the claim.

Indeed, if we let \(t\mapsto c_x(t)\) be the unit-length curve with \(c_x(0)=x\) and \(\mathrm {d}\rho (\dot{c}_x(t)) = \vert \nabla \rho (c_x(t)) \vert \), then \(\rho \) increases along \(c_x\) and by [20, Lem, 2.5] the curve stays in O until it hits the boundary of M. Let \(\ell \ge 0\) be the first time for which \(p=c_x(\ell ) \in \partial M \cup \partial M_{c_i}\). Then

$$\begin{aligned} d(x,p) \le \ell \le \int _0^\ell \mathrm {d}\rho (\dot{c}_x(t) ) \mathrm {d}t = \rho (p) - \rho (x) \le c_i - c_{i+1} \le r/2 \end{aligned}$$
(5.14)

and we must have \(p \in \partial M_{c_i}\) and \(x\in B(p,r)\), as desired. \(\square \)

The next Lemma is of importance for the full-data problem (\(O=M\)) and allows to perturb convex foliations in a way that shifts the point of degeneracy.

Lemma 5.4

Suppose \(\rho :M\rightarrow {\mathbb {R}}\) is smooth and strictly convex. Then there exists another \({{\tilde{\rho }}}:M\rightarrow {\mathbb {R}}\), smooth and strictly convex, such that \(\rho \) and \({{\tilde{\rho }}}\) achieve their global minima at different points.

Proof

Suppose \(\rho \) achieves its minimum at the point \(x^*\in M\) and let V be a smooth vector field on M which is tangent to \(\partial M\) and non-vanishing at \(x^*\). Denote the flow of V by \((\psi _t:t\ge 0)\), then \(\psi _t^*\rho \in C^\infty (M,{\mathbb {R}})\) is strictly convex for t sufficiently small and achieves its (unique) minimum at \(x^*_t=\psi _{-t}(x^*)\). Since \(V(x_*)\ne 0\), we have \(x_t^* \ne x^*\) for \(t>0\) sufficiently small, and thus, \({{\tilde{\rho }}} = \psi _t^*\rho \) and \(\rho \) achieve their minimum at different points. \(\square \)

Proof of Proposition 5.1

Let \(\rho :O\rightarrow {\mathbb {R}}\) be a strictly convex exhausting function and denote \(\rho _*=\inf _O\rho \). We first reduce to the situation that

$$\begin{aligned} \vert \nabla \rho (x) \vert \ge 1 \quad \text { and } \quad \rho (x)>\rho _*\quad \text {for } x\in K. \end{aligned}$$
(5.15)

Indeed, after scaling \(\rho \) if necessary, (5.15) can only fail, when \(O=M\) [20, Lemma 2.5], and in this case, we argue as follows: Take \({{\tilde{\rho }}}\) as in Lemma 5.4, then we may choose \(\epsilon >0\) such that \(M=\{\rho \ge \rho _*+\epsilon \}\cup \{{{\tilde{\rho }}} \ge \inf _M\tilde{\rho }+\epsilon \}=K\cup {\tilde{K}}\). Then K and \({\tilde{K}}\) satisfy (5.15) for \(\rho \) and \({{\tilde{\rho }}}\), respectively, and the corresponding stability estimates (5.1) can be combined to bound \( \Vert I f\Vert _{L^2(M)}\).

In the remaining proof, we argue with fixed \(f\in C^\infty (M,{\mathbb {C}}^m)\) and \(W\in C^\infty (SM,{{\,\mathrm{Gl}\,}}(m,{\mathbb {C}}))\) and keep track of the dependency of our constructions on

$$\begin{aligned} A = \Vert W\Vert _{C^k(SM)} \vee \Vert W^{-1} \Vert _{L^\infty (SM)} \end{aligned}$$
(5.16)

for an integer \(k\ge 2\) to be specified. Let us first summarise the consequences of Theorem 4.1: Each \(p\in K\) is a strictly convex boundary point of either M itself or of the manifold \(M_c\), defined in (5.3). We can, thus, apply Theorem 4.1, either with respect to the local X-ray transform \(I_W\) on M or the one on \(M_c\), which we denote by \(I^c_W\). Thus, for all \(f\in L^2(M,{\mathbb {C}}^m)\) we have

$$\begin{aligned}&\Vert f \Vert _{L^2(B(p,h))} \le C \Vert I_Wf\Vert _{H^1({\mathcal {M}}_O)}, p\in K\cap \partial M \end{aligned}$$
(5.17)
$$\begin{aligned}&\Vert f \Vert _{L^2(B(p,r)\cap M_{c} }) \le C \Vert I^c_W f \Vert _{H^1({\mathcal {M}}_c)}, p\in K\cap \partial M_c\backslash B(\partial M,h/2) \end{aligned}$$
(5.18)

where \(C,h>0\) depend on W and \(r=r(h)\) is as in Lemma 5.3(i). By part (iii) of the theorem, the choice of regularity k that appears in (4.2) can be made uniform over the compactum K, and will be fixed from now on (assuming \(k\ge 2\) without loss of generality). We then have \( C\vee h^{-1} \lesssim _A 1. \)

We proceed by stratifying K into layers \(\{x\in O:c_i\ge \rho (x) > c_{i+1}\}\) (\(i=0,\dots , N\)) for \(c_0,\dots ,c_{N+1}\) as in Lemma 5.3(ii) with \(N\lesssim h^{-1} \lesssim _A 1\). We will prove inductively that

$$\begin{aligned} \Vert f \Vert _{L^2(O_{c_i})} \lesssim _A \Vert f \Vert _{C^2(M)}^{1-2^{-i}}\Vert I_W f\Vert _{L^2({\mathcal {M}}_O)}^{2^{-i}},\quad i=1,\dots , N+1 \end{aligned}$$
(5.19)

which implies (5.1). For \(i=0\), this is a straightforward consequence of (5.17). Indeed, for every \(p\in \partial M\), we can use the interpolation inequality \(\Vert \cdot \Vert _{H^1}^2\le \Vert \cdot \Vert _{L^2}\Vert \cdot \Vert _{H^2}\) on \({\mathcal {M}}_O\) and a forward estimate (Thm. 2.4) to obtain

$$\begin{aligned} \Vert f \Vert _{L^2(B(p_{0j},h))} \lesssim _A \Vert f \Vert _{C^2(M)}^{1/2} \Vert I_Wf\Vert ^{1/2}_{L^2({\mathcal {M}}_O)},\quad j=1,\dots , J_0, \end{aligned}$$
(5.20)

where the points \(p_{01},\dots , p_{0J_0}\) are as in Lemma 5.3(ii). As the corresponding h-balls cover \(O_{\ge c_1}\), this implies (5.19) for \(i=1\).

Next assume the estimate has been established for some \(1\le i < N\) and consider the points \(p_{i1},\dots ,p_{iJ_i}\) from the lemma. Then (5.18) and Lemma 5.2, combined with the induction hypothesis which allows to bound the bracketed term in (5.5), yield

$$\begin{aligned} \begin{aligned} \Vert f\Vert _{L^2(B(p_{ij},r)\cap M_{c_i})} \lesssim _A \Vert f \Vert _{C^2(M)}^{1-2^{-(i+1)}} \Vert I_W f\Vert _{L^2( {\mathcal {M}}_O)}^{2^{-(i+1)}}. \end{aligned} \end{aligned}$$
(5.21)

A similar bound can be achieved on the balls \(B(p_{0j},h)\) (decreasing the Hölder exponent as in the proof of Lemma 5.2) and together with the induction hypothesis we conclude (5.19) for \(i+1\). This finishes the proof. \(\square \)

5.2 Proof of Theorem 1.3

We conclude the main stability theorem by combining the linear estimates from the previous section with pseudo-linearisation formula and the bounds on integrating factors from Theorem 2.1.

Proof of Theorem 1.3

Let \(\Phi ,\Psi \in C^ \infty (M,{\mathbb {C}}^{m\times m}) \) and recall from Lemma 1.5, that \(C_\Phi - C_\Psi =R_\Phi I_{{\mathcal {W}}_{\Phi ,\Psi } }(\Phi - \Psi ) \alpha ^* R_\Psi ^{-1}\), where \({\mathcal {W}}_{\Phi ,\Psi }A = R_{\Phi }^{-1} A R_\Psi \) and we may choose smooth integrating factors \(R_\Phi \) and \(R_\Psi \) as in Theorem 2.1.

Now for \(K\subset O \subset M\) as in the theorem, we can apply Theorem 5.1 to obtain

$$\begin{aligned} \Vert \Phi - \Psi \Vert _{L^2(K)} \le C({\mathcal {W}}_{\Phi ,\Psi }) \cdot \Vert \Phi -\Psi \Vert _{C^2(M)}^{1-\mu (W)}\cdot \Vert C_\Phi - C_\Psi \Vert _{L^2({\mathcal {M}}_O)}^{\mu ({\mathcal {W}}_{\Phi ,\Psi })} \end{aligned}$$
(5.22)

with \(C({\mathcal {W}}_{\Phi ,\Psi }) \vee \mu ({\mathcal {W}}_{\Phi ,\Psi })^{-1}\) bounded above by

$$\begin{aligned} \omega (\Vert {\mathcal {W}}_{\Phi ,\Psi } \Vert _{C^k(SM)} \vee \Vert {\mathcal {W}}_{\Phi ,\Psi }^{-1} \Vert _{L^\infty (SM)}) \end{aligned}$$
(5.23)

for a non-decreasing function \(\omega :[0,\infty )\rightarrow [0,\infty )\). It remains to bound the norms in the previous display in terms of \(\Vert \Phi \Vert _{C^k(M)} \vee \Vert \Psi \Vert _{C^k(M)}\). Note that \(\Vert {\mathcal {W}}^{\pm 1}_{\Phi ,\Psi } \Vert _{C^k(SM)} \lesssim \Vert R_\Phi ^{\mp 1} \Vert _{C^k(M)} \cdot \Vert R_\Psi ^{\pm 1}\Vert _{C^k(M)}\); hence, the proof is finished by using the bounds from Theorem 2.1. \(\square \)

6 Statistical Application

In this section, we demonstrate the scope of our stability estimate (Theorem 1.3) by showing how it can be used to establish a statistical consistency result. We will focus on the full-data problem (\(O=M\)) and discuss the two-dimensional results from [14] alongside with the case \(d\ge 3\). Let us, therefore, assume that (Mg) is a compact Riemannian manifold with strictly convex boundary and that we are in either of the following cases:

  1. (A)

    \(d=2\) and M is simple

  2. (B)

    \(d\ge 3\) and M admits a strictly convex function

In both cases, we assume for simplicityFootnote 6 that (as a smooth manifold) M is the closed unit ball in \({\mathbb {R}}^d\). We further assume that the potentials \(\Phi \) take values in either \(\mathfrak {so}(m)=\{A\in {\mathbb {R}}^{m\times m}: A^\mathrm{{T}} = - A\}\) or \(\mathfrak {gl}_m({\mathbb {R}})={\mathbb {R}}^{m\times m}\) and write \({\mathfrak {g}}\) to denote either choice.

The plan for this section is as follows: We first record all necessary estimates at one place, then give a brief overview of the Bayesian approach of inverse problems and recall the main statistical theorem from [14], including a sketch of its proof. Finally, in the last subsection, we explain how the proof can be amended to obtain a consistency result in case (B).

In order to keep the overlap with [14] at a minimum, the discussion below is brief and heavily relies on [14]. For more background on the statistical framework, we refer to the books [7] and [8].

6.1 Available Estimates

In both cases, the available forward and stability estimates take the following form:

$$\begin{aligned} \Vert C_\Phi - C_\Psi \Vert _{L^2(M)}\le & {} c_1(\Phi ,\Psi ) \cdot \Vert \Phi - \Psi \Vert _{L^2(M)} \end{aligned}$$
(6.1)
$$\begin{aligned} \Vert C_\Phi \Vert _{L^\infty (M)}\le & {} c_2(\Phi ) \end{aligned}$$
(6.2)
$$\begin{aligned} \Vert \Phi - \Psi \Vert _{L^2(M)}\le & {} C(\Phi ,\Psi ) \cdot \Vert C_\Phi - C_\Psi \Vert _{L^2(\partial _+SM)}^{\mu (\Phi ,\Psi )}, \end{aligned}$$
(6.3)

Here, \(c_1(\Phi ,\Psi ),c_2(\Phi ),C(\Phi ,\Psi )>0\) and \(\mu (\Phi ,\Psi )\in (0,1)\) may depend on the potentials. The validity of the estimates and the uniformity properties of the constants can be summarised as follows:

  • The forward estimates (6.1) and (6.2) are the same in case (A) and (B) and hold true for smooth potentials \(\Phi ,\Psi :M\rightarrow {\mathfrak {g}}\). If \({\mathfrak {g}}= \mathfrak {so}(m)\), then \(c_1\) and \(c_2\) are constant, due to the compactness of SO(m). If \({\mathfrak {g}}= \mathfrak {gl}_m({\mathbb {R}})\), then \(c_1(\Phi ,\Psi )\) and \(c_2(\Phi )\) are uniform on \(L^\infty \)-balls.

  • In case (A) and for \({\mathfrak {g}}= \mathfrak {so}(m)\), we can choose any integer \(k\ge 2\). Then (6.3) holds true for smooth \(\Phi ,\Psi :M\rightarrow \mathfrak {so}(m)\) with \(\mu (\Phi ,\Psi ) = (k-1)/k\) and \(C(\Phi ,\Psi )\) uniform on \(C^k\)-balls.

  • In case (B) and for \({\mathfrak {g}}= \mathfrak {gl}_m({\mathbb {R}})\), there exists an integer \(k\gg 1\) such that (6.3) holds true for \(\Phi ,\Psi :M\rightarrow \mathfrak {gl}_m({\mathbb {R}})\) with both \(C(\Phi ,\Psi )\) and \(\mu (\Phi ,\Psi )\) uniform on \(C^{k}\)-balls.

Here we say that a quantity is ‘uniform on F-balls’ (for \(F=L^\infty (M,{\mathfrak {g}})\) or \(F=C^k(M,{\mathfrak {g}})\)) if its supremum (resp. infimum) over \(\{\Phi ,\Psi : M\rightarrow {\mathfrak {g}}\text { smooth} : \Vert \Phi \Vert _F \vee \Vert \Psi \Vert _F \le A\}\) is finite (resp. \(>0\)) for all \(A>0\).

The forward estimates are proved in Corollary 2.5 for a general non-trapping manifold (with strictly convex boundary). The stability estimate for case (B) is the content of our main theorem (Thm. 1.3), and the version for case (A) is discussed below the main theorem.

Remark 5

An important difference between case (A) and (B) lies in the role of ‘regularity parameter’ k and Hölder exponent \(\mu \), which – in the statistical analysis below – determine the choice of prior and the rate of contraction, respectively. In case (A), one can effectively choose the Hölder exponent arbitrarily close to 1 (by sending \(k\rightarrow \infty \)), while in case (B), our method of proof yields an unknown k and there is no control over the Hölder exponent. See also Remark 8.

6.2 Statistical Background

The statistical question we are concerned with arises in following experimental setup: Suppose for \(\Phi \in C(M,{\mathfrak {g}})\) we observe the data \((X_i,V_i,Y_i)_{i=1}^n\), where

$$\begin{aligned} Y_i = C_\Phi (X_i,V_i) + \epsilon _i,\quad i=1,\dots ,n, \end{aligned}$$
(6.4)

with directions \((X_i,V_i)\) (\(i=1,\dots ,n\)) drawn independently and uniformlyFootnote 7 from \( \partial _+SM\) and independent additive noise given by

$$\begin{aligned} \epsilon _i= (\epsilon _{ij}:1\le j\le \dim {\mathfrak {g}})\in {\mathbb {R}}^{\dim {\mathfrak {g}}}\equiv {\mathfrak {g}}\quad \text { for } \epsilon _{ij}{\sim } N(0,1) \text { i.i.d}. \end{aligned}$$
(6.5)

We write \(P^n_\Phi ={\mathcal {L}}(D_n\vert \Phi )\) for the law of \(D_n=(X_i,V_i,Y_i:1\le i\le n)\), arising from (6.4) with potential \(\Phi \). The statistical experiment just described is then encoded in the collection of probability measures \((P_\Phi ^n:\Phi \in C(M,{\mathfrak {g}}))\) on the sample space \({\mathscr {D}}^n=(\partial _+SM\times {\mathfrak {g}})^n\).

The Bayesian approach to estimate \(\Phi \) from a sample \(D^n=((X_i,V_i,Y_i): 1\le i \le n)\in {\mathscr {D}}^n\) is to choose a prior \(\Pi _n\) on \(C(M,{\mathfrak {g}})\) and compute the posterior probability under the sample \(D^n\) of a (Borel-measurable) set \(B\subset C(M,{\mathfrak {g}})\) according to the formula:

$$\begin{aligned} \Pi _n(\Phi \in B\vert D^n) = \frac{\int _B p^{n}_\Phi (D^n) \Pi (\mathrm {d}\Phi ) }{\int p^n_\Phi (D^n) \Pi (\mathrm {d}\Phi )}, \end{aligned}$$
(6.6)

where \(p_\Phi ^n(D^n)\) is the likelihood of \(D^n\) being generated from \(\Phi \). Precisely, \(p_\Phi ^n=p_\Phi ^1 \otimes \dots \otimes p_\Phi ^1\) (n-times), where \( \log p^1_\Phi (x,v,y) = -\frac{1}{2}\vert C_\Phi (x,v) - y \vert _F^2 -\frac{\dim {\mathfrak {g}}}{2} \log (2\pi ) \) (for \((x,v,y)\in {\mathscr {D}}^1\)) and \(\vert \cdot \vert _F\) is the Frobenius norm.

Given the posterior one can estimate \(\Phi \), for example, by the posterior mean which in our setting exists as Bochner integral in \(C(M,{\mathfrak {g}})\). From a frequentist perspective, one then asks how well \(\Phi \) is estimated, when the data are generated from (6.4) with a ‘true’ potential \(\Phi _0\) and a first such quality assessment is given by the posterior consistency results below.

6.3 Posterior Consistency in Case (A)

In order to state the posterior consistency result of [14], we first review the construction of priors (in arbitrary dimension \(d\ge 2\)), focusing on their key example based on Matérn-Whittle processes.

For a given choice of regularity parameter \(\alpha > d/2\), define a base prior \({\underline{\Pi }}={\underline{\Pi }}(\alpha )\) on \(C(M,{\mathbb {R}})\) as law of a centred Gaussian process \((f(x):x\in M)\) with covariance \( {\mathbb {E}}f(x) f(y) = \int _{{\mathbb {R}}^d} \mathrm{{e}}^{i(x-y)\xi } \langle \xi \rangle ^{-2\alpha }\mathrm {d}\xi , \) where it is understood that \(M\subset {\mathbb {R}}^d\). This so-called Matérn-Whittle process of regularity \(\alpha \) is a standard prior choice in non-parametric Bayesian statistics (Example 11.8 in [7]) and satisfies

$$\begin{aligned} \mathrm {RKHS}({\underline{\Pi }}) = H^\alpha (M,{\mathbb {R}}),\quad {\underline{\Pi }}(C^k(M,{\mathbb {R}})) = 1\text { for } k\in {\mathbb {Z}}\cap [0,\alpha -d/2) \end{aligned}$$
(6.7)

where \(\mathrm {RKHS}(\cdot )\) stands for the ‘reproducing kernel Hilbert-space’. The prior \(\Pi _1\) on \(C(M,{\mathfrak {g}})\) is then obtained by drawing each component (in an identification \({\mathfrak {g}}\equiv {\mathbb {R}}^{\dim {\mathfrak {g}}}\)) independently from \({\underline{\Pi }}\). For \(n\ge 2\) the prior \(\Pi _n\) is defined by scaling \(\Pi _1\), precisely

$$\begin{aligned} \Pi _n = {\mathcal {L}}\left( n^{-\frac{d}{4\alpha + 2d}} \Phi \right) ,\quad \text { for } \Phi \sim \Pi _1, \end{aligned}$$
(6.8)

where \({\mathcal {L}}(\cdot )\) denotes the law of a random variable.

Then in case (A) (M is a simple surface), the following result holds true:

Theorem 6.1

(Thm. 3.2 in [14]) Suppose we are in case (A) above, \({\mathfrak {g}}= \mathfrak {so}(m)\) and \(\alpha >3\). Then for every \(\Phi _0 \in C^\infty (M,\mathfrak {so}(m))\), there is a \(\gamma >0\) such that

$$\begin{aligned} \Pi _n(\Phi : \Vert \Phi - \Phi _0 \Vert _{L^2(M)} \ge n^{-\gamma }\vert D^n ) \rightarrow 0\quad \text { as } n\rightarrow \infty \end{aligned}$$
(6.9)

in \(P^n_{\Phi _0}\)-probability. Here \(\Pi _n(\cdot \vert D^n)\) are the posteriors, defined in (6.6), with respect to the scaled Matérn-Whittle priors in (6.8) of regularity \(\alpha \).

Remark 6

(Generalisations) The theorem remains true for a larger class of base priors (specified in [14, Condition 3.1]). Further, the regularity of \(\Phi _0\) can be relaxed and, by varying \(\alpha \), one has control over the rate of contraction \(\gamma \) (Remark 3.3 in [14]).

Remark 7

The scaling rate \(\nu ={d}/(4\alpha + 2d)\) in(6.8) is chosen such that, writing \(t_* = 2t/(2+t)\) for \(t>0\), we have

$$\begin{aligned} (4\nu / (1-4\nu ) )_* = d/\alpha , \end{aligned}$$
(6.10)

which arises as exponent in a classical \(L^2\)-entropy bound for the unit ball \({\mathbb {B}}^\alpha \subset H^1(M,{\mathfrak {g}})= \mathrm {RKHS} (\Pi _1)\) (cf. Lemma 7.5).

Sketch of proof

Let \(\delta _n = n^{-\alpha /(2\alpha + d)}\)(\(=n^{\nu -1/2}\)). Using (6.10) and a theorem of Li-Linde [12, Thm. 1.2], one computes the small ball probability

$$\begin{aligned} - \log \Pi _n(\Vert \Phi \Vert _{L^2(M)} \le \delta _n ) \lesssim n\delta _n^2. \end{aligned}$$
(6.11)

The event in the last probability can be changed to \(\Vert \Phi - \Phi _0\Vert _{L^2(M)} \le \delta _n\) by a standard argument (Anderson’s Lemma, cf. [8, Cor. 2.6.18]) and expressed in terms of the likelihoods \(p^n_\Phi ,p^n_{\Phi _0}\) by using the forward estimates. A general contraction theorem ( [14, Thm. 5.13]) then implies that, for some sufficiently large \(m'>0\), we have

$$\begin{aligned} \Pi _n(\Phi : h(p^n_\Phi ,p^n_{\Phi _0})\le m' \delta _n \vert D^n) \xrightarrow {{P^n_{\Phi _0}}} 1,\quad \text {as } n\rightarrow \infty . \end{aligned}$$
(6.12)

Here \(h(p^n_\Phi ,p^n_{\Phi _0})\) denotes the Hellinger distance, which is \(\approx \Vert C_\Phi - C_{\Phi _0} \Vert _{L^2(\partial _+SM)}\), as the scattering data are SO(m) valued ( [14, Lem. 5.14]).

By (6.7), it follows for \(0\le k < \alpha -d/2\) that the events \({{\,\mathrm{{\mathcal {F}}}\,}}'(A)=\{\Vert \Phi \Vert _{C^k(M)} \le A \}\) (\(A>0\)) have \(\Pi _n\)-mass approaching 1 as \(n\rightarrow \infty \) (Fernique’s theorem, cf. [8, Thm. 2.1.20]), which suggests that one can intersect the event in (6.12) with \({{\,\mathrm{{\mathcal {F}}}\,}}'(A)\) without destroying the limit. To make this precise one shows, using Borell’s isoperimetric inequality [3] that the slightly smaller events \( {{\,\mathrm{{\mathcal {F}}}\,}}_n(A) = \{\Phi _1 + \Phi _2: \Vert \Phi _1\Vert _{L^2(M)} \le \delta _n, \Vert \Phi _2 \Vert _{H^\alpha (M)}\le A\}\cap {{\,\mathrm{{\mathcal {F}}}\,}}'(A) \) obey

$$\begin{aligned} -\log \Pi _n({{\,\mathrm{{\mathcal {F}}}\,}}_n(A)^c) \ge \omega (A) n\delta _n^2\quad \text { and }\quad \log {\mathcal {N}}({{\,\mathrm{{\mathcal {F}}}\,}}_n(A),h, \delta _n)\lesssim _A n\delta _n^2 \end{aligned}$$
(6.13)

with \(\omega (A)\) unbounded and non-decreasing in A [14, Lemma 5.17] and where \(\log {\mathcal {N}}\) is the metric entropy, defined above Lemma 7.5. Then, for \(A>0\) sufficiently large, [14, Theorem 5.13] indeed implies that

$$\begin{aligned} \Pi _n(\Phi : \Vert C_\Phi - C_{\Phi _0} \Vert _{L^2(\partial _+SM)} \le A \delta _n, \Vert \Phi \Vert _{C^k(M)} \le A\vert D_n)\xrightarrow {{P^n_{\Phi _0}}} 1, \end{aligned}$$
(6.14)

as \(n\rightarrow \infty \) [14, Thm. 5.19]. If \(\alpha > 3\), we may choose \(k \in {\mathbb {Z}}\cap [2,\alpha - d/2)\) and apply stability estimate (6.3) with Hölder exponent \((k-1)/k\). Thus on the event in the previous display we have

$$\begin{aligned} \Vert \Phi - \Phi _0 \Vert _{L^2(M)} \le (A' \delta _n)^{(k-1)/k} \end{aligned}$$
(6.15)

for some \(A'>0\) which incorporates the constant from the stability estimate. Choosing a slightly slower rate \(0<\eta <(k-1)/k\), the constant \(A'\) can be absorbed in the limit \(n\rightarrow \infty \) and thus

$$\begin{aligned} \Pi _n(\Phi :\Vert \Phi - \Phi _0 \Vert _{L^2(M)} \le \delta _n^{\eta }, \Vert \Phi \Vert _{C^k(M)} \le A' \vert D_n) \rightarrow 1 \end{aligned}$$
(6.16)

in \(P_{\Phi _0}^n\)-probability. Dropping the constraint \(\Vert \Phi \Vert _{C^k(M)} \le A'\) yields (6.9) and finishes the proof. \(\square \)

6.4 Posterior Consistency in Case (B)

The proof above can be adapted to case (B) (M of dimension \(d \ge 3\), supporting a strictly convex function) and \({\mathfrak {g}}= \mathfrak {gl}_m({\mathbb {R}})\) to obtain the following result:

Theorem 6.2

Suppose we are in case (B) above and \({\mathfrak {g}}=\mathfrak {gl}(m)\). Then there exist \(\alpha >0\) and \(\gamma >0\), such that for all \(\Phi _0\in C^\infty (M,{\mathfrak {g}})\) we have

$$\begin{aligned} \Pi _n(\Phi : \Vert \Phi - \Phi _0 \Vert _{L^2(M)} \ge n^{-\gamma } \vert D_n) \rightarrow 0\quad \text { as } n\rightarrow \infty \end{aligned}$$
(6.17)

in \(P^n_{\Phi _0}\)-probability. Here, \(\Pi _n(\cdot \vert D_n)\) is again the posterior defined in (6.6) with respect to the scaled Matérn-Whittle priors in (6.8) of regularity \(\alpha \). \(\square \)

Under the hypotheses of the theorem and essentially with the same arguments as in [14], one can use the theorem above to derive a consistency result for the posterior mean. This is defined as \({{\bar{\Phi }}}_n (D_n) = E_{\Pi _n}[\Phi \vert D_n]\) and exists as Bochner integral in \(C(M,{\mathfrak {g}})\). Using the precise exponential convergence rate in (6.17) (above withhold for simplicity), one then shows that

$$\begin{aligned} P_{\Phi _0}\left( \Vert \bar{\Phi }_n(D_n) - \Phi _0 \Vert _{L^2(M)} > n^{-\gamma }\right) \rightarrow 0,\quad \text { as } n\rightarrow \infty , \end{aligned}$$
(6.18)

which gives precisely Theorem 1.4 as stated in the introduction.

Remark 8

In comparison with Theorem 6.1, the theorem has two shortcomings: First, the rate of contraction, while being polynomial, is unknown. Second, and more importantly, the required regularity of the prior (the choice of \(\alpha \)) is unknown as well, and thus, the theorem does not provide a precise guideline for the choice of prior in applications.

Possibly the latter issue can be alleviated by choosing a prior with \(C^\infty \)-smooth sample paths, such as a squared exponential prior. However, as our ignorance of \(\alpha \) rather seems to be an artefact of the proof of the underlying stability estimate than an intrinsic feature of the inverse problem, it is questionable whether such a prior choice is advisable.

Sketch of proof of Theorem 6.2

Let us first discuss the case \({\mathfrak {g}}= \mathfrak {so}(m)\). Then, as we have identical forward estimates as in case (A) and the general contraction theory is independent of the dimension, the proof of Theorem 6.1 extends verbatim to case (B) up to equation (6.14). That is, for \(A>0\) large enough (and \(0\le k < \alpha -d/2)\) we have, as \(n\rightarrow \infty \)

$$\begin{aligned} \Pi _n(\Phi : \Vert C_\Phi - C_{\Phi _0} \Vert _{L^2(\partial _+SM)} \le A \delta _n, \Vert \Phi \Vert _{C^k(M)} \le A\vert D_n)\xrightarrow {{P^n_{\Phi _0}}} 1. \end{aligned}$$
(6.19)

To proceed, one chooses \(\alpha >0\) so large that \(\alpha - d/2\) exceeds the regularity parameter k from Theorem 1.3. Then stability estimate (6.3) implies that on the event in (6.19), we have

$$\begin{aligned} \Vert \Phi - \Phi _0 \Vert _{L^2(M)} \le (A'\delta _n)^{\mu }, \end{aligned}$$
(6.20)

where \(A'\) incorporates the constant from the stability estimate and (in the notation of (6.3)) \(\mu = \inf \mu (\Phi ,\Phi _0)>0\), where the infimum is taken over \(\{\Phi : \Vert \Phi \Vert _{C^k(M)} \le A\}\). The proof is then finished as in case (A).

For \({\mathfrak {g}}= \mathfrak {gl}_m({\mathbb {R}})\), (6.19) remains true, but one has to take some care in its derivation, as the scattering data now assume values in the non-compact group \({{\,\mathrm{Gl}\,}}(m,{\mathbb {C}})\) and the forward estimates are only uniform on \(L^\infty \)-balls. We will explain the necessary changes in the following:

As for the small ball probabilities, (6.11) has to be replaced by

$$\begin{aligned} -\log \Pi _n(\Vert \Phi \Vert _{L^2(M)} \le \delta _n, \Vert \Phi \Vert _{L^\infty } \le A ) \lesssim _A n\delta _n^2, \end{aligned}$$
(6.21)

which follows from (6.11) and the Gaussian correlation inequality [11]

$$\begin{aligned} \begin{aligned} \Pi _n(\Vert \Phi \Vert _{L^2(M)} \le \delta _n, \Vert \Phi \Vert _{L^\infty } \le A )&\ge \Pi _n(\Vert \Phi \Vert _{L^2(M)} \le \delta _n) \Pi _n( \Vert \Phi \Vert _{L^\infty (M)} \le A ), \end{aligned} \end{aligned}$$

noting that \(-\log \Pi _n(\Vert \Phi \Vert _{L^\infty (M)} \le A) = o(1)\) as \(n\rightarrow \infty \) due to Fernique’s theorem. Mutatis mutandis, the same arguments as in case (A) imply (6.12).

Next, the comparison between Hellinger- and \(L^2\)-distance in the general case (and with essentially the same proof) takes the form:

$$\begin{aligned} \omega (\Vert \Phi \Vert _{L^\infty (M)}) ^{-1} \Vert C_\Phi - C_{\Phi _0} \Vert _{L^2(\partial _+M)}\lesssim h(p_{\Phi }^n, p_{\Phi _0}^n) \lesssim \Vert C_\Phi - C_{\Phi _0} \Vert _{L^2(\partial _+M)}\nonumber \\ \end{aligned}$$
(6.22)

for a non-decreasing function \(\omega :[0,\infty )\rightarrow [0,\infty )\) coming from (6.2). As we use the lower bound only on the event \({{\,\mathrm{{\mathcal {F}}}\,}}'(A) = \{\Vert \Phi \Vert _{C^k(M)}\le A\}\), this adjustment is unproblematic, as \(\omega \) can be controlled.

Finally, we note that the proof of (6.13) is completely independent of the forward estimates and only uses the upper bound in (6.22). In particular, [14, Thm. 5.13] can again be used to conclude (6.19), as desired. \(\square \)