Analysis of Artifacts in Shell-Based Image Inpainting: Why They Occur and How to Eliminate Them

Hocking, L. Robert; Holding, Thomas; Schönlieb, Carola-Bibiane

doi:10.1007/s10208-020-09450-3

Analysis of Artifacts in Shell-Based Image Inpainting: Why They Occur and How to Eliminate Them

Open access
Published: 16 March 2020

Volume 20, pages 1549–1651, (2020)
Cite this article

Download PDF

You have full access to this open access article

Foundations of Computational Mathematics Aims and scope Submit manuscript

Analysis of Artifacts in Shell-Based Image Inpainting: Why They Occur and How to Eliminate Them

Download PDF

L. Robert Hocking¹,
Thomas Holding² &
Carola-Bibiane Schönlieb¹

2438 Accesses
1 Citation
Explore all metrics

Abstract

In this paper we study a class of fast geometric image inpainting methods based on the idea of filling the inpainting domain in successive shells from its boundary inwards. Image pixels are filled by assigning them a color equal to a weighted average of their already filled neighbors. However, there is flexibility in terms of the order in which pixels are filled, the weights used for averaging, and the neighborhood that is averaged over. Varying these degrees of freedom leads to different algorithms, and indeed the literature contains several methods falling into this general class. All of them are very fast, but at the same time all of them leave undesirable artifacts such as “kinking” (bending) or blurring of extrapolated isophotes. Our objective in this paper is to build a theoretical model in order to understand why these artifacts occur and what, if anything, can be done about them. Our model is based on two distinct limits: a continuum limit in which the pixel width $h \rightarrow 0$ and an asymptotic limit in which $h > 0$ but $h \ll 1$. The former will allow us to explain “kinking” artifacts (and what to do about them) while the latter will allow us to understand blur. Both limits are derived based on a connection between the class of algorithms under consideration and stopped random walks. At the same time, we consider a semi-implicit extension in which pixels in a given shell are solved for simultaneously by solving a linear system. We prove (within the continuum limit) that this extension is able to completely eliminate kinking artifacts, which we also prove must always be present in the direct method. Finally, we show that although our results are derived in the context of inpainting, they are in fact abstract results that apply more generally. As an example, we show how our theory can also be applied to a problem in numerical linear algebra.

Diffusion–Shock Inpainting

Denoising by Inpainting

Regularised Diffusion–Shock Inpainting

Article Open access 01 April 2024

1 Introduction

Image inpainting refers to the filling in of a region in an image where information is missing or needs to be replaced (due to, for example, an occluding object), in such a way that the result looks plausible to the human eye. The region to be filled is subsequently referred to as the inpainting domain. Since the seminal work of Bertalmio et al. [6], image inpainting has become increasingly important, with applications ranging from removing an undesirable or occluding object from a photograph, to painting out a wire in an action sequence, to 3D conversion of film [27], as well as 3D TV and novel viewpoint construction [12, 17, 29] in more recent years. See [21] for a recent survey of the field.

Image inpainting methods can loosely be categorized as exemplar-based and geometric. Exemplar-based methods generally operate by copying pieces of the undamaged portion of the image into the inpainting domain, in such a way as to make the result appear seamless. Examples include [3, 16, 43]. Exemplar-based methods also operate behind the scenes in Photoshop’s famous Content-Aware Fill tool. These methods excel at interpolating texture across large gaps, but may produce artifacts in structure dominated images with strong edges, or be needlessly expensive if the inpainting domain is thin.

Geometric inpainting methods attempt to smoothly extend image structure into the inpainting domain, typically using partial differential equations (PDEs) or variational principles—see [34] for a comprehensive survey. Examples based on PDEs include the seminal work of Bertalmio et al. [6], methods based on anisotropic diffusion such as [40], while examples based on variational principles include TV, TV-$\hbox {H}^{-1}$, Mumford–Shah, Cahn–Hilliard inpainting [11, 14], Euler’s Elastica [13, 33], as well as the joint interpolation of image values and a guiding vector field in Ballester et al. [4]. These approaches are typically iterative and convergence is often slow (due to either the high order of the underlying PDE model or the difficult nature of the variational problem). On the other hand, Telea’s Algorithm [38], coherence transport [7, 31], and our previous work Guidefill [27] are based on the simple idea of filling the inpainting domain in successive shells from the boundary inward, setting the color of pixels on the current boundary equal to a weighted average of their already filled neighbors—Fig. 1 illustrates this process. In general, “neighbors” may include “ghost pixels” lying between pixel centers and defined using bilinear interpolation—this concept was introduced in [27]. By choosing the weights, fill order, and averaging neighborhood carefully, image isophotes may be extrapolated into the inpainting domain. Compared to their iterative counterparts, these methods have the advantage of being very fast. However, at the same time all of them create, to a greater or lesser degree, some potentially disturbing visual artifacts. For example, extrapolated isophotes may “kink” (change direction), blur, or end abruptly. Although generally these problems have been gradually reduced over time as newer and better algorithms have been proposed, they are still present to some degree even in the state of the art.

The present paper has two goals. Firstly, to analyze these algorithms as a class in order to understand the origins of these artifacts and what, if anything, can be done about them. Secondly, to propose a simple extension in which the pixels in a given shell are solved simultaneously as a linear system, and to analyze its benefits. We will refer to the original methods as “direct” and the extended methods as “semi-implicit”—see Fig. 2 for an illustration of the difference between them. To our knowledge, this extension has never before been proposed in the literature. We also propose a simple iterative method for solving the linear system arising in the semi-implicit extension, show that it is equivalent to damped-Jacobi or successive over-relaxation (SOR), and analyze its convergence rates when applied to the linear system arising in semi-implicit Guidefill. Successive over-relaxation is shown to converge extremely quickly, provided unknowns are ordered appropriately.

Remark 1

The original motivation for this paper comes from one of the author’s experiences working in industry for the 3D conversion company Gener8 (see [27] or [34, Chapter 9] for a description of 3D conversion). The programmers there had implemented a series of shell-based inpainting methods of the form discussed above (all essentially variants of Telea’s algorithm [38], which they were unaware of) in order to solve an inpainting problem arising in their 3D conversion pipeline. This type of algorithm was attractive to them because it is fast and simple, while generally adequate for their application (where inpainting domains are typically very thin). However, they were puzzled by some of the artifacts they were seeing, most notably kinking artifacts. After a literature review it became clear that the existing theory [7, 31, 32] was enough to explain some, but not all, of what they observed. This paper is an attempt to fill the gap.

1.1 Organization

The remainder of this paper is organized as follows. First, in Sect. 2, we present the general form of the class of inpainting methods under consideration, including pseudocode for both the direct method and an implementation of our proposed semi-implicit extension (see Algorithm 1). We also cover in detail known artifacts created by the direct form of the method, how they have been reduced over time, related work and our contributions. Next, in Sect. 3 we review the main variants of Algorithm 1 present in the literature. Section 4 describes an equivalence principle between weighted sums of ghost pixels and a sum over real pixels with different weights. This principle will be applied repeatedly throughout our theoretical analysis. Section 5 describes our semi-implicit extension of Algorithm 1, including a description of an implementation of two alternatives for solving the resulting linear system, namely damped Jacobi and successive over-relaxation (SOR). Section 6 contains our core analysis, and is divided into a number of subsections. The first of these, Sect. 6.1, describes the simplifying assumptions that we make throughout. Next, Sect. 6.2 presents a theoretical analysis of the convergence rates of damped Jacobi and SOR for solving the linear system arising in semi-implicit Guidefill, and compares with real experiments. In Sect. 6.3 we define the continuum and asymptotic limits of Algorithm 1. Section 6.4 describes the connection between Algorithm 1 and stopped random walks, while Sects. 6.5 and 6.7 use this connection to prove convergence to the asymptotic and continuum limits previously defined in Sect. 6.3. Next, In Sect. 7, we apply our analysis to gain a theoretical understanding of kinking and blur artifacts. In particular, Sect. 7.1 utilizes our continuum limit to both explain kinking artifacts, and to show how to overcome them. Section 7.3 utilizes the asymptotic limit to make theoretical predictions quantifying the extent to which a given method introduces blur artifacts. Sections 7.1 and 7.3 also contain numerous numerical experiments demonstrating the accuracy of the predictions made by these limits. In Sect. 8 we show that our results—although derived in the context of image inpainting—are abstract results that can be applied to other problems. To demonstrate this, we apply our asymptotic limit to a problem in numerical linear algebra. Finally in Sect. 9 we draw some conclusions and sketch some directions for potential future research. We also have a number of appendices giving technical details of certain proofs.

1.2 Notation

$h = \text{ the } \text{ width } \text{ of } \text{ one } \text{ pixel }$.
$\mathbb {Z}^2_h := \{ (nh,mh) : (n,m) \in \mathbb {Z}^2 \}.$
Given $\mathbf{v} \in \mathbb {R}^2$, we denote by $L_\mathbf{v}:=\{ \lambda \mathbf{v} : \lambda \in \mathbb {R} \}$ the line through the origin parallel to $\mathbf{v}$.
Given $\mathbf{x} \in \mathbb {R}^2$, we denote by $\theta (\mathbf{x}) \in [0,\pi )$ the counter-clockwise angle $L_\mathbf{x}$ makes with the x-axis. $\theta (\mathbf{x})$ can also be thought of as the counterclockwise angle $\mathbf{x}$ makes with the x-axis, modulo $\pi $.
Given $\mathbf{v} \in \mathbb {R}^2$, we denote by $\mathbf{v}^{\perp }$ the counterclockwise rotation of $\mathbf{v}$ by $90^{\circ }$.
$\varOmega = [a,b] \times [c,d]$ and $\varOmega _h = \varOmega \cap \mathbb {Z}^2_h$ are the continuous and discrete image domains.
$D_h = D^{(0)}_h \subset \varOmega _h$ is the (initial) discrete inpainting domain.
$D^{(k)}_h \subseteq D^{(0)}_h$ is the discrete inpainting domain on step k of the algorithm.
$D \subset \varOmega := \{ \mathbf{x} \in \varOmega : \exists \mathbf{y} \in D_h \text{ s.t. } \Vert \mathbf{y}-\mathbf{x}\Vert _{\infty } < h \}$ is the continuous inpainting domain.
$D^{(k)}$ is the continuous inpainting domain on step k of the algorithm, defined in the same way as D.
$u_0 : \varOmega _h \backslash D_h \rightarrow \mathbb {R}^d$ is the given (discrete) image. In an abuse of notation, we also use $u_0$ to denote an assumed underlying continuous image $u_0 : \varOmega \backslash D \rightarrow \mathbb {R}^d$.
$u_h : \varOmega _h \rightarrow \mathbb {R}^d$ is the inpainted completion of $u_0$.
$\mathbf{g}(\mathbf{x})$ is the guidance vector field used by coherence transport and Guidefill.
$B_{\epsilon }(\mathbf{x})$ the solid ball of radius $\epsilon $ centered at $\mathbf{x}$.
$A_{\epsilon ,h}(\mathbf{x}) \subset B_{\epsilon }(\mathbf{x})$ denotes a generic discrete (but not necessarily lattice aligned) neighborhood of radius $\epsilon $ surrounding the pixel $\mathbf{x}$ and used for inpainting.
$\text{ Supp }(A_{\epsilon ,h}(\mathbf{x})) \subset \mathbb {Z}_h^2$ denotes the set of real pixels needed to define $A_{\epsilon ,h}(\mathbf{x})$ based on bilinear interpolation.
$B_{\epsilon ,h}(\mathbf{x}) = \{ \mathbf{y} \in \varOmega _h : \Vert \mathbf{x}-\mathbf{y}\Vert \le \epsilon \}$, the discrete ball of radius $\epsilon $ centerd at $\mathbf{x}$ and the choice of $A_{\epsilon ,h}(\mathbf{x})$ used by coherence transport.
$r = \epsilon /h$ the radius of $B_{\epsilon ,h}(\mathbf{x})$ measured in pixels.
${\tilde{B}}_{\epsilon ,h}(\mathbf{x}) = R(B_{\epsilon ,h}(\mathbf{x}))$, where R is the rotation matrix taking (0, 1) to $\mathbf{g}(\mathbf{x})$, the choice of $A_{\epsilon ,h}(\mathbf{x})$ used by Guidefill.
${\mathcal {N}}(\mathbf{x})=\{ \mathbf{x}+\mathbf{y} : \mathbf{y} \in \{-h,0,h\} \times \{-h,0,h\}\}$ is the 9-point neighborhood of $\mathbf{x}$.
Given $A_h \subset \mathbb {Z}^2_h$, we define the dilation of $A_h$ by
$$\begin{aligned} D_h(A_h) = \cup _{\mathbf{x} \in A_h} {\mathcal {N}}(\mathbf{x}). \end{aligned}$$
If $h=1$ we write D instead of $D_1$.
Given $A_h \subset \mathbb {Z}^2_h$, we define the discrete (inner) boundary of $A_h$ by
$$\begin{aligned} \partial A_h := \{ \mathbf{x} \in A_h : {\mathcal {N}}(\mathbf{x}) \cap \mathbb {Z}^2_h \backslash A_h \ne \emptyset \}. \end{aligned}$$
For convenience we typically drop the word “inner” and refer to $\partial A_h$ as just the boundary of $A_h$.
O denotes the zero matrix.
Given $c \in \mathbb {R}$, we define $\{ y \le c \} := \{ (x,y) \in \mathbb {R}^2 : y \le c \}.$
Given $ L > 0$, and $0 \le x,y < L$ we define the circular distance between x and y by
$$\begin{aligned} d^{{\text {circle}}}_L(x,y) = \min (x-y \mod L, y-x \mod L). \end{aligned}$$
Given $\sigma , \mu \in \mathbb {R}$ we denote by $G_{\sigma ,\mu }$ and $g_{\sigma ,\mu }$ the Gaussians
$$\begin{aligned} G_{\sigma ,\mu }(x) := \frac{e^{-\frac{(x-\mu )^2}{2\sigma ^2}}}{\sqrt{2\pi }\sigma } \quad g_{\sigma ,\mu }(x):=\frac{e^{-\frac{(x-\mu )^2}{2\sigma ^2}}}{\sum _{n \in \mathbb {Z}} e^{-\frac{(n-\mu )^2}{2\sigma ^2}}} \end{aligned}$$
Similarly, given $L > 0$ and $N \in \mathbb {N}$ we denote by $G^*_{\sigma ,\mu ,L}$ and $g^*_{\sigma ,\mu ,N}$ the periodic Gaussians
$$\begin{aligned} G^*_{\sigma ,\mu ,L}(x):= & {} \frac{e^{-\frac{d^{{\text {circle}}}_L(x,\mu )^2}{2\sigma ^2}}}{\sqrt{2\pi }\sigma }\\ g^*_{\sigma ,\mu ,N}(x):= & {} \frac{e^{-\frac{d^{{\text {circle}}}_N(x,\mu )^2}{2\sigma ^2}}}{\sum _{n =0}^{N-1} e^{-\frac{d^{{\text {circle}}}_N(n,\mu )^2}{2\sigma ^2}}}\end{aligned}$$
Note that $G_{\sigma ,\mu }$ and $g_{\sigma ,\mu }$ are normalized with respect to integration over $\mathbb {R}$ and summation over $\mathbb {Z}$ respectively, while $\frac{1}{\int _0^L G^*_{\sigma ,\mu ,L}(x)dx }\cdot G^*_{\sigma ,\mu ,L}$ and $g^*_{\sigma ,\mu ,N}$ are normalized with respect to integration over [0, L) and summation over $\{0,1,\ldots ,N-1\}$ respectively.

2 Shell-Based Geometric Inpainting

In this paper we are interested in a simple class of shell-based geometric inpainting methods and their proposed semi-implicit extension. These methods fill the inpainting domain in successive shells from the boundary inwards, as illustrated in Fig. 1. In the direct form of the method, the color of a given pixel $\mathbf{x}$ due to be filled is computed as a weighted average of its already filled neighbors within a discrete neighborhood $A_{\epsilon ,h}(\mathbf{x}) \subset B_{\epsilon }(\mathbf{x})$. In the semi-implicit extension, this sum also includes unknown pixels within the current shell, resulting in a linear system (Fig. 2). We will cover the resulting linear system in detail in Sect. 5, where we also propose an iterative method for its solution. The direct method as well as this proposed iterative solution to the semi-implicit extension are illustrated in Algorithm 1 with pseudo code (the blue code indicates the parts relevant to the semi-implicit extension). The neighborhood $A_{\epsilon ,h}(\mathbf{x})$ need not be axis aligned and may contain “ghost pixels” lying between pixel centers—see Fig. 3 for an illustration. Ghost pixels were introduced in [27] (where they were shown to be beneficial for reducing “kinking artifacts”—see Fig. 6), and the color of a given ghost pixel is defined as the bilinear interpolation of its four real pixel neighbors, but is undefined if one or more of them has not yet been assigned a color. We denote by $\text{ Supp }(A_{\epsilon ,h}(\mathbf{x})) \subset \mathbb {Z}_h^2$ the set of real pixels needed to define $A_{\epsilon ,h}(\mathbf{x})$ in this way. Here h and $\epsilon $ denote respectively the width of one pixel and the radius of the bounding disk $B_{\epsilon }(\mathbf{x}) \supset A_{\epsilon ,h}(\mathbf{x})$. The averaging weights $w_{\epsilon }$ are non-negative and are allowed to depend on $\mathbf{x}$, but must scale proportionally with the size of the neighborhood $A_{\epsilon ,h}(\mathbf{x})$, like so:

$$\begin{aligned} w_{\epsilon }(\mathbf{x},\mathbf{y})={\hat{w}}\left( \mathbf{x}, \frac{\mathbf{y}-\mathbf{x}}{\epsilon }\right) \end{aligned}$$

(2)

for some function ${\hat{w}}(\cdot ,\cdot ) : \varOmega \times B_1(\mathbf{0}) \rightarrow [0,\infty ]$. Note that we will sometimes write $w_r$ or $w_1$ in place of $w_{\epsilon }$—in this case we mean (2) with $\epsilon $ replaced by r or 1 in the denominator on the right hand side. As the algorithm proceeds, the inpainting domain shrinks, generating a sequence of inpainting domains

$$\begin{aligned} D_h = D^{(0)}_h \supset D^{(1)}_h \supset \ldots \supset D^{(K)}_h = \emptyset . \end{aligned}$$

We will assume the non-degeneracy condition

$$\begin{aligned} \sum _{\mathbf{y} \in A_{\epsilon ,h}(\mathbf{x}) \cap (\varOmega \backslash D^{(k)})}w_{\epsilon }(\mathbf{x},\mathbf{y}) > 0 \end{aligned}$$

(3)

holds at all times, this ensures that the average (1) in Algorithm 1 is always well defined. One trivial way of ensuring this is by having strictly positive weights ${\hat{w}}$, which all the methods considered do (see Sect. 3). At iteration k, only pixels belonging to the current boundary $\partial D^{(k)}_h$ are filled, but moreover we fill only a subset $\partial _{\text{ ready }} D^{(k)}_h \subseteq \partial D^{(k)}_h$ of pixels deemed to be “ready” (In Sect. 3 we will review the main methods in the literature and give their “ready” functions). The main inpainting methods in the literature of the general form given by Algorithm 1 are (in chronological order) Telea’s Algorithm [38], coherence transport [7], coherence transport with adapted distance functions [31], and our previous work Guidefill [27]. As we will see, these methods essentially differ only in the choice of weights (2), the choice of fill order as dictated by the “ready” function, and the choice of neighborhood $A_{\epsilon ,h}(\mathbf{x})$. We will review these methods in Sect. 3.

Remark 2

It is worth mentioning that this class of algorithms is nearly exactly the same as the “generic single-pass algorithm” first systematically studied by Bornemann and März in [7]. The two main differences are

1.
They assume $A_{\epsilon ,h}(\mathbf{x})=B_{\epsilon ,h}(\mathbf{x})$, while we allow for greater flexibility.
2.
They consider only the direct form of Algorithm 1, not the semi-implicit extension.

Beyond this, they also phrase things in terms of a pre-determined fill order, rather than a “ready” function, but the former may easily be seen, mathematically at least, to be a special case of the latter (Sect. 3).

2.1 Advantages of Algorithm 1

The main appeal of Algorithm 1 is its simplicity and parallelizability, which enable it to run very fast. A second advantage, first noted by Bornemann and März [7], is the stability property

$$\begin{aligned} \min _{\mathbf{y} \in B}u_0(\mathbf{y}) \le u_h(\mathbf{x}) \le \max _{\mathbf{y} \in B} u_0(\mathbf{y}) \qquad \text{ for } \text{ all } {} \mathbf{x} \in D_h, \end{aligned}$$

(4)

(which holds channelwise) where $u_0 : \varOmega _h \backslash D_h \rightarrow \mathbb {R}^d$ is the given image and B is the band of width $\epsilon $ pixels surrounding $\partial D_h$. This property holds because we have chosen non-negative weights summing to one.

Remark 3

Although we have presented Telea’s algorithm [38] as an example of Algorithm 1, this is not strictly true as its update formula (5) (see Sect. 3) contains a gradient term that, after it has been approximated with finite differences, effectively violates the rule of non-negative weights summing to one. This means that Telea’s algorithm does not satisfy the stability property (4). See Fig. 7.

2.2 Disadvantages and Artifacts

The main disadvantage of Algorithm 1 is obviously loss of texture, and in cases where texture is important, these methods should not be used. However, beyond loss of texture, inpainting methods of the general form given in Algorithm 1 can also introduce a host of of other artifacts, which we list below.

“kinking” of isophotes where extrapolated isophotes change direction at the boundary of the inpainting domain—see Figs. 6 and 8.
“blurring” of isophotes where edges that are sharp in the undamaged region may blur when extended into the inpainting domain—see Figs. 6 and 8.
“cutting off” of isophotes where isophotes extrapolated into the inpainting domain end abruptly—see Fig. 4.
formation of shocks where sharp discontinuities may form in the inpainting domain—see Fig. 4.
bright or dark spots that are only a problem if the stability condition (4) is violated, as it is for Telea’s algorithm. See Figs. 7 and 8.

2.3 Related Work (Artifact Reduction)

Broadly speaking, there has been incremental progress as follows: Telea’s algorithm [38], the earliest variant to appear in the literature, suffers from strong artifacts of every type. In particular, the weights make no attempt to take into account the orientation of undamaged isophotes in $\varOmega _h \backslash D_h$, and the result shows strong kinking artifacts (see Fig. 6). Bornemann and März identified and sought to address this problem with coherence transport [7], which proposed carefully chosen weights that are proven (in a high resolution and vanishing viscosity limit) to extend isophotes in any desired guidance direction $\mathbf{g}$ not parallel to the inpainting domain boundary. This was combined with a method aimed at robustly measuring the orientation of isophotes at the boundary, so that a suitable $\mathbf{g}$ allowing for a seamless transition could be found. The problem of “kinking” ostensibly resolved, in a follow up work März proposed coherence transport with adapted distance functions [31] designed to minimize the problem of “cut off” isophotes and shocks. This was accomplished by recognizing that artifacts such as the incomplete line in Fig. 4a are often the byproduct of a suboptimal fill order such as the one superimposed (in this case the default onion shell ordering). The situation can often be corrected as in Fig. 4b, by using an ordering better adapted to the image such as the one illustrated there. Rather than filling pixels in an order proportional to their distance from the boundary, i.e. having the $\text{ ready }$ function in Algorithm 1 always return “true”, März proposed a number of ways of generating improved orderings based on non-Euclidean distance from boundary maps. At the same time, recognizing that the presence of shocks was related to the “stopping set” [31] of the distance map, März was able to exert some measure of control over those as well, if not prevent them entirely. Guidefill [27] brought the focus back to the reduction of kinking artifacts, by noting that coherence transport is actually only able to propagate along a given guidance vector $\mathbf{g}$ if it points in one of a finite set of special directions—see Fig. 5b. Whereas previous improvements to Algorithm 1 had focused first on improving the choice of weights, then the fill order (equivalently the choice of $\text{ ready }$ function), Guidefill proposed for the first time to change the averaging neighborhood $A_{\epsilon ,h}(\mathbf{x})$, which until then had always been the discrete ball $B_{\epsilon ,h}(\mathbf{x})$ (Fig. 10a). Specifically, it proposed to replace $A_{\epsilon ,h}(\mathbf{x})=B_{\epsilon ,h}(\mathbf{x})$ with $A_{\epsilon ,h}(\mathbf{x})={\tilde{B}}_{\epsilon ,h}(\mathbf{x})$, where ${\tilde{B}}_{\epsilon ,h}(\mathbf{x})$ is the rotated discrete ball shown in Fig. 10b, aligned with the guidance direction $\mathbf{g}$. Since $A_{\epsilon ,h}(\mathbf{x})$ is in this case no longer axis aligned, it contains what the authors called “ghost pixels” lying between pixel centers, which they defined based on bilinear interpolation. This small change enabled Guidefill to propagate along most guidance directions, but it too has problems when the angle between $\mathbf{g}$ and the boundary to the inpainting domain is too shallow—see Fig. 5c. However, Guidefill pays a price for its reduction in kinking artifacts in the form of an increase in blur artifacts. See Fig. 6, where coherence transport produces a sharp extension of image isophotes, albeit possibly in the wrong direction, whereas Guidefill extrapolates in the right direction, but the extrapolation suffers from blur. Guidefill also proposed its own “smart order” computed on the fly as an alternative to März’s adapted distance functions, but this does not have any serious advantage in terms of the quality of the results. Either approach will do for preventing “cut off” isophotes.

2.4 Related Theoretical Work

The direct form of Algorithm 1 has been studied from a theoretical point of view by proposing two distinct continuum limits. The first of these is the high-resolution vanishing viscosity limit proposed by Bornemann and März, in which $h \rightarrow 0 $ and then $\epsilon \rightarrow 0$ [7]. The second is a fixed-ratio continuum limit proposed in our previous work [27] in which $(h,\epsilon ) \rightarrow (0,0)$ along the line $\epsilon = r h$ (with $r \in \mathbb {N}$ fixed). The non-negative integer r is simply the radius of $A_{\epsilon ,h}(\mathbf{x})$ measured in pixels. Although both are perfectly valid mathematically, numerical experiments indicate that the high resolution viscosity limit gives a good approximation of the behaviour of Algorithm 1 in practice only when $r \gg 1$, whereas our fixed-ratio limit gives a good approximation even when r is a small integer, as it typically is in practice (see Remark 4.3 in our previous work [27] for an explanation of why this is). There has also been significant work in studying the well-posedness of the high resolution and vanishing viscosity limit of Algorithm 1, both in [7] and especially in [32]. See Fig. 9 for an illustration of these two separate limits.

Motivation for ghost pixels In Sect. 4 we will prove that any weighted sum over a set $A_{\epsilon ,h}(\mathbf{x})$ of ghost pixels is equivalent to a sum over the real pixels in $\text{ Supp }(A_{\epsilon ,h}(\mathbf{x}))$ with equivalent weights. While this makes ghost pixels in some sense redundant, they are useful concept. Specifically, in Theorem 6 we will prove that the fixed-ratio continuum limit described above and illustrated in Fig. 9 is a transport equation with transport direction that is a function of the weights $w_{\epsilon }$ and the position vectors of the elements of $A_{\epsilon ,h}(\mathbf{x})$. Avoiding “kinking” artifacts amounts to working backwards from a desired transport direction $\mathbf{g}$ to the weights $w_{\epsilon }$ and neighborhood $A_{\epsilon ,h}(\mathbf{x})$ that yield $\mathbf{g}$. This is easier to do when the elements of $A_{\epsilon ,h}(\mathbf{x})$ can move continuously in $\mathbb {R}^2$, rather than being constrained to the lattice $\mathbb {Z}_h^2$.

2.5 Contributions

Our contributions are both theoretical and practical, aimed at a deeper understanding of the mathematical properties of both the direct and semi-implicit versions of Algorithm 1, and through that understanding, a better grasp of the underlying causes of the artifacts described above and what, if anything, can be done about them. Our main targets are “kinking” and “blurring” artifacts, the others having already been thoroughly analyzed in [7, 32] and well understood. Broadly speaking, we have three main contributions:

We propose a semi-implicit variant of Algorithm 1, propose an efficient method for solving the linear system that arises in it, and derive analytical rates of convergence.
We propose two novel limits of Algorithm 1: a fixed-ratio continuum limit where the pixel width $h \rightarrow 0$ with $r =\epsilon /h$ fixed, and an asymptotic limit where $r=\epsilon /h$ is again fixed but rather than taking h to zero, we explore the asymptotic dependence of the solution on h when it is very small.
We use the above two limits to explain the origins of kinking and blur artifacts in Algorithm 1, and how to rectify them. Among other things, we prove that any variant of the direct form of Algorithm 1 must exhibit kinking artifacts, whereas this is not the case for our proposed semi-implicit extension (Sect. 7.1.1).

While our present work focuses on inpainting, our results—in particular the asymptotic limit—are abstract results that can be applied more generally. An example of a situation where this comes up is the use of damped Jacobi iteration for solving a linear system

$$\begin{aligned} A\mathbf{x} = \mathbf{b} \end{aligned}$$

when A is an M-matrix (that is, has a positive diagonal and negative or zero off-diagonal elements). Section 8 demonstrates this by applying our asymptotic limit to predict the evolution of the error in damped Jacobi applied to a 1-d convection-diffusion problem with Dirichlet boundary conditions.

3 Review of Main Methods

Here we briefly review the main inpainting methods of the general form sketched in Algorithm 1.

Telea’s algorithm The earliest algorithm (to our knowledge) appearing in the literature and of the form sketched in Algorithm 1, Telea’s algorithm [38] is also the only such algorithm to use a different formula for $u_h(\mathbf{x})$ than the expression (1) appearing in Algorithm 1 (see Remark 3). Instead of computing $u_h(\mathbf{x})$ a weighted average of $u_h(\mathbf{y})$ evaluated at nearby already filled pixels $\mathbf{y}$, it takes a weighted average of the predictions that each of these pixels makes, based on linear extrapolation, for $u_h(\mathbf{x})$. That is,

$$\begin{aligned} u_h(\mathbf{x}) = \frac{\sum _{\mathbf{y} \in B_{\epsilon ,h}(\mathbf{x}) \cap (\varOmega _h \backslash D^{(k)}_h )} w_{\epsilon }(\mathbf{x},\mathbf{y}) u_{\text{ predicted }}(\mathbf{x})}{\sum _{\mathbf{y} \in B_{\epsilon ,h}(\mathbf{x})\cap (\varOmega _h \backslash D^{(k)}_h )} w_{\epsilon }(\mathbf{x},\mathbf{y})}, \end{aligned}$$

(5)

where

$$\begin{aligned} u_{\text{ predicted }}(\mathbf{x}) := u_h(\mathbf{y})+\nabla _h u_h(\mathbf{y}) \cdot (\mathbf{x}-\mathbf{y}) \end{aligned}$$

and $\nabla _h u_h(\mathbf{y})$ denotes the centered difference approximation to the gradient of $u_h$ at $\mathbf{y}$, that is

$$\begin{aligned} \nabla _h u_h(\mathbf{y}):= & {} \frac{1}{2}\big (u_h(\mathbf{y}+e_1)\\&\quad -u_h(\mathbf{y}-e_1),u_h(\mathbf{y}+e_2)-u_h(\mathbf{y}-e_2)\big ). \end{aligned}$$

As we have already noted in Remark 3, this approach has a disadvantage in that it results in the loss of the stability property (4). Moreover, the predictions based on linear extrapolation can become highly inaccurate when $\mathbf{y}$ is on an edge in $\varOmega _h \backslash D^{(k)}_h$, leading to significant over or undershooting, visible as bright or dark spots as in Figs. 7 and 8. Perhaps in recognition of this, the gradient term was dropped from (5) in all subsequent algorithms. The weights in this case are

$$\begin{aligned} w_{\epsilon }(\mathbf{x},\mathbf{y}) = \text{ dir }(\mathbf{x},\mathbf{y}) \cdot \text{ dst }(\mathbf{x},\mathbf{y}) \cdot \text{ lev }(\mathbf{x},\mathbf{y}), \end{aligned}$$

where

$$\begin{aligned} \text{ dir }(\mathbf{x},\mathbf{y}):= & {} \frac{\mathbf{x}-\mathbf{y}}{\Vert \mathbf{x}-\mathbf{y}\Vert } \cdot \mathbf{N}(\mathbf{x}), \\ \text{ dst }(\mathbf{x},\mathbf{y}):= & {} \frac{d_0^2}{\Vert \mathbf{x}-\mathbf{y}\Vert ^2}, \\ \text{ lev }(\mathbf{x},\mathbf{y}):= & {} \frac{T_0}{1+|T(\mathbf{y})-T(\mathbf{x})|}, \end{aligned}$$

and $T(\mathbf{x})$ denotes the Euclidean distance from $\mathbf{x}$ to the (original) boundary of the inpainting domain, and $N(\mathbf{x}) = \nabla _h T(\mathbf{x})$ (estimated based on central differences). T is precomputed using the fast marching method. Telea’s algorithm uses the default onion shell ordering, that is “$\text{ ready }(\mathbf{x}) \equiv \text{ true }$” (Fig. 9).

Coherence transport Coherence transport [7] improves upon Telea’s algorithm by adapting the weights in order to encourage extrapolation of isophotes in the direction of their tangent. This is done by calculating a “local coherence direction” $\mathbf{g}(\mathbf{x})$ in terms of a modified structure tensor. Coherence transport calculates the color of a given pixel to be filled using the formula (1) in Algorithm 1 with weights

$$\begin{aligned} w_{\epsilon }(\mathbf{x},\mathbf{y}) = \frac{1}{\Vert \mathbf{y}-\mathbf{x}\Vert }\exp \left( -\frac{\mu ^2}{2\epsilon ^2}(\mathbf{g}^{\perp }(\mathbf{x}) \cdot (\mathbf{y}-\mathbf{x}))^2\right) , \end{aligned}$$

(6)

and with $A_{\epsilon ,h}(\mathbf{x}) = B_{\epsilon ,h}(\mathbf{x})$—see Fig. 10a and c. Like Telea’s algorithm, coherence transport uses the default onion shell ordering, that is “$\text{ ready }(\mathbf{x}) \equiv \text{ true }$”.

Coherence transport with adapted distance functions In a subsequent work [31], März made improvements to coherence transport by replacing the default onion shell ordering with one based on a variety of non-Euclidean distance functions. One such distance function defines an “active boundary” $\Gamma _h \subseteq \partial D_h$ defined by

$$\begin{aligned} \Gamma _h := \{ \partial D_h : \langle \mathbf{g}(\mathbf{x}), \mathbf{N}(\mathbf{x}) \rangle ^2 > \gamma \} \end{aligned}$$

where $\gamma > 0$ is a small constant. The non-Euclidean distance to boundary $T^*_h$ is then computed as the Euclidean distance to the active boundary. The algorithm is modified so that at any given iteration, only a subset of boundary pixels are filled—namely those minimizing $T^*_h$. That is

$$\begin{aligned} \text{ ready }(\mathbf{x}) = \text{ true } \Leftrightarrow \mathbf{x} \in {\text {argmin}}_{\mathbf{y} \in \partial D_h} T^*_h(\mathbf{y}). \end{aligned}$$

This adaptation leads to improvements in the long range extrapolation of isophotes, as in Fig. 4.

Guidefill Guidefill [27] is a recent inpainting algorithm designed to address, among other things, the kinking issues in Figs. 5b and 6. While coherence transport is able to extrapolate along guidance direction $\mathbf{g}(\mathbf{x})$ only if $\mathbf{g}(\mathbf{x}) = \lambda (\mathbf{v}-\mathbf{x})$ for some $\mathbf{v} \in B_{\epsilon ,h}(\mathbf{x})$ (see Fig. 5b), Guidefill replaces the lattice aligned discrete ball $B_{\epsilon ,h}(\mathbf{x})$ with the rotated discrete ball ${\tilde{B}}_{\epsilon ,h}(\mathbf{x})$ aligned with the local transport direction $\mathbf{g}(\mathbf{x})$, so that $\mathbf{g}(\mathbf{x}) = \lambda (\mathbf{v}-\mathbf{x})$ for some $\mathbf{v} \in {\tilde{B}}_{\epsilon ,h}(\mathbf{x})$ is always true. The rotated ball ${\tilde{B}}_{\epsilon ,h}(\mathbf{x})$ contains “ghost pixels” lying between pixel centers which are defined using bilinear interpolation. See Sect. 4 for a deeper discussion of ghost pixels, as well as Fig. 10a–b for an illustration of $B_{\epsilon ,h}(\mathbf{x})$ and ${\tilde{B}}_{\epsilon ,h}(\mathbf{x})$.

Guidefill uses the same weights (6) as coherence transport (illustrated in Fig. 10c) and similarly to the latter’s extension [31], it has a way of automatically determining a good fill order. Unlike coherence transport which computes $\mathbf{g}(\mathbf{x})$ concurrently with inpainting, Guidefill computes a guide field $\mathbf{g}(\mathbf{x}) : D_h \rightarrow \mathbb {R}^2$ prior to inpainting. The guide field is computed based on splines which the user may adjust in order to influence the results. It is used to automatically compute a good fill order by computing for each $\mathbf{x} \in \partial D_h$ a confidence $C(\mathbf{x}) \in [0,1]$ inspired by Criminisi et al. [16] and given by

$$\begin{aligned} C(\mathbf{x})=\frac{\sum _{\mathbf{y} \in {\tilde{B}}_{\epsilon ,h}(\mathbf{x}) \cap (\varOmega \backslash D^{(k)} )}w_{\epsilon }(\mathbf{x},\mathbf{y})}{\sum _{\mathbf{y} \in {\tilde{B}}_{\epsilon ,h}(\mathbf{x}) }w_{\epsilon }(\mathbf{x},\mathbf{y})}, \end{aligned}$$

(7)

and then only filling those pixels for which $C(\mathbf{x}) > c$, where $ c \in (0,1)$ is a small constant. That is

$$\begin{aligned} \text{ ready }(\mathbf{x}) = 1( C(\mathbf{x}) > c) \end{aligned}$$

(8)

Guidefill was designed for use as part of a 3D conversion pipeline, and as such makes use of a set $B_h$ of “bystander pixels” which are neither inpainted nor may be used for inpainting. However, this is not relevant to our current investigation and we will assume $B_h = \emptyset $ throughout. As shown in Fig. 5c—Guidefill is able to largely, but not completely, eliminate kinking artifacts. It was in the hope of overcoming this that we designed the semi-implicit version of Algorithm 1 discussed in Sect. 5.

Remark 4

Note that we have deliberately excluded the point $\mathbf{x}$ from the update formula (1) in Algorithm 1, even if the set $A_{\epsilon ,h}(\mathbf{x})$ contains $\mathbf{x}$. This is not done in any of the methods [7, 27, 31, 38] we have just discussed, but it makes no difference to them or any other variant of the direct form of Algorithm 1, because the subroutine FillRow only involves sums taken over $A_{\epsilon ,h}(\mathbf{x}) \cap (\varOmega \backslash D^{(k)})$, which never contains $\mathbf{x}$. However, the semi-implicit extension of Algorithm 1 expresses $u_h(\mathbf{x})$ as a sum of $u_h(\mathbf{y})$ over a set of points that might include $\mathbf{x}$. This creates problems with weights such as (6) for which $w_{\epsilon }(\mathbf{x},\mathbf{x})=\infty $. See “Appendix A” for further details.

4 Ghost Pixels and Equivalent Weights

Because ghost pixels are defined using bilinear interpolation, any sum over a finite set of ghost pixels $A(\mathbf{x})$ can be converted into a sum over an equivalent set of real pixels with equivalent weights,^{Footnote 1} that is

$$\begin{aligned} \sum _{\mathbf{y} \in A(\mathbf{x})} w(\mathbf{x},\mathbf{y}) u_h(\mathbf{y}) = \sum _{\mathbf{y} \in \text{ Supp }(A(\mathbf{x}))} {\tilde{w}}(\mathbf{x},\mathbf{y}) u_h(\mathbf{y}) \end{aligned}$$

where $\text{ Supp }(A(\mathbf{x}))$ denotes the set of real pixels needed to define $u_h(\mathbf{y})$ for each $\mathbf{y} \in A(\mathbf{x})$ and ${\tilde{w}}$ denotes a set of equivalent weights. This works because each $u_h(\mathbf{y})$ is itself a weighted sum of the form

$$\begin{aligned} u_h(\mathbf{y}) = \sum _{\mathbf{z} \in \mathbb {Z}_h^2}\varLambda _{\mathbf{z},h}(\mathbf{y})u_h(\mathbf{z}), \end{aligned}$$

where $\{ \varLambda _{\mathbf{z},h} \}_{\mathbf{z} \in \mathbb {Z}^2_h}$ denote the basis functions of bilinear interpolation associated with the lattice $\mathbb {Z}^2_h$. This is illustrated in Fig. 11a, b, where we show a heat map of the weights (6) over the set ${\tilde{B}}_{\epsilon ,h}(\mathbf{x}) \backslash \{ \mathbf{x}\}$ for $\mu = 100$ and $\epsilon =3$px, as well as a similar heat map of the modified weights over $\text{ Supp }({\tilde{B}}_{\epsilon ,h}(\mathbf{x})\backslash \{ \mathbf{x}\}) \subseteq D_h(B_{\epsilon ,h}(\mathbf{x}))$. Note that even though ${\tilde{B}}_{\epsilon ,h}(\mathbf{x})\backslash \{ \mathbf{x}\}$ does not contain the point $\mathbf{x}$, the support of this set does. This will be important in Sect. 5. Here we briefly list some properties of equivalent weights, including an explicit formula. Proofs are sketched, but details are deferred to “Appendix B”.

Properties of equivalent weights Properties 1–3 deal with a general finite set $A(\mathbf{x})$ and general weights $w(\mathbf{x},\mathbf{y})$, while properties 4–6 deal with the specific set $A_{\epsilon ,h}(\mathbf{x}) \subset B_{\epsilon }(\mathbf{x})$ and the specific weights $w_{\epsilon }(\mathbf{x},\mathbf{y})$ obeying (3).

1.
Explicit formula:
$$\begin{aligned} {\tilde{w}}(\mathbf{x},\mathbf{z}) = \sum _{\mathbf{y} \in A(\mathbf{x})} \varLambda _{\mathbf{y},h}(\mathbf{z})w(\mathbf{x},\mathbf{y}). \end{aligned}$$
(9)
2.
Preservation of total mass:
$$\begin{aligned} \sum _{\mathbf{y} \in A(\mathbf{x})} w(\mathbf{x},\mathbf{y}) = \sum _{\mathbf{y} \in \text{ Supp }(A(\mathbf{x}))} {\tilde{w}}(\mathbf{x},\mathbf{y}). \end{aligned}$$
(10)
3.
Preservation of center of mass (or first moment):
$$\begin{aligned} \sum _{\mathbf{y} \in A(\mathbf{x})} w(\mathbf{x},\mathbf{y})\mathbf{y} = \sum _{\mathbf{y} \in \text{ Supp }(A(\mathbf{x}))} {\tilde{w}}(\mathbf{x},\mathbf{y})\mathbf{y}. \end{aligned}$$
(11)
4.
Inheritance of non-negativity:
$$\begin{aligned} {\tilde{w}}_{\epsilon }(\mathbf{x},\mathbf{z}) \ge 0 \quad \text{ for } \text{ all } \mathbf{z} \in \text{ Supp }(A_{\epsilon ,h}(\mathbf{x})). \end{aligned}$$
(12)
5.
Inheritance of non-degeneracy condition (3):
$$\begin{aligned} \sum _{\mathbf{y} \in \text{ Supp }(A_{\epsilon ,h}(\mathbf{x}) \cap (\varOmega \backslash D^{(k)}))}{\tilde{w}}_{\epsilon }(\mathbf{x},\mathbf{y}) > 0. \end{aligned}$$
(13)
6.
Universal support: For any $n \in \mathbb {Z}$, we have
$$\begin{aligned} \text{ Supp }(A_{\epsilon ,h}(\mathbf{x}) \cap \{ y \le nh\} )\subseteq & {} D_h(B_{\epsilon ,h}(\mathbf{x})) \cap \{ y \le nh\} \\\subseteq & {} B_{\epsilon +2h,h}(\mathbf{x}) \cap \{ y \le nh\}. \end{aligned}$$
where $\{y \le nh \} := \{ (x,y) \in \mathbb {R}^2 : y \le nh\}$, and where $D_h$ is the dilation operator defined in our section on notation.

Proof

Most of these properties are either obvious or are derived based on a simple exercise in changing the order of nested finite sums. Properties (10) and (11) are slightly more interesting—they follow from the fact that the bilinear interpolant of a polynomial of degree at most one is just the polynomial again. Note that an analogous formula for preservation of the second moment does not hold, because a quadratic function and its bilinear interpolant are not the same thing. The last identity is based on an explicit formula for the support of a point. Details are provided in “Appendix B”. $\square $

Remark 5

Although we have explicit formula (9) for the equivalent weights which will occasionally be useful, most of the time it is more fruitful to think about them in the following way: To compute ${\tilde{w}}_{\epsilon }(\mathbf{x},\mathbf{y})$ for some real pixel $\mathbf{y}$, loop over the ghost pixels $\mathbf{z}$ such that $\mathbf{y} \in \text{ Supp }(\mathbf{z})$. Then, each such $\mathbf{z}$ redistributes to ${\tilde{w}}_{\epsilon }(\mathbf{x},\mathbf{y})$ a fraction of its weight $w_{\epsilon }(\mathbf{x},\mathbf{z})$ equal to the proportion of $u_h(\mathbf{y})$ that went into $u_h(\mathbf{z})$.

Remark 6

An obvious corollary of the universal support property (14) is that we also have the containment

$$\begin{aligned} \text{ Supp }(A_{\epsilon ,h}(\mathbf{x}) ) \subseteq D_h(B_{\epsilon ,h}(\mathbf{x})) \subseteq B_{\epsilon +2h,h}(\mathbf{x}). \end{aligned}$$

(14)

Figure 11 illustrates an example where this containment is strict, and in fact it is not hard to show that this holds in general. However, (14) is tight enough for our purposes in this paper.

Remark 7

The main results of this paper, in particular Theorems 2 and 6, remain applicable if ghost pixels are defined using a different form of interpolation (rather than bilinear interpolation as above) so long as that interpolation scheme can also be expressed in terms of non-negative basis functions with compact support and also preserves polynomials of degree one. Moreover, the fixed-ratio continuum limit (Theorem 6) is identical for all such interpolation schemes. The asymptotic limit (Theorem 2) however, is not. One example of an alternative interpolation scheme obeying these properties is Barycentric triangle interpolation [10]. In the future, it would be interesting the explore alternative interpolation schemes for ghost pixels, however this is beyond the scope of the current work.

5 Semi-Implicit Extension of Algorithm 1

Here we present the semi-implicit form of Algorithm 1, to our knowledge not previously proposed in the literature, in which instead of computing $u_h(\mathbf{x})$ for each $\mathbf{x} \in \partial _{\text{ ready }} D^{(k)}_h$ independently, we solve for them simultaneously by solving a linear system. We call our method semi-implicit in order to distinguish it from fully implicit methods in which the entire inpainting domain $\{u_h(\mathbf{x}) : \mathbf{x} \in D_h\}$ is solved simultaneously, as is typically the case for most inpainting methods based on PDEs or variational principles, e.g. [6, 11, 13]. Specifically, we solve

$$\begin{aligned} {\mathcal {L}}{} \mathbf{u} = \mathbf{f} \quad \text{ where } \mathbf{u} = \{ u_h(\mathbf{x}) : \mathbf{x} \in \partial _{\text{ ready }} D^{(k)}_h \}. \end{aligned}$$

(15)

and $\mathbf{f}$ is a vector of length $|\partial _{\text{ ready }} D^{(k)}_h|$. The explicit entries of ${\mathcal {L}}$ are written in terms of the equivalent weights ${\tilde{w}}_{\epsilon }$ introduced in Sect. 4. Defining

$$\begin{aligned} S^{(k)}_{\epsilon ,h}(\mathbf{x}) := \text{ Supp }(A_{\epsilon ,h}(\mathbf{x})) \cap \partial _{\text{ ready }} D^{(k)}_h, \end{aligned}$$

it follows that ${\mathcal {L}}$ couples each $\mathbf{x} \in \partial _{\text{ ready }} D^{(k)}_h$ to its immediate neighbors in $S^{(k)}_{\epsilon ,h}(\mathbf{x})$. In particular, we have

$$\begin{aligned} ({\mathcal {L}}{} \mathbf{u})(\mathbf{x})= & {} \left( 1-\frac{{\tilde{w}}_{\epsilon }(\mathbf{x},\mathbf{x})}{W}\right) u_h(\mathbf{x}) \\&\quad -\sum _{\mathbf{y} \in S^{(k)}_{\epsilon ,h}(\mathbf{x}) \backslash \{ \mathbf{x} \}}\frac{{\tilde{w}}_{\epsilon }(\mathbf{x},\mathbf{y})}{W}u_h(\mathbf{y}), \end{aligned}$$

where W is the total mass and can be computed in one of two ways, using either the original weights $w_{\epsilon }$ or the equivalent weights ${\tilde{w}}_{\epsilon }$, exploiting preservation of mass (10):

$$\begin{aligned} W:= & {} \sum _{\mathbf{y} \in S^{(k)}_{\epsilon ,h}(\mathbf{x}) \cup \left( \text{ Supp }(A_{\epsilon ,h}(\mathbf{x})) \cap (\varOmega _h \backslash D^{(k)}_h)\right) }{\tilde{w}}_{\epsilon }(\mathbf{x},\mathbf{y}) \end{aligned}$$

(16)

$$\begin{aligned}= & {} \sum _{\mathbf{y} \in (A_{\epsilon ,h}(\mathbf{x}) \backslash \{ \mathbf{x}\}) \cap (\varOmega \backslash D^{(k)}) } w_{\epsilon }(\mathbf{x},\mathbf{y}). \end{aligned}$$

(17)

Generally, (17) is more convenient to work with than (16), but (16) combined with the inherited non-degeneracy condition (13) tells us that

$$\begin{aligned} \sum _{\mathbf{y} \in S^{(k)}_{\epsilon ,h}(\mathbf{x})} {\tilde{w}}_{\epsilon }(\mathbf{x},\mathbf{y}) < W, \end{aligned}$$

because the non-degeneracy conditions implies that a non-zero proportion of the total weight goes into the known pixels in $\text{ Supp }(A_{\epsilon ,h}(\mathbf{x})) \cap (\varOmega \backslash D^{(k)}_h)$. From this it immediately follows that ${\mathcal {L}}$ is strictly diagonally dominant as

$$\begin{aligned}&|{\mathcal {L}}_{\mathbf{x},\mathbf{x}}| - \sum _{\mathbf{y} \in \partial _{\text{ ready }} D^{(k)}_h \backslash \{ \mathbf{x} \}}|\mathcal L_{\mathbf{x},\mathbf{y}}| \\&\quad = \left( 1-\frac{{\tilde{w}}_{\epsilon }(\mathbf{x},\mathbf{x})}{W}\right) -\sum _{\mathbf{y} \in S^{(k)}_{\epsilon ,h}(\mathbf{x}) \backslash \{ \mathbf{x} \}}\frac{{\tilde{w}}_{\epsilon }(\mathbf{x},\mathbf{y})}{W} \\&\quad = 1-\sum _{\mathbf{y} \in S^{(k)}_{\epsilon ,h}(\mathbf{x})}\frac{{\tilde{w}}_{\epsilon }(\mathbf{x},\mathbf{y})}{W}> 0. \end{aligned}$$

To compute $\mathbf{f}$, we do not need the concept of equivalent weights. We have

$$\begin{aligned} \mathbf{f}(\mathbf{x}) = \sum _{\mathbf{y} \in (A_{\epsilon ,h}(\mathbf{x}) \backslash \{ \mathbf{x} \}) \cap ( \varOmega \backslash D^{(k)})}\frac{w_{\epsilon }(\mathbf{x},\mathbf{y})}{W}u_h(\mathbf{y}). \end{aligned}$$

(18)

5.1 Solving the Linear System

Designing maximally efficient methods for solving (15) is beyond the scope of this paper—our main purpose lies in understanding the effect of this extension on the continuum limit that we will derive later. Therefore, in this paper we consider only two very simple methods: damped Jacobi and SOR (successive over-relaxation). These are natural choices for a number of reasons. First, since ${\mathcal {L}}$ is strictly diagonally dominant, these methods are both guaranteed to converge [42, Theorem 3.10, p. 79], at least in the case $\omega =1$, where they reduce to Jacobi and Gauss–Seidel. Second, at least for the semi-implicit extension of Guidefill, the performance of SOR is already satisfactory, (see Sect. 6.2, Proposition 1). Third, both methods can be implemented with minimal changes to the direct form of Algorithm 1. In fact, changing the variable “semiImplicit” from “false” to “true” in Algorithm 1 and executing the “FillBoundary” subroutine in parallel is equivalent to solving (15) using damped Jacobi, with underrelaxation parameter

$$\begin{aligned} \omega ^* = \left( 1 - \frac{{\tilde{w}}_{\epsilon }(\mathbf{x},\mathbf{x})}{W}\right) \le 1. \end{aligned}$$

(19)

Similarly, executing the “FillBoundary” subroutine sequentially results in SOR with the same underrelaxation parameter—see Proposition 3 in “Appendix C” for a proof. Note that $\omega ^*$ is typically very close 1, so these methods are very similar to plain Jacobi and Gauss–Seidel.

Remark 8

The reason Algorithm 1 with “semiImplicit” set to “true” results in damped Jacobi/SOR, rather than plain Jacobi/Gauss–Seidel, is because even though the update formula (1) for the nth iterate $u_h^{(n)}(\mathbf{x})$ is expressed as a sum of the $(n-1)$st iterate evaluated at ghost pixels that do not include $\mathbf{x}$, some of those ghost pixels may indirectly depend on $u^{(n-1)}_h(\mathbf{x})$ because they are defined using bilinear interpolation. The result is that the nth iterate $u_h^{(n)}(\mathbf{x})$ depends on $u_h^{(n-1)}(\mathbf{x})$, which is true of damped Jacobi/SOR but not of Jacobi/Gauss–Seidel.

6 Analysis

This section contains our core analysis, firstly of the convergence properties of semi-implicit Guidefill, then of the convergence of both Algorithm 1 and its semi-implicit extension to an asymptotic limit where $r=\epsilon /h$ is fixed and $h > 0$ but $h \ll 1$ as well as a continuum limit where $(h,\epsilon ) \rightarrow (0,0)$ along the ray $\epsilon = r h$ (with $r \in \mathbb {N}$ fixed). Next, in Sect. 7, we will apply our results to explain some of the artifacts discussed in Sect. 2. We begin with a discussion of some symmetry assumptions that will hold from now on (Fig. 12).

6.1 Symmetry Assumptions

We will assume throughout that the inpainting domain D is the unit square $[0,1) \times (0,1]$ while the image domain is $\varOmega =[0,1) \times (-\delta , 1]$ equipped with Dirichlet or periodic boundary conditions at $x=0$ and $x=1$, and no condition at $y=1$. We denote the undamaged portion of the image by

$$\begin{aligned} {\mathcal {U}} := [0,1) \times (-\delta ,0] \quad \text{ and } \quad {\mathcal {U}}_h := {\mathcal {U}} \cap \mathbb {Z}^2_h. \end{aligned}$$

(20)

We discretize $D = [0,1) \times (0,1]$ as an $N \times N$ array of pixels $D_h=D \cap (h \cdot \mathbb {Z}^2)$ with pixel width $h:=1/N$. Specifically, we have

$$\begin{aligned} D_h = \{(ih,jh) : 0 \le i < N, 1 \le j \le N\}. \end{aligned}$$

In order to ensure that the update formula (1) is well defined, we need $\epsilon +2h < \delta $, which we achieve by assuming $h < \frac{\delta }{r+2}$ [this follows from the inclusion (14)]. We assume that the default onion shell ordering is used, so that

$$\begin{aligned} \partial D^{(k)}_h = \{ (jh,kh) \}_{j=1}^N. \end{aligned}$$

We also assume that the sets $A_{\epsilon ,h}(\mathbf{x})$ are translations of one another, and the weights $w_{\epsilon }(\mathbf{x},\mathbf{y})$ depend only on $\frac{\mathbf{y}-\mathbf{x}}{\epsilon }$, that is

$$\begin{aligned} w_{\epsilon }(\mathbf{x},\mathbf{y})={\hat{w}}\left( \frac{\mathbf{y}-\mathbf{x}}{\epsilon }\right) \end{aligned}$$

For coherence transport and Guidefill this means that the guidance direction $\mathbf{g}$ is a constant.

Remark 9

We make the above assumptions not because we believe they are necessary, but because they enable us to make our analysis as simple as possible while still capturing the phenomena we would like to capture. In particular, the above two assumptions on the weights $w_{\epsilon }$ and neighborhood $A_{\epsilon ,h}(\mathbf{x})$ ensure that the matrix ${\mathcal {L}}$ given by (15) is either Toeplitz or circulant (depending on the boundary conditions), and also ensures that the random walk we connect Algorithm 1 to in Sect. 6.3 has i.i.d. (independent identically distributed) increments. Without these simplifications, our already lengthy analysis would become even more technical. Numerical experiments (omitted for brevity) suggest that these assumptions can be weakened, but proving this is beyond the scope of the present work.

These assumptions give us a high level of symmetry with which we may rewrite the update formula (1) of Algorithm 1 in the generic form

$$\begin{aligned} u_h(\mathbf{x}) = \frac{\sum _{\mathbf{y} \in a^*_r} w_r(\mathbf{0},\mathbf{y}) u_h(\mathbf{x}+h\mathbf{y})}{\sum _{\mathbf{y} \in a^*_r} w_r(\mathbf{0},\mathbf{y})} \end{aligned}$$

(21)

where

$$\begin{aligned} a^*_r = \left( \frac{1}{h}A_{\epsilon ,h}(\mathbf{0}) \backslash \{ \mathbf{0} \}\right) \cap \{ y \le \delta \}, \end{aligned}$$

and $\delta = -1$ for the direct method, while $\delta = 0$ for the semi-implicit extension. In particular, for coherence transport we have $a^*_r=b^-_r$ for the direct method and $a^*_r = b^0_r$ for the extension, where

$$\begin{aligned} b^0_r:= & {} \{ (n,m) \in \mathbb {Z}^2 : 0<n^2+m^2 \le r^2, m \le 0 \}. \end{aligned}$$

(22)

$$\begin{aligned} b^-_r:= & {} \{ (n,m) \in \mathbb {Z}^2 : n^2+m^2 \le r^2, m \le -1 \}. \end{aligned}$$

(23)

Similarly, for Guidefill, we have $a^*_r={\tilde{b}}^-_r$ for the direct method and $a^*_r={\tilde{b}}^0_r$ for the semi-implicit extension, where

$$\begin{aligned} {\tilde{b}}^0_r:= & {} \{ n\hat{\mathbf{g}}+m\hat{\mathbf{g}}^{\perp } : (n,m) \in \mathbb {Z}^2, 0<n^2+m^2 \le r^2, \\&n\hat{\mathbf{g}}\cdot e_2+m\hat{\mathbf{g}}^{\perp } \cdot e_2 \le 0 \} \\ {\tilde{b}}^-_r:= & {} \{ n\hat{\mathbf{g}}+m\hat{\mathbf{g}}^{\perp } : (n,m) \in \mathbb {Z}^2, n^2+m^2 \le r^2, \\&n\hat{\mathbf{g}}\cdot e_2+m\hat{\mathbf{g}}^{\perp } \cdot e_2 \le -1 \}, \end{aligned}$$

and $\hat{\mathbf{g}} := \mathbf{g}/\Vert \mathbf{g}\Vert $ (if $\mathbf{g} = \mathbf{0}$ we set ${\tilde{b}}^-_r = b^-_r$). The sets $b^-_r$, $b^0_r$, ${\tilde{b}}^-_r$, ${\tilde{b}}^0_r$ may be visualized by looking at the portion of Fig. 10a, b on or below the lines $y=0$ and $y=-1$ respectively. Also important are the dilated sets ${\bar{b}}^0_r=D(b^0_r) \cap \{ y \le 0\}$ and ${\bar{b}}^-_r=D(b^-_r) \cap \{ y \le -1\}$. These sets are given explicitly by

$$\begin{aligned} {\bar{b}}^0_r:= & {} \{ (n+\varDelta n,m+\varDelta m) : (n,m)\in b^0_r, (\varDelta n,\varDelta m) \nonumber \\\in & {} \{-1,0,1\} \times \{-1,0,1\}, m+\varDelta m \le 0\} \end{aligned}$$

(24)

$$\begin{aligned} {\bar{b}}^-_r:= & {} \{ (n+\varDelta n,m+\varDelta m) : (n,m) \in b^-_r, (\varDelta n,\varDelta m) \nonumber \\\in & {} \{-1,0,1\} \times \{-1,0,1\}, m+\varDelta m \le -1\}. \end{aligned}$$

(25)

This is because the universal support property (14) gives us the inclusion

$$\begin{aligned} \text{ Supp }(a^*_r) \subseteq {\left\{ \begin{array}{ll} {\bar{b}}^-_r \subseteq b^-_{r+2} &{} \text{ direct } \text{ form } \text{ of } \text{ Algorithm } \text{1. } \\ {\bar{b}}^0_r \subseteq b^0_{r+2} &{} \text{ semi-implicit } \text{ form. } \end{array}\right. } \end{aligned}$$

(26)

which will be critical later. The sets ${\bar{b}}^-_r$ and ${\bar{b}}^0_r$ are illustrated in Fig. 13.

Definition 1

We call the set $a^*_r$ and the weights $\{ w_r(\mathbf{0},\mathbf{y}) : \mathbf{y} \in a^*_r\}$ the stencil and stencil weights of a method. The center of mass of $a^*_r$ is defined in the following two equivalent ways:

$$\begin{aligned} \text{ C.M. }=\frac{\sum _{\mathbf{y} \in a^*_r} w_r(0,\mathbf{y})\mathbf{y}}{\sum _{\mathbf{y} \in a^*_r} w_r(0,\mathbf{y})}=\frac{\sum _{\mathbf{y} \in \text{ Supp }(a^*_r)} {\tilde{w}}_r(0,\mathbf{y})\mathbf{y}}{\sum _{\mathbf{y} \in \text{ Supp }(a^*_r)} {\tilde{w}}_r(0,\mathbf{y})}. \end{aligned}$$

Here ${\tilde{w}}_r$ denote the equivalent weights from Sect. 4, and the above identity follows from (10) and (11). The center of mass of $a^*_r$ will play a critical role both in the continuum limit of Algorithm 1 (where it is the transport direction of the resulting transport PDE) and in the connection to random walks (where, after multiplication by h, it is the mean of the increments of the walk).

Under these assumptions, the matrix ${\mathcal {L}}$ from (15) is independent of k (that is, we solve the same linear system for every shell), and moreover ${\mathcal {L}}$ becomes a Toeplitz matrix (Dirichlet boundary conditions) or circulant matrix (periodic boundary conditions). For a given pixel $\mathbf{x}$ at least $r+2$ pixels away from the boundary at $x=0$ and $x=1$, that is $\mathbf{x} \in \{ (jh,kh) : r+2 \le j \le N-r-2\}$, it takes on the form

$$\begin{aligned} ({\mathcal {L}}{} \mathbf{u})(\mathbf{x}) =\left( 1-\frac{{\tilde{w}}_r(\mathbf{0},\mathbf{0})}{W}\right) u_h(\mathbf{x}) - \sum _{\mathbf{y} \in s_r \backslash \{ \mathbf{0} \}}\frac{{\tilde{w}}_{r}(\mathbf{0},\mathbf{y})}{W}u_h(\mathbf{x}+h\mathbf{y}), \end{aligned}$$

where by (26) we have

$$\begin{aligned} s_r= & {} \text{ Supp }(a^*_r) \cap ( \mathbb {Z} \times \{ 0 \} ) \\\subseteq & {} \{-(r+2)e_1,-(r+1)e_1,\ldots ,(r+1)e_1,(r+2)e_1\}. \end{aligned}$$

If $\mathbf{x}$ is not at least $r+2$ pixels away from the boundaries, then the formula changes in the usual way for Toeplitz and circulant matrices—we assume the reader is familiar with this and no further discussion is needed. Under the same assumptions the vector $\mathbf{f}$ becomes

$$\begin{aligned} \mathbf{f} = \sum _{\mathbf{y} \in \text{ Supp }(a^*_r) \backslash ( \mathbb {Z} \times \{ 0\} ) }\frac{{\tilde{w}}_{r}(\mathbf{0},\mathbf{y})}{W}u_h(\mathbf{x}+h\mathbf{y}) \end{aligned}$$

where $\text{ Supp }(a^*_r) \backslash ( \mathbb {Z} \times \{ \mathbf{0}\} ) \subseteq b^-_{r+2}$. We also define

$$\begin{aligned} {\tilde{W}} := \sum _{j=-r-2}^{r+2} {\tilde{w}}_r(\mathbf{0},je_1) \quad \text{ and } \quad {\tilde{w}}_{0,0}:={\tilde{w}}_r(\mathbf{0},\mathbf{0}). \end{aligned}$$

(27)

For a given point $\mathbf{x} \in \partial D^{(k)}_h$, (again, assuming $\mathbf{x}$ is far enough from the boundary) the ratio $\frac{{\tilde{W}}}{W}$ gives the fraction of the mass of the stencil (Definition 1) centered at $\mathbf{x}$ that gets “leaked” to the unknown pixels in $\partial D^{(k)}_h$, while $\frac{{\tilde{w}}_{0,0}}{W}$ gives the fraction that gets leaked to $\mathbf{x}$. Together, these give a measure of the diagonal dominance of ${\mathcal {L}}$, as we have

$$\begin{aligned} \frac{\sum _{j \ne i} |{\mathcal {L}}_{ij}|}{|{\mathcal {L}}_{ii}|} = \frac{{\tilde{W}}-{\tilde{w}}_{0,0}}{W-{\tilde{w}}_{0,0}}. \end{aligned}$$

The smaller this ratio is, the stronger the diagonal dominance of ${\mathcal {L}}$, and the faster damped Jacobi and SOR can be expected to converge—see Proposition 1 for explicit formulas.

For semi-implicit Guidefill with guidance direction $\mathbf{g}=(\cos \theta ,\sin \theta )$, it can be shown that ${\mathcal {L}}$ becomes a lower triangular matrix in the limit $\mu \rightarrow \infty $, provided we order unknowns left to right if $\cos \theta > 0$ and right to left otherwise (see “Appendix D”). This gives us a hint that Gauss–Seidel and SOR might be very effective for the solution of (15) in this case, and indeed Proposition 1 confirms this.

6.2 Convergence Rates of Damped Jacobi and SOR for Semi-Implicit Guidefill

Here we derive tight bounds on the convergence rates of damped Jacobi and SOR for solving (15) in the semi-implicit extension of Guidefill described in Sect. 5, under the symmetry assumptions discussed above, and in the limit $\mu \rightarrow \infty $ (recall that $\mu $ is the parameter from the weights (6) controlling the extent to which weights are biased in favor of pixels in the directions $\pm \mathbf{g}$). We will prove that in each case the parameter value $\omega = 1$ is optimal, but also pay special attention to the case $\omega = \omega ^*$ given by (19), since this is the value of $\omega $ that our proposed implementation of the semi-implicit extension in Algorithm 1 uses. We consider general $\omega $ mainly in order to demonstrate that the choice $\omega = \omega ^*$, while not optimal, is close enough to optimal not to matter in practice.

We will assume that $D=(0,1]^2$ with Dirichlet boundary conditions, as this simplifies our analysis of SOR—for damped Jacobi, we could just as easily have assumed periodic boundary conditions. We will measure convergence rates with respect to the induced infinity norm, which obeys the identity

$$\begin{aligned} \Vert A\Vert _{\infty } = \max _{i=1}^N \sum _{j=1}^N |a_{ij}| \end{aligned}$$

(28)

for any $N \times N$ matrix A. Note that the iterates of the error $\mathbf{e}^{(0)}$, $\mathbf{e}^{(1)}$, $\ldots $ associated with any stationary iterative method with iteration matrix M obey the bounds

$$\begin{aligned} \Vert \mathbf{e}^{(n)}\Vert \le \Vert M\Vert ^n\Vert \mathbf{e}^{(0)}\Vert \text{ and } R(\mathbf{e}) := \root n \of {\frac{\Vert \mathbf{e}^{(n)}\Vert }{\Vert \mathbf{e}^{(0)}\Vert }} \le \Vert M\Vert \end{aligned}$$

(29)

for any vector norm $\Vert \cdot \Vert $ and induced matrix norm. We will be interested in these identities in the particular case that the vector norm is $\Vert \cdot \Vert _{\infty }$, and the stationary iterative method is damped Jacobi or SOR. Here

$$\begin{aligned} \mathbf{e}^{(n)} := u_h - u^{(n)}_h \end{aligned}$$

denotes the difference between the exact solution to (15), found by first solving (15) to machine precision, with the approximate solution at iteration n.

Proposition 1

Suppose semi-implicit Guidefill with guidance direction $\mathbf{g}=(\cos \theta ,\sin \theta )$ is used to inpaint the square $[0,1)^2$ under the assumptions above, using either damped Jacobi or SOR to solve (15). Suppose that in the case of SOR, $\partial D^{(k)}_h= \{ (ih,kh)\}_{i=0}^{N-1}$ is ordered from left to right if $\cos \theta \ge 0$ and from right to left otherwisen. Let ${\mathcal {L}}$ be as in (16) and define ${\mathcal {L}} = D-L-U$, where D, $-L$, and $-U$ are the diagonal, strictly lower triangular, and strictly upper triangular parts of ${\mathcal {L}}$ respectively. Let $J_{\omega }$ and $G_{\omega }$ denote the iteration matrices of damped Jacobi and SOR respectively with relaxation parameter $\omega $, that is

$$\begin{aligned} J_{\omega }= & {} I - \omega D^{-1} {\mathcal {L}} \\ G_{\omega }= & {} (I-\omega D^{-1}L)^{-1}((1-\omega )I+D^{-1}U). \end{aligned}$$

Let $r=\epsilon /h$ and define $\theta _c = \arcsin (1/r)$. Define, for $\theta _c \le \theta \le \pi - \theta _c$, $j^* = \lfloor \frac{1}{\sin \theta } \rfloor \le r$. Let W be the total weight (17), let ${\tilde{W}}$ and ${\tilde{w}}_{0,0}$ be as in (27). Then, in the limit as $\mu \rightarrow \infty $, we have

$$\begin{aligned} W= & {} \sum _{j=1}^r \frac{1}{j}, \qquad {\tilde{w}}_{0,0}=(1-\sin \theta )(1-|\cos \theta |)\\ {\tilde{W}}= & {} {\left\{ \begin{array}{ll} \sum _{j=1}^r \frac{1}{j} - r \sin \theta &{} \text{ if } \theta \in (0,\theta _c] \cup [\pi -\theta _c,\pi ) \\ \sum _{j=1}^{j^*} \frac{1}{j} - j^* \sin \theta &{} \text{ if } \theta \in (\theta _c , \pi - \theta _c) \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned} \Vert J_{\omega }\Vert _{\infty }= & {} |1-\omega |+\omega \left( \frac{{\tilde{W}}-{\tilde{w}}_{0,0}}{W-{\tilde{w}}_{0,0}} \right) \quad \text{ for } \omega \in (0,2) \\ \Vert G_{\omega }\Vert _{\infty }= & {} \frac{|1-\omega |}{1-\omega \frac{{\tilde{W}}-{\tilde{w}}_{0,0}}{W-{\tilde{w}}_{0,0}}} \quad \text{ for } \omega \in (0,1], \\ \end{aligned}$$

where $\Vert \cdot \Vert _{\infty }$ is the induced infinity matrix norm (28). The optimal $\omega \in (0,2)$ is in both cases independent of $\theta $ and equal to $\omega = 1$, where we obtain

$$\begin{aligned} \Vert J_1\Vert _{\infty } = 1 - \frac{r\sin \theta }{\sum _{j=1}^r \frac{1}{j}-(1-\sin \theta )(1-|\cos \theta |)} \end{aligned}$$

if $\theta \in (0, \theta _c] \cup [\pi -\theta _c,\pi ),$

$$\begin{aligned} \Vert J_1\Vert _{\infty } =1 - \frac{\sum _{j=j^*+1}^r \frac{1}{j} + j^*\sin \theta }{\sum _{j=1}^r \frac{1}{j}-(1-\sin \theta )(1-|\cos \theta |)} \end{aligned}$$

if $\theta \in (\theta _c, \pi - \theta _c)$, and

$$\begin{aligned} \Vert G_1\Vert _{\infty } = 0. \end{aligned}$$

Proof

“Appendix D”. $\square $

Corollary 1

For the special case of

$$\begin{aligned} \omega ^* = \left( 1 - \frac{{\tilde{w}}_{0,0}}{W}\right) = \left( 1-\frac{(1-\sin \theta )(1-|\cos \theta |)}{ \sum _{j=1}^r \frac{1}{j}}\right) \end{aligned}$$

that is, the parameter value equivalent to running Algorithm 1 with “semiImplicit” set to true, we obtain

$$\begin{aligned} \Vert G_{\omega ^*}\Vert _{\infty }= & {} \frac{{\tilde{w}}_{0,0}}{W-{\tilde{W}}_k+{\tilde{w}}_{0,0}}\\ \Vert J_{\omega ^*}\Vert _{\infty }= & {} \frac{{\tilde{W}}_k}{W} \end{aligned}$$

or

$$\begin{aligned} \Vert G_{\omega ^*}\Vert _{\infty }= & {} \frac{(1-\sin \theta )(1-|\cos \theta |)}{r\sin \theta +(1-\sin \theta )(1-|\cos \theta |)}\\ \Vert J_{\omega ^*}\Vert _{\infty }= & {} 1 - \frac{r\sin \theta }{\sum _{j=1}^r \frac{1}{j}} \end{aligned}$$

if $\theta \in (0, \theta _c] \cup [\pi -\theta _c,\pi )$ and

$$\begin{aligned} \Vert G_{\omega ^*}\Vert _{\infty }= & {} \frac{(1-\sin \theta )(1-|\cos \theta |)}{\sum _{j=j^*+1}^r \frac{1}{j} + j^*\sin \theta +(1-\sin \theta )(1-|\cos \theta |)}\\ \Vert J_{\omega ^*}\Vert _{\infty }= & {} 1 - \frac{\sum _{j=j^*+1}^r \frac{1}{j} + j^*\sin \theta }{\sum _{j=1}^r \frac{1}{j}} \end{aligned}$$

if $\theta \in (\theta _c, \pi - \theta _c)$. See Fig. 14 for a plot of $\Vert G_{\omega ^*}\Vert _{\infty }$ and $\Vert J_{\omega ^*}\Vert _{\infty }$ as a function of $\theta $ for $r=3$.

Proof

This follows from direct substitution of $\omega ^*$ given by (19) into Proposition 1. $\square $

As noted in Sect. 5, the implementation of semi-implicit Guidefill outlined in Algorithm 1 (blue text) is equivalent to solving the linear system (15) iteratively using damped Jacobi (parallel implementation) or SOR (sequential implementation), with relaxation parameter $\omega ^*$ given by (19). Figure 14 compares the experimentally measured convergence rates of the implementation of semi-implicit Guidefill outlined in the blue text of Algorithm 1 ($r=3$, $\mathbf{g}=(\cos \theta ,\sin \theta )$ and $\mu = 100$) with the theoretical bounds on $\Vert J_{\omega ^*}\Vert _{\infty }$ and $\Vert G_{\omega ^*}\Vert _{\infty }$ from Corollary 1. Specifically, (a) and (c) confirm experimentally the first bound in (29) in the cases $M=J_{\omega ^*}$ and $M=G_{\omega ^*}$, that is, damped Jacobi and SOR with relaxation parameter $\omega ^*$, for the case $\theta = 2^{\circ }$. The inpainting problem in this case is the same as in Fig. 12a, and all the parameters of semi-implicit Guidefill are the same. The “exact” solution $u_h$ was found by first solving (15) to machine precision. In each case, we measured convergence rates only within the first “shell” of the inpainting problem. Next, (b)–(d) confirm experimentally the second bound in (29), as a function of $\theta $. The inpainting problem is the same as the one in Fig. 5a, and all parameters are the same. In this case we vary $\theta $ from $1^{\circ }$ up to $179^{\circ }$ in increments of one degree, in each case iteratively solving (15) (again, only for the first shell), computing $R(\mathbf{e})$, and comparing with $\Vert J_{\omega ^*}\Vert _{\infty }$ and $\Vert G_{\omega ^*}\Vert _{\infty }$. Note the excellent performance of SOR in comparison with damped Jacobi.

Although our choice of $\omega ^*$ is non-optimal, it is convenient to implement and the difference in performance is negligible. In particular, for $r=3$ we have $\Vert G_{\omega ^*}\Vert _{\infty } \le R(e) \le 0.06$ indpendent of $\theta $. Moreover, it follows from Corollary 1 that $\Vert G_{\omega ^*}\Vert _{\infty }$ is decreasing function of r for each fixed $\theta $, so this bound holds for all $r \ge 3$ as well.

6.3 Convergence of Algorithm 1 to Continuum and Asymptotic Limits

In this section we prove the convergence of the direct and semi-implicit forms of Algorithm 1 to an “asymptotic limit” and a “fixed-ratio continuum limit”. Then, in Sect. 7, we use these limits to explain the kinking and blur artifacts of Algorithm 1 observed in practice as outlined in Sect. 2. We begin by defining what these limits are:

Definition of the continuum limit We wish to prove convergence of the direct and semi-implicit forms of Algorithm 1 to a continuum limit u given by the transport equation

$$\begin{aligned} \nabla u \cdot \mathbf{g}_r^* = 0, \qquad u \Big |_{y = 0} = u_0 \Big |_{y = 0} \qquad u\Big |_{x = 0}=u\Big |_{x = 1} \end{aligned}$$

(30)

Specifically, we wish to prove convergence when we take $(h,\epsilon ) \rightarrow (0,0)$ along the path $\epsilon = rh$ (with $r \in \mathbb {N}$ fixed). Because of the assumptions we have made in Sect. 6.1, $\mathbf{g}_r^*$ will turn out to be a constant equal to the center of mass of the stencil $a^*_r$ with respect to the stencil weights $\{ w_r(\mathbf{0},\mathbf{y}) : \mathbf{y} \in a^*_r \}$ (Definition 1), that is

$$\begin{aligned} \mathbf{g}_r^* = \frac{\sum _{\mathbf{y} \in a^*_r} w_r(0,\mathbf{y})\mathbf{y}}{\sum _{\mathbf{y} \in a^*_r} w_r(0,\mathbf{y})}. \end{aligned}$$

(31)

As we will allow discontinuous boundary data $u_0$, the solution to (30) must be defined in a weak sense. However, since $\mathbf{g}_r^*$ is a constant, this is simple. So long as $\mathbf{g}_r^* \cdot e_2 \ne 0$, we simply define the solution to the transport problem (30) to be

$$\begin{aligned} u(\mathbf{x}) = u_0(\varPi _{\theta _r^*}(\mathbf{x})) \end{aligned}$$

(32)

where

$$\begin{aligned} \varPi _{\theta _r^*}(x,y)=(x-\cot (\theta _r^*)y \mod 1,0). \end{aligned}$$

(33)

We call $\varPi _{\theta _r^*}: D \rightarrow \partial D$ the transport operator associated with (32). The $\text{ mod }$ 1 is due to our assumed periodic boundary conditions and

$$\begin{aligned} \theta _r^* = \theta (\mathbf{g}_r^*) \in (0,\pi ) \end{aligned}$$

is the counterclockwise angle between the x-axis and the line $L_{\mathbf{g}_r^*} := \{ \lambda \mathbf{g}_r^* : \lambda \in \mathbb {R} \}$.

Remark 10

When $u_0$ is smooth, convergence is straightforward to prove—we did so in [27, Theorem 1] for the direct form of Algorithm 1 using a simple argument based on Taylor series. However, in this paper we are interested in a more general setting where $u_0$ may not be smooth and Taylor series may not be available. Instead of Taylor series, our main tool will be a connection to stopped random walks.

Definition of the asymptotic limit Unlike the continuum limit, which finds the limiting value of $u_h$ when $(h,\epsilon ) \rightarrow (0,0)$ along the path $\epsilon = rh$ with $r \in \mathbb {N}$ fixed, the asymptotic limit gives the asymptotic dependence of $u_h$ on h when $r = \epsilon /h$ is fixed and $h \ll 1$. For a fixed $\mathbf{x }=(x,y) \in D_h$ the asymptotic limit is computed in the following two steps:

1.
Given discrete boundary data $u_0 : {\mathcal {U}}_h \rightarrow \mathbb {R}$, generate new boundary data ${\tilde{u}}_0 : \partial U_h \rightarrow \mathbb {R} $ generated from $u_0$ by convolving it with a discrete convolution kernel
$$\begin{aligned} {\tilde{g}}_{\sigma (y,h)}(x_1,x_2) = g_{\sigma (y,h)}(x_1)\varDelta (x_2), \end{aligned}$$
where $g_{\sigma (y,h)}$ is a one-dimensional Gaussian with variance dependent on both h and y, and $\varDelta $ is a function that “mixes” the top r rows of $u_0$ by taking a convex combination with weights independent of both h and y.
2.
$u_h$ is now given by the solution to the same transport equation (30) as in the fixed ratio continuum limit, but with the boundary data $u_0$ replaced by ${\tilde{u}}_0 = {\tilde{g}}_{\sigma (y,h)} * u_0$. That is,
$$\begin{aligned} u_h(x,y) = ({\tilde{g}}_{\sigma (y,h)} * u_0)(\varPi _{\theta _r^*}(x,y))+o(1). \end{aligned}$$

Remark 11

By $({\tilde{g}}_{\sigma (y,h)} * u_0)(\mathbf{x})$ we mean the discrete circular convolution

$$\begin{aligned} ({\tilde{g}}_{\sigma (y,h)} * u_0)(\mathbf{x}):= & {} \sum _{\mathbf{y} \in {\mathcal {U}}_h} u_0(\mathbf{y}){\tilde{g}}_{\sigma (y,h)}(\mathbf{x}-\mathbf{y}), \end{aligned}$$

where the x-coordinate of $\mathbf{x}-\mathbf{y}$ is defined modulo 1. Strictly speaking, for this definition to coincide with the usual definition of discrete convolution given in, for example, [35, Ch. 6], we must have $\mathbf{x}-\mathbf{y} \in {\mathcal {U}}_h$. In our case $\mathbf{x}$ may lie between pixel centers, so we do not make this restriction.

6.4 Connection to Stopped Random Walks

The convergence of Algorithm 1 to both limits above is established based on a connection between Algorithm 1 and stopped random walks. The purpose of this section is to explain this connection.

To begin, Note that the update formula (21) gives a relationship between $u_h(\mathbf{x})$ and its immediate neighbors in $a^*_r$, which for now we assume obeys $a^*_r \subseteq b^-_r$ or $a^*_r \subseteq b^0_r$ (if this is not the case we can apply the method of equivalent weights from Sect. 4). Now suppose we modify (21) iteratively by repeated application of the following rule: for each $\mathbf{y} \in a^*_r$, if $\mathbf{x}+h\mathbf{y} \in D_h$, replace $u_h(\mathbf{x}+h\mathbf{y})$ in the RHS of (21) with the RHS of a version of (21) where the LHS is evaluated at $\mathbf{x}+h\mathbf{y}$ (in other words, we are substituting (21) into itself). Otherwise, if $\mathbf{x}+h\mathbf{y} \in {\mathcal {U}}_h$, we are already in the undamaged portion of the image, and we may replace $u_h(\mathbf{x}+h\mathbf{y})$ with $u_0(\mathbf{x}+h\mathbf{y})$. Repeat this procedure until $u_h(\mathbf{x})$ is entirely expressed as a weighted sum of $u_0 \Big |_{{\mathcal {U}}_h}$, that is

$$\begin{aligned} u_h(\mathbf{x}) = \sum _{\mathbf{y} \in {\mathcal {U}}_h} \rho (\mathbf{y}) u_0(\mathbf{y}), \end{aligned}$$

(34)

for some as of yet unknown weights $\rho $. Denoting $\mathbf{x}:=(nh,mh)$, then for the direct form of Algorithm 1 this procedure will terminate after m steps, as in this case (21) expresses $u_h(\mathbf{x})$ in terms of neighbors at least h units below it. On the other hand, for the semi-implicit extension, (34) has to be interpreted as a limit.

This elimination procedure has a natural interpretation in terms of stopped random walks. Since the weights $\{ \frac{w(\mathbf{0},\mathbf{y})}{W} \}_{\mathbf{y} \in a^*_r}$ are non-negative and sum to 1, we can interpret them as the density of a two dimensional random vector $\mathbf{Z} := (U,V)$ taking values in $b^-_r$ or $b^0_r$ with density

$$\begin{aligned} P(\mathbf{Z}_i =\mathbf{y})=\frac{w_r(\mathbf{0},\mathbf{y})}{W}. \end{aligned}$$

(35)

Moreover, defining the random walk

$$\begin{aligned} \mathbf{X}_j :=(X_j,Y_j) = (nh,mh) + h \sum _{i=1}^j \mathbf{Z}_i \end{aligned}$$

(36)

with $\{ \mathbf{Z}_i \}$ i.i.d. and equal to $\mathbf{Z}$ in distribution, and defining

$$\begin{aligned} \tau = \inf \{ j : \mathbf{X}_j \in {\mathcal {U}}_h \}=\inf \{ j : Y_j \le 0 \}, \end{aligned}$$

(37)

then after k steps of elimination, we have

$$\begin{aligned} u_h(\mathbf{x}) = \sum _{\mathbf{y} \in \varOmega _h} \rho _{\mathbf{X}_{j \wedge \tau }}(\mathbf{y}) u_h(\mathbf{y}), \end{aligned}$$

where $\rho _{\mathbf{X}_{j \wedge \tau }}$ denotes the density of $\mathbf{X}_{j \wedge \tau }$. Denoting the mean of $\mathbf{Z}$ by $(\mu _x,\mu _y)$ note that by (31) we have the equivalence

$$\begin{aligned} (\mu _x, \mu _y) = \mathbf{g}^*_r. \end{aligned}$$

In other words, the mean of $\mathbf{Z}$ is precisely the transport direction of our limiting Eq. (30). The condition $\mathbf{g}^*_r \cdot e_2 \ne 0$, which we needed for (30) to be defined, implies $\mu _y < 0$. In the nomenclature of random walks, this means that $\mathbf{X}_k$ has negative drift in the y direction, while $\tau $ is the first passage time through $y=0$. Fortunately, this type of random walk and this type of first passage time have been studied and are well understood. See for example [24, Chapter 4], [22, 23, 25]. The book [24] also provides an good overview of stopped random walks in general. In particular, we know immediately that $\tau $ is a stopping time, $P(\tau = \infty )=0$, and $\tau $ has finite moments of all orders [24, Chapter 3, Theorems 1.1 and 3.1]. It follows that

$$\begin{aligned} u_h(\mathbf{x}) = \sum _{\mathbf{y} \in {\mathcal {U}}_h} \rho _{\mathbf{X}_{\tau }}(\mathbf{y}) u_0(\mathbf{y}) = E[u_0(\mathbf{X}_{\tau })]. \end{aligned}$$

(38)

See Fig. 15 for an illustration of these ideas.

Lattice coordinates and absolute coordinates

Our derivation of the asymptotic limit leverages known results for stopped random walks. However, the results most relevant to us concern random walks on the integer lattice $\mathbb {Z}^2$, whereas our random walk is on the scaled lattice $h \cdot \mathbb {Z}^2$. In order to facilitate conversion between our setting and the setting of the results we quote, we make the following definition

Definition 2

Given $\mathbf{x}=(x,y) = (ih,jh) \in D_h \subset h \cdot \mathbb {Z}^2$, we refer to $(i,j) \in \mathbb {Z}^2$ as the lattice coordinates of $\mathbf{x}$, while we refer to $(x,y)=(ih,jh)$ as the absolute coordinates of $\mathbf{x}$.

Overshoot and the ladder process associated with $\mathbf{X}_{\tau }$

In general, we expect that the stopped random walk $\mathbf{X}_{\tau }$ will overshoot the line $y=0$. The (asymptotic) nature of the overshoot probability density is well understood and is described briefly here (we will need it in Sect. 6.5). However, in order to describe it, we first need to define the ladder process associated with $\mathbf{X}_{\tau }$. The strict ladder process $\{n_i, S_{n_i} \}_{i \ge 1}$ associated with the y-coordinate of $\mathbf{X}_{\tau }$ is given by

$$\begin{aligned} S_n= & {} \sum _{i=1}^n V_i, \end{aligned}$$

(39)

$$\begin{aligned} n_1= & {} \inf \{n: V_n < 0\}\end{aligned}$$

(40)

$$\begin{aligned} n_{i+1}= & {} \inf \{n: S_n < S_{n_{i}}\}. \end{aligned}$$

(41)

Following the notation in [36], we denote by H the law of $V_{n_1}$, and write $\nu _y = E[V_{n_1}]$. The (asymptotic) overshoot function $\varDelta $ is then given by the following identity [36, Equation (1.4)]:

$$\begin{aligned} \varDelta (kh) = H((-\infty ,k))/\nu _y. \end{aligned}$$

(42)

in absolute coordinates or

$$\begin{aligned} \varDelta (k) = H((-\infty ,k))/\nu _y. \end{aligned}$$

(43)

in lattice coordinates (Definition 2). Note that H here is a probability measure, so that $H((-\infty ,k))$ refers to the measure of the interval $(-\infty ,k)$.

6.5 Derivation of the Asymptotic Limit

The key idea of our asymptotic limit is to combine (38) with known results on the asymptotic distribution of $\rho _{\mathbf{X}_{\tau }}$ started from $(x,y)=(nh,mh) \in D_h$ when m is very large (for fixed y, this is equivalent to h very small). This problem has been considered by a number of authors, including [24, Chapter 4], [22, 23, 25], where it is shown that $\rho _{\mathbf{X}_{\tau }}$ obeys a central limit theorem, asymptotically approaching a normal distribution. However, for our purposes it is not enough to know the asymptotic distribution of the stopped walk for $m \gg 1$, we must also know the rate of convergence as a function of m. While many authors have considered the former, to our knowledge only [30, 36] have considered the latter. Our results here are based on [36], specifically the corollary immediately following Theorem 3.3. The following theorem restates this result, along with a modified version adapted to our periodic boundary conditions.

Theorem 1

Let $\mathbf{X}_{\tau }:=(X_{\tau },Y_{\tau }) \in \mathbb {Z}^2$ be a stopped random walk started from $(n,m) \in \mathbb {Z}^2$ and with i.i.d. increments $\varDelta \mathbf{X}_i = \mathbf{Z}_i$. Assume that $\mathbf{X}_{\tau }$ has negative drift, that is $(\mu _x,\mu _y) =E[\mathbf{Z}_i]$ obeys $\mu _y < 0$ and define the stopping time $\tau $ by

$$\begin{aligned} \tau := \inf \{ n : Y_n < 0 \}. \end{aligned}$$

Define the characteristic function $\varphi _{\mathbf{Z}_i}$ of $\mathbf{Z}_i$ by

$$\begin{aligned} \varphi _{\mathbf{Z}_i}(\mathbf{u}) = E[e^{\sqrt{-1}{} \mathbf{{Z_i}}\cdot \mathbf{u}}] \end{aligned}$$

and assume that $|\varphi _{\mathbf{Z}_i}|<1$ unless each component of $\mathbf{u}$ is an integer multiple of $2\pi $. Then

$$\begin{aligned} \lim _{m \rightarrow \infty } \sqrt{m}|\rho _{\mathbf{X}_{\tau }}(i,j)-{\tilde{G}}_{\sigma (m),\mu }(i,j)| =0 \end{aligned}$$

(44)

uniformly in $(i,j) \in \mathbb {Z}^2$, where

$$\begin{aligned} {\tilde{G}}_{\sigma (m),\mu }(i,j)= & {} G_{\sigma (m),\mu }(i)\varDelta (j) \\ G_{\sigma (m),\mu }(i)= & {} \frac{e^{-\frac{(i-\mu )^2}{2\sigma (m)^2}}}{\sqrt{2\pi } \sigma (m)} \end{aligned}$$

with

$$\begin{aligned} \sigma (m)^2 = \frac{\gamma ^2 m}{|\mu _y|^3} \quad \text{ where } \gamma ^2 = {\text {Var}}(\mu _x V_1 - \mu _y U_1). \end{aligned}$$

(45)

and

$$\begin{aligned} \mu = n - \frac{\mu _x}{\mu _y}m \end{aligned}$$

and $\varDelta (j)$ is given by (43). Moreover, if the domain of $\mathbf{X}_{\tau }$ is changed to

$$\begin{aligned} \mathbf{X}_{\tau } \in \{0,1,\ldots ,N-1\} \times \{ -(r+2),-(r+1),\ldots ,N\} \end{aligned}$$

with periodic boundary condition $0 \sim N$ in the x-direction, and if $m \le N$ while $\mathbf{Z}_i := (U_i,V_i)$ obeys $V_i \le 0$, then the same result holds but with $G_{\sigma (m),\mu }(i)$ replaced by

$$\begin{aligned} G^*_{\sigma (m),\mu ,N}(i) = \frac{e^{-\frac{d^{{\text {circ}}}_{N}(i,\mu ^*)^2}{2\sigma (m)^2}}}{\sqrt{2\pi } \sigma (m)} \end{aligned}$$

where

$$\begin{aligned} \mu ^* = n - \frac{\mu _x}{\mu _y}m \mod N. \end{aligned}$$

and $d^{{\text {circ}}}_{N}$ denotes the circular distance defined for $x,y \in \mathbb {R}$ by

$$\begin{aligned} d^{{\text {circ}}}_{N}(x,y) = \min (x-y \mod N,y-x \mod N) \end{aligned}$$

(46)

If $\mu = 0$, we write $G_{\sigma }$ and $G^*_{\sigma ,N}$ in place of $G_{\sigma ,0}$ and $G^*_{\sigma ,0,N}$.

Proof

The first statement follows from the (un-numbered) Corollary immediately following Theorem 3.3 in [36]. The second statement is a simple corollary, the details of which may be found in “Appendix F”. $\square $

The asymptotic limit follows straightforwardly from Theorem 1 and (38) once we have completed the following tasks:

1.
The second statement of Theorem 1 gives the asymptotic probability that $X_{\tau } = i$ for every $i=0,\ldots ,N-1$ for our random walk with periodic boundary conditions. However, the Gaussian $G^*_{\sigma (m),\mu ,N}$ that $X_{\tau }$ converges to is not normalized with respect to summation over $\{i\}_{i=0}^{N-1}$. Lemma 1 addresses this problem by allowing us to convert statements about $G^*_{\sigma (m),N}$ into statements about $g^*_{\sigma (m),N}$, a discrete Gaussian that is normalized with respect to summation over the set $\{i\}_{i=0}^{N-1}$.
2.
The assumption in Theorem 1 on the characteristic function of $\mathbf{Z}_i$ can be inconvenient to check. Definition 3 gives an equivalent geometric condition (which can typically be verified by inspection) that is proven in Lemma 2 to be equivalent.
3.
Theorem 1 is expressed with respect to the integer lattice $\mathbb {Z}^2$, whereas our random walk is on the scaled lattice $h \cdot \mathbb {Z}^2$. Lemma 3 uses Theorem 1 combined with Lemmas 1 and 2 to prove a version of Theorem 1 applicable to our setting.
4.
We need to derive an expression for $\varDelta (x_2)$. This is done in Lemma 4.
5.
We need to derive a bound on the probability of $\mathbf{X}_{\tau }$ deviating significantly from $\varPi _{\theta _r^*}(\mathbf{x})$. This is done in Lemma 5.

Once the above items are taken care of, we prove convergence to the asymptotic limit in Theorem 2.

We begin with task one. For now, we work in lattice coordines (Definition 2). Let us denote by $G_{\sigma ,\mu }$ and $G^*_{\sigma ,\mu ,N}$ the standard and periodic Gaussians with mean $\mu $ and variance $\sigma ^2$ defined for $x \in \mathbb {R}$ and $x \in [0,N)$ respectively by

$$\begin{aligned} G_{\sigma ,\mu }(x) = \frac{e^{-\frac{(x-\mu )^2}{2\sigma ^2}}}{\sqrt{2\pi }\sigma } \qquad G^*_{\sigma ,\mu ,N}(x) = \frac{e^{-\frac{d^{{\text {circ}}}_{N}(x,\mu )^2}{2\sigma ^2}}}{\sqrt{2\pi }\sigma } . \end{aligned}$$

(47)

These are the functions that $X_{\tau }$ converges to in the two forms of Theorem 1 (infinite domain and periodic domain). However, neither of them are probability densities. This is because while

$$\begin{aligned} \int _{\mathbb {R}} G_{\sigma ,\mu }(x)dx=1 \end{aligned}$$

is true, the domain of $X_{\tau }$ in the infinite case is not $\mathbb {R}$, but rather $\mathbb {Z}$, and

$$\begin{aligned} \sum _{i \in \mathbb {Z}} G_{\sigma ,\mu }(i) \ne 1. \end{aligned}$$

For $G^*_{\sigma ,\mu ,N}$, the situation is worse: we have

$$\begin{aligned} \int _{0}^{N} G^*_{\sigma ,\mu ,N}(x)dx \ne 1 \quad \text{ and } \quad \sum _{i=0}^{N-1} G^*_{\sigma ,\mu ,N}(i) \ne 1. \end{aligned}$$

We therefore consider it more natural to work with the discrete Gaussians

$$\begin{aligned} g_{\sigma ,\mu }(i)= & {} \frac{e^{-\frac{(i-\mu )^2}{2\sigma ^2}}}{\sum _{k \in \mathbb {Z}} e^{-\frac{(k-\mu )^2}{2\sigma ^2}}} \nonumber \\ g^*_{\sigma ,\mu ,N}(i)= & {} \frac{e^{-\frac{d^{{\text {circ}}}_{N}(i,\mu )^2}{2\sigma ^2}}}{\sum _{k=0}^{N-1} e^{-\frac{d^{{\text {circ}}}_{N}(k,\mu )^2}{2\sigma ^2}}} \end{aligned}$$

(48)

which are normalized by construction (similarly to $G_{\sigma ,\mu }$ and $G^*_{\sigma ,\mu ,N}$, when $\mu = 0$ we simply write $g_{\sigma }$ and $g^*_{\sigma ,N}$). The following lemma allows us to convert statements about $G^*_{\sigma ,\mu ,N}$ into statements about $g^*_{\sigma ,\mu ,N}$. A nearly identical argument would allow us to do the same for $G_{\sigma ,\mu }$ and $g_{\sigma ,\mu }$, but we do not pursue this here. Lemma 1 also contains a result regarding discrete sums of Gaussians which is to the best of our knowledge novel and of interest in and of itself.

Lemma 1

i.
There is a constant $C>0$ such that for $|\sigma | \ge 1$ we have
$$\begin{aligned} \left| \sum _{n \in \mathbb {Z}} e^{-\frac{(n-\mu )^2}{2\sigma ^2}} - \sqrt{2\pi }\sigma \right| \le C \sigma e^{-2 \pi ^2 \sigma ^2}\end{aligned}$$
(49)
ii.
Let $G^*_{\sigma ,N+1}$ and $g^*_{\sigma ,N+1}$ be defined as above, with common mean $\mu $ and variance $\sigma ^2$. Further suppose that $\sigma ^2$ is proportional to $m \in \mathbb {N}$, that is $\sigma ^2(m) = {\tilde{\sigma }}^2 m$ for some constant ${\tilde{\sigma }}^2> 0$. Suppose further that $m \le N$ (recall that $N=1/h$). Then there exist constants $A > 0$ and $b >0 $ such that
$$\begin{aligned} |G^*_{\sigma (m),N}(x)-g^*_{\sigma (m),N}(x)|<Am^{\frac{1}{2}}e^{-bm} \end{aligned}$$
uniformly in $x \in \mathbb {R}$ for m sufficiently large.

Proof

We prove only the first statement. The second statement is then derived from the first together with routine tail estimates on gaussians, and is provided in “Appendix G”.

First we note that our sum is closely related to Jacobi theta functions of the third type, defined by [5]

$$\begin{aligned} \vartheta _3(q) = \sum _{n \in \mathbb {Z}} q^{n^2}. \end{aligned}$$

This function is known explicitly for a few values of q, for example

$$\begin{aligned} \vartheta _3(e^{-\pi })=\frac{\pi ^{\frac{1}{4}}}{\Gamma (\frac{3}{4})}. \end{aligned}$$

(50)

We will use this identity later on.

Our argument is based on the Poisson summation formula [2, pp. 322–333], which states that if $f : \mathbb {R} \rightarrow \mathbb {R}$ is a Schwartz function (that is, decays faster than $|x|^{-N}$ for all $N \in \mathbb {N}$ for x sufficiently large), then

$$\begin{aligned} \sum _{n \in \mathbb {Z}} f(n) = \sum _{n \in \mathbb {Z}} \int ^{\infty }_{-\infty } f(x) e^{-i2\pi n x}dx. \end{aligned}$$

Choosing $f(x) = e^{-\frac{(x-\mu )^2}{2\sigma ^2}}$ gives

$$\begin{aligned} \sum _{n \in \mathbb {Z}} e^{-\frac{(n-\mu )^2}{2\sigma ^2}}= & {} \sum _{n \in \mathbb {Z}} \int ^{\infty }_{-\infty } e^{-\frac{(x-\mu )^2}{2\sigma ^2}} e^{-i2\pi n x}dx \\= & {} \sum _{n \in \mathbb {Z}} \int ^{\infty }_{-\infty } e^{-\frac{(x-\mu )^2}{2\sigma ^2}} \cos (2\pi n x)dx, \end{aligned}$$

where on line two, noting that the LHS side is real, we have eliminated the imaginary part of the RHS as well. Changing variables in the integral to $y=x-\mu $ gives

$$\begin{aligned} \sum _{n \in \mathbb {Z}} e^{-\frac{(n-\mu )^2}{2\sigma ^2}}= & {} \sum _{n \in \mathbb {Z}} \int ^{\infty }_{-\infty } e^{-\frac{y^2}{2\sigma ^2}} \cos (2\pi n y+2\pi n \mu )dy \\= & {} \sum _{n \in \mathbb {Z}} \cos (2\pi n \mu ) \int ^{\infty }_{-\infty } e^{-\frac{y^2}{2\sigma ^2}} \cos (2\pi n y)dy \end{aligned}$$

Since the $\sin $ terms in the expansion of $\cos (2\pi n x+2\pi n \mu )$ are odd functions, they integrate to 0, and we have eliminated them in line two. But it is well known that

$$\begin{aligned} \int ^{\infty }_{-\infty } e^{-\frac{y^2}{2\sigma ^2}} \cos (2\pi n y)dy = \sqrt{2\pi }\sigma e^{-2\pi ^2 \sigma ^2 n^2} \end{aligned}$$

(this follows from the identity for the Fourier Transform of a Gaussian [8]). Hence

$$\begin{aligned} \sum _{n \in \mathbb {Z}} e^{-\frac{(n-\mu )^2}{2\sigma ^2}}=\sqrt{2\pi }\sigma \sum _{n \in \mathbb {Z}} e^{-2\pi ^2 \sigma ^2 n^2}\cos (2\pi n \mu ) \end{aligned}$$

and therefore (using the fact that $|\sigma | \ge 1$ on line four):

$$\begin{aligned} \left| \sum _{n \in \mathbb {Z}} e^{-\frac{(n-\mu )^2}{2\sigma ^2}}-\sqrt{2\pi }\sigma \right|\le & {} 2\sqrt{2\pi }\sum ^{\infty }_{n=1} e^{-2\pi ^2 \sigma ^2 n^2}|\cos (2\pi n \mu )| \\\le & {} 2\sqrt{2\pi }\sum ^{\infty }_{n=1} e^{-2\pi ^2 \sigma ^2 n^2} \\= & {} 2\sqrt{2\pi }\sigma e^{-\pi ^2\sigma ^2} \sum _{n=1}^{\infty } e^{-2\pi ^2\sigma ^2(n^2-1)} \\\le & {} 2\sqrt{2\pi }\sigma e^{-\pi ^2\sigma ^2} \sum _{n=1}^{\infty } e^{-\pi (n^2-1)} \\= & {} \sqrt{2\pi }e^{\pi }\left( \vartheta _3(e^{-\pi })-1\right) \sigma e^{-\pi ^2\sigma ^2}. \end{aligned}$$

Using (50), claim i. then follows with

$$\begin{aligned} C= & {} \sqrt{2\pi }e^{\pi }\left( \vartheta _3(e^{-\pi })-1\right) \\= & {} \sqrt{2\pi }e^{\pi }\left( \frac{\pi ^{\frac{1}{4}}}{\Gamma (3/4)}-1\right) \approx 5.014. \end{aligned}$$

$\square $

Remark 12

Although statement i. of Lemma 1 is stated asymptotically, in practice the convergence is so rapid that we have

$$\begin{aligned} \sum _{n \in \mathbb {Z}} e^{-\frac{n^2}{2\sigma ^2}} = \sqrt{2\pi }\sigma \end{aligned}$$

to machine precise for $|\sigma |>2$. This has been noted empirically by other authors, e.g. [20, p. 288], but as far as we know (49) is the first explicit bound. Note also that our bound (49) is extremely tight. For $\sigma =1$, the first value of $\sigma $ for which our bound is valid, the left hand side and right hand side of (49) differ by $O(10^{-12})$. For $\alpha > 1$, the bound rapidly becomes even tighter.

Having established a connection between discrete and continuous Gaussians, we now move onto task number two and derive a necessary and sufficient condition for the characteristic function $\varphi _{\mathbf{Z}_i}$ of $\mathbf{Z}_i$ to satisfy the conditions of Theorem 1.

Definition 3

We say that a stencil $a^*_r$ is “confined to a sublattice of $\mathbb {Z}^2$” if there exists an $(n,m)\in \mathbb {Z}^2$ and $k_1,k_2 \in \mathbb {Z}$ with at least one of $k_1,k_2$ greater than one in magnitude, such that the density of $\text{ Supp }(a^*_r)$ is zero outside of $(n,m)+k_1\mathbb {Z}\times k_2\mathbb {Z}$. We say that $a^*_r$ is not confined to any sublattice of $\mathbb {Z}^2$ if no such (n, m) and $k_1,k_2$ exist.

Lemma 2

The characteristic function of $\mathbf{Z}_i$, defined by

$$\begin{aligned} \varphi _{\mathbf{Z}_i}(\mathbf{u}) = E[e^{\sqrt{-1}{} \mathbf{{Z_i}}\cdot \mathbf{u}}] \end{aligned}$$

obeys $|\varphi _{\mathbf{Z}_i}|<1$ unless each component of $\mathbf{u}$ is an integer multiple of $2\pi $ if and only if the associated stencil $a^*_r$ is not confined to any sublattice of $\mathbb {Z}^2$ (in the sense of Definition 3).

Proof

First assume that $a^*_r$ is not confined to any sublattice of $\mathbb {Z}^2$. For convenience, let us write ${\tilde{w}}_\mathbf{y} := {\tilde{w}}_r(\mathbf{0},\mathbf{y})$. By the definition of expectation we have

$$\begin{aligned} \varphi _{\mathbf{Z}_i}(\mathbf{u}) =E[e^{\sqrt{-1}{} \mathbf{{Z_i}}\cdot \mathbf{u}}]=\frac{\sum _{\mathbf{y} \in \text{ Supp }(a^*_r)} {\tilde{w}}_\mathbf{y}e^{\sqrt{-1}{} \mathbf{y} \cdot \mathbf{u}}}{W}. \end{aligned}$$

Since $W \equiv \sum _{\mathbf{y} \in \text{ Supp }(a^*_r)}{\tilde{w}}_\mathbf{y}$ we have $|\varphi _{\mathbf{Z}_i}(\mathbf{u})| \le 1$, with equality if and only if $e^{\sqrt{-1}{} \mathbf{{y}}\cdot \mathbf{u}}$ have the same phase for all $\mathbf{y} \in \text{ Supp }(a^*_r)$ such that ${\tilde{w}}_\mathbf{y} >0$. Since $a^*_r$ is not confined to any sublattice of $\mathbb {Z}^2$, there must exist a $\mathbf{y}_1 \in \text{ Supp }(a^*_r)$ such that ${\tilde{w}}_{\mathbf{y}_1}>0$ and ${\tilde{w}}_{\mathbf{y}_1+e_1}>0$. Thus

$$\begin{aligned} |\varphi _{\mathbf{Z}_i}(\mathbf{u})| = 1\Rightarrow & {} e^{\sqrt{-1}{} \mathbf{{y}_1}\cdot \mathbf{u}}=e^{\sqrt{-1}{(\mathbf {y}_1}+e_1)\cdot \mathbf{u}} \\\Rightarrow & {} e^{\sqrt{-1}{} \mathbf{u} \cdot e_1}=1 \end{aligned}$$

after canceling a common factor of $e^{\sqrt{-1}{\mathbf{y}_1}}$. But then $\mathbf{u} \cdot e_1$ must be an integer multiple of $2\pi $, and a similar argument with proves that the same must be true of $\mathbf{u} \cdot e_2$.

Now assume that $a^*_r$ is confined to some sublattice $(n,m)+k_1\mathbb {Z} \times k_2\mathbb {Z}$ of $\mathbb {Z}^2$. Then defining $\mathbf{u} := (u_1,u_2)$ and $\mathbf{y}=(y_1,y_2)$ we can write

$$\begin{aligned} |\varphi _{\mathbf{Z}_i}(\mathbf{u})|= & {} \big |\sum _{\mathbf{y} \in \mathbb {Z}^2} \frac{{\tilde{w}}_\mathbf{y}}{W} e^{\sqrt{-1}(n+y_1k_1,m+y_2k_2)\cdot \mathbf{u}}\big | \\= & {} \big |e^{\sqrt{-1}(n,m) \cdot \mathbf{u}}\sum _{\mathbf{y} \in \mathbb {Z}^2}\frac{{\tilde{w}}_\mathbf{y}}{W} e^{\sqrt{-1}(y_1k_1,y_2k_2)\cdot \mathbf{u}}\big | \\= & {} \big |\sum _{\mathbf{y}\in \mathbb {Z}^2}\frac{{\tilde{w}}_\mathbf{y}}{W} e^{\sqrt{-1}(y_1k_1,y_2k_2)\cdot \mathbf{u}}\big | \end{aligned}$$

If $u_1 = 2\pi /k_1$ and $u_2 = 2\pi /k_2$, then we have

$$\begin{aligned} |\varphi _{\mathbf{Z}_i}(\mathbf{u})| = \sum _{\mathbf{y} \in \mathbb {Z}^2} \frac{{\tilde{w}}_\mathbf{y}}{W} = 1. \end{aligned}$$

Since at least one of $k_1, k_2$ is greater than one in magnitude, at least one of $u_1$, $u_2$ is not an integer multiple of $2\pi $. $\square $

Next, we use the above lemma together with Lemma 1 to prove a version of Theorem 1 expressed in terms of a discrete Gaussian $g^*_{\sigma (y,h),\mu ^*,1}$ rather than $G^*_{\sigma (m),\mu ^*,N}$.

Lemma 3

Let $a^*_r$ be a stencil not confined to any sublattice of $\mathbb {Z}^2$ in the sense of Definition 3, or equivalently that the characteristic function of $\mathbf{Z}_i$, that is

$$\begin{aligned} \varphi _{\mathbf{Z}_i}(\mathbf{u}) = \sum _{\mathbf{y} \in \text{ Supp }(a^*_r)} \frac{w_r(\mathbf{0},\mathbf{y}) e^{\sqrt{-1}{} \mathbf{y} \cdot \mathbf{u}}}{W}, \end{aligned}$$

obeys $|\varphi _{\mathbf{Z}_i}(\mathbf{u})| < 1$ unless both components of $\mathbf{u}$ are integer multiples of $2\pi $. Let $\mathbf{x}=(x,y) \in D_h$, and let $\mathbf{X}_{\tau }$ denote the stopped random walk started from $\mathbf{x}$ described in Sect. 6.4. Then for y fixed, we have

$$\begin{aligned} \rho _{X_{\tau }}(ih,jh)={\tilde{g}}_{\sigma (y,h),{\hat{x}}}(ih,jh)+o(\sqrt{h}) \end{aligned}$$

where

$$\begin{aligned} {\tilde{g}}_{\sigma (y,h),{\hat{x}}}(ih,jh)=g^*_{\sigma (y,h),{\hat{x}},1}(ih)\varDelta (jh) \end{aligned}$$

with $\varDelta (jh)$ given by (42) and where $g^*_{\sigma (y,h),{\hat{x}},1}$ is the one dimensional discrete periodic gaussian (48) with mean

$$\begin{aligned} {\hat{x}}=x - \frac{\mu _x}{\mu _y}y \mod 1 \equiv \varPi _{\theta _r^*}(x,y), \end{aligned}$$

where $\varPi _{\theta _r^*}$ is the transport operator (33), and with variance

$$\begin{aligned} \sigma (y,h)^2 = \frac{\gamma ^2 y h}{|\mu _y|^3} \quad \text{ where } \gamma ^2 = {\text {Var}}(\mu _x V_1 - \mu _y U_1). \end{aligned}$$

Proof

Let $\mathbf{x}=(x,y) = (nh,mh)$. By Theorem 1 we have

$$\begin{aligned} \rho _{X_{\tau }}(ih,jh)=G^*_{\sigma (m),\mu ^*,N}(i)\varDelta (jh)+o\left( \frac{1}{\sqrt{m}}\right) , \end{aligned}$$

where $G^*_{\sigma (m),\mu ^*,N}$ is defined by (47), has mean $\mu ^* = {\hat{n}} := {\hat{x}}/h$ and variance $\sigma (m)^2 = \frac{\gamma ^2 m}{|\mu _y|^3}:={\tilde{\sigma }}^2m$. By Lemma 1 statement ii, we have

$$\begin{aligned} G^*_{\sigma (m),{\hat{n}},N}(i) = g^*_{\sigma (m),{\hat{n}},N}(i)+O(m^{\frac{1}{2}}e^{-cm}), \end{aligned}$$

for some constant $c>0$, and hence

$$\begin{aligned} \rho _{X_{\tau }}(ih,jh)=g^*_{\sigma (m),{\hat{n}},N}(i)\varDelta (jh)+o\left( \frac{1}{\sqrt{m}}\right) , \end{aligned}$$

since the asymptotic decay of $m^{\frac{1}{2}}e^{-cm}$ is strictly faster than that of $m^{-\frac{1}{2}}$ for all $c > 0$. But $h = y/m$ and y is a constant, so $o(\frac{1}{\sqrt{m}}) = o(\sqrt{h})$. Finally, since

$$\begin{aligned} g^*_{\sigma (m),{\hat{n}},N}(i)= & {} \frac{e^{-\frac{d^{{\text {circ}}}_{N}(i,{\hat{n}})^2}{2{\tilde{\sigma }}^2m}}}{\sum _{j=0}^{N-1} e^{-\frac{d^{{\text {circ}}}_{N}(j,{\hat{n}})^2}{2{\tilde{\sigma }}^2m}} }\\= & {} \frac{e^{-\frac{d^{{\text {circ}}}_{1}(ih,{\hat{x}})^2}{2{\tilde{\sigma }}^2yh}}}{\sum _{j=0}^{N-1} e^{-\frac{d^{{\text {circ}}}_{1}(jh,{\hat{x}})^2}{2{\tilde{\sigma }}^2yh}} }=g^*_{\sigma (y,h),{\hat{x}},1}(ih), \end{aligned}$$

we conclude

$$\begin{aligned} \rho _{X_{\tau }}(ih,jh)=g^*_{\sigma (y,h),{\hat{x}},1}(ih)\varDelta (jh)+o\left( \sqrt{h}\right) . \end{aligned}$$

$\square $

Now we move onto task three and derive an expression for $\varDelta (x_2)$ for the random walk $\mathbf{X}_{\tau }$ under consideration:

Lemma 4

Let H denote the law of $V_{n_1}$, where $n_1 = \inf \{ k : V_k < 0 \}$, and let $\nu _y = E[V_{n_1}]$. Let $\varDelta $ denote the asymptotic law of the overshoot $Y_{\tau }$. Then, for the direct method we have

$$\begin{aligned} \varDelta (kh) = \frac{1}{{\mu _y}}\frac{\sum _{\mathbf{j} \in {\bar{b}}^-_r} {\tilde{w}}_{r}(\mathbf{0},\mathbf{j}) 1(\mathbf{j} \cdot e_2 < k)}{W}. \end{aligned}$$

For the semi-implicit method we have

$$\begin{aligned} \varDelta (kh) = \frac{1}{{\nu _y}}\frac{\sum _{\mathbf{j} \in {\bar{b}}^0_r} {\tilde{w}}_{r}(\mathbf{0},\mathbf{j}) 1(\mathbf{j} \cdot e_2 < k \cap \mathbf{j} \cdot e_2 \ne 0 )}{\sum _{\mathbf{j} \in {\bar{b}}^0_r} {\tilde{w}}_{r}(\mathbf{0},\mathbf{j})1(\mathbf{j} \cdot e_2 \ne 0 )}, \end{aligned}$$

where $\nu _y = (1-p_s)^{-1}\mu _y$ and $p_s$ is the probability that the random walk in the implicit case does not move downwards in a given step:

$$\begin{aligned} p_s = \frac{1}{W}\sum _{\mathbf{j} \in {\bar{b}}^0_r} {\tilde{w}}_{r}(\mathbf{0},\mathbf{j})1(\mathbf{j} \cdot e_2 = 0 ). \end{aligned}$$

Proof

First, note that since the increments $\{V\}$ obey $-r-1\le V_i \le 0$ for both the direct and semi-implicit methods, we have

$$\begin{aligned} \varDelta (kh) = H((-\infty ,k))/\nu _y = H([-r-1,k))/\nu _y . \end{aligned}$$

For the direct method, the ladder process above is particularly simple, since $\mathbf{X_{\tau }}$ is guaranteed to move at least one pixel down in the y direction every iteration. In this case we simply have $n_i = i$ for all i. Hence $\nu _y = \mu _y$ and H is equal to the law of $V_1$. It follows that

$$\begin{aligned} \varDelta (kh)= & {} \frac{1}{\mu _y}H([-r-1,k))\\= & {} \frac{1}{\mu _y}\frac{\sum ^{k-1}_{\ell = -r-1} \sum _{\mathbf{j} \in {\bar{b}}^-_r} {\tilde{w}}_{r}(\mathbf{0},\mathbf{j})1(\mathbf{j}\cdot e_2 = \ell )}{W} \\= & {} \frac{1}{\mu _y} \frac{\sum _{\mathbf{j} \in {\bar{b}}^-_r}\sum ^{k-1}_{\ell = -r-1} {\tilde{w}}_{r}(\mathbf{0},\mathbf{j})1(\mathbf{j}\cdot e_2 = \ell )}{W}\\= & {} \frac{1}{{\mu _y}}\frac{\sum _{\mathbf{j} \in {\bar{b}}^-_r} {\tilde{w}}_{r}(\mathbf{0},\mathbf{j}) 1(\mathbf{j} \cdot e_2 < k)}{W}. \end{aligned}$$

The semi-implicit case is more complex, since although $X_{\tau }$ can never move in the positive y direction, it can “pause” at a constant value of y for an arbitrarily long duration before resuming its downward march. Let $p_s$ denote the probability that a given increment of the walk does not move in the y-direction, that is $p_s = P(V_i=0)$. Then we clearly have

$$\begin{aligned} p_s = \frac{1}{W}\sum _{\mathbf{j} \in {\bar{b}}^0_r} {\tilde{w}}_{r}(\mathbf{0},\mathbf{j})1(\mathbf{j} \cdot e_2 = 0 ). \end{aligned}$$

By Wald’s Identity we have $\nu _y = E[V_{n_1}]=\mu _y E[n_1]$ since $n_1$ is a stopping time. We have

$$\begin{aligned} E[n_1]= & {} \sum _{n=1}^{\infty } p_s^{n-1}(1-p_s)n \\= & {} \frac{1-p_s}{p_s}\sum _{n=1}^{\infty } np_s^n\\= & {} \frac{1-p_s}{p_s}\cdot \frac{p_s}{(1-p_s)^2}=\frac{1}{1-p_s}, \end{aligned}$$

and hence

$$\begin{aligned} \nu _y = \frac{\mu _y}{1-p_s}. \end{aligned}$$

The law of $V_{n_1}$ (and hence H) is given by:

$$\begin{aligned} p_{V_{n_1}}(\mathbf{j}) = \frac{{\tilde{w}}_{r}(\mathbf{0},\mathbf{j})1(\mathbf{j} \cdot e_2 \ne 0 )}{\sum _{\mathbf{j} \in {\bar{b}}^0_r} {\tilde{w}}_{r}(\mathbf{0},\mathbf{j})1(\mathbf{j} \cdot e_2 \ne 0 )}. \end{aligned}$$

It follows that

$$\begin{aligned} \varDelta (kh)= & {} \frac{1}{\nu _y}H([-r-1,k)) \\= & {} \frac{1}{\mu _y}\frac{\sum ^{k-1}_{\ell = -r-1} \sum _{\mathbf{j} \in {\bar{b}}^0_r} {\tilde{w}}_{r}(\mathbf{0},\mathbf{j})1(\mathbf{j}\cdot e_2 = \ell )}{\sum _{\mathbf{j} \in {\bar{b}}^0_r} {\tilde{w}}_{r}(\mathbf{0},\mathbf{j})1(\mathbf{j} \cdot e_2 \ne 0 )} \\= & {} \frac{1}{\nu _y} \frac{\sum _{\mathbf{j} \in {\bar{b}}^0_r}\sum ^{k-1}_{\ell = -r-1} {\tilde{w}}_{r}(\mathbf{0},\mathbf{j})1(\mathbf{j}\cdot e_2 = \ell )}{\sum _{\mathbf{j} \in {\bar{b}}^0_r} {\tilde{w}}_{r}(\mathbf{0},\mathbf{j})1(\mathbf{j} \cdot e_2 \ne 0 )}\\= & {} \frac{1}{{\nu _y}}\frac{\sum _{\mathbf{j} \in {\bar{b}}^0_r} {\tilde{w}}_{r}(\mathbf{0},\mathbf{j}) 1(\mathbf{j} \cdot e_2 < k )}{\sum _{\mathbf{j} \in {\bar{b}}^0_r} {\tilde{w}}_{r}(\mathbf{0},\mathbf{j})1(\mathbf{j} \cdot e_2 \ne 0 )}. \end{aligned}$$

$\square $

Our final task before proving our main result is to bound the probability of $\mathbf{X}_{\tau }$ deviating significantly from $\varPi _{\theta _r^*}(\mathbf{x})$. The following Lemma accomplishes this.

Lemma 5

Let $\mathbf{x} \in D_h$ and $\hat{\mathbf{x}}:=({\hat{x}},0) := \varPi _{\theta ^*_r}(\mathbf{x})$. Let $d^{{\text {circ}}}_{1}$ denote the circular distance (46). Then

$$\begin{aligned} P(d^{{\text {circ}}}_{1}({ X}_{\tau },{\hat{x}})> \delta )\le & {} 2\exp \left( -\frac{(\delta -rh)^2\sin ^2\theta ^*_r}{4\frac{r}{|\mu _y|}rh}\right) \\&\quad +\exp \left( -\frac{1}{4\frac{r}{|\mu _y|}rh}\right) . \end{aligned}$$

Proof

Our random walk on the periodic domain $[0,1) \times (-\delta ,1]$ is equivalent to the same walk on the infinite domain $\mathbb {R} \times (-\delta ,1]$, modulo the equivalence relation $(x,y) \sim (x+k,y)$ for all $k \in \mathbb {Z}$. Let us denote (within this lemma only) our periodic walk by $\mathbf{X}^*_{\tau }$ while the walk on $\mathbb {R} \times (-\delta ,1]$ is denoted by $\mathbf{X}_{\tau }$. Then we have

$$\begin{aligned} \{ d^{{\text {circ}}}_1(X^*_{\tau },{\hat{x}})> \delta \}= & {} \bigcap _{k \in \mathbb {Z}} \{ |X_{\tau }-({\hat{x}}+k)|> \delta \} \\\subseteq & {} \{ |X_{\tau }-{\hat{x}}| > \delta \}. \end{aligned}$$

We therefore have

$$\begin{aligned} P( d^{{\text {circ}}}_1(X^*_{\tau },{\hat{x}})> \delta ) \le P( |X_{\tau }-{\hat{x}}| > \delta ), \end{aligned}$$

and hence it suffices to prove

$$\begin{aligned} P( |X_{\tau }-{\hat{x}}| > \delta )\le & {} 2\exp \left( -\frac{(\delta -rh)^2\sin ^2\theta ^*_r}{4\frac{r}{|\mu _y|}rh}\right) \\&\quad +\exp \left( -\frac{1}{4\frac{r}{|\mu _y|}rh}\right) . \end{aligned}$$

Step 1. $P\left( \tau > \lceil \frac{2}{|\mu _y|h} \rceil \right) \le \exp \left( -\frac{1}{4\frac{r}{|\mu _y|}rh}\right) .$ We define

$$\begin{aligned} M_k := Y_{k \wedge \tau } - (\tau \wedge k) \mu _yh -mh \end{aligned}$$

which the reader may verify is a zero mean martingale with bounded increments

$$\begin{aligned} |M_{k+1}-M_k| \le rh. \end{aligned}$$

Next we note that for any k the following events are equal:

$$\begin{aligned} \{ \tau \ge k \}= & {} \{Y_{\tau \wedge k} \ge 0\} = \{ M_k \ge -(k \wedge \tau ) \mu _y h - mh \} \\= & {} \{ M_k \ge -k\mu _y h - mh \}. \end{aligned}$$

Therefore

$$\begin{aligned} \left\{ \tau \ge \left\lceil \frac{2}{|\mu _y|h} \right\rceil \right\}= & {} \left\{ M_{ \left\lceil \frac{2}{|\mu _y|h} \right\rceil } \ge - \left\lceil \frac{2}{|\mu _y|h} \right\rceil \mu _y h - mh \right\} \\\subseteq & {} \left\{ M_{ \left\lceil \frac{2}{|\mu _y|h} \right\rceil } \ge 1 \right\} \end{aligned}$$

where we have used the inequality

$$\begin{aligned} -\left\lceil \frac{2}{|\mu _y|h} \right\rceil \mu _y h - mh = \left\lceil \frac{2}{|\mu _y|h} \right\rceil |\mu _y| h - mh \ge 2-mh \ge 1. \end{aligned}$$

Noting that $M_0 = 0$ we apply Azuma’s inequality to find

$$\begin{aligned} P\left( \tau > \left\lceil \frac{2}{|\mu _y|h} \right\rceil \right)\le & {} P \left( M_{ \left\lceil \frac{2}{|\mu _y|h} \right\rceil } \ge 1 \right) \\\le & {} \exp \left( - \frac{1}{2\left\lceil \frac{2}{|\mu _y|h} \right\rceil r^2h^2} \right) \\\le & {} \exp \left( -\frac{1}{4\frac{r}{|\mu _y|}rh}\right) . \end{aligned}$$

Step 2. Let $d(\mathbf{X}_{\tau },L_{\mathbf{x},\hat{\mathbf{x}}})$ denote the orthogonal distance from $\mathbf{X}_{\tau }$ to the line passing through $\mathbf{x}$ and $\hat{\mathbf{x}}$. This time we define

$$\begin{aligned} M_k := (\mathbf{X}_{\tau \wedge k} - \mathbf{x}) \cdot (-\sin \theta ^*_r,\cos \theta ^*_r). \end{aligned}$$

Once again, $M_k$ is a zero-mean martingale with bounded increments $|M_{k+1}-M_k| \le rh$ obeying $M_0 = 0$. Therefore, applying Azuma’s inequality again gives, for any $k \in \mathbb {N}$,

$$\begin{aligned} P(|M_k| > \delta ) \le 2\exp \left( -\frac{\delta ^2}{2kr^2h^2}\right) . \end{aligned}$$

Moreover, if $\tau \le k$, then we have the equality

$$\begin{aligned} |M_k|=d(\mathbf{X}_{\tau },L_{\mathbf{x},\hat{\mathbf{x}}}) \end{aligned}$$

For any integer $k \in \mathbb {N}$ we clearly have

$$\begin{aligned} P(d(\mathbf{X}_{\tau },L_{\mathbf{x},\hat{\mathbf{x}}})> \delta )\le & {} P( \{ d(\mathbf{X}_{\tau },L_{\mathbf{x},\hat{\mathbf{x}}})> \delta \} \cap \{ \tau \le k \})\\&+P( \tau> k )\\\le & {} P(|M_k|> \delta ) + P(\tau > k). \end{aligned}$$

Taking $k = \lceil \frac{2}{|\mu _y|h} \rceil $, substituting the above bound as well as the one from Step 1 gives

$$\begin{aligned} P(d(\mathbf{X}_{\tau },L_{\mathbf{x},\hat{\mathbf{x}}}) > \delta )\le & {} 2\exp \left( -\frac{\delta ^2}{4\frac{r}{|\mu _y|}rh}\right) \\&+\exp \left( -\frac{1}{4\frac{r}{|\mu _y|}rh}\right) . \end{aligned}$$

Finally, by the triangle inequality and simple trigonometry we have

$$\begin{aligned} | X_{\tau }-{\hat{x}}| \le \frac{d(\mathbf{X}_{\tau },L_{\mathbf{x},\hat{\mathbf{x}}})}{\sin \theta ^*_r}+rh. \end{aligned}$$

Hence

$$\begin{aligned} P(|X_{\tau }-{\hat{x}} |> \delta ) \le P(d(\mathbf{X}_{\tau },L_{\mathbf{x},\hat{\mathbf{x}}}) > (\delta -rh)\sin \theta ^*_r), \end{aligned}$$

from which the claimed result follows trivially. $\square $

Having accomplished all of our tasks, we are now ready to prove our main result.

Theorem 2

Let the inpainting domain D and undamaged area ${\mathcal {U}}$, as well as their discrete counterparts $D_h$, ${\mathcal {U}}_h$ be as described in Sect. 6.1. Let $u_0 : {\mathcal {U}} \rightarrow \mathbb {R}^d$ denote as usual the undamaged portion of the image, and assume $u_0$ is bounded in norm, that is, there is an $M > 0$ such that $\Vert u_0\Vert \le M$ on ${\mathcal {U}}$. Suppose we inpaint $D_h$ using Algorithm 1 or its semi-implicit extension, and denote the result by $u_h : D_h \rightarrow \mathbb {R}^d$. Assume the assumptions of Sect. 6.1 hold and let $a^*_r$ denote the stencil of our inpainting method. Let ${Z_i} = (U_i,V_i)$ taking values in $\text{ Supp }(a^*_r)$ with mean $(\mu _x,\mu _y)$ denote the increments of the random walk described above, with probability density given by (35). Assume that $a^*_r$ is not confined to any sublattice of $\mathbb {Z}^2$ in the sense of Definition 3, or equivalently that the characteristic function $\varphi _{\mathbf{Z}_i}$ of $\mathbf{Z}_i$, that is

$$\begin{aligned} \varphi _{\mathbf{Z}_i}(\mathbf{u}) = \sum _{\mathbf{y} \in \text{ Supp }(a^*_r)} \frac{w_r(\mathbf{0},\mathbf{y}) e^{\sqrt{-1}{} \mathbf{y} \cdot \mathbf{u}}}{W}, \end{aligned}$$

obeys $|\varphi _{\mathbf{Z}_i}(\mathbf{u})| < 1$ unless both components of $\mathbf{u}$ are integer multiples of $2\pi $. Let $\varPi _{\theta ^*_r} : D \rightarrow \partial D$ denote the transport operator defined in (32) (recall that that the transport direction $\mathbf{g}^*_r$ obeys $\mathbf{g}^*_r=(\mu _x,\mu _y)$), and let $\mathbf{x}=(x,y)=(nh,mh) \in D_h$, and assume y is fixed. Let

$$\begin{aligned} {\tilde{g}}_{\sigma (y,h)}(ih,jh)=g^*_{\sigma (y,h),1}(ih){\bar{\varDelta }}(jh) \end{aligned}$$

where $g^*_{\sigma (y,h),1}$ is the one-dimensional discrete Gaussian

$$\begin{aligned} g_{\sigma (y,h)}(ih)=\frac{e^{-\frac{d^{{\text {circ}}}_{1}(ih,0)^2}{2\sigma (y,h)^2}}}{\sum _{k=0}^{N-1} e^{-\frac{d^{{\text {circ}}}_{1}(kh,0)^2}{2\sigma (y,h)^2}}} \end{aligned}$$

with variance

$$\begin{aligned} \sigma (y,h)^2 = \frac{\gamma ^2 y h}{|\mu _y|^3} \quad \text{ where } \gamma ^2 = {\text {Var}}(\mu _x V_1 - \mu _y U_1) \end{aligned}$$

and where ${\bar{\varDelta }}$ denotes the even extension of $\varDelta $ defined by

$$\begin{aligned} {\bar{\varDelta }}(jh)=\varDelta (|j|h) \end{aligned}$$

where $\varDelta (jh)$ is given by Lemma 4. Then

$$\begin{aligned} u_h(x,y) = (u_0 * {\tilde{g}}_{\sigma (y,h)})(\varPi _{\theta ^*_r}(x,y)) +o(1) \end{aligned}$$

(51)

Proof

Our goal is to show that for any $\epsilon > 0$, taking h sufficiently small gives

$$\begin{aligned} \Vert u_h(x,y) - (u_0 * {\tilde{g}}_{\sigma (y,h)})(\varPi _{\theta ^*_r}(x,y))\Vert < \epsilon . \end{aligned}$$

Let us define $\hat{\mathbf{x}} :=({\hat{x}},0) := \varPi _{\theta ^*_r}(x,y)$. Then this is equivalent to proving

$$\begin{aligned} \left\| \sum _{\mathbf{x} \in {\mathcal {U}}_h}u_0(\mathbf{x})\left( \rho _{\mathbf{X}_{\tau }}(\mathbf{x} ) - {\tilde{g}}_{\sigma (y,h)}(\hat{\mathbf{x}}-\mathbf{x})\right) \right\| < \epsilon , \end{aligned}$$

where the x-coordinate of $\hat{\mathbf{x}}-\mathbf{x}$ is understood to be taken modulo 1 due to our periodic boundary conditions. Since $\Vert u_0\Vert \le M$ it suffices to prove

$$\begin{aligned} S\equiv & {} \sum _{(ih,jh) \in {\mathcal {U}}_h} \left| \rho _{\mathbf{X}_{\tau }}(ih,jh) - \varDelta (jh)g^*_{\sigma (y,h),1}({\hat{x}}-ih)\right| \\< & {} \frac{\epsilon }{M}. \end{aligned}$$

We will prove shortly that the per-term error

$$\begin{aligned} \left| \rho _{\mathbf{X}_{\tau }}(ih,jh) - \varDelta (jh)g^*_{\sigma (y,h),1}({\hat{x}}-ih)\right| \end{aligned}$$

is of size $o(1/\sqrt{N})$. However, this condition alone is not enough to prove the desired result, as S contains O(N) terms. To get around this, we introduce the sets $I = \{0,1,\ldots ,N-1\}$ and

$$\begin{aligned} I_k=\{ i \mod N\}_{i=\lceil \frac{{\hat{x}}}{h}-k\sqrt{m}\rceil }^{\lfloor \frac{{\hat{x}}}{h}+k\sqrt{m} \rfloor }, \end{aligned}$$

where $k \in \mathbb {N}$ is a constant to be determined. Note that $I_k$ may equivalently be characterized as

$$\begin{aligned} I_k = \left\{ i \in I : d^{{\text {circ}}}_{N}\left( i,\frac{{\hat{x}}}{h}\right) \le k\sqrt{m}\right\} , \end{aligned}$$

which in turn means that

$$\begin{aligned} I \backslash I_k = \left\{ i \in I : d^{{\text {circ}}}_{N}\left( i,\frac{{\hat{x}}}{h}\right) > k\sqrt{m}\right\} , \end{aligned}$$

(52)

Which will be critical later.

We break the sum into two pieces $S = S_1 + S_2$ defined as

$$\begin{aligned} S_1\equiv & {} \sum _{j=-r-2}^0 \sum _{i \in I_k} \left| \rho _{\mathbf{X}_{\tau }}(ih,jh) - \varDelta (jh)g^*_{\sigma (y,h),1}({\hat{x}}-ih)\right| \\ S_2\equiv & {} \sum _{j=-r-2}^0 \sum _{i \in I \backslash I_k} \left| \rho _{\mathbf{X}_{\tau }}(ih,jh) - \varDelta (jh)g^*_{\sigma (y,h),1}({\hat{x}}-ih)\right| . \end{aligned}$$

Since $m = yN$ and y is fixed, m is proportional to N, and hence $S_1$ contains $O(\sqrt{m})=O(\sqrt{N})$ terms concentrated around ${\hat{x}}$. We will control the size of $S_1$ by using the $o(\sqrt{h})=o(1/\sqrt{N})$ error estimate from Lemma 3. The second sum $S_2$ will contain all remaining terms in S. We will bound $S_2$ by making separate tail estimates for $\rho _{\mathbf{X}_{\tau }}$ and $g^*_{\sigma (y,h),1}({\hat{x}}-ih)$, both based on (52). $\square $

Claim 1

For $k \ge \max \left( \frac{2r}{\sqrt{y}},4\right) $ there exist constants $C_1,C_2,C_3,C_4 > 0$ (dependent on r and y) such that

$$\begin{aligned} S_2 < C_1e^{-C_2k^2}+C_3N^{\frac{3}{2}}e^{-C_4N}. \end{aligned}$$

We defer the proof of Claim 1 until the end. For now, note that it gives us the estimate

$$\begin{aligned} S \le S_1 + C_1e^{-C_2k^2}+C_3N^{\frac{3}{2}}e^{-C_4N}. \end{aligned}$$

Fixing k so that $C_1e^{-C_2k^2} < \frac{\epsilon }{4M}$ and $k \ge \frac{2r}{\sqrt{y}}$ are both true, and taking N large enough that

$C_3N^{\frac{3}{2}}e^{-C_4N}< \frac{\epsilon }{4M}$, it remains to prove that $S_1 < \frac{\epsilon }{2M}$. Defining $C :=(r+3)3k\sqrt{y}$ we have

$$\begin{aligned} S_1\equiv & {} \sum _{j=-r-2}^0 \sum _{i \in I_k} \left| \rho _{\mathbf{X}_{\tau }}(ih,jh) - \varDelta (jh)g^*_{\sigma (y,h),1}({\hat{x}}-ih)\right| \\\le & {} C\sqrt{N}\sup _{i,j}\left| \rho _{\mathbf{X}_{\tau }}(ih,jh) - \varDelta (jh)g^*_{\sigma (y,h),1}({\hat{x}}-ih)\right| . \end{aligned}$$

By Lemma 3, we have

$$\begin{aligned}&\sqrt{N}\sup _{i,j}\left| \rho _{\mathbf{X}_{\tau }}(ih,jh) - \varDelta (jh)g^*_{\sigma (y,h),1}({\hat{x}}-ih)\right| \\&\rightarrow 0 \quad \text{ as } N \rightarrow \infty . \end{aligned}$$

Taking making N large enough (equivalently, h small enough) that

$$\begin{aligned} \sqrt{N}\sup _{i,j}\left| \rho _{\mathbf{X}_{\tau }}(ih,jh) - \varDelta (jh)g^*_{\sigma (y,h),1}({\hat{x}}-ih)\right| < \frac{1}{C}\frac{\epsilon }{2M}, \end{aligned}$$

we are done.

Proof of Claim 1

We break $S_2$ into two pieces:

$$\begin{aligned} S_2\equiv & {} \sum _{j=-r-2}^0 \sum _{i \in I \backslash I_k} \left| \rho _{\mathbf{X}_{\tau }}(ih,jh) - \varDelta (jh)g^*_{\sigma (y,h),1}({\hat{x}}-ih)\right| \\\le & {} \sum _{j=-r-2}^0 \sum _{i \in I \backslash I_k} \rho _{\mathbf{X}_{\tau }}(ih,jh)\\&\quad + \sum _{j=-r-2}^0 \varDelta (jh) \sum _{i \in I \backslash I_k} g^*_{\sigma (y,h),1}({\hat{x}}-ih) \end{aligned}$$

But by (52), we have

$$\begin{aligned} S_2\le & {} P(d^{{\text {circ}}}_1(X_{\tau },{\hat{x}})> k\sqrt{y}\sqrt{N}h) \\&+ \sum _{i \in I \backslash I_k} g^*_{\sigma (y,h),1}({\hat{x}}-ih).\\ \end{aligned}$$

The first of these, we already know how to bound. By Lemma 5, we have

$$\begin{aligned}&P(d^{{\text {circ}}}_1(X_{\tau },{\hat{x}})> k\sqrt{y}\sqrt{N}h) \\&\le 2\exp \left( -\frac{(k\sqrt{y}\sqrt{N}h-rh)^2\sin ^2\theta ^*_r}{4r^2/|\mu _y|h}\right) \\&\quad +\exp \left( -\frac{1}{4\frac{r}{|\mu _y|}rh}\right) \end{aligned}$$

Since $k \ge \frac{2r}{\sqrt{y}}$ we have

$$\begin{aligned} \frac{(k\sqrt{y}\sqrt{N}h-rh)^2}{h} = (k\sqrt{y}-r\sqrt{h})^2 \ge (k\sqrt{y}-r)^2 \ge 0.5yk^2, \end{aligned}$$

it follows that

$$\begin{aligned}&P(d^{{\text {circ}}}_1(X_{\tau },{\hat{x}})> k\sqrt{y}\sqrt{N}h) \nonumber \\&\quad \le 2\exp \left( -y\left\{ \frac{\sin ^2\theta ^*_r}{16r^2/|\mu _y|}\right\} k^2\right) \nonumber \\&\qquad +\exp \left( -\frac{1}{4\frac{r}{|\mu _y|}rh}\right) . \end{aligned}$$

(53)

For the second, we find it convenient to rewrite the sum in terms of $G^*_{\sigma (m),N}(i-{\hat{x}}/h)$. To that end, we define ${\tilde{\sigma }}^2=\sigma ^2(y,h)/(yh) = \frac{\gamma ^2}{|\mu _y|^3}$ where $\gamma ^2 > 0$ is a constant given by (45). Then

$$\begin{aligned} \sum _{i \in I \backslash I_k} g^*_{\sigma (y,h),1}(ih-{\hat{x}})= & {} \sum _{i \in I \backslash I_k} g_{\sigma (m),N}(i-{\hat{x}}/h) \nonumber \\\le & {} \sum _{i \in I \backslash I_k} G^*_{\sigma (m),N}(i-{\hat{x}}/h)\nonumber \\&\quad +\frac{A}{y}m^{\frac{3}{2}}e^{-bm}, \end{aligned}$$

(54)

where we have used Lemma 1 together with $|I \backslash I_k| \le N=m/y$ in the rightmost inequality. Next, define

$$\begin{aligned} \varXi = \left\{ d^{{\text {circ}}}_{N}\left( i,\frac{{\hat{x}}}{h}\right) : i \in I \backslash I_k \right\} . \end{aligned}$$

Since $I \backslash I_k \subseteq \mathbb {Z}$, every member of $\varXi $ must be of the form

$$\begin{aligned} \xi + n \qquad \text{ where } \xi = \frac{{\hat{x}}}{h}-\left\lfloor \frac{{\hat{x}}}{h} \right\rfloor \in [0,1) \end{aligned}$$

for some $n \in \mathbb {Z}$. But by (52), we must have $n \ge \lfloor k \sqrt{m} \rfloor $. Finally, since the map $i \in I \backslash I_k \rightarrow d^{{\text {circ}}}_{N}(i,\frac{{\hat{x}}}{h})$ is at most two to one, we have

$$\begin{aligned} \sum _{i \in I \backslash I_k} G^*_{\sigma (m),N}(i-{\hat{x}}/h)= & {} \sum _{i \in I \backslash I_k} G_{\sigma (m)}(d^{{\text {circ}}}_{N}(i,{\hat{x}}/h)) \\\le & {} 2\sum _{x \in \varXi } G_{\sigma (m)}(x) \\\le & {} 2\sum _{n = \lfloor k \sqrt{m} \rfloor }^{\infty } G_{\sigma (m)}(\xi + n) \end{aligned}$$

The following routine tail estimate gives us a bound on our sum.

$$\begin{aligned} \sum _{i \in I \backslash I_k} G^*_{\sigma (m),N}(i-{\hat{x}}/h)\le & {} 2\sum _{n = \lfloor k \sqrt{m} \rfloor }^{\infty } G_{\sigma (m)}(\xi + n) \\= & {} 2\sum _{n = \lfloor k \sqrt{m} \rfloor }^{\infty } \frac{e^{-\frac{(\xi + n)^2}{2\sigma (m)^2}}}{\sqrt{2\pi }\sigma (m)} \\\le & {} 2\sum _{n = \lfloor k \sqrt{m} \rfloor }^{\infty } \frac{e^{-\frac{n^2}{2\sigma (m)^2}}}{\sqrt{2\pi }\sigma (m)} \\\le & {} 2 \int ^{\infty }_{\lfloor k \sqrt{m} \rfloor -1} \frac{e^{-\frac{x^2}{2\sigma (m)^2}}}{\sqrt{2\pi }\sigma (m)}dx. \end{aligned}$$

Since $k > 4$, we can find a $c \in (0,1)$ independent of $1 \le m \le N$ and k (for example $c = 0.5$ will work) such that $ck\sqrt{m} \le \lfloor k \sqrt{m} \rfloor -1$. At the same time, we have $\sigma (m)^2 = {\tilde{\sigma }}^2m$ where ${\tilde{\sigma }}^2 = \gamma ^2/|\mu _y|^3$ with $\gamma ^2 = {\text {Var}}(\mu _x V_1 - \mu _y U_1)$. Therefore

$$\begin{aligned} \sum _{i \in I \backslash I_k} G^*_{\sigma (m),N}(i-{\hat{x}}/h)\le & {} 2 \int ^{\infty }_{c k \sqrt{m}} \frac{e^{-\frac{x^2}{2{\tilde{\sigma }}^2m}}}{\sqrt{2\pi }{\tilde{\sigma }}\sqrt{m}}dx \\\le & {} 2 \int ^{\infty }_{c k \sqrt{m}} \frac{x}{ck\sqrt{m}} \frac{e^{-\frac{x^2}{2{\tilde{\sigma }}^2m}}}{\sqrt{2\pi }{\tilde{\sigma }}\sqrt{m}}dx\\= & {} \sqrt{\frac{2}{\pi }}\frac{{\tilde{\sigma }}}{ck}e^{-\frac{c^2k^2}{2{\tilde{\sigma }}^2}} \\\le & {} \sqrt{\frac{2}{\pi }}\frac{{\tilde{\sigma }}}{c}e^{-\frac{c^2k^2}{2{\tilde{\sigma }}^2}} \\ \end{aligned}$$

Putting this together with (54) and (53) gives

$$\begin{aligned} S_2\le & {} 2\exp \left( -y\left\{ \frac{\sin ^2\theta ^*_r}{4r^2/|\mu _y|}\right\} k^2\right) +\exp \left( -\frac{1}{4\frac{r}{|\mu _y|}rh}\right) \\&+\sqrt{\frac{2}{\pi }}\frac{{\tilde{\sigma }}}{c}e^{-\frac{c^2k^2}{2{\tilde{\sigma }}^2}}+Am^{\frac{3}{2}}e^{-bm} \\\le & {} \left( 2+\sqrt{\frac{2}{\pi }}\frac{{\tilde{\sigma }}}{c}\right) e^{-\alpha k^2}+(1+A)y^{\frac{3}{2}}N^{\frac{3}{2}}e^{-\beta y N} \end{aligned}$$

where $\alpha \equiv \min (\frac{\sin ^2\theta ^*_r}{4r^2/|\mu _y|},\frac{c^2}{2{\tilde{\sigma }}^2})$, $\beta \equiv \min (\frac{1}{4r^2/|\mu _y|},b)$. Taking $C_1 = \left( 2+\sqrt{\frac{2}{\pi }}\frac{{\tilde{\sigma }}}{c}\right) $, $C_2=\alpha $, $C_3=(1+A)y^{\frac{3}{2}}$, $C_4=\beta y$ completes the proof of the claim. $\square $

6.6 Asymptotic Limit Version 2

An important special case of the general framework considered in the previous two sections occurs when only the x-component of $\mathbf{X}_{\tau }$ is random. In particular, let us assume that the increments $\varDelta \mathbf{X}_i = h \mathbf{Z}_i = h \cdot (U,V)$ obey $U \in \{ -(r+2),-(r+1),\ldots ,r+1,r+2\}$ but $V \equiv -1$. Then the assumptions of Theorem 1 are never satisfied, as

$$\begin{aligned} \varphi _{\mathbf{Z}_i}(u_1,u_2)=e^{-\sqrt{-1}u_2} \sum _{k=-(r+2)}^{(r+2)} P(U=k)e^{\sqrt{-1}ku_1} \end{aligned}$$

obeys $|\varphi _{\mathbf{Z}_i}|=1$ for $u_1$ an integer multiple of $2\pi $ and any $u_2 \in \mathbb {R}$. Since we cannot use Theorem 1, a different approach is required in this case.

Here we use the following local central limit theorem for U:

Theorem 3

Let $X_1,X_2, \ldots $ be i.i.d. copies of an integer-valued random variable X of mean $\mu $ and variance $\sigma ^2$. Suppose furthermore that there is no infinite subprogression $a+q\mathbb { Z}$ of $\mathbb {Z}$ with $q>1$ for which X takes values almost surely in $a+q\mathbb {Z}$. Then one has

$$\begin{aligned} \mathbf{P}( S_n = m) = \frac{1}{\sqrt{2\pi n} \sigma } e^{-(m-n\mu )^2/2n \sigma ^2} + o(1/n^{1/2}) \end{aligned}$$

for all $n \ge 1$ and all integers m, where the error term $o(1/n^{1/2})$ is uniform in m.

Proof

This theorem may be found on Terrance Tao’s blog [37]. $\square $

The requirement in Theorem 3 “that there is no infinite subprogression $a+q\mathbb { Z}$ of $\mathbb {Z}$ with $q>1$ for which X takes values almost surely in $a+q\mathbb {Z}$” is equivalent to the following one dimensional version of Definition 3:

Definition 4

We say that an integer valued random variable X is “confined to a sublattice of $\mathbb {Z}$” if there exists an $n \in \mathbb {Z}$ and $k \in \mathbb {Z}$ with $|k| > 1$, such that the density of X is zero outside of $n+k\mathbb {Z}$. We say that X is not confined to any sublattice of $\mathbb {Z}$ if no such n and k exist.

Moreover, by an argument almost identical to that of Lemma 2, this is equivalent to the characteristic function $\varphi _X(u)$ defined by

$$\begin{aligned} \varphi _X(u) = \sum _{j \in \mathbb {Z}} P(X = j) e^{\sqrt{-1}ju} \end{aligned}$$

obeying $|\varphi _X(u)|<1$ unless u is an integer multiple of $2\pi $.

Following a sequence of steps almost identical to the previous section, we arrive at a version of the asymptotic limit applicable to this case.

Theorem 4

Let the inpainting domain D and undamaged area ${\mathcal {U}}$, as well as their discrete counterparts $D_h$, ${\mathcal {U}}_h$ be as described in Sect. 6.1. Let $u_0 : {\mathcal {U}} \rightarrow \mathbb {R}^d$ denote as usual the undamaged portion of the image, and assume $u_0$ is bounded in norm, that is, there is an $M > 0$ such that $\Vert u_0\Vert \le M$ on ${\mathcal {U}}$. Suppose we inpaint $D_h$ using Algorithm 1 or its semi-implicit extension, and denote the result by $u_h : D_h \rightarrow \mathbb {R}^d$. Assume the assumptions of Sect. 6.1 hold and let $a^*_r$ denote the stencil of our inpainting method. Assume that the stencil weights ${\tilde{w}}_\mathbf{j}$ assign zero mass outside of the line $\mathbb {Z} \times \{ -1 \}$, and assume that the x-coordinate U of $\mathbf{Z}_i = (U,V)$ is not confined to any sublattice of $\mathbb {Z}$ in the sense Definition 4, or equivalently that the characteristic function $\varphi _U$ of U defined by

$$\begin{aligned} \varphi _U(u) = \sum _{j = -(r+2)}^{r+2} P(U = j) e^{\sqrt{-1}ju} \end{aligned}$$

obeys $\varphi _U(u) <1$ unless u is an integer multiple of $2\pi $. Let $\varPi _{\theta ^*_r} : D \rightarrow \partial D$ denote the transport operator defined in (32) (recall that that the transport direction $\mathbf{g}^*_r$ obeys $\mathbf{g}^*_r=(\mu _x,\mu _y)$), and let $\mathbf{x}=(x,y) \in D_h$. Let $g_{\sigma (y,h)}$ denote the one-dimensional discrete Gaussian

$$\begin{aligned} g_{\sigma (y,h)}(ih)=\frac{e^{-\frac{d^{{\text {circ}}}_1(ih,0)^2}{2\sigma (y,h)^2}}}{\sum _{k=0}^{N-1} e^{-\frac{d^{{\text {circ}}}_1(kh,0)^2}{2\sigma (y,h)^2}}} \end{aligned}$$

with mean 0 and variance

$$\begin{aligned} \sigma (y,h) = \sigma ^2 yh \end{aligned}$$

where $\sigma ^2$ denote the variance of U. Then

$$\begin{aligned} u_h(x,y) = (u_0 * g_{\sigma (y,h)})(\varPi _{\theta ^*_r}(x,y)) +o(1) \end{aligned}$$

(55)

Proof

The proof is nearly identical to that of Theorem 2 and is left as an exercise to the reader. $\square $

6.7 Derivation of Fixed-Ratio Continuum Limit

In this section, we prove pointwise convergence of the discrete solution $u_h(\mathbf{x})$ to the fixed ratio continuum limit $u(\mathbf{x})$, for those points $\mathbf{x} \in D$ for which $\varPi _{\theta _r^*}(\mathbf{x}) \in \partial D$ is a continuity point of $u_0$. Theorem 2 tells us that

$$\begin{aligned} u_h(\mathbf{x}) = {\tilde{g}}_{\sigma (y,h)} * u_0(\varPi _{\theta _r^*}(\mathbf{x})) + o(1), \end{aligned}$$

where ${\tilde{g}}_{\sigma (y,h)}$ is the blur kernel from Lemma 3. Since $u(\mathbf{x}) = u_0(\varPi _{\theta _r^*}(\mathbf{x}))$, our job amounts to proving that

$$\begin{aligned} {\tilde{g}}_{\sigma (h)} * u_0(\mathbf{x}) \rightarrow u_0(\mathbf{x}) \qquad \text{ as } h \rightarrow 0 \end{aligned}$$

when $\mathbf{x}$ is a continuity point of $u_0$.

Before tackling this problem, it is worth digressing briefly to consider what would happen had we considered the original high resolution vanishing viscosity limit of März and Bornemann, in which first $h \rightarrow 0$ and then $\epsilon \rightarrow 0$. Writing $u_{\epsilon ,h}$ in place of $u_h$ to make the dependence on $\epsilon $ explicit, after $h \rightarrow 0$, we are left with

$$\begin{aligned} u_{\epsilon } = \lim _{h \rightarrow 0} u_{\epsilon ,h} = G_{\sigma (y,\epsilon )} * u_0 \end{aligned}$$

where $G_{\sigma (y,\epsilon )}$ is an integral convolution operator. For $u_0 \in BV({\mathcal {U}})$ the limit

$$\begin{aligned} \lim _{\epsilon \rightarrow 0} u_{\epsilon }(\mathbf{x}) \end{aligned}$$

can be almost characterized completely by well known results for functions of bounded variation. For completeness, we quote these results and the relevant definitions here:

Definition 5

(Approximate limit) we say that $u \in BV(\varOmega )$ has an approximate limit at $\mathbf{x} \in \varOmega $ if there is a $z \in \mathbb {R}$ such that

$$\begin{aligned} \lim _{\rho \downarrow 0} \frac{1}{|B_{\rho }(\mathbf{x})|}\int _{B_{\rho }(\mathbf{x})} |u(y)-z|dy = 0. \end{aligned}$$

The set $S_u$ of points where this property does not hold is called the approximate discontinuity set. For any $\mathbf{x} \in \varOmega \backslash S_u$, we define ${\tilde{u}}(\mathbf{x}) = z$.

Definition 6

(Approximate jump points) Given $\mathbf{x} \in \varOmega \subseteq \mathbb {R}^n$, $\mathbf{v} \in S^{n-1}$, and $\rho > 0$, define:

$$\begin{aligned} B^+_{\rho }(\mathbf{x},\mathbf{v})):= & {} \{ \mathbf{y} \in B_{\rho }(\mathbf{x}) : (\mathbf{y}-\mathbf{x})\cdot \mathbf{v} > 0\} \\ B^-_{\rho }(\mathbf{x},\mathbf{v})):= & {} \{ \mathbf{y} \in B_{\rho }(\mathbf{x}) : (\mathbf{y}-\mathbf{x})\cdot \mathbf{v} < 0\}. \end{aligned}$$

Let $u \in BV(\varOmega )\subseteq \mathbb {R}^n$ and $\mathbf{x} \in \varOmega $. We say that $\mathbf{x}$ is an approximate jump point of u if there exist $a,b \in \mathbb {R}$ and $\mathbf{v} \in \mathbb {R}^{n-1}$ such that $a \ne b$ and

$$\begin{aligned} \lim _{\rho \downarrow 0} \frac{1}{|B^+_{\rho }(\mathbf{x})|}\int _{B^+_{\rho }(\mathbf{x})} |u(y)-a|dy = 0. \\ \lim _{\rho \downarrow 0} \frac{1}{|B^-_{\rho }(\mathbf{x})|}\int _{B^-_{\rho }(\mathbf{x})} |u(y)-b|dy = 0. \end{aligned}$$

The set of $\mathbf{x} \in \varOmega $ where this property holds is called the approximate jump set $J_u$ of u. For any $\mathbf{x} \in J_u$, we define $u^+(\mathbf{x}) =a$ and $u^+(\mathbf{x}) =b$.

Theorem 5

If $\rho _{\epsilon }$ is a family of mollifiers and if $\mathbf{x} \in \varOmega \backslash S_u$, then the functions $u * \rho _{\epsilon }(\mathbf{x})$ converge to ${\tilde{u}}(\mathbf{x})$ as $\epsilon \downarrow 0$. Moreover, if $\mathbf{x} \in J_u$, then

$$\begin{aligned} \lim _{\epsilon \downarrow 0} u * \rho _{\epsilon }(\mathbf{x}) \rightarrow \frac{u^+(\mathbf{x})+u^-(\mathbf{x})}{2}. \end{aligned}$$

Proof

See Proposition 3.64 and Corollary 3.80 on pages 160 and 175 respectively of [1]. $\square $

März and Bornemann’s high-resolution vanishing viscosity limit is also the solution to a transport equation, differing from our fixed-ratio limit only in that the transport operator, which we denote here by $\varPi _{M}$, is different (see [7, 31] for details). If $u_0 \in SBV({\mathcal {U}})$, the approximate discontinuity set $S_{u_0}$ equals the approximate jump set $J_{u_0}$ and in this case we have

$$\begin{aligned} \lim _{\epsilon \rightarrow 0} u_{\epsilon }({\mathbf{x}})=u_{M}({\mathbf{x}}) \end{aligned}$$

where

$$\begin{aligned} u_{M} = u_0(\varPi _{M}({\mathbf{x}})) \text{ if } {\mathbf{x}} \in {\mathcal {U}} \backslash J_{u_0} \end{aligned}$$

and

$$\begin{aligned} u_{M} = \alpha u^+_0(\varPi _{M}(\mathbf{x}))+(1-\alpha )u^-_0(\varPi _{M}(\mathbf{x})) \text{ if } \mathbf{x} \in {\mathcal {J}}_{u_0} \end{aligned}$$

for some $\alpha \in [0,1]$. This is a result that März and Bornemann appear to be unaware of, as their approach is very different from ours.

In our case, however, ${\tilde{g}}_{\sigma (y,h)}$ is discrete and $BV({\mathcal {U}})$ (or $SBV({\mathcal {U}})$), where functions are only defined up to a set of measure zero, does not give us the right setup to prove our result. Moreover, even for piecewise continuous boundary data with simple jump type discontinuities, the behavior of $u_0(\mathbf{x})$ when $\varPi _{\theta _r^*}(\mathbf{x})$ is not a continuity point of $u_0$ is in general rather complex. Therefore, we content ourselves with proving convergence in the case where $\varPi _{\theta _r^*}(\mathbf{x})$ is a continuity point of $u_0$.

Theorem 6

Assume $u_0$ obeys the same conditions as in Theorem 2. Then for every $\mathbf{x} \in D$ such that $\varPi _{\theta _r^*}(\mathbf{x})$ is a continuity point of $u_0$ we have

$$\begin{aligned} u_h(\mathbf{x}) \rightarrow u(\mathbf{x}) = u_0(\varPi _{\theta _r^*}(\mathbf{x})) \qquad \text{ as } h \rightarrow 0. \end{aligned}$$

where $\varPi _{\theta _r^*}(\mathbf{x})$ is given by

$$\begin{aligned} \varPi _{\theta _r^*}(x,y)=(x-\cot (\theta _r^*)y \mod 1,0), \end{aligned}$$

and

$$\begin{aligned} \theta _r^* = \theta (\mathbf{g}_r^*) \in (0,\pi ) \end{aligned}$$

is the counterclockwise angle between the x-axis and the line $L_{\mathbf{g}_r^*} := \{ \lambda \mathbf{g}_r^* : \lambda \in \mathbb {R} \}$, and $\mathbf{g}^*_r$ is the center of mass of the stencil $a^*_r$ which can be computed in two equivalent ways (Definition 1)

$$\begin{aligned} \mathbf{g}^*_r =\frac{\sum _{\mathbf{y} \in a^*_r} w_r(0,\mathbf{y})\mathbf{y}}{\sum _{\mathbf{y} \in a^*_r} w_r(0,\mathbf{y})}=\frac{\sum _{\mathbf{y} \in \text{ Supp }(a^*_r)} {\tilde{w}}_r(0,\mathbf{y})\mathbf{y}}{\sum _{\mathbf{y} \in \text{ Supp }(a^*_r)} {\tilde{w}}_r(0,\mathbf{y})}. \end{aligned}$$

Proof

Assume that $\varPi _{\theta _r^*}(\mathbf{x}):=\hat{\mathbf{x}}$ is a continuity point of $u_0$. Note that

$$\begin{aligned} \Vert u_h(\mathbf{x}) - u_0(\hat{\mathbf{x}})|\le & {} \Vert u_h(\mathbf{x}) - ({\tilde{g}}_{\sigma (h)} * u_0)(\hat{\mathbf{x}})\Vert \\&+ \Vert ({\tilde{g}}_{\sigma (h)} * u_0)(\hat{\mathbf{x}}) - u_0(\hat{\mathbf{x}})\Vert . \end{aligned}$$

By Theorem 2, for h sufficiently small the first term in the above sum is at most $\frac{\epsilon }{2}$ . It suffices to show that taking h sufficiently small bounds the second term by $\frac{\epsilon }{2}$ as well. For convenience, within this proof given $\mathbf{x}:=(x_1,x_2), \mathbf{y}:=(y_1,y_2) \in [0,1) \times \mathbb {R}$ we define $B_{r,h}(\mathbf{x})$ and $\mathbf{x}-\mathbf{y}$ in the circular sense, that is

$$\begin{aligned} B_{r,h}(\mathbf{x}):= & {} \{ \mathbf{y} : d^{{\text {circ}}}_1(x_1,y_1)^2+(x_2-y_2)^2 \le r^2 \}\\ \mathbf{x}-\mathbf{y}:= & {} (x_1-y_1 \mod 1,x_2-y_2). \end{aligned}$$

For any $r > 0$ we have

$$\begin{aligned}&\Vert ({\tilde{g}}_{\sigma (y,h)} * u_0)(\hat{\mathbf{x}}) - u_0(\hat{\mathbf{x}})\Vert \\&\quad \le \left\| \sum _{\mathbf{y} \in {\mathcal {U}}_h \cap B_{r,h}(\hat{\mathbf{x}})} {\tilde{g}}_{\sigma (y,h)}(\hat{\mathbf{x}}-\mathbf{y})u_0(\mathbf{y}) - u_0(\hat{\mathbf{x}})\right\| \\&\qquad + \left\| \sum _{\mathbf{y} \in {\mathcal {U}}_h \backslash B_{r,h}(\hat{\mathbf{x}})} {\tilde{g}}_{\sigma (h)}(\hat{\mathbf{x}}-\mathbf{y})u_0(\mathbf{y}) - u_0(\hat{\mathbf{x}})\right\| \\&\quad \le \sum _{\mathbf{y} \in {\mathcal {U}}_h \cap B_{r,h}(\hat{\mathbf{x}})} {\tilde{g}}_{\sigma (h)}(\hat{\mathbf{x}}-\mathbf{y})\Vert u_0(\mathbf{y}) - u_0(\hat{\mathbf{x}})\Vert \\&\qquad + \sum _{\mathbf{y} \in {\mathcal {U}}_h \backslash B_{r,h}(\hat{\mathbf{x}})} {\tilde{g}}_{\sigma (h)}(\hat{\mathbf{x}}-\mathbf{y})\Vert u_0(\mathbf{y}) - u_0(\hat{\mathbf{x}})\Vert . \end{aligned}$$

Take r small enough that $\mathbf{y} \in B_{r,h}(\hat{\mathbf{x}})$ implies $|u_0(\mathbf{y})-u_0(\hat{\mathbf{x}})|<\frac{\epsilon }{4}$. At the same time, since $\sigma (y,h) \rightarrow 0$ as $h \rightarrow 0$, taking h sufficiently small ensures that the mass of ${\tilde{g}}_{\sigma (y,h)}(\hat{\mathbf{x}}-\mathbf{y})$ outside of $B_{r,h}(\hat{\mathbf{x}})$ is no more than $\frac{\epsilon }{4M}$. This gives

$$\begin{aligned} \Vert ({\tilde{g}}_{\sigma (y,h)} * u_0)(\hat{\mathbf{x}}) - u_0(\hat{\mathbf{x}})\Vert\le & {} \sum _{\mathbf{y} \in {\mathcal {U}}_h \cap B_{r,h}(\hat{\mathbf{x}})} {\tilde{g}}_{\sigma (h)}(\hat{\mathbf{x}}-\mathbf{y})\frac{\epsilon }{4} \\&\quad + \sum _{\mathbf{y} \in {\mathcal {U}}_h \backslash B_{r,h}(\hat{\mathbf{x}})} {\tilde{g}}_{\sigma (h)}(\hat{\mathbf{x}}-\mathbf{y})M \\\le & {} 1 \cdot \frac{\epsilon }{4} +\frac{\epsilon }{4M} \cdot M \le \frac{\epsilon }{2} \end{aligned}$$

as required. $\square $

Remark 13

Here we have derived the fixed-ratio continuum limit as a corollary of the asymptotic limit. This was done for the sake of efficiency - it can also be derived as a stand alone result. Moreover, this result is much more general, covering convergence in $L^p$ rather than just pointwise convergence, and the assumption that the stencil $a^*_r$ not be restricted to any sublattice of $\mathbb {Z}^2$ (Definition 3) can be eliminated. Convergence rates are also given, which turn out to depend on both the choice of $L^p$ norm and the regularity of the boundary data $u_0$. However, the proof is very long and technical, so for this reason we have published it online only—please see [26].

7 Consequences

In this section apply the fixed-ratio continuum limit and asymptotic limit of the previous section, that is Theorems 2 and 6 , in order to explain:

1.
The “kinking” and “blur” artifacts listed in Sect. 2.2 and illustrated in Figs. 5, 6, and 8 .
2.
The differences between the direct form of Algorithm 1 and its semi-implicit extension in terms of coping with these artifacts.

We begin by considering kinking artifacts, and then go on to blur. We will see that the fixed-ratio continuum limit from Theorem 6 is all that is required to understand kinking artifacts, while blur artifacts require the asymptotic limit from Theorem 2. Shocks and cut off isophotes are not considered as they have already been adequately explained elsewhere [31, 32].

7.1 Kinking

We begin by exploring kinking artifacts. First we prove a fundamental distinction between the direct form of Algorithm 1 and the semi-implicit extension. Then we go on to look at the limiting transport directions associated with three specific methods: coherence transport, Guidefill, and semi-implicit Guidefill.

7.1.1 The Direct form Algorithm 1 Kinks, the Semi-Implicit Extension Need Not

An immediate corollary of Theorem 6 is that the direct form of Algorithm 1 is limited in terms of what directions it can extrapolate along, while this is not true of the semi-implicit extension. This follows from the geometric interpretation of the limiting transport direction $\mathbf{g}^*_r$ as the center of mass of the stencil $a^*_r$ with the respect to the stencil weights $\{ w_r(\mathbf{0},\mathbf{y})/W : \mathbf{y} \in a^*_r \}$. Specifically, since the original weights $\{ \frac{w_r(\mathbf{0},\mathbf{y})}{W} \}_{\mathbf{y} \in a^*_r}$ are non-negative and sum to one, by the preservation of total mass (10) and inheritance of non-negativity (12) properties of equivalent weights, the same is true of the equivalent weights $\{ \frac{{\tilde{w}}_r(\mathbf{0},\mathbf{y})}{W} \}_{\mathbf{y} \in \text{ Supp }(a^*_r)}$. Hence

$$\begin{aligned} \mathbf{g}^*_r = \sum _{\mathbf{y} \in \text{ Supp }(a^*_r)} \frac{{\tilde{w}}_r(\mathbf{0},\mathbf{y})}{W} \mathbf{y} \in \text{ Conv }(\text{ Supp }(a^*_r)), \end{aligned}$$

where $\text{ Conv }(\text{ Supp }(a^*_r))$ denotes the convex hull of $\text{ Supp }(a^*_r)$. But by (26) we have

$$\begin{aligned} \text{ Supp }(a^*_r) \subseteq {\left\{ \begin{array}{ll} {\bar{b}}^-_r &{} \text{ direct } \text{ form } \text{ of } \text{ Algorithm } \text{1 } \\ {\bar{b}}^0_r &{} \text{ semi-implicit } \text{ form } \end{array}\right. } \end{aligned}$$

(56)

where ${\bar{b}}^0_r$ and ${\bar{b}}^-_r$ are the dilated sets defined by (24) and (25) respectively. It follows that $\mathbf{g}^*_r$ has to lie in the convex hull of ${\bar{b}}^-_r$ if the direct form of Algorithm 1 is used, or the convex hull of ${\bar{b}}^0_r$ if the semi-implicit form is used instead. That is

$$\begin{aligned} \mathbf{g}^*_r \in {\left\{ \begin{array}{ll} \text{ Conv }({\bar{b}}^-_r) &{} \text{ direct } \text{ form } \text{ of } \text{ Algorithm } \text{1. } \\ \text{ Conv }({\bar{b}}^0_r) &{} \text{ semi-implicit } \text{ form. } \end{array}\right. } \end{aligned}$$

(57)

This in turn immediately implies that if we use the direct form of Algorithm 1, then $\theta (\mathbf{g}^*_r)$ is restricted to the cone

$$\begin{aligned} \theta _c \le \theta (\mathbf{g}^*_r) \le \pi - \theta _c \end{aligned}$$

(58)

where

$$\begin{aligned} \theta _c = \arcsin \left( \frac{1}{r}\right) \end{aligned}$$

(59)

is the smallest possible angle one can make given the restriction (57), while for the semi-implict method we have

$$\begin{aligned} 0< \theta (\mathbf{g}^*_r) < \pi \end{aligned}$$

(60)

where the angles $\theta = 0$ and $\theta = \pi $ are omitted not because of the restriction (57), but because Theorem 6 does not apply in either case (indeed, the continuum limit u given by (30) is not even defined). The cone (58) is exactly what we saw in Fig. 5c. At the same time, the lack of restrictions implied by (60) is consistent with our experience in Figs. 12 and 14, where semi-implicit Guidefill was able to successfully extrapolate lines making an angle as small as $1^{\circ }$ with the inpainting domain boundary for $r=3$, well under the critical angle $\theta _c= \arcsin (\frac{1}{3}) \approx 19.5^{\circ }$ where standard Guidefill breaks down. Figure 16 illustrates this result.

7.1.2 Limiting Transport Directions for Coherence Transport, Guidefill, and Semi-Implicit Guidefill

Here we derive formulas for the limiting transport directions of coherence transport, Guidefill, and semi-implicit Guidefill in the limit as $\mu \rightarrow \infty $. For convenience, in this section we rescale the limiting transport direction $\mathbf{g}^*_r$ by a factor of $W = \sum _{\mathbf{y} \in a^*_r} w_r(\mathbf{0},\mathbf{y})$, giving

$$\begin{aligned} \mathbf{g}^*_r = \sum _{\mathbf{y} \in a^*_r} w_r(\mathbf{0},\mathbf{y})\mathbf{y}. \end{aligned}$$

(61)

This is more convenient to work with and defines the same transport equation. In fact, the ability to rescale $\mathbf{g}^*_r$ by any $\lambda \ne 0$ without changing the underlying transport equation is a tool that we will use repeatedly in our arguments throughout this section. To make the dependence of the transport direction on $\mu $ explicit, within this section we write this direction as $\mathbf{g}^*_{r,\mu }$, and define

$$\begin{aligned} \mathbf{g}^*_r := \lim _{\mu \rightarrow \infty } \mathbf{g}^*_{r,\mu }. \end{aligned}$$

In order to resolve the ambiguity that $\mathbf{g}^*_r$ and $-\mathbf{g}^*_r$ define the same transport equation, we will frequently be taking angles modulo $\pi $. This is reflected in the definition we have selected for $\theta (\mathbf{v})$—recall that this refers to the angle that the line $L_{\mathbf{v}}=\{ \lambda \mathbf{v} : \lambda \in \mathbb {R} \}$ makes with the x-axis, and hence always lies in the range $[0,\pi )$. We define $\mathbf{g}=(\cos \theta ,\sin \theta )$, $\theta ^*_{r,\mu } :=\theta (\mathbf{g}^*_{r,\mu })$, $\theta ^*_r :=\theta (\mathbf{g}^*_r)$ and consider the function $\theta ^*_r = F(\theta )$. Our main concern is to determine when $\theta ^*_r = \theta $ (no kinking) and when $\theta ^*_r \ne \theta $ (kinking). Results are illustrated in Fig. 17.

Coherence transport We have already stated in Sect. 2 and illustrated in Fig. 5 that coherence transport in the limit $\mu \rightarrow \infty $ kinks unless $\mathbf{g}=\lambda \mathbf{v}$ for some $\mathbf{v} \in B_{\epsilon ,h}(\mathbf{0})$ and some $\lambda \in \mathbb {R}$. Under the assumptions of Sect. 6.1, a more precise statement is that coherence transport in the limit $\mu \rightarrow \infty $ fails to kink if and only if $\mathbf{g}=\lambda \mathbf{v}$ for some $\mathbf{v} \in b^-_r$ and $\lambda \in \mathbb {R}$. Before we prove this, let us make some definitions. We define the angular spectrum of $b^-_r$ as

$$\begin{aligned} \varTheta (b^-_r) := \{ \theta _1,\ldots ,\theta _n \in [0,\pi ) : \exists {} \mathbf{y} \in b^-_r \text{ s.t. } \theta _i = \theta (\mathbf{y}) \}. \end{aligned}$$

(62)

In other words, $\varTheta (b^-_r)$ is the collection of angles modulo $\pi $ that are representable using members of $b^-_r$ (or which elements of the projective space $\mathbb {RP}^1$ are representable, to be more mathematically precise), see Fig. 18 for an illustration. The angular spectrum may be similarly defined in the obvious way for more general sets, and we do so in “Appendix E”. We will show that for coherence transport, when $0< \theta < \pi $, we either have

$$\begin{aligned} \theta ^*_r \in \varTheta (b^-_r) \end{aligned}$$

or

$$\begin{aligned} \theta ^*_r = \frac{\theta _i+\theta _{i+1}}{2} \quad \text{ for } \text{ consecutive } \theta _i, \theta _{i+1} \in \varTheta (b^-_r). \end{aligned}$$

To begin, note that in this case (61) becomes

$$\begin{aligned} {\mathbf{g}}^*_{r,\mu } := \sum _{{\mathbf{y}} \in b^-_r} e^{-\frac{\mu ^2}{2r^2}({\mathbf{y}} \cdot {\mathbf{g}}^{\perp })^2}\frac{\mathbf{y}}{\Vert {\mathbf{y}}\Vert }, \end{aligned}$$

where $b^-_r$ is the discrete half ball (23). Denote by $\varPsi $ the set of minimizers of $|\mathbf{y} \cdot \mathbf{g}^{\perp }|$ for $\mathbf{y} \in b^-_r$, meaning that $|\mathbf{y} \cdot \mathbf{g}^{\perp }|:=\varDelta \ge 0$ for all $\mathbf{y} \in \varPsi $ and $|\mathbf{y} \cdot \mathbf{g}^{\perp }|>\varDelta $ for all $\mathbf{y} \in b^-_r \backslash \varPsi $. After rescaling by $e^{\frac{\mu ^2}{2r^2}\varDelta ^2}$, the transport direction $\mathbf{g}^*_{r,\mu }$ becomes

$$\begin{aligned} {\mathbf{g}}^*_{r,\mu }= & {} \sum _{{\mathbf{y}} \in \varPsi } \frac{\mathbf{y}}{\Vert {\mathbf{y}}\Vert }+\sum _{{\mathbf{j}} \in b^-_r \backslash \varPsi } e^{-\frac{\mu ^2}{2r^2}\left\{ ({\mathbf{y}} \cdot {\mathbf{g}}^{\perp })^2 - \varDelta ^2 \right\} }\frac{\mathbf{y}}{\Vert {\mathbf{y}}\Vert } \nonumber \\\rightarrow & {} \sum _{{\mathbf{y}} \in \varPsi } \frac{\mathbf{y}}{\Vert {\mathbf{y}}\Vert } \qquad \text{ as } \mu \rightarrow \infty . \end{aligned}$$

(63)

Note that $|\mathbf{y} \cdot \mathbf{g}^{\perp }|$ represents the distance from the point $\mathbf{y}$ to the line through the origin $L_\mathbf{g}$. Thus computing the set $\varPsi $ is equivalent to finding the set of points in $b^-_r$ closest to a given line through the origin. In “Appendix E” we prove that as $\theta $ sweeps out an arc from 0 to $\pi $, for all but finitely many $\theta $ the set $\varPsi $ is a singleton, containing a sequence of lone minimizers that we enumerate (in order of occurrence) as $\mathbf{y}_1, \mathbf{y}_2,\ldots \mathbf{y}_{n'}$ (for some finite $n'$). Now, it turns out that $n'=n$ and moreover for every $\theta _i \in \varTheta (b^-_r)$ we have

$$\begin{aligned} \theta _i = \theta (\mathbf{y}_i) \end{aligned}$$

(“Appendix E”, Proposition 4). In other words, we have a 1-1 correspondence between (singleton) minimizers of $|\mathbf{y} \cdot \mathbf{g}^{\perp }|$ and the angular spectrum $\varTheta (b^-_r)$. Moreover, it can be shown that if $\theta _i< \theta < \theta _{i+1}$ for some $\theta _i, \theta _{i+1} \in \varTheta (b^-_r)$, then either $\varPsi = \{ \mathbf{y}_i \}$ (for $\theta $ close to $\theta _i$) or $\varPsi = \{ \mathbf{y}_{i+1} \}$ (for $\theta $ close to $\theta _{i+1}$) or $\varPsi = \{\mathbf{y}_i, \mathbf{y}_{i+1} \}$ if $\theta =\theta _{i,i+1}$, where $\theta _{i,i+1} \in (\theta _i,\theta _{i+1})$ is a critical angle we refer to as a transition angle.

Definition 7

Given minimizers $\{ \mathbf{y}_i \}_{i=1}^n$ defined as above we define the set of transition angles $\{ \theta _{i,i+1} \}_{i=1}^{n-1}$ by

$$\begin{aligned} \theta _{i,i+1} = \theta (\mathbf{y}_i + \mathbf{y}_{i+1}) \quad \text{ for } 1 \le i \le n-1, \end{aligned}$$

where $\theta (\mathbf{y})$ denotes as usual the angle between the line $L_\mathbf{y}$ and the x-axis.

For convenience, we also define $\theta _{0,1}=0$ and $\theta _{n,n+1}=\pi $. Since one can also prove $\varPsi = \{ \mathbf{y}_1 \}$ for $0:=\theta _{0,1}<\theta <\theta _1$ and $\varPsi = \{\mathbf{y}_n\}$ for $\theta _n< \theta < \theta _{n,n+1}:=\pi $, with this notation we have the general result

$$\begin{aligned} \varPsi ={\left\{ \begin{array}{ll} \{\mathbf{y}_i\} \text{ if } \theta _i< \theta< \theta _{i,i+1} \\ \text{ for } \text{ some } i=1,\ldots ,n \\ \{\mathbf{y}_i,\mathbf{y}_{i+1}\} \text{ if } \theta =\theta _{i,i+1} \\ \text{ for } \text{ some } i=1,\ldots ,n-1 \\ \{\mathbf{y}_{i+1}\} \text{ if } \theta _{i,i+1}< \theta < \theta _{i+1} \\ \text{ for } \text{ some } i=0,\ldots ,n-1 \\ \end{array}\right. } \end{aligned}$$

(“Appendix E”, Proposition 5—note that we have carefully excluded $\theta _{0,1}:=0$ and $\theta _{n,n+1}=\pi $ from the middle case). In words, this means that the element(s) of $b^-_r$ closest to the line $L_\mathbf{g}$ are also the member(s) of $b^-_r$ that cast the shallowest angles with $L_\mathbf{g}$ from above and below. This statement is not true if $b^-_r$ is replaced with a generic subset of $\mathbb {Z}^2$—see Fig. 19. The remaining cases are $\theta =\theta _i \in \varTheta (b^-_r)$ and $\theta =0$. We deal with the former first—in the first case we have

$$\begin{aligned} \varPsi = \{ \mathbf{y} \in b^-_r : \theta (y) = \theta _i \}, \end{aligned}$$

a set containing up to r members, all of which are parallel to each other and to $\mathbf{g}$. In order to make these ideas more concrete, Example 1 gives explicit expressions for $\varTheta (b^-_r)$ and $\varPsi $ in the case $r=3$. Also, in “Appendix E” Remark 20, we give an algorithm for computing $\varTheta (b^-_r)$ and $Y(b^-_r):=\{\mathbf{y}_1,\mathbf{y}_2,\ldots ,\mathbf{y}_n\}$ for any r.

Example 1

When $r=3$ we have

$$\begin{aligned} \varTheta (b^-_3):= & {} \{\theta _1,\theta _2,\theta _3,\theta _4,\theta _5,\theta _6,\theta _7\}\\= & {} \{ \arctan (1/2), \frac{\pi }{4}, \arctan (2), \frac{\pi }{2},\\&\frac{\pi }{2}+\arctan (1/2),\frac{3\pi }{4},\frac{\pi }{2}+\arctan (2)\}. \end{aligned}$$

For $0 < \theta \le \frac{\pi }{2}$, (we omit $\frac{\pi }{2}< \theta < \pi $ for brevity) the set of minimizers $\varPsi $ is given by

$$\begin{aligned} \varPsi = {\left\{ \begin{array}{ll} \{(-2,-1)\}:=\{\mathbf{y}_1\} \\ \text{ if } 0< \theta< \theta _{1,2}. \\ \{(-2,-1),(-1,-1)\}:=\{\mathbf{y}_1,\mathbf{y}_2\} \\ \text{ if } \theta =\theta _{1,2}. \\ \{(-1,-1)\}:=\{\mathbf{y}_2\} \\ \text{ if } \theta _{1,2}< \theta< \theta _2. \\ \{(-1,-1),(-2,-2)\}:=\{\mathbf{y}_2,2\mathbf{y}_2\} \\ \text{ if } \theta = \theta _2. \\ \{(-1,-1)\}:=\{\mathbf{y}_2\} \\ \text{ if } \theta _2< \theta< \theta _{2,3}. \\ \{(-1,-1),(-1,-2)\}:=\{\mathbf{y}_2,\mathbf{y}_3\} \\ \text{ if } \theta = \theta _{2,3}. \\ \{(-1,-2)\}=\{ \mathbf{y}_3\} \\ \text{ if } \theta _{2,3}< \theta< \theta _{3,4}. \\ \{(-1,-2),(0,-1)\}=\{\mathbf{y}_3,\mathbf{y}_4 \} \\ \text{ if } \theta =\theta _{3,4}. \\ \{(0,-1)\}:=\{\mathbf{y}_4\} \\ \text{ if } \theta _{3,4}< \theta < \theta _4. \\ \{(0,-1),(0,-2),(0,-3)\}:=\{\mathbf{y}_4,2\mathbf{y}_4,3\mathbf{y}_4\} \\ \text{ if } \theta = \theta _4. \end{array}\right. } \end{aligned}$$

where $\theta _{0,1}=0$, $\theta _{1,2}=\arctan (2/3)$, $\theta _{2,3}=\arctan (3/2)$, $\theta _{3,4}=\arctan (3)$.

When $\varPsi $ is a singleton set, that is $\varPsi = \{ \mathbf{y}_i \}$ for some $1 \le i \le n$, (63) becomes $\mathbf{g}^*_r = \frac{\mathbf{y}_i}{\Vert \mathbf{y}_i\Vert }$ and we have

$$\begin{aligned} \theta (\mathbf{g}^*_r) = \theta \left( \frac{\mathbf{y}_i}{\Vert \mathbf{y}_i\Vert }\right) = \theta _i \end{aligned}$$

On the other hand, if $\theta = \theta _i \in \varTheta (b^-_r)$, $\mathbf{g}^*_r$ is a sum of vectors all parallel to one another and to $\mathbf{g}$, and we get $\theta ^*_i = \theta _i$ again. This is the lone case in which coherence transport doesn’t kink. Next, at the transition angles $\theta _{i,i+1}$ where $\varPsi = \{ \mathbf{y}_i, \mathbf{y}_{i+1} \}$, we have $\mathbf{g}^*_r = \frac{\mathbf{y}_i}{\Vert \mathbf{y}_i\Vert }+\frac{\mathbf{y}_{i+1}}{\Vert \mathbf{y}_{i+1}\Vert }$, so that

$$\begin{aligned} \theta (\mathbf{g}^*_r) = \theta \left( \frac{\mathbf{y}_i}{\Vert \mathbf{y}_i\Vert }+\frac{\mathbf{y}_{i+1}}{\Vert \mathbf{y}_{i+1}\Vert }\right) = \frac{\theta _i+\theta _{i+1}}{2}, \end{aligned}$$

where we have used the observation (proved in “Appendix E”, Observation 7) that $\theta (\mathbf{v}+\mathbf{w}) = \frac{\theta (\mathbf{v}) + \theta (\mathbf{w})}{2}$ holds for all unit vectors $\mathbf{v}, \mathbf{w} \in S^1$. Finally, suppose $\theta = 0$. Here we have $\varPsi = \{ (i,-1) : -r+1 \le i \le r-1 \}$, giving

$$\begin{aligned} \mathbf{g}^*_r = e_2 \qquad \text{ for } \theta = 0 \end{aligned}$$

after rescaling. In summary, for $\theta \in [0,\pi )$ we then have

$$\begin{aligned} \theta ^*_r = {\left\{ \begin{array}{ll} \frac{\pi }{2} &{} \text{ if } \theta = 0 \\ \theta _i &{} \text{ if } \theta _{i-1,i}< \theta < \theta _{i,i+1} \text{ for } \text{ some } i = 1,\ldots n \\ \frac{\theta _i+\theta _{i+1}}{2} &{} \text{ if } \theta = \theta _{i,i+1} \text{ for } \text{ some } i = 1,\ldots n \end{array}\right. } \end{aligned}$$

(64)

See Fig. 17a, b for an illustration of (64) for $r=3$ and $r=5$. In “Appendix E”, Corollary 2 we prove a generalization of this result that applies if, for example, the discrete ball used by coherence transport is replaced with a discrete square.

Guidefill. In this case (61) becomes

$$\begin{aligned} {\mathbf{g}}^*_{r,\mu } := \sum _{{\mathbf{y}} \in {\tilde{b}}^-_r} e^{-\frac{\mu ^2}{2r^2}({\mathbf{y}} \cdot {\mathbf{g}}^{\perp })^2}\frac{\mathbf{y}}{\Vert {\mathbf{y}}\Vert }, \end{aligned}$$

where ${\tilde{b}}^-_r$ is given by (23). It is useful to patition ${\tilde{b}}^-_r$ into a disjoint union of sets $\ell ^-_k$ such at each $\ell ^-_k$ is the collection of points in ${\tilde{b}}^-_r$ distance k from the line $L_\mathbf{g} = \{ \lambda \mathbf{g} : \lambda \in \mathbb {R}\}$, that is

$$\begin{aligned} \ell ^-_k =\{ n\mathbf{g}+m\mathbf{g}^{\perp } \in {\tilde{b}}^-_r : m = \pm k\}. \end{aligned}$$

Since the weights (6) exponentially decay with distance from $L_\mathbf{g}$, then so long as $\ell ^-_{0}$ is non-empty, we expect the contribution of the other $\ell ^-_k$ to vanish. For Guidefill, $\ell ^-_0$ can be explicitly parametrized as

$$\begin{aligned} \ell ^-_0:= & {} \{ n\mathbf{g} \in {\tilde{b}}^-_r : n\mathbf{g} \cdot e_2 \le -1\} \\= & {} \{ n\mathbf{g} : n=-r,\ldots ,-\lceil \csc \theta \rceil \}. \end{aligned}$$

For $\ell ^-_0$ to be non-empty, we need $\lceil \csc \theta \rceil \le r$, which occurs only if $\theta _c \le \theta \le \pi -\theta _c$, where $\theta _c$ is the same critical angle (59) from Sect. 7.1.1. If $\ell ^-_0 \ne \emptyset $, we have

$$\begin{aligned} {\mathbf{g}}^*_{r,\mu }= & {} \sum _{{\mathbf{y}} \in \ell ^-_0} \frac{\mathbf{y}}{\Vert {\mathbf{y}}\Vert }+\sum _{k=1}^r \sum _{{\mathbf{y}} \in \ell ^-_k} e^{-\frac{\mu ^2}{2r^2}k^2 \Vert {\mathbf{g}}\Vert ^2 }\frac{\mathbf{y}}{\Vert {\mathbf{y}}\Vert } \\\rightarrow & {} \sum _{{\mathbf{y}} \in \ell ^-_0} \frac{\mathbf{y}}{\Vert {\mathbf{y}}\Vert } \qquad \text{ as } \mu \rightarrow \infty . \\= & {} \sum _{n=-r}^{-\lceil \csc \theta \rceil } n {\mathbf{g}} \\= & {} {\mathbf{g}} \qquad \text{ after } \text{ rescaling. } \end{aligned}$$

Hence we transport in the correct direction in this case.

One the other hand, if $\ell ^-_0 = \emptyset $ but $\theta \ne 0$, then the weights (6) concentrate all their mass into $\ell ^-_1$, with all other $\ell ^-_k$ vanishing. Unfortunately, unlike $\ell ^0$, the set $\ell ^-_1$ is not parallel to $\mathbf{g}$ so we expect kinking in this case. In general $\ell ^-_1$ can consist of two parallel components on either side of $\ell ^-_0$. But in this case, since $\ell ^-_0$ lies entirely above the line $y=-1$, we know $\ell ^-_1$ consists of just one component and we can write it explicitly as

$$\begin{aligned} \ell ^-_1 =\{ n\mathbf{g}-\text{ sgn }(\cos \theta )\mathbf{g}^{\perp } : n=-r+1,\ldots ,-1 \} \end{aligned}$$

(remember that $\mathbf{g}^{\perp }$ denotes the counterclockwise rotation of $\mathbf{g}$ by $90^{\circ }$). After rescaling by $e^{\frac{\mu ^2}{2r^2}\Vert \mathbf{g}\Vert ^2}$ we obtain

$$\begin{aligned} {\mathbf{g}}^*_{r,\mu }= & {} \sum _{{\mathbf{y}} \in \ell ^-_1} \frac{\mathbf{y}}{\Vert {\mathbf{y}}\Vert }+\sum _{k=2}^r \sum _{{\mathbf{y}} \in \ell ^-_k} e^{-\frac{\mu ^2}{2r^2}(k^2-1) \Vert {\mathbf{g}}\Vert ^2 }\frac{\mathbf{y}}{\Vert {\mathbf{y}}\Vert } \\\rightarrow & {} \sum _{{\mathbf{y}} \in \ell ^-_1} \frac{\mathbf{y}}{\Vert {\mathbf{y}}\Vert } \qquad \text{ as } \mu \rightarrow \infty . \\= & {} \left( \sum _{n=-r+1}^{-1} \frac{n}{\sqrt{1+n^2}} \right) \mathbf{g}\\&\quad - \text{ sgn }(\cos \theta ) \left( \sum _{n=-r+1}^{-1} \frac{1}{\sqrt{1+n^2}} \right) {\mathbf{g}}^{\perp } \\= & {} {\mathbf{g}}+\text{ sgn } (\cos \theta )\alpha _r {\mathbf{g}}^{\perp } \qquad \text{ after } \text{ rescaling. } \end{aligned}$$

where

$$\begin{aligned} \alpha _r = \frac{\sum _{n=1}^{r-1} \frac{1}{\sqrt{1+n^2}}}{\sum _{n=1}^{r-1} \frac{n}{\sqrt{1+n^2}}}. \end{aligned}$$

Finally, if $\theta =0$ we have ${\tilde{b}}^-_r = b^-_r$ and we obtain $\mathbf{g}^*_r = e_2$ as for coherence transport. Defining $\varDelta \theta _r = \arctan (\alpha _r)$, for $\theta \in [0,\pi )$ we obtain

$$\begin{aligned} \theta ^*_r = {\left\{ \begin{array}{ll} \frac{\pi }{2} &{} \text{ if } \theta = 0 \\ \theta +\varDelta \theta _r &{} \text{ if } 0< \theta< \theta _c \\ \theta &{} \text{ if } \theta _c \le \theta \le \pi -\theta _c \\ \theta -\varDelta \theta _r &{} \text{ if } \pi - \theta _c< \theta < \pi . \end{array}\right. } \end{aligned}$$

(65)

In other words, aside from exceptional case $\theta =0$, we have $\theta _r^*=\theta $ for the “well behaved” range of values $\theta _c \le \theta \le \pi -\theta _c$, but $\theta _r^*$ jumps by a constant angle $\varDelta \theta _r$ near $\theta = 0$ and $\theta =\pi $. The first few values of $\varDelta \theta _r$ are $\varDelta \theta _3 \approx 35.8^{\circ }$, $\varDelta \theta _4 \approx 30.0^{\circ }$, $\varDelta \theta _5 \approx 25.9^{\circ }$. See Fig. 17c, d for an illustration of (65) for $r=3$ and $r=5$.

Semi-implicit Guidefill. The analysis of semi-implicit Guidefill is the same as for Guidefill except that the set ${\tilde{b}}^-_r$ is replaced by ${\tilde{b}}^0_r$. Defining $\ell ^0_k$ as the collection of points in ${\tilde{b}}^0_r$ distance k from the line $L_\mathbf{g} = \{ \lambda \mathbf{g} : \lambda \in \mathbb {R}\}$ as before, we find that in this case

$$\begin{aligned} \ell ^0_0 := \{ n\mathbf{g} \in {\tilde{b}}_r : n \mathbf{g} \cdot e_2 \le 0\} \end{aligned}$$

is never empty. In fact, for $0 \le \theta < \pi $ we have

$$\begin{aligned} \ell ^0_0 = {\left\{ \begin{array}{ll} \{ n\mathbf{g} : -r \le n \le -1\} &{} \text{ if } 0<\theta <\pi \\ \{ n\mathbf{g} : -r \le n \le r, n \ne 0\} &{} \text{ if } \theta = 0. \end{array}\right. } \end{aligned}$$

If $\theta > 0$, we proceed as before and find

$$\begin{aligned} {\mathbf{g}}^*_{r,\mu }= & {} \sum _{{\mathbf{y}} \in \ell ^0_0} \frac{\mathbf{y}}{\Vert {\mathbf{y}}\Vert }+\sum _{k=1}^r \sum _{{\mathbf{y}} \in \ell ^0_k} e^{-\frac{\mu ^2}{2r^2}k^2 \Vert {\mathbf{g}}\Vert ^2 }\frac{\mathbf{y}}{\Vert {\mathbf{y}}\Vert } \\\rightarrow & {} \sum _{{\mathbf{y}} \in \ell ^0_0} \frac{\mathbf{y}}{\Vert {\mathbf{y}}\Vert } \qquad \text{ as } \mu \rightarrow \infty . \\= & {} \sum _{n=-r}^{-1} n {\mathbf{g}} \\= & {} {\mathbf{g}} \qquad \text{ after } \text{ rescaling. } \end{aligned}$$

However, if $\theta =0$, this argument doesn’t work because the elements of $\ell ^0_0$ all cancel each other out. In fact, in this case we have

Hence, in this case $\mathbf{g}^*_r = e_2$ yet again, just like for Guidefill and coherence transport. In general, for $0 \le \theta < \pi $, we have

$$\begin{aligned} \theta ^*_r = {\left\{ \begin{array}{ll} \frac{\pi }{2} &{} \text{ if } \theta = 0 \\ \theta &{} \text{ if } 0< \theta < \pi \end{array}\right. } \end{aligned}$$

(66)

In other words, semi-implicit Guidefill kinks only if $\mathbf{g}$ is exactly parallel to boundary of the inpainting domain. See Fig. 17e, f for an illustration of (66) for $r=3$ and $r=5$ (the curves are of course the same, since (66) is independent of r).

Remark 14

In contrast to Guidefill and coherence transport, (66) tells us that for semi-implicit Guidefill in the limit $\mu \rightarrow \infty $ we have $\theta ^*_r = \theta $ for all $\theta \in (0,\pi )$, independent of r. This is in fact exactly the same prediction (albeit under stronger simplifying assumptions) that März and Bornemann obtained for coherence transport in their own continuum limit $u_{M}$ as $\mu \rightarrow \infty $ [7, p. 14]. We have in some sense come full circle—the original predictions of [7] for coherence transport under their high resolution and vanishing viscosity limit are the same as ours for semi-implicit Guidefill under our fixed ratio limit.

7.2 Numerical Validation of Limiting Transport Directions for Coherence Transport, Guidefill, and Semi-Implicit Guidefill

In this experiment we compare the limiting transport directions derived in Sect. 7.1.2 for coherence transport, Guidefill, and semi-implicit Guidefill as $\mu \rightarrow \infty $ with the orientation of extrapolated isophotes in an actual inpainted image $u_h$ obtained in practice with finite $\mu $. In each case we choose as our boundary data the image shown in Fig. 20a, consisting of a red dot on a blue background, with the inpainting domain shown in yellow. We run each algorithm with

$$\begin{aligned} \mathbf{g}=(\cos (k^{\circ }),\sin (k^{\circ })) \end{aligned}$$

for $k=0,1,\ldots ,90$, with $\mu =40$ fixed and for various values of r. The dot is then stretched into a line as in Fig. 20b, the orientation of which gives $\mathbf{g}^*_{\mu ,r}$ and which can be measured numerically.

Results are shown in Fig. 21 for $r=3$ (similar results may be found for different values of r, but these are omitted for brevity). The theoretical curves are shown in blue, while the measured curves are shown in red. While we see some smoothing out of the jump discontinuities in the case of coherence transport, this is expected as one can easily show that the convergence to $\theta ^*=\lim _{\mu \rightarrow \infty } \theta ^*_{r,\mu }$ is pointwise but not uniform, becoming arbitrarily slow in the vicinity of the jumps. On the other hand, for Guidefill and semi-implicit Guidefill we see excellent agreement with theory even in the vicinity of jump discontinuities. This is to be expected as well, since one can easily show that the relevant limits are uniform in this case.

7.3 Consequences of the Asymptotic Limit

In this section, we apply the asymptotic limit from Theorem 2 to study the blur artifacts associated with Algorithm 1 and illustrated in Sect. 2.2. As a motivating example, we consider the situation illustrated in Fig. 22, where we extrapolate a vertical line using Guidefill with two different guideance directions: $\mathbf{g}_1 = e_2$ and $\mathbf{g}_2 = e_1$. We have already shown in Sect. 7.1.2 that these lead to the same continuum limit with transport direction $\mathbf{g}^*_r = e_2$ (Fig. 17). However, $\mathbf{g}_1 = e_2$ leads to a clean extrapolation (Fig. 22a) while the extrapolation using $\mathbf{g}_1 = e_2$ suffers from heavy blur (Fig. 22b). Evidently, the fixed ratio continuum limit from Theorem 6 is inadequate to explain this discrepancy. Instead we turn to the asymptotic limit—plugging the relevant parameters into Theorem 2, we obtain a theoretical result that is indistinguishable from the result obtained in reality (Fig. 22c, d). The asymptotic limit thus appears to yield excellent predictions of the blur obtained in practice.

Although our asymptotic limit more closely approximates the real behavior of Algorithm 1 than the fixed-ratio continuum limit, the latter remains valid in the high resolution limit $h \rightarrow 0$. This is illustrated in Fig. 24 where we repeatedly solve the same inpainting problem with $D = [0,1) \times (0,0.5]$ but at increasing high resolutions. By examining sections of constant y for differing values of h, we see explicitly the convergence of the discrete solution $u_h$ to the fixed-ratio continuum limit u.

Remark 15

Since $u_h$ converges to u obeying the transport PDE (30), Algorithm 1 may be viewed as a numerical method for solving (30). It is well known (see for example [28]) that numerical methods for transport equations tend to introduce blur, and therefore we should not be surprised that this is also the case of Algorithm 1. One way of studying this blur is the technique of modified equations [28, p. 117]. However, this is not the approach we have adopted in this paper.

Angular dependence of blur artifacts Here we briefly explore how the variance $\sigma (y,h)^2$ depends on $\mathbf{g}=(\cos \theta ,\sin \theta )$ as $\theta $ changes. Since $\sigma (y,h)^2=\frac{\gamma ^2yh}{|\mu _y^3|}$ can be quite complex in general we explore this numerically. Figure 23 illustrates the angular dependence of $\sigma (y,h)$ as a function of $\theta \in [0,\pi ]$ with $y=1$ and $h=0.01$ fixed, for the three main methods of interest—coherence transport, Guidefill, and semi-implicit Guidefill (note the log scale). In every case we have $r=3$ and $\mu = 100$.

Note that for semi-implicit Guidefill, since $\mu _y \rightarrow 0$ as $\theta \rightarrow 0$, we expect blur artifacts to become arbitrarily bad as $\theta \rightarrow 0$, unless we also have $\gamma ^2 \rightarrow 0$ as $\theta \rightarrow 0$ in such a way that the ratio remains bounded. In fact,

$$\begin{aligned} \gamma ^2 = {\text {Var}}(\mu _x V_1 - \mu _y U_1)={\text {Var}}[(\mu _x,\mu _y) \cdot (U_1,V_1)]. \end{aligned}$$

is the variance of the increments $\mathbf{Z_1}=(U_1,V_1)$ orthogal to the mean $(\mu _x,\mu _y)$, which does go to zero as $\theta \rightarrow 0$. However, as Fig. 23 illustrates, it does not do so fast enough to compensate for the blow up of $1/|\mu _y|^3$. Indeed, numerical experiments (omitted for brevity) demonstrate that the blur artifacts for semi-implicit Guidefill do become arbitrarily bad as $\theta \rightarrow 0$ (Fig. 24).

Note that for coherence transport, $\sigma ^2(y,h) = 0$ for all but finitely many angles, explaining the methods apparent lack of blur artifacts (Fig. 6). These special angles correspond precisely to the jumps in Fig. 17a, where coherence transport puts its mass into more than one $\mathbf{y} \in b^-_r$ (Sect. 7.1.2).

Degenerate stencils As noted above, coherence transport suffers from no blur artifacts for all but finitely many angles. To understand why this is, recall from the analysis of Sect. 7.1.2 that coherence transport “kinks” because in the limit as $\mu \rightarrow \infty $, the stencil weights $w_r(0,\mathbf{y})$ are zero for all but a single $\mathbf{y}^* \in b^-_r$ which receives stencil weight one (except when $\theta (\mathbf{g})$ is a transition angle or $\theta (\mathbf{g})=0$, in which case—as observed in numerical experiments omitted for brevity—coherence transport does blur). We call such stencils degenerate.

Theorem 2 cannot be applied to degenerate stencils as we have

$$\begin{aligned} \varphi _{\mathbf{Z}_i}(\mathbf{u}) = e^{\sqrt{-1}{} \mathbf{y}^* \cdot \mathbf{u}} \end{aligned}$$

which has magnitude one irrespective of $\mathbf{u}$. In this case, the “random” walk $\mathbf{X}_{\tau }$ has become deterministic as the increments $\varDelta \mathbf{X}_i$ now obey $\varDelta \mathbf{X}_i= h\mathbf{Z}_i = h\mathbf{y}^*$ with probability one. One may readily show that $\rho _{\mathbf{X}_{\tau }}$ assigns mass 1 to a single pixel in ${\mathcal {U}}_h$, an hence there is no blur.

Alternative interpolation schemes

As noted in Remark 7, if we were to modify semi-implicit Guidefill so that ghost pixels are defined not using bilinear interpolation, but by some other interpolation scheme that also

1.
can be expressed in terms of non-negative basis functions summing to one
2.
preserves polynomials of degree 1

we would obtain an identical continuum limit as for semi-implicit Guidefill. In the future, we would like to explore whether or not another interpolation scheme can be found such that $\sigma (y,h)$ remains bounded as $\theta \rightarrow 0$. This would amount to finding a scheme for which the orthogonal variance $\gamma ^2 \rightarrow 0$ as $\theta \rightarrow 0$ fast enough to compensate for the blow up of $1/|\mu _y|^3$.

Summary

Theorem 2 and the experiments of this section have four main takeaway messages:

Blur gets worse as one moves further into the inpainting domain and (e.g. y increases)—Figs. 22 and 25 .
Blur gets better as $h \rightarrow 0$—Fig. 24.
Blur is non-existent if the stencil weights are degenerate—that is, put all of their mass into a single real pixel $\mathbf{y}$ (since in this case $\mathbf{Z}_i$ is deterministic and all variances are 0).
For semi-implicit Guidefill, blur gets arbitrarily bad as the $\theta (\mathbf{g}) \rightarrow 0$.

8 Applications to Numerical Linear Algebra

Although we have derived our asymptotic and continuum limits in the context of image inpainting, they are abstract results that apply more generally. In particular, the asymptotic limit applies to any situation in which a sequence of vectors $\{\mathbf{x}^{(m)} \}_{m \in \mathbb {N}}$ is generated recursively (for some $r \in \mathbb {N}$) by

$$\begin{aligned} A_0\mathbf{x}^{(m)}= A_1\mathbf{x}^{(m-1)}+A_2\mathbf{x}^{(m-2)}+\ldots +A_r\mathbf{x}^{(m-r)} \end{aligned}$$

where $A_0,A_1,A_2,\ldots ,A_r$ are circulant or Toeplitz^{Footnote 2} matrices with bandwidth at most $2r+1$, with $A_0$ an M-matrix (positive diagonal and non-negative off-diagonal entrees) while $A_1,A_2,\ldots ,A_r$ are element-wise non-negative. An example of a situation where this comes up is the application of damped-Jacobi for iteratively solving the linear system

$$\begin{aligned} A\mathbf{x} = \mathbf{b} \end{aligned}$$

where A is a Toeplitz or circulant M-matrix. In this case the evolution of the error is given by

$$\begin{aligned} e^{(m)}={\mathcal {J}}_{\omega } e^{(m-1)}={\mathcal {J}}^m_{\omega } e^{(0)} \end{aligned}$$

(67)

where $\omega $ is the damping parameter and the iteration matrix ${\mathcal {J}}_{\omega }$ is given by

$$\begin{aligned} {\mathcal {J}}_{\omega } = I - \omega D^{-1}A, \end{aligned}$$

with D denoting the diagonal of A. This can be viewed as a special case of the above with r determined by the bandwidth of A, $A_0 = I$, $A_1 = {\mathcal {J}}_{\omega }$, and

$$\begin{aligned} A_2=A_3=\ldots =A_r = O. \end{aligned}$$

One may easily verify that ${\mathcal {J}}_{\omega }$ is non-negative so long as $\omega \in (0,1)$. Here we demonstrate how our asymptotic limit may be used to predict the evolution of the error of damped Jacobi applied to the one-dimensional convection diffusion equation

$$\begin{aligned} u_{xx}+\alpha u_x = f \end{aligned}$$

(68)

with Dirichlet boundary conditions.

Boundary Conditions Technically, our asymptotic limit is only valid for periodic boundary conditions. To make our result rigorous for Dirichlet boundary conditions, we would need to change the stopping time $\tau $ of the stopped random walk $\mathbf{X}_{\tau }:=(X_{\tau },Y_{\tau })$ from Sect. 6.4 to

$$\begin{aligned} \tau := \inf \{n: X_n< 0 \text{ or } X_n > 1 \text{ or } Y_n < 0\}. \end{aligned}$$

However, we know that as $h \rightarrow 0$ the asymptotic limit approaches the continuum limit, and one may easily see that simply padding the boundary data $u_0$ on either side with an infinite strip of zeros leads to the same continuum limit—and hence this should also lead to a reasonable approximation for the asymptotic limit, at least away from the boundary at $x=0$ and $x=1$.

Discretization We consider two discretizations: 2nd order centered differences, leading to

$$\begin{aligned} A_{\text{ centered }} = {\text {tri}}\left[ -1-\frac{\alpha h}{2},2,-1+\frac{\alpha h}{2}\right] \end{aligned}$$

and first order upwind differences

$$\begin{aligned} A_{\text{ upwind }} = {\text {tri}}\left[ -1,2-\alpha h,-1+\alpha h\right] \end{aligned}$$

Centered differences In this case, we obtain the iteration matrix

$$\begin{aligned} {\mathcal {J}}_{\omega ,\text{ centered }} = {\text {tri}}\left[ \frac{\omega }{2}\left( 1+\frac{\alpha h}{2}\right) ,1-\omega ,\frac{\omega }{2}\left( 1-\frac{\alpha h}{2}\right) \right] . \end{aligned}$$

Assuming $0 \le \omega \le 1$ and $-2 \le \alpha h \le 2$ we have ${\mathcal {J}}_{\omega ,\text{ centered }}$ non-negative, and hence by Sect. 6.4 we have the associated random walk with increments (U, V) given by

$$\begin{aligned} U = {\left\{ \begin{array}{ll} 1 &{} \text{ with } \text{ probability } \frac{\omega }{2}\left( 1-\frac{\alpha h}{2}\right) \\ 0 &{} \text{ with } \text{ probability } 1-\omega \\ -1 &{} \text{ with } \text{ probability } \frac{\omega }{2}\left( 1+\frac{\alpha h}{2}\right) \\ \end{array}\right. } \end{aligned}$$

and $V = -1$ with probabilty 1. Since V is deterministic, the second form of the asymptotic limit (Theorem 4) applies so long as $\omega \ne 1$. If $\omega = 1$, the stencil of U is constrained to the sublattice $-1+2\cdot \mathbb {Z} \subsetneq \mathbb {Z}$ and hence the conditions of Theorem 4 are violated. We will see in Fig. 26 that in this case the asymptotic limit of Theorem 4does not accurately predict the error evolution (67). To compute the asymptotic limit, we need only consider the mean $\mu $ and variance $\sigma ^2$ of U. We have

$$\begin{aligned} \mu = -\omega \frac{\alpha h}{2} \qquad \sigma ^2 = \omega - \frac{\omega ^2 \alpha ^2 h^2}{4} \end{aligned}$$

Thus, defining

$$\begin{aligned} g_{\sigma ,\mu }(n) := \frac{e^{-\frac{(n-\mu )^2}{2\sigma ^2}}}{\sum _{n'=1}^N e^{-\frac{(n'-\mu )^2}{2\sigma ^2}}} \end{aligned}$$

we have

$$\begin{aligned} e^{(m)}_k \approx (e^{(0)} * g_{m\sigma ,m \mu })_k. \end{aligned}$$

(69)

Figure 26 presents a comparison of the predicted error (69) with the real measured error, for $N = 100$, $\alpha h = 1$ fixed and for $\omega = 0.5$ and $\omega = 1$. Note that even in the case $\omega = 1$, where the assumptions of Theorem 4 do not apply, (69) is still able to predict the overall shape of the error, minus high frequency oscillations. For $\omega \ne 1$, (69) is an excellent prediction of the error, except near the boundaries $x=0$ and $x=1$.

Upwind in this case, we obtain the iteration matrix

$$\begin{aligned} {\mathcal {J}}_{\omega ,\text{ upwind }} = {\text {tri}}\left[ \frac{\omega }{2-\alpha h},1-\omega ,\omega \frac{1-\alpha h}{2-\alpha h}\right] . \end{aligned}$$

This time we assume $0 \le \omega , \alpha h \le 1$ and consider the associated random walk with increments (U, V) given by

$$\begin{aligned} U = {\left\{ \begin{array}{ll} 1 &{} \text{ with } \text{ probability } \omega \frac{1-\alpha h}{2-\alpha h} \\ 0 &{} \text{ with } \text{ probability } 1-\omega \\ -1 &{} \text{ with } \text{ probability } \frac{\omega }{2-\alpha h} \\ \end{array}\right. } \end{aligned}$$

and $V = -1$ with probabilty 1. Again, the second form of the asymptotic limit (Theorem 4) applies so long as $\omega \ne 1$. The mean $\mu $ and variance $\sigma ^2$ of U in this case are given by

$$\begin{aligned} \mu = -\omega \frac{\alpha h}{2-\alpha h} \qquad \sigma ^2 = \omega - \frac{\omega ^2 \alpha ^2 h^2}{(2-\alpha h)^2} \end{aligned}$$

Once again the predicted error is given by (69). Similar results (omitted for brevity) are obtained as in the centered differences case.

Remark 16

The multigrid algorithm [9, 39] is based in part on the observation that simple iterative methods such as damped Jacobi or Gauss–Seidel—while not necessarily effective at eliminating the error—can be very effective at smoothing it. It is well known that while Jacobi iteration ($\omega =1$) is typically not an effective smoother, damped Jacobi with $\omega \ne 1$ often is, and indeed it is widely employed in the multigrid method [39]. The analysis of this section corroborates this well known fact from the multigrid community—for $\omega \ne 1$, Theorem 4 gives us an explicit formula for the asymptotic error at iteration m as “smoothed” version of the initial error. However, when $\omega = 1$, the assumptions of Theorem 4 are violated and this fails.

Remark 17

In its present form, our asymptotic limit cannot be used to analyze the error evolution of damped Jacobi applied to the linear systems arising from the discretization of partial differential equations in dimension $d \ge 2$. To handle this case, we must first generalize our asymptotic limit to $\mathbb {R}^{d+1}$. This is straightforward. In particular, the original form of Theorem 1 as it is presented in [36], is applicable to random walks in $\mathbb {R}^d$. We quoted the two dimensional version of it in Theorem 1 simply because this was the version relevant to the problem at hand. Since the asymptotic limit is derived from Theorem 1, generalizing it to arbitrary dimensions is straightforward.

9 Conclusions and Future Work

9.1 Conclusions

In this paper we have presented a detailed analysis of a class of geometric inpainting algorithms based on the idea of filling the inpainting domain in successive shells from the boundary inwards, where every pixel to be filled is assigned a color equal to a weighted average of its already filled neighbors. These methods are all special cases of a generic framework sketched in Algorithm 1. Methods in the literature falling within this framework include

Telea’s Algorithm [38].
Coherence Transport [7, 31].
Guidefill [27].

A subtle but important point about these methods is that pixels in the current inpainting boundary are filled independently. Noting this, we have proposed a semi-implicit extension of these methods in which pixels in the current shell are instead filled simultaneously by solving a linear system. We have also sketched in Algorithm 1 a straightforward extension of the original framework that is equivalent to solving this linear system using damped Jacobi or successive over-relaxation (SOR). A theoretical convergence analysis of these methods is presented for the semi-implicit extension of Guidefill, where we show that SOR is extremely effective. This analysis is backed up by numerical experiments.

As all of the algorithms listed above (with the exception of semi-implicit Guidefill, which is presented for the first time in this paper) are known to create some disturbing visual artifacts, the main objective of our analysis is to understand why these occur and whether or not they are inevitable. We focus specifically on kinking artifacts and blur artifacts. Other artifacts including the formation of shocks and extrapolated isophotes that end abruptly are discussed but not analyzed, as they have already been studied elsewhere [7, 32] and are well understood. Our analysis is based on three main ideas:

A continuum limit, which we use to explain kinking artifacts.
An asymptotic limit, which we use to analyze blur artifacts.
A connection to the theory of stopped random walks, from which both limits are derived.

Similarly to the earlier work of Bornemann and März [7], our continuum limit is a transport equation. However, our limit process is different and so are the coefficients of the resulting transport equation. Moreover, numerical experiments show that our transport equation is a better reflection of the behaviour of Algorithm 1 (the direct form and our proposed semi-implicit extension) in practice, capable of accurately predicting kinking phenomena that is not captured by the alternative continuum limit proposed in [7]. The third core idea of our analysis, which is to relate Algorithm 1 and its extension to stopped random walks, is critical for two reasons. Firstly, it allows us to prove convergence to our continuum limit even for boundary data with low regularity, such as jump discontinuities. By contrast, the analysis in [7] demonstrates convergence under the assumption of smooth boundary data, which is an unrealistic assumption for images. Secondly, this connection is central to our analysis of blur artifacts, which we analyze based not on a continuum limit where $h \rightarrow 0$, but rather an asymptotic limit where h very small but nonzero. Our asymptotic limit allows us to make quantitative predictions regarding blur that are in excellent agreement with numerical experiments, even for relatively low resolution images (e.g. Fig. 22 which is only $200\times 200$px). Although our analysis operates in a highly idealized setting (Sect. 6.1), our conclusions are far reaching. In particular, we prove the following:

1.
In the direct form of Algorithm 1, kinking artifacts will always be present. That is, certain isophotes cannot be extended into the inpainting domain without bending (Sect. 7.1.1).
2.
This is not true of the semi-implicit extension of Algorithm 1. In particular, semi-implicit Guidefill can extrapolate isophotes with any orientation,^{Footnote 3} and moreover is able to do so efficiently by using SOR to solve the required linear system (Sects. 7.1.1, 7.1.2, Corollary 1).
3.
Blur artifacts exhibit an angular dependence, which for semi-implicit Guidefill become arbitrarily bad as the angle an extrapolated isophote makes with the boundary of the inpainting domain goes to zero. Thus, semi-implicit Guidefill pays a heavy price (on top of the need to solve a linear system for every shell) for its ability to successfully extrapolate such isophotes (Theorem 2, Fig. 23).
4.
Blur artifacts become less significant as the resolution of the image goes up, and get worse the further into the inpainting domain you extrapolate (Theorem 2).
5.
Methods that put all of their weight into a single pixel exhibit no blur, but can only extrapolate without kinking in finitely many directions (Sect. 7.3).

In addition to this, we have also demonstrated that our asymptotic limit has applications outside of image processing. In particular, in Sect. 8 we showed how it could be used to predict the error evolution of damped Jacobi applied to a discretized one-dimensional convection-diffusion problem.

9.2 Future Work

There are a three main questions we would like to explore in the future.

1.
Does there exist an algorithm within the framework we have described that avoids blurring artifacts and kinking artifacts at the same time? If not, is it at least possible to design an algorithm that, like semi-implicit Guidefill, avoids kinking artifacts so long as the guidance direction $\mathbf{g}=(\cos \theta ,\sin \theta )$ obeys $\theta \ne 0$, but for which the severity of blur as a function of $\theta $ remains bounded? Could this be done, for example, by the replacing the bilinear interpolation used to define ghost pixels with a more sophisticated interpolation scheme?
2.
What happens if the semi-implicit version of Algorithm 1 is generalized to a fully-implicit extension in which pixel colors are computed as a weighted average not only of their (known) already filled neighbors in $A_{\epsilon ,h}(\mathbf{x}) \cap \varOmega \backslash D^{(k)}$ and (unknown) neighbors within the same shell $\partial D^{(k)}_h$ , but of all neighbors in $A_{\epsilon ,h}(\mathbf{x})$? Are their additional benefits in terms of artifact reduction and, if so, can the resulting linear system be solved efficiently enough to make this worthwhile?
3.
We would like to explore the connection of our asymptotic limit to problems in numerical linear algebra more deeply. In particular, we would like to generalize our asymptotic limit to arbitrary dimensions so that it can be applied to discretized partial differential equations in dimension $d \ge 2$ (we have already sketched how to do this in Remark 17). We would also like to see whether our results can be applied to more sophisticated methods than damped Jacobi and—if so—whether or not any new insights can be gained by doing so.

Notes

Note that here we mean a general family of finite sets $A(\mathbf{x}) \in \mathbb {R}^2$ and general weights $w(\mathbf{x},\mathbf{y})$. We do not mean the specific family of sets $A_{\epsilon ,h}(\mathbf{x})$ or the specific weights $w_{\epsilon }(\mathbf{x},\mathbf{y})$, which have special properties.
Technically, our result only applies to the circulant case, but we will argue that it applies approximately to the Toeplitz case as well.
That is, unless the isophotes are exactly parallel to the boundary of the inpainting domain. But in this case extrapolation is not defined.

References

Ambrosio, L., Fusco, N., Pallara, D.: Functions of bounded variation and free discontinuity problems. Oxford Mathematical Monographs (2000)
Apostol, T.M.: Mathematical Analysis. Pearson (1974)
Arias, P., Facciolo, G., Caselles, V., Sapiro, G.: A variational framework for exemplar-based image inpainting. Int. J. Comput. Vision 93(3), 319–347 (2011). https://doi.org/10.1007/s11263-010-0418-7.
Article MathSciNet MATH Google Scholar
Ballester, C., Bertalmio, M., Caselles, V., Sapiro, G., Verdera, J.: Filling-in by joint interpolation of vector fields and gray levels. IEEE Transactions on Image Processing 10(8), 1200–1211 (2001). https://doi.org/10.1109/83.935036
Article MathSciNet MATH Google Scholar
Bellman, R.: A Brief Introduction to Theta Functions. Holt (1961)
Bertalmio, M., Sapiro, G., Caselles, V., Ballester, C.: Image inpainting. In: Proceedings of the 27th annual conference on Computer graphics and interactive techniques, pp. 417–424. ACM Press/Addison-Wesley Publishing Co. (2000)
Bornemann, F., März, T.: Fast image inpainting based on coherence transport. Journal of Mathematical Imaging and Vision 28(3), 259–278 (2007). https://doi.org/10.1007/s10851-007-0017-6.
Article MathSciNet Google Scholar
Bracewell, R.: The Fourier Transform and its Applications, second edn. McGraw-Hill Kogakusha, Ltd., Tokyo (1978)
MATH Google Scholar
Briggs, W.L., Henson, V.E., McCormick, S.F.: A Multigrid Tutorial: Second Edition. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA (2000)
Book Google Scholar
Brown, R.A.: Barycentric coordinates as interpolants. CoRR (2013). arXiv:1308.1279
Burger, M., He, L., Schönlieb, C.: Cahn-hilliard inpainting and a generalization for grayvalue images. SIAM J. Imaging Sci. 2(4), 1129–1167 (2009)
Article MathSciNet Google Scholar
Buyssens, P., Meur, O., Daisy, M., Tschumperlé, D., Lézoray, O.: Depth-guided disocclusion inpainting of synthesized rgb-d images. IEEE Transactions on Image Processing 26(2), 525–538 (2017). https://doi.org/10.1109/TIP.2016.2619263
Article MathSciNet MATH Google Scholar
Chan, T., Kang, S., Shen, J.: Euler’s elastica and curvature-based inpainting. SIAM Journal on Applied Mathematics pp. 564–592 (2002)
Chan, T., Shen, J.: Variational image inpainting. Communications on pure and applied mathematics 58(5), 579–619 (2005)
Article MathSciNet Google Scholar
Cilleruelo, J.: The distribution of the lattice points on circles. Journal of Number Theory 43(2), 198 – 202 (1993). https://doi.org/10.1006/jnth.1993.1017.
Article MathSciNet MATH Google Scholar
Criminisi, A., Pérez, P., Toyama, K.: Region filling and object removal by exemplar-based image inpainting. IEEE Transactions on Image Processing 13, 1200–1212 (2004)
Article Google Scholar
Daribo, I., Pesquet-Popescu, B.: Depth-aided image inpainting for novel view synthesis. In: 2010 IEEE International Workshop on Multimedia Signal Processing, pp. 167–170 (2010). https://doi.org/10.1109/MMSP.2010.5662013
domotorp (http://mathoverflow.net/users/955/domotorp): Conjecture regarding closest point inside a discrete ball to a line. MathOverflow. http://mathoverflow.net/q/187830 (version: 2014-11-22)
Erdös, P., Hall, R.: On the angular distribution of gaussian integers with fixed norm. Discrete Mathematics 200(1–3), 87 – 94 (1999). https://doi.org/10.1016/S0012-365X(98)00329-X.
Article MathSciNet MATH Google Scholar
Getreuer, P.: A Survey of Gaussian Convolution Algorithms. Image Processing On Line 3, 286–310 (2013). https://doi.org/10.5201/ipol.2013.87
Article Google Scholar
Guillemot, C., Meur, O.: Image inpainting : Overview and recent advances. IEEE Signal Processing Magazine 31(1), 127–144 (2014). https://doi.org/10.1109/MSP.2013.2273004
Article Google Scholar
Gut, A.: On the moments and limit distributions of some first passage times. Ann. Probab. 2(2), 277–308 (1974). https://doi.org/10.1214/aop/1176996709.
Article MathSciNet MATH Google Scholar
Gut, A.: On the moments of some first passage times for sums of dependent random variables. Stochastic Processes and their Applications 2(1), 115 – 126 (1974). https://doi.org/10.1016/0304-4149(74)90015-5. http://www.sciencedirect.com/science/article/pii/0304414974900155
Gut, A.: Stopped Random Walks: Limit Theorems and Applications. Springer Series in Operations Research and Financial Engineering. Springer New York (2009). https://books.google.co.uk/books?id=tWfBs5G7jHIC
Gut, A., Janson, S.: The limiting behaviour of certain stopped sums and some applications. Scandinavian Journal of Statistics 10(4), 281–292 (1983). http://www.jstor.org/stable/4615930
Hocking, L., Holding, T., Schönlieb, C.: Numerical analysis of shell-based geometric image inpainting algorithms and their semi-implicit extension. arXiv:1707.09713
Hocking, L., MacKenzie, R., Schönlieb, C.: Guidefill: Gpu accelerated, artist guided geometric inpainting for 3d conversion of film. arXiv:1611.05319
LeVeque, R.J.: Numerical methods for conservation laws (2. ed.). Lectures in mathematics. Birkhäuser (1992)
Ma, L., Do, L., de With, P.: Depth-guided inpainting algorithm for free-viewpoint video. In: 2012 19th IEEE International Conference on Image Processing, pp. 1721–1724 (2012). https://doi.org/10.1109/ICIP.2012.6467211
Malinovskii, V.K.: Limit theorems for stopped random sequences. i: Rates of convergence and asymptotic expansions. Theory of Probability & Its Applications 38(4), 673–693 (1994). https://doi.org/10.1137/1138067
März, T.: Image inpainting based on coherence transport with adapted distance functions. SIAM J. Img. Sci. 4(4), 981–1000 (2011). https://doi.org/10.1137/100807296
Article MathSciNet MATH Google Scholar
März, T.: A well-posedness framework for inpainting based on coherence transport. Foundations of Computational Mathematics 15(4), 973–1033 (2015). https://doi.org/10.1007/s10208-014-9199-7.
Article MathSciNet MATH Google Scholar
Masnou, S., Morel, J.: Level lines based disocclusion. In: Image Processing, 1998. ICIP 98. Proceedings. 1998 International Conference on, pp. 259–263. IEEE (1998)
Schönlieb, C.: Partial Differential Equation Methods for Image Inpainting. Cambridge University Press (2015)
Smith, S.W.: The Scientist and Engineer’s Guide to Digital Signal Processing. California Technical Publishing, San Diego, CA, USA (1997)
Google Scholar
Stam, A.J.: Local central limit theorem for first entrance of a random walk into a half space. Compositio Mathematica 23(1), 15–23 (1971). http://eudml.org/doc/89073
Tao, T.: Variants of the central limit theorem. https://terrytao.wordpress.com/2015/11/19/275a-notes-5-variants-of-the-central-limit-theorem/#more-8566
Telea, A.: An image inpainting technique based on the fast marching method. Journal of Graphics Tools 9(1), 23–34 (2004)
Article Google Scholar
Trottenberg, U., Oosterlee, C., Schuller, A.: Multigrid. Elsevier Science (2000). https://books.google.ca/books?id=9ysyNPZoR24C
TschumperlÉ, D.: Fast anisotropic smoothing of multi-valued images using curvature-preserving pde’s. International Journal of Computer Vision 68(1), 65–82 (2006). https://doi.org/10.1007/s11263-006-5631-z.
Article Google Scholar
Varah, J.: A lower bound for the smallest singular value of a matrix. Linear Algebra and its Applications 11(1), 3 – 5 (1975). https://doi.org/10.1016/0024-3795(75)90112-3. http://www.sciencedirect.com/science/article/pii/0024379575901123
Varga, R.: Matrix iterative analysis. Prentice-Hall series in automatic computation. Prentice-Hall (1962). https://books.google.co.uk/books?id=PFYWAAAAIAAJ
Wexler, Y., Shechtman, E., Irani, M.: Space-time completion of video. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 463–476 (2007). https://doi.org/10.1109/TPAMI.2007.60
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank Vittoria Silvestri, James Norris, and Arieh Iserles for helpful conversations. Thanks is also due to Dömötör Pálvölgyi, whose idea of a symmetry-based argument eventually led to “Appendix E”, Proposition 5. Finally, we would like to thank an anonymous referee for their insightful comments and suggestions.

Author information

Authors and Affiliations

Department for Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, CB3 0WA, UK
L. Robert Hocking & Carola-Bibiane Schönlieb
Mathematics Institute, Zeeman Building, University of Warwick, Coventry, CV4 7AL, UK
Thomas Holding

Authors

L. Robert Hocking
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Holding
View author publications
You can also search for this author in PubMed Google Scholar
Carola-Bibiane Schönlieb
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to L. Robert Hocking.

Additional information

Communicated by Hans Munthe-kaas.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

CBS acknowledges support from the Leverhulme Trust project ‘Breaking the non-convexity barrier’, the EPSRC Grant ‘EP/M00483X/1’, EPSRC centre ‘EP/N014588/1’, the Cantab Capital Institute for the Mathematics of Information, the CHiPS (Horizon 2020 RISE Project Grant), the Global Alliance Project ‘Statistical and Mathematical Theory of Imaging’ and the Alan Turing Institute. RH and TH acknowledge support from the EPSRC Cambridge Centre for Analysis and the Cambridge Trust.

Appendices

A Punctured Sums

Here we provide further justification for our exclusion of the point $\mathbf{x}$ from the update formula (1) in Algorithm 1 as mentioned in Remark 4. As we mentioned there, this makes no difference to the direct form of Algorithm 1, because the subroutine FillRow only involves sums taken over $A_{\epsilon ,h}(\mathbf{x}) \cap (\varOmega \backslash D^{(k)})$, which never contains $\mathbf{x}$. However, the semi-implicit extension of Algorithm 1 expresses $u_h(\mathbf{x})$ as a sum of $u_h(\mathbf{y})$ over a set of points that might include $\mathbf{x}$. The reason we deliberately exclude $\mathbf{x}$ is because, as the following proposition shows, if $w_{\epsilon }(\mathbf{x},\mathbf{x}) <\infty $, it doesn’t matter, but if $w_{\epsilon }(\mathbf{x},\mathbf{x}) =\infty $, it wreaks havoc. Moreover, the weights (6) which we wish to study do have the property that $w_{\epsilon }(\mathbf{x},\mathbf{x}) = \infty $.

Proposition 2

Suppose $\mathbf{x} \in B$ for some finite set $B \subset \mathbb {R}^2$, and suppose there exist non negative weights $w : B \times B \rightarrow [0,\infty ]$, finite everywhere except possibly at $(\mathbf{x},\mathbf{x})$. Then if $w(\mathbf{x},\mathbf{x}) <\infty $, we have

$$\begin{aligned} u_h(\mathbf{x}) = \frac{\sum _{\mathbf{y} \in B} w(\mathbf{x},\mathbf{y})u_h(\mathbf{y})}{\sum _{\mathbf{y} \in B} w(\mathbf{x},\mathbf{y})} = \frac{\sum _{\mathbf{y} \in B \backslash \{\mathbf{x} \}} w(\mathbf{x},\mathbf{y})u_h(\mathbf{y})}{\sum _{\mathbf{y} \in B \backslash \{\mathbf{x} \}} w(\mathbf{x},\mathbf{y})}. \end{aligned}$$

On the other hand, if $w(\mathbf{x},\mathbf{x}) =\infty $, we have an indeterminate expression

$$\begin{aligned} u_h(\mathbf{x}) =\frac{\infty }{\infty }. \end{aligned}$$

Proof

The proof is an exercise in cancellation and left to the reader. $\square $

B Properties of Ghost Pixels and Equivalent Weights

In this appendix we prove the six properties of ghost pixels and equivalent weights listed in Sect. 4. These properties all follow from the definition of $u_h(\mathbf{y})$ when $\mathbf{y}$ is a ghost pixel, namely

$$\begin{aligned} u_h(\mathbf{y}) = \sum _{\mathbf{z} \in \text{ Supp }(\{\mathbf{y}\})}\varLambda _{\mathbf{z},h}(\mathbf{y})u_h(\mathbf{z}), \end{aligned}$$

where $\{ \varLambda _{\mathbf{z},h} \}_{\mathbf{z} \in \mathbb {Z}^2_h}$ denote the basis functions of bilinear interpolation associated with the lattice $\mathbb {Z}^2_h$, and where $\text{ Supp }(A)$ denotes the set of real pixels needed to define a set A of ghost pixels using bilinear interpolation. Note that if $\mathbf{y} \in A$, then

$$\begin{aligned} \varLambda _{\mathbf{z},h}(\mathbf{y})=0 \quad \text{ for } \text{ all } \mathbf{z} \notin \text{ Supp }(A). \end{aligned}$$

(70)

The following explicit formula for $\text{ Supp }(\{(x,y)\})$ (which comes from the definition of bilinear interpolation) will occasionally be useful.

$$\begin{aligned} \text{ Supp }(\{(x,y)\}) = \bigg \{ \left( \left\lfloor \frac{x}{h} \right\rfloor h,\left\lfloor \frac{y}{h} \right\rfloor h \right) , \left( \left\lceil \frac{x}{h} \right\rceil h,\left\lfloor \frac{y}{h} \right\rfloor h \right) , \nonumber \\ \left( \left\lfloor \frac{x}{h} \right\rfloor h,\left\lceil \frac{y}{h} \right\rceil h \right) , \left( \left\lceil \frac{x}{h} \right\rceil h,\left\lceil \frac{y}{h} \right\rceil h \right) \bigg \}. \end{aligned}$$

(71)

First we prove property one.

1. Explicit formula:

$$\begin{aligned} {\tilde{w}}(\mathbf{x},\mathbf{z}) = \sum _{\mathbf{y} \in A(\mathbf{x})} \varLambda _{\mathbf{y},h}(\mathbf{z})w(\mathbf{x},\mathbf{y}) \end{aligned}$$

Proof

This follows straightforwardly from the definition of ghost pixels, the property (70), and a few exchanges of finite sums.

$$\begin{aligned} u_h(\mathbf{x})= & {} \sum _{\mathbf{y} \in A(\mathbf{x})}w(\mathbf{x},\mathbf{y})u_h(\mathbf{y}) \\= & {} \sum _{\mathbf{y} \in A(\mathbf{x})}w(\mathbf{x},\mathbf{y})\sum _{\mathbf{z} \in \text{ Supp }(\{\mathbf{y}\})}\varLambda _{\mathbf{z},h}(\mathbf{y})u_h(\mathbf{z}) \\= & {} \sum _{\mathbf{y} \in A(\mathbf{x})}w(\mathbf{x},\mathbf{y})\sum _{\mathbf{z} \in \text{ Supp }(A(\mathbf{x}))}\varLambda _{\mathbf{z},h}(\mathbf{y})u_h(\mathbf{z}) \\= & {} \sum _{\mathbf{z} \in \text{ Supp }(A(\mathbf{x}))} \underbrace{\left\{ \sum _{\mathbf{y} \in A(\mathbf{x})}\varLambda _{\mathbf{z},h}(\mathbf{y}) w(\mathbf{x},\mathbf{y})\right\} }_{:={\tilde{w}}(\mathbf{x},\mathbf{z})}u_h(\mathbf{z}) \\ \end{aligned}$$

$\square $

Next, instead of proving properties two and three, we prove a stronger result from which they both follow.

1.(a) Preservation of degree 1 polynomials Suppose $f(\mathbf{y})$ is a (scalar valued) polynomial of degree at most 1, that is $f(\mathbf{y})=a+\mathbf{b} \cdot \mathbf{y}$ for some $a \in \mathbb {R}$ and $\mathbf{b} \in \mathbb {R}^2$. Then

$$\begin{aligned} \sum _{\mathbf{y} \in A(\mathbf{x})} w(\mathbf{x},\mathbf{y})f(\mathbf{y}) = \sum _{\mathbf{y} \in \text{ Supp }(A(\mathbf{x}))} {\tilde{w}}(\mathbf{x},\mathbf{y})f(\mathbf{y}). \end{aligned}$$

Proof

This follows from the fact that the bilinear interpolant of a polynomial of degree at most 1 is just the polynomial back. That is,

$$\begin{aligned} \sum _{\mathbf{z} \in \text{ Supp }(\{\mathbf{y}\})}\varLambda _{\mathbf{z},h}(\mathbf{y})f(\mathbf{z}) = f(\mathbf{y}). \end{aligned}$$

This is obvious and we do not prove it. We will also use (70) once.

$$\begin{aligned}&\sum _{\mathbf{y} \in A(\mathbf{x})} w(\mathbf{x},\mathbf{y})f(\mathbf{y})\\&\quad = \sum _{\mathbf{y} \in A(\mathbf{x})} w(\mathbf{x},\mathbf{y})\left\{ \sum _{\mathbf{z} \in \text{ Supp }(\{\mathbf{y}\})}\varLambda _{\mathbf{z},h}(\mathbf{y})f(\mathbf{z})\right\} \\&\quad = \sum _{\mathbf{y} \in A(\mathbf{x})} w(\mathbf{x},\mathbf{y})\left\{ \sum _{\mathbf{z} \in \text{ Supp }(A(x))}\varLambda _{\mathbf{z},h}(\mathbf{y})f(\mathbf{z})\right\} \\&\quad = \sum _{\mathbf{z} \in \text{ Supp }(A(x))} \left\{ \sum _{\mathbf{y} \in A(\mathbf{x})} \varLambda _{\mathbf{z},h} (\mathbf{y}) w(\mathbf{x},\mathbf{y})\right\} f(\mathbf{z}) \\&\quad = \sum _{\mathbf{z} \in \text{ Supp }(A(x))} {\tilde{w}}(\mathbf{x},\mathbf{z})f(\mathbf{z}). \end{aligned}$$

$\square $

2. Preservation of total mass.

$$\begin{aligned} \sum _{\mathbf{y} \in A(\mathbf{x})} w(\mathbf{x},\mathbf{y}) = \sum _{\mathbf{y} \in \text{ Supp }(A(\mathbf{x}))} {\tilde{w}}(\mathbf{x},\mathbf{y}). \end{aligned}$$

Proof

Special case of 1.(a), $p(\mathbf{y}) \equiv 1$. $\square $

3. Preservation of center of mass (or first moment).

$$\begin{aligned} \sum _{\mathbf{y} \in A(\mathbf{x})} w(\mathbf{x},\mathbf{y})\mathbf{y} = \sum _{\mathbf{y} \in \text{ Supp }(A(\mathbf{x}))} {\tilde{w}}(\mathbf{x},\mathbf{y})\mathbf{y}. \end{aligned}$$

Proof

Apply 1.(a) to each component of $f(\mathbf{y}) = \mathbf{y}$. $\square $

4. Inheritance of non-negativity:

$$\begin{aligned} {\tilde{w}}_{\epsilon }(\mathbf{x},\mathbf{z}) \ge 0 \quad \text{ for } \text{ all } \mathbf{z} \in \text{ Supp }(A_{\epsilon ,h}(\mathbf{x})). \end{aligned}$$

Proof

This is immediate from the non-negativity of the original weights $\{ w_{\epsilon } \}$, the non-negativity of the basis functions $\{ \varLambda _{\mathbf{y},h} \}$, and the explicit formula (9). $\square $

5. Inheritance of non-degeneracy condition (3).

$$\begin{aligned} \sum _{\mathbf{y} \in \text{ Supp }(A_{\epsilon ,h}(\mathbf{x}) \cap (\varOmega \backslash D^{(k)}))}{\tilde{w}}_{\epsilon }(\mathbf{x},\mathbf{y}) > 0. \end{aligned}$$

Proof

Apply preservation of total mass to (3). $\square $

6. Universal Support. For any $n \in \mathbb {Z}$, we have

$$\begin{aligned} \text{ Supp }(A_{\epsilon ,h}(\mathbf{x}) \cap \{ y \le nh\} )\subseteq & {} D_h(B_{\epsilon ,h}(\mathbf{x})) \cap \{ y \le nh\} \\\subseteq & {} B_{\epsilon +2h,h}(\mathbf{x}) \cap \{ y \le nh\} . \end{aligned}$$

where $\{y \le nh \} := \{ (x,y) \in \mathbb {R}^2 : y \le nh\}$, and where $D_h$ is the dilation operator defined in our section on notation.

Proof

Let $(x,y) \in A_{\epsilon ,h}(\mathbf{x}) \cap \{ y \le nh\}$. Then $x^2+y^2 \le \epsilon ^2$ and $y \le nh$. Hence $\left( \left\lfloor \frac{x}{h} \right\rfloor h,\left\lfloor \frac{y}{h} \right\rfloor h \right) \in B_{\epsilon ,h}(\mathbf{x}),$ and by (71) we have

$$\begin{aligned} \text{ Supp }(\{(x,y)\}) \subseteq {\mathcal {N}} \left( \left( \left\lfloor \frac{x}{h} \right\rfloor h,\left\lfloor \frac{y}{h} \right\rfloor h \right) \right) \subseteq D_h(B_{\epsilon ,h}(\mathbf{x})), \end{aligned}$$

where ${\mathcal {N}}(\mathbf{x})$ denotes the nine-point neighborhood of $\mathbf{x} \in \mathbb {Z}^2_h$ defined in the notation section. But since we also know $y \le nh$, we have $\left\lceil \frac{y}{h} \right\rceil h \le \left\lceil \frac{nh}{h} \right\rceil h \le nh,$ and hence applying (71) again we conclude

$$\begin{aligned} \text{ Supp }(\{(x,y)\}) \subseteq \{ y \le nh\} \end{aligned}$$

as well. Since $(x,y) \in A_{\epsilon ,h}(\mathbf{x}) \cap \{ y \le nh\}$ was arbitrary, the first inclusion follows. For the second inclusion, note that every element of $D_h(B_{\epsilon ,h}(\mathbf{x})) \cap \{ y \le nh\}$ is of the form $(x,y) = \mathbf{y} + \varDelta \mathbf{y}$ where $\mathbf{y} \in B_{\epsilon ,h}(\mathbf{x})$ and $ \varDelta \mathbf{y} \in \{-h,0,h\} \times \{-h,0,h\}$. Hence

$$\begin{aligned} \Vert (x,y)\Vert =\Vert \mathbf{y}+ \varDelta \mathbf{y}\Vert \le \Vert \mathbf{y}\Vert +\Vert \varDelta \mathbf{y}\Vert \le \epsilon + \sqrt{2}h < \epsilon + 2h. \end{aligned}$$

At the same time we have $y \le nh$, so $(x,y) \in B_{\epsilon +2h,h}(\mathbf{x}) \cap \{ y \le nh\}$ as claimed. $\square $

C Proof that Our Proposed Extension of Algorithm 1 is Equivalent to Damped Jacobi or SOR

Here we supply the proof, promised in Sect. 5, that the blue text in Algorithm 1 is actually an implementation of either damped Jacobi or SOR, depending on whether the “FillBoundary” subroutine is executed sequentially or in parallel.

Proposition 3

Changing the boolean “semiImplicit” to true in Algorithm 1 is equivalent to solving (15) with damped Jacobi if “FillBoundary” is executed in parallel, and with SOR if it is executed sequentially. In either case, the relaxation parameter is given by

$$\begin{aligned} \omega ^* = \left( 1- \frac{{\tilde{w}}_{\epsilon }(\mathbf{x},\mathbf{x})}{W}\right) . \end{aligned}$$

Proof

First, note that the Jacobi iteration for solving the linear system (15) may be written as

$$\begin{aligned} {\tilde{u}}^{(n+1)}_h(\mathbf{x}) = \frac{1}{1-\frac{{\tilde{w}}_{\epsilon }(\mathbf{x},\mathbf{x})}{W}} \left( \sum _{\mathbf{y} \in S^{(k)}_{\epsilon ,h}(\mathbf{x}) \backslash \{ \mathbf{x} \}}\frac{{\tilde{w}}_{\epsilon }(\mathbf{x},\mathbf{y})}{W}u^{(n)}_h(\mathbf{y}) + \mathbf{f} \right) \end{aligned}$$

with $\mathbf{f}$ defined as in (18). By comparison, repeated (parallel) executation of

$\text{ FillBoundary }(D^{(k+1)}_h, \partial D^{(k)}_h)$ is equivalent (after applying the definition of equivalent weights) to

$$\begin{aligned} u^{(n+1)}_h(\mathbf{x})= & {} \sum _{\mathbf{y} \in S^{(k)}_{\epsilon ,h}(\mathbf{x})} \frac{{\tilde{w}}_{\epsilon }(\mathbf{x},\mathbf{y})}{W}u^{(n)}_h(\mathbf{y}) + \mathbf{f} \\= & {} \frac{{\tilde{w}}(\mathbf{x},\mathbf{x})}{W}u^{(n)}_h(\mathbf{x})\\&+\sum _{\mathbf{y} \in S^{(k)}_{\epsilon ,h}(\mathbf{x}) \backslash \{ \mathbf{x} \}} \frac{{\tilde{w}}_{\epsilon }(\mathbf{x},\mathbf{y})}{W}u^{(n)}_h(\mathbf{y}) + \mathbf{f} \\= & {} (1-\omega ^*)u^{(n)}_h(\mathbf{x})+\omega ^* {\tilde{u}}^{(n+1)}_h(\mathbf{x}), \end{aligned}$$

which is a definition of damped Jacobi. The proof for SOR is analogous. $\square $

D Proof of Proposition 1

First we fix some notation. Suppose we are on iteration k of semi-implicit Guidefill and let $\mathbf{x} := x^{(k)}_0$ denote a fixed but arbitrary member of $\partial D_h^{(k)}$. The pixel $x^{(k)}_0$ is coupled by (15) to its immediate neighbors $x_j^{(k)}$ for $-r-2 \le j \le r+2$, and also depends on the pixels $x^{(k-\delta )}_{j}:= \mathbf{x}+ h (j,\delta ) \in \partial D_h^{(k-\delta )}$ for $(j,\delta ) \in b^-_{r+2}$ which appear in the right hand side of (15) within the vector $\mathbf{f}$.

Next, note that since $\mu \rightarrow \infty $, the weights $w_r$ have vanished on all of ${\tilde{b}}^0_r$ except for the line of ghost pixels

$$\begin{aligned} \ell ^-_r := \{ -j \mathbf{g} \}_{j=1}^r. \end{aligned}$$

For convenience, we enumerate $\ell ^-_r$ as $\ell ^-_r:=\{ p_j \}_{j=1}^r$ where $p_j = -j \mathbf{g}$. Each $p_j$ receives weight

$$\begin{aligned} w_j := \frac{1}{j}. \end{aligned}$$

This situation is illustrated in Fig. 27 for the case $0< \theta < \theta _c$, where $\ell ^-_r$ fits entirely into the space between $\partial D^{(k)}_h$ and $\partial D^{(k-1)}_h$.

To compute the entries of ${\mathcal {L}}$, we follow the idea of Sect. 4 and consider how the weight $w_j$ of each ghost pixel $p_j$ gets distributed to its real pixel neighbors. For example, in Fig. 27, the weight $w_1$ of $p_1$ gets redistributed amongst the four pixels $x^{(k)}_0$, $x^{(k)}_{-1}$, $x^{(k-1)}_{0}$, and $x^{(k-1)}_{-1}$.

How exactly this weight gets redistributed is for the most part not something we need to know precisely. For example, it is already clear from Fig. 27 that if $0< \theta \le \frac{\pi }{2}$, then none of the weight of any of the $p_j$ make it into any of $x^{(k)}_1$, $x^{(k)}_2$, $x^{(k)}_3\ldots $. Similarly, if $\frac{\pi }{2} \le \theta < \pi $, no weight makes it to any of $x^{(k)}_{-1}$, $x^{(k)}_{-2}$, $x^{(k)}_{-3}\ldots $. This means that, given our assumed ordering of pixels within each layer $\partial D^{(k)}_h$, we already know that ${\mathcal {L}}$ is a lower triangular matrix. Hence $L = {\mathcal {L}}$, $U = O$. Therefore, $G_{\omega }$ takes on the simplified form

$$\begin{aligned} G_{\omega } = (1-\omega )(I-\omega D^{-1}L)^{-1}. \end{aligned}$$

We begin with $\Vert G_{\omega }\Vert _{\infty }$, the harder case. In this case, defining

$$\begin{aligned} A := I-\omega D^{-1}L, \end{aligned}$$

we have

$$\begin{aligned} \Vert G_{\omega }\Vert _{\infty } = |1-\omega |\Vert A^{-1}\Vert _{\infty } \end{aligned}$$

(72)

We know ${\mathcal {L}} = D-L$ is strictly diagonally dominant, so the following computation shows that A is as well, provided $0<\omega \le 1$:

$$\begin{aligned} \sum _{j \ne i} |A_{ij}| = \frac{|\omega |}{|D_{ii}|}\sum _{j \ne i} |L_{ij}| < |\omega | \le 1 = |A_{ii}|. \end{aligned}$$

Hence, the following classical bound due to Jim Varah [41, Theorem 1] applies:

$$\begin{aligned} \Vert A^{-1}\Vert _{\infty } \le \frac{1}{\min _{i=1}^N \varDelta _i(A)} \end{aligned}$$

where

$$\begin{aligned} \varDelta _i(A):=\big ||A_{ii}|-\sum _{j \ne i} | A_{ij}|\big |. \end{aligned}$$

Since A is a Toeplitz matrix with band width $r+2$ and at the same time a lower triangular matrix , we know that $\varDelta _i(A)$ is the same for all $i \ge r+3$, but increases somewhat for $i \le r+2$ as there are fewer off diagonal terms (due to our assumed Dirichlet boundary conditions). In particular, the first row has no off diagonal terms, so we have $\varDelta _1(A) = A_{11}=1$. It follows that

$$\begin{aligned} \varDelta _1(A) \ge \varDelta _2(A) \ge \cdots \ge \varDelta _{r+3}(A) = \varDelta _{r+4}(A) = \ldots = \varDelta _N(A). \end{aligned}$$

Choosing row N as a representative row for convenience gives

$$\begin{aligned} \Vert A^{-1}\Vert \le \frac{1}{\varDelta _{N}(A)}. \end{aligned}$$

However, the identity

$$\begin{aligned} \Vert A^{-1}\Vert ^{-1} = \inf _{\mathbf{x}} \frac{\Vert A\mathbf{x}\Vert }{\Vert \mathbf{x}\Vert } \end{aligned}$$

(valid for any induced norm) means that in particular, for the vector e containing $r+2$ zeros followed by $N-r-2$ ones, that is

$$\begin{aligned} e = (\underbrace{0,\ldots ,0}_{r+2},\underbrace{1,\ldots ,1}_{N-r-2}), \end{aligned}$$

we have

$$\begin{aligned} \Vert A^{-1}\Vert ^{-1}_{\infty } \le \frac{\Vert Ae\Vert _{\infty }}{\Vert e\Vert _{\infty }}= & {} \max _{i=1}^N \left| \sum _{j=1}^N A_{ij}e_j\right| \\= & {} \max _{i=r+3}^N \left| |A_{ii}| - \sum _{j \ne i} |A_{ij}| \right| =\varDelta _N(A). \end{aligned}$$

where we have used the fact that for all i we have $A_{ii} >0$ and $A_{ij} \le 0$ for $j \ne i$. The vector e was chosen deliberately in order to avoid the first $r+2$ rows of A, which we have already said are different due to boundary conditions. Hence

$$\begin{aligned} \Vert A^{-1}\Vert _{\infty } \ge \frac{1}{\varDelta _{N}(A)} \end{aligned}$$

as well, and having proven the bound in both directions we conclude

$$\begin{aligned} \Vert A^{-1}\Vert _{\infty } = \frac{1}{\varDelta _{N}(A)}. \end{aligned}$$

(73)

Remark 18

It appears that Varah’s bound [41, Theorem 1] should generalize to equality not only in our case, but to general strictly diagonally dominant Toeplitz matrices obeying $A_{ii}>0$ for all i and $A_{ij} \le 0$ whenever $j \ne i$, using a very similar argument. However, this generalization does not appear in [41] and we have been unable to find it in the literature.

The next step is to compute $\varDelta _N(A)$. To that end, note that by definition $A = I - \omega D^{-1}L$ obeys $A_{ii}=1$ for all i and

$$\begin{aligned} A_{ij}=-\omega \frac{\left( \frac{{\tilde{w}}_r(\mathbf{0},(j-i)\mathbf{e}_1)}{W}\right) }{\left( 1-\frac{{\tilde{w}}_r(\mathbf{0},\mathbf{0})}{W}\right) }=-\omega \frac{{\tilde{w}}_r(\mathbf{0},(j-i)\mathbf{e}_1)}{W-{\tilde{w}}_r(\mathbf{0},\mathbf{0})} \end{aligned}$$

for

$$\begin{aligned} \max (i-r-2, 1) \le j < i. \end{aligned}$$

by (27). Here W are the total weight and equivalent weights ${\tilde{w}}_r$ defined in Sect. 6.1. Hence

$$\begin{aligned} \varDelta _N(A) = \left| 1 - \omega \frac{\sum _{j=-r-2}^{-1} {\tilde{w}}_r(\mathbf{0},j\mathbf{e}_1)}{W-{\tilde{w}}_r(\mathbf{0},\mathbf{0})} \right| =\left| 1 - \omega \left( \frac{{\tilde{W}}-{\tilde{w}}_{0,0}}{W-{\tilde{w}}_{0,0}}\right) \right| , \end{aligned}$$

where ${\tilde{W}}$ and ${\tilde{w}}_{0,0}$ are defined as in (27). Combining the above with (72) and (73) we finally obtain

$$\begin{aligned} \Vert G_{\omega }\Vert _{\infty } = \frac{|1-\omega |}{1-\omega \left( \frac{{\tilde{W}}-{\tilde{w}}_{0,0}}{W-{\tilde{w}}_{0,0}}\right) } \end{aligned}$$

as claimed. We leave deriving expressions for W, ${\tilde{W}}$, and ${\tilde{w}}_{0,0}$ until the end. First we derive an expression for $\Vert J_{\omega }\Vert _{\infty }$ in terms of these three quantities. Since $U=O$ we have

$$\begin{aligned} J_{\omega } = I - \omega D^{-1}{\mathcal {L}} = I - \omega D^{-1}(D-L) = (1-\omega )I + \omega D^{-1} L. \end{aligned}$$

By definition we have

$$\begin{aligned} \Vert J_{\omega }\Vert _{\infty } := \max _{i=1}^N \sum _{j=1}^N | (J_{\omega })_{ij}|. \end{aligned}$$

So long as $i \ge r+3$, this sum becomes

$$\begin{aligned} \sum _{j=1}^N | (J_{\omega })_{ij}|=|1-\omega | +\omega \left( \frac{{\tilde{W}}-{\tilde{w}}_{0,0}}{W-{\tilde{w}}_{0,0}}\right) . \end{aligned}$$

If $i \le r+2$, then this sum includes fewer terms and is potentially smaller. Hence

$$\begin{aligned} \Vert J_{\omega }\Vert _{\infty }=|1-\omega | +\omega \left( \frac{{\tilde{W}}-{\tilde{w}}_{0,0}}{W-{\tilde{w}}_{0,0}}\right) . \end{aligned}$$

Our remaining task is to derive the claimed expressions for W, ${\tilde{W}}$, and ${\tilde{w}}_{0,0}$. The easiest is W. By (17) we have

$$\begin{aligned} W = \sum _{j=1}^r w_j = \sum _{j=1}^r \frac{1}{j}. \end{aligned}$$

It is also not difficult to compute ${\tilde{w}}_{0,0}$, which represents fraction of the mass $w_1=1$ of the point $p_1$ that gets redistributed back to $x^{(k)}_0$ (see Fig. 27). Since $p_1$ sits $h\sin \theta $ units below $\partial D^{(k)}_h$ and $h(1-\sin \theta )$ units above $\partial D^{(k-1)}_h$, and either $h\cos \theta $ units to the left $x^{(k)}_0$ and $h(1-\cos \theta )$ units right of $x^{(k)}_{-1}$ if $0 \le \theta \le \frac{\pi }{2}$ or $h|\cos \theta |$ units right of $x^{(k)}_0$ and $h(1-|\cos \theta |)$ units left of $x^{(k)}_{1}$ otherwise, it follows that

$$\begin{aligned} {\tilde{w}}_{0,0} = (1-\sin \theta )(1-|\cos \theta |)w_1 = (1-\sin \theta )(1-|\cos \theta |). \end{aligned}$$

For ${\tilde{W}}$, we split into cases. If $0 \le \theta \le \theta _c$ or $\pi - \theta _c \le \theta \le \pi $, then $\ell ^-_r$ fits entirely between $\partial D^{(k)}_h$ and $\partial D^{(k-1)}_h$, as in Fig. 27. If $\theta _c< \theta < \pi - \theta _c$, then only $p_1$ up to $p_{j^*}$ fit (recall the definition of $j^*$ from the statement of the proposition). As a result, in the first case every $p_j$ for $1 \le j \le r$ contribute mass to ${\tilde{W}}$. In the second case, only the first $j^*$ contribute. Each contributing $p_j$ is situated $h j \sin \theta $ units below $\partial D_h^{(k)}$ and $h(1-j\sin \theta )$ units above $\partial D^{(k-1)}_h$. Hence each contributing $p_j$ contributes $(1-j\sin \theta )w_j$ towards ${\tilde{W}}$. Hence, in the first case we have

$$\begin{aligned} {\tilde{W}} = \sum _{j=1}^r (1-j\sin \theta )\frac{1}{j}= \sum _{j=1}^r \frac{1}{j} - r\sin \theta , \end{aligned}$$

while in the second we have

$$\begin{aligned} {\tilde{W}} = \sum _{j=1}^{j^*} \frac{1}{j} - j^* \sin \theta . \end{aligned}$$

Our final claim on the expressions for $\Vert J_1\Vert _{\infty }$ and $\Vert G_1\Vert _{\infty }$ and the optimality of $\omega = 1$ is now a simple exercise and is left to the reader. $\square $

E Additional Details on Coherence Transport and the Angular Spectrum

In Sect. 7.1.2 we related the limiting transport direction $\mathbf{g}^*_r = \lim _{\mu \rightarrow \infty } \mathbf{g}^*_{\mu ,r}$ of coherence transport to the angular spectrum $\varTheta (b^-_r)$ of $b^-_r$ defined by (62). More precisely, first we connected $\mathbf{g}^*_r$ to the set of minimizers $\varPsi $ within $b^-_r$ of the orthogonal distance to the line $L_\mathbf{g} = \{ \lambda \mathbf{g} : \lambda \in \mathbb {R} \}$, where $\mathbf{g}$ is the guidance direction of coherence transport. Then we claimed that $\varPsi $ and $\varTheta (b^-_r)$ are related. Now is the time to prove that claim. We will do this in Proposition 4 not just for $b^-_r$, but for a general finite subset $A \subseteq \mathbb {Z}^2 \cap \{ y \le -1\}$. To do this, however, first we generalize the concept of angular spectrum to a general subset $A \subseteq \mathbb {Z}^2$.

Definition 8

Given $A \subseteq \mathbb {Z}^2$ we define the angular spectrum of A by

$$\begin{aligned} \varTheta (A) = \{ \theta \in [0,\pi ) : \theta =\theta (\mathbf{y}) \text{ for } \text{ some } \mathbf{y} \in A \backslash \{\mathbf{0}\} \} \end{aligned}$$

(74)

If A is finite it follows that $\varTheta (A)$ is as well, and we write

$$\begin{aligned} \varTheta (A)=\{0 \le \theta _1< \theta _2< \ldots<\theta _n < \pi \}. \end{aligned}$$

See Fig. 28b for an illustration of $\varTheta (A)$ in the case $A = b^-_r$.

Once again we have defined $\varTheta (A)$ modulo $\pi $ to reflect the fact that $\mathbf{g}^*_r$ and $-\mathbf{g}^*_r$ define the same transport equation. The characterization of $\varTheta (A)$ is of interest in and of itself and has been studied for $A = b_r$ by the likes of Erdös [19] and many others, see for example [15] (they, however, do not define it modulo $\pi $).

Remark 19

The point of generalizing the concept of angular spectrum and generalizing Proposition 4 from $b^-_r$ to a general $A \subseteq \mathbb {Z}^2 \cap \{ y \le -1\}$ is so that we can show (Corollary 2) that our kinking results for coherence transport from Sect. 7.1.2 continue to hold, essentially unchanged, if coherence transport is modified to sum over a discrete square, for example, rather than a discrete ball.

Proposition 4

Let $A \subseteq \mathbb {Z}^2 \cap \{ y \le -1\}$ obey $|A|<\infty $, and let $\varTheta (A) = \{\theta _1,\theta _2,\ldots ,\theta _n\}$ denote the angular spectrum of A, and assume $n=|\varTheta (A)|\ge 2$ in order to avoid degenerate cases (that is, the elements of A do not all sit on a single line through the origin). Let $\mathbf{g}_{\theta }=(\cos \theta ,\sin \theta )$ and denote by $\varPsi _{\theta }$ the set of minimizers of $|\mathbf{g}_{\theta }^{\perp } \cdot \mathbf{y}|$ over $\mathbf{y} \in A$ (that is, the point(s) in A minimizing the orthogonal distance to the line $L_{\mathbf{g}_{\theta }}$. Given $\mathbf{y} \in A$, we say that $\mathbf{y}$ is a singleton minimizer if there is some $\theta \in [0,\pi )$ for which $\varPsi _{\theta } = \{ \mathbf{y} \}$. Let $Y:=\{ \mathbf{y}_1,\mathbf{y}_2,\ldots ,\mathbf{y}_{n'}\}$ denote the set of all singleton minimizers, ordered so that $\theta (\mathbf{y}_i) \le \theta (\mathbf{y}_{i+1})$ for all $i=1,\ldots n'-1$. Then $n'=n$,

$$\begin{aligned} \varTheta (A) = \{\theta (\mathbf{y}_1),\theta (\mathbf{y}_2),\ldots ,\theta (\mathbf{y}_n)\}, \end{aligned}$$

and moreover $\theta _i = \theta (\mathbf{y}_i)$ for all $i=1,\ldots ,n$. Finally, each singleton minimizer $\mathbf{y}_i$ is the shortest vector in A such that $\theta (\mathbf{y})=\theta _i$, that is for every $\mathbf{y} \in A$ such that $\theta (\mathbf{y})=\theta _i$, we have $\Vert \mathbf{y}\Vert \ge \Vert \mathbf{y}_i\Vert $.

Proof

Let $\theta _i \in \varTheta (A)$. Our main objective is to show that $\theta _i = \theta (\mathbf{y}_i)$. To achieve that, it suffices to show that the sets $\varTheta (A)$ and $\{ \theta (\mathbf{y}_1),\theta (\mathbf{y_2}),\ldots ,\theta (\mathbf{y}_{n'})\}$ are equal, since from here it follows that $n'=n$, and then the desired identity follows from the ordering property $\theta (\mathbf{y}_i) \le \theta (\mathbf{y}_{i+1})$ for all $i=1,\ldots ,n-1$. Our secondary objective, to show that $\mathbf{y}_i$ is the shortest vector in A obeying $\theta (\mathbf{y})=\theta _i$ is something that will be proved along the way.

For the first step, the inclusion

$$\begin{aligned} \varTheta (A) \supseteq \{\theta (\mathbf{y}_1),\theta (\mathbf{y}_2),\ldots ,\theta (\mathbf{y}_{n'})\} \end{aligned}$$

is obvious and follows from the definition of $\varTheta (A)$. Hence it suffices to prove

$$\begin{aligned} \varTheta (A) \subseteq \{\theta (\mathbf{y}_1),\theta (\mathbf{y}_2),\ldots ,\theta (\mathbf{y}_{n'})\}. \end{aligned}$$

(75)

Still fixing $\theta _i \in \varTheta (A)$, by definition we have $\theta _i = \theta (\mathbf{y})$ for some $\mathbf{y} \in A$. In fact, we have $\theta _i = \theta (\mathbf{y})$ for all $\mathbf{y} \in \varPsi _{\theta _i}$, which in this case is a set of vectors that are all scalar multiples of $\mathbf{g}_{\theta _i}$ and hence all of which obey $|\mathbf{g}^{\perp }_{\theta _i} \cdot \mathbf{y}|=0$. Define the functions $\varDelta (\theta )$ and $\delta (\theta )$ by

$$\begin{aligned} \delta (\theta ):= & {} \max _{\mathbf{y} \in \varPsi _{\theta }} |\mathbf{g}^{\perp }_{\theta } \cdot \mathbf{y}| \\ \varDelta (\theta ):= & {} {\left\{ \begin{array}{ll} \min _{\mathbf{y} \in A \backslash \varPsi _{\theta }} |\mathbf{g}^{\perp }_{\theta } \cdot \mathbf{y}|. &{} \text{ if } A \backslash \varPsi _{\theta } \ne \emptyset \\ \delta (\theta ) &{} \text{ otherwise. } \end{array}\right. } \end{aligned}$$

Then $\delta (\theta )$ and $\varDelta (\theta )$ each depend continuously on $\theta $. Moreover, it is straightforward to show that $\varDelta (\theta _i) > \delta (\theta _i)=0$, since we have assumed $|\varTheta (A)|\ge 2$ (this condition could only ever be violated if all elements of A were scalar multiples of one another). By continuity, it follows that for $|\theta -\theta _i|$ sufficiently small we have $\delta (\theta ) < \varDelta (\theta )$, which means that $\varPsi _{\theta } \subseteq \varPsi _{\theta _i}$. But for $|\theta -\theta _i| \le \frac{\pi }{2}$ and for $\mathbf{y} \in \varPsi _{\theta _i}$, we have the explicit formula

$$\begin{aligned} |\mathbf{g}^{\perp }_{\theta } \cdot \mathbf{y}|=\Vert \mathbf{y}\Vert \sin |\theta -\theta _i|. \end{aligned}$$

This is obviously minimized by whichever $\mathbf{y}^* \in \varPsi _{\theta _i}$ is shortest - i.e. $\Vert \mathbf{y}^*\Vert \le \Vert \mathbf{y}\Vert $ for all $\mathbf{y} \in \varPsi _{\theta _i}$. Moreover, since A is contained in the lower half plane we know this minimizer is unique. Hence $\varPsi _{\theta } = \{ \mathbf{y}^*\}$, which makes $\mathbf{y}^*$ a singleton minimizer. Since $\theta _i = \theta (\mathbf{y}^*)$, this proves the desired inclusion (75), and we have already proven our secondary claim on the length of $\mathbf{y}^*$ being minimal. $\square $

Our next claim was a formula for $\varPsi $ valid when $\theta _i< \theta < \theta _{i+1}$ for two consecutive members $\theta _i,\theta _{i+1} \in \varTheta (A)$, when $0:=\theta _{0,1}<\theta <\theta _1$, and when $\theta _n< \theta < \theta _{n,n+1}:=\pi $. Proposition 5 derives this formula, under the assumption that A can be described a union of discrete rectangles on or below the line $y=-1$ and straddling the line $x=0$. This includes the case $A = b^-_r$, but also covers a number of other cases, as mentioned in Remark 19. Credit for this proposition goes to Dömötör Pálvölgyi, who had the critical idea of using a symmetry based argument [18].

Proposition 5

Let $A \subseteq \mathbb {Z}^2 \cap \{ y \le -1 \}$ be a finite union of discrete rectangles of the form

$$\begin{aligned} A = \bigcup _{k=1}^K [a_k, b_k ] \times [c_k, d_k] \cap \mathbb {Z}^2 \end{aligned}$$

where $-\infty< a_k \le 0 \le b_k < \infty $, $-\infty < c_k \le d_k \le -1$ for all k. Let

$$\begin{aligned} \varTheta (A):=\{\theta _1,\theta _2, \ldots ,\theta _n\} \end{aligned}$$

denote the angular spectrum of A, let $\mathbf{g} = (\cos \theta ,\sin \theta )$, and let $Y = \{\mathbf{y}_1,\mathbf{y}_2,\ldots ,\mathbf{y}_n\}$ be the set of singleton minimizers defined in Proposition 4 of $|\mathbf{g}^{\perp } \cdot \mathbf{y}|$ over A as $\theta $ varies from 0 to $\pi $. For each $1 \le i \le n-1$, define the transition angle $\theta _{i,i+1} \in (\theta _i,\theta _{i+1})$ by

$$\begin{aligned} \theta _{i,i+1} = \theta (\mathbf{y}_i+\mathbf{y}_{i+1}). \end{aligned}$$

Define also $\theta _{0,1}:=0$ and $\theta _{n,n+1}=\pi $ for convenience. Then

$$\begin{aligned} \varPsi ={\left\{ \begin{array}{ll} \{\mathbf{y}_i\} \\ \text{ if } \theta _i< \theta< \theta _{i,i+1} \quad \text{ for } \text{ some } i=1,\ldots ,n \\ \{\mathbf{y}_i,\mathbf{y}_{i+1}\} \\ \text{ if } \theta =\theta _{i,i+1} \quad \text{ for } \text{ some } i=1,\ldots ,n-1 \\ \{\mathbf{y}_{i+1}\} \\ \text{ if } \theta _{i,i+1}< \theta < \theta _{i+1} \quad \text{ for } \text{ some } i=0,\ldots ,n-1 \\ \end{array}\right. } \end{aligned}$$

Proof

The bulk of the work is to prove that if $\theta _i< \theta < \theta _{i+1}$, then

$$\begin{aligned} \varPsi _{\theta } := {\text {argmin}}_{\mathbf{y} \in A} |\mathbf{g}^{\perp } \cdot \mathbf{y}| \subseteq \{ \mathbf{y}_i, \mathbf{y}_{i+1}\}. \end{aligned}$$

Once this is established, since we evidently have $\varPsi _{\theta }=\{ \mathbf{y}_i \}$, $\varPsi _{\theta }=\{\mathbf{y}_{i+1}\}$ for $\theta $ sufficiently close to $\theta _i$ and $\theta _{i+1}$ respectively, it follows that that

$$\begin{aligned} \varPsi _{\theta } = {\left\{ \begin{array}{ll} \{ \mathbf{y}_i \}&{} \text{ if } \theta < \theta _c \\ \{ \mathbf{y}_i,\mathbf{y}_{i+1} \} &{} \text{ if } \theta = \theta _c \\ \{ \mathbf{y}_{i+1} \} &{} \text{ if } \theta > \theta _c.\end{array}\right. } \end{aligned}$$

where $\theta _c$ is defined by $|\mathbf{g}^{\perp } \cdot \mathbf{y}_i|=|\mathbf{g}^{\perp } \cdot \mathbf{y}_{i+1}|$. One can readily show this is equivalent to $\theta _c = \theta (\mathbf{y}_i+\mathbf{y}_{i+1})$.

To prove that $\varPsi _{\theta } := {\text {argmin}}_{\mathbf{y} \in A} |\mathbf{g}^{\perp } \cdot \mathbf{y}| \subseteq \{ \mathbf{y}_i, \mathbf{y}_{i+1}\}$ as claimed, consider the open triangles $T_i$, ${\tilde{T}}_i$ defined in terms of $\mathbf{y}_i$ as shown in Fig. 28, as well as open triangles $T_{i+1}$, ${\tilde{T}}_{i+1}$ defined in the same way in terms of $\mathbf{y}_{i+1}$. The triangles $T_i$ and $T_{i+1}$ each have empty intersection with A, as $\mathbf{y}_i$ and $\mathbf{y}_{i+1}$ are the elements of A that cast the shallowest angles on $L_{\mathbf{g}}$ from above and below. To prove $\varPsi _{\theta } \subseteq \{ \mathbf{y}_i, \mathbf{y}_{i+1}\}$, we need to show that $\mathbf{y}_i$ and $\mathbf{y}_{i+1}$ are also the two closest points in A to $L_{\mathbf{g}}$. This amounts to proving that the triangles ${\tilde{T}}_i$ and ${\tilde{T}}_{i+1}$ have empty intersection with A as well.

To show this, first note that our assumptions on A imply that the intersection of each of our four triangles with A is equal to their intersection with $\mathbb {Z}^2$ as a whole (because A has no “holes”). Second, note that the map

$$\begin{aligned} \mathbf{F}(\mathbf{x}) = \mathbf{y}_i - \mathbf{x} \end{aligned}$$

is a bijection of the plane taking $T_i$ to ${\tilde{T}}_i$ such that $\mathbf{F}(\mathbb {Z}^2) = \mathbb {Z}^2$. Hence ${\tilde{T}}_i$ contains a lattice point if and only if $T_i$ does. But

$$\begin{aligned} T_i \cap \mathbb {Z}^2 = T_i \cap A = \emptyset \end{aligned}$$

by assumption, and so

$$\begin{aligned} {\tilde{T}}_i \cap \mathbb {Z}^2 = {\tilde{T}}_i \cap A = \emptyset \end{aligned}$$

as well. This proves the claim for ${\tilde{T}}_i$ and the proof for ${\tilde{T}}_{i+1}$ is analogous. The remaining cases $0:=\theta _{0,1}<\theta <\theta _1$ and $\theta _n< \theta < \theta _{n,n+1}:=\pi $ are straightforward and left as an exercise. $\square $

Proposition 5 has a couple of straightforward corollaries. The first is our claim from 7.1.2 that $\varPsi $ is a singleton set for all but finitely many $\theta $. This is obvious from the statement of the proposition (which gives an expression for $\varPsi $ for all but finitely many $\theta $) and requires no proof. The second corollary generalizes our formula for $\theta ^*_r$ from Sect. 7.1.2, and uses the following observation, which we also used in Sect. 7.1.2 and owe a proof of.

Observation 7

Let $\mathbf{v}$, $\mathbf{w}$ be unit vectors in $S^1$. Then

$$\begin{aligned} \theta (\mathbf{v}+\mathbf{w})=\frac{\theta (\mathbf{v})+\theta (\mathbf{w})}{2}. \end{aligned}$$

Proof

This is simplest if we work in complex arithmetic, that is, we write $\mathbf{v}=e^{i\psi }$ and $\mathbf{w}=e^{i\phi }$ for some $\psi ,\phi \in [0,2\pi )$. However, by symmetry we may assume $\mathbf{v}=1$ (otherwise rotate the plane). Hence, it suffices to prove

$$\begin{aligned} \frac{1+e^{i\phi }}{|1+e^{i\phi }|}=e^{i\frac{\phi }{2}}. \end{aligned}$$

But this follows from the following simple manipulation:

$$\begin{aligned} 1+e^{i \phi } = e^{i\frac{\phi }{2}}(e^{-i\frac{\phi }{2}}+e^{i\frac{\phi }{2}})=2\cos \left( \frac{\phi }{2}\right) e^{i\frac{\phi }{2}}. \end{aligned}$$

$\square $

The following corollary shows that coherence transport-like algorithms, which use the same weights but replace $B_{\epsilon ,h}(\mathbf{x})$ with a different set of pixels (a finite union of discrete rectangles) exhibit similar kinking behaviour in the limit $\mu \rightarrow \infty $.

Corollary 2

Suppose we inpaint $D = (0,1]^2$ using Algorithm 1 with boundary data $u_0 : {\mathcal {U}} \rightarrow \mathbb {R}^d$ and suppose the symmetry assumptions of Sect. 6.1 hold as usual. Assume our stencil $a^*_r$ is of the form

$$\begin{aligned} a^*_r = \bigcup _{k=1}^K [a_k, b_k ] \times [c_k, d_k] \cap \mathbb {Z}^2 \end{aligned}$$

where $-\infty< a_k \le 0 \le b_k < \infty $, $c_k \le d_k \le -1$ for all k. Let

$$\begin{aligned} \varTheta (a^*_r):=\{\theta _1,\theta _2, \ldots ,\theta _n\} \end{aligned}$$

denote the angular spectrum of $a^*_r $, let $\mathbf{g} = (\cos \theta ,\sin \theta )$, assume we use as stencil weights the weights of März (6), and denote by $\mathbf{g}^*_{\mu ,r}$ the limiting transport direction from Theorem 6. Let $\mathbf{g}^*_r = \lim _{\mu \rightarrow \infty } \mathbf{g}^*_{\mu ,r}$ and define $\theta ^*_r := \theta (\mathbf{g}^*_r)$. Then

$$\begin{aligned} \theta ^*_r = {\left\{ \begin{array}{ll} \frac{\pi }{2} &{} \text{ if } \theta = 0 \\ \theta _i &{} \text{ if } \theta _{i-1,i}< \theta < \theta _{i,i+1} \text{ for } \text{ some } i = 1,\ldots n \\ \frac{\theta _i+\theta _{i+1}}{2} &{} \text{ if } \theta = \theta _{i,i+1} \text{ for } \text{ some } i = 1,\ldots n \end{array}\right. } \end{aligned}$$

(76)

Proof

This follows from Proposition 5 and observation 7 in exactly the same way as when showed this for the special case $a^*_r = b^-_r$ in Sect. 7.1.2. $\square $

We conclude this appendix with a remark on a practical algorithm for computing the angular spectrum $\varTheta (A)$ given $A \subseteq \mathbb {Z}^2$ satisfying the hypotheses of Proposition 4. This algorithm was used to generate the theoretical limiting curves for $\theta ^*_r$ for coherence transport in Sect. 7.1.2.

Remark 20

Given $A \subseteq \mathbb {Z}^2$ satisfying the hypotheses of Proposition 4, a simple algorithm for computing the angular spectrum $\varTheta (A)$ and singleton minimizers Y is as follows:

1.
Starting with $Y^*=\emptyset $, go through each $\mathbf{y} \in A$ and find the unique $\mathbf{y}' \in A$ such that $\theta (\mathbf{y})=\theta (\mathbf{y}')$ and $\mathbf{y}'$ is of minimal length. If $\mathbf{y}'$ is not already in $Y^*$, add it.
2.
For each $\mathbf{y} \in Y^*$, compute $\theta (\mathbf{y})$. Sort $Y^*$ according to $\theta (\mathbf{y})$. The sorted list $Y^*$ is now equal to Y, and the sorted list of angles is $\varTheta (A)$.

F Proof of Theorem 1 Statement II

First off, note that since $V_i \le 0$, $m \le N$, and $Y_{\tau } \ge -(r+2)$ the restriction of our domain in the y-direction from $\mathbb {Z}$ to $\{-(r+2),-(r+1),\ldots ,N\}$ makes no difference. Putting this together with our periodic boundary conditions, we find that our situation is equivalent to a random walk on $\mathbb {Z}^2$ modulo the equivalence relation

$$\begin{aligned} (i,j) \sim (i+kN,j) \quad \text{ for } \text{ all } k \in \mathbb {Z}. \end{aligned}$$

Denoting by $\rho ^*_{\mathbf{X}_{\tau }}$ the density of our stopped random walk on the periodic domain and $\rho _{\mathbf{X}_{\tau }}$ the density of the original walk on $\mathbb {Z}^2$ considered in [36], we clearly have

$$\begin{aligned} \rho ^*_{\mathbf{X}_{\tau }}(i,j) = \sum _{k \in \mathbb {Z}} \rho _{\mathbf{X}_{\tau }}(i+kN,j). \end{aligned}$$

(77)

Now, for fixed i let $k^*$ denote the value of k minimizing $|i+kN-\mu |$, where

$$\begin{aligned} \mu = n - \frac{\mu _x}{\mu _y}m \end{aligned}$$

denotes the mean (in the x-direction) of the asymptotic distribution of the original random walk on $\mathbb {Z}^2$. Let us define

$$\begin{aligned} \mu ^* = \mu \mod N. \end{aligned}$$

Then it is not hard to see that we have

$$\begin{aligned} |i+k^*N-\mu | = d^{{\text {circ}}}_{N}(i,\mu ^*), \end{aligned}$$

where $d^{{\text {circ}}}_{N}$ is the circular distance (46).

With this in mind, let us rewrite (77) as

$$\begin{aligned} \rho ^*_{\mathbf{X}_{\tau }}(i,j) = \sum _{\varDelta k \in \mathbb {Z}} \rho _{\mathbf{X}_{\tau }}(i+(k^*+\varDelta k)N,j). \end{aligned}$$

By the first statement of Theorem 1 we have

$$\begin{aligned} \rho ^*_{\mathbf{X}_{\tau }}(i,j)= & {} \sum _{\varDelta k \in \mathbb {Z}} \varDelta (j)\frac{e^{-\frac{(d^{{\text {circ}}}_{N}(i,\mu ^*)+\varDelta k N)^2}{2\sigma (m)^2}}}{\sqrt{2\pi }\sigma (m)}+o\left( \frac{1}{\sqrt{m}}\right) \\= & {} \varDelta (j)\frac{e^{-\frac{d^{{\text {circ}}}_{N}(i,\mu ^*)^2}{2\sigma (m)^2}}}{\sqrt{2\pi }\sigma (m)}+R+o\left( \frac{1}{\sqrt{m}}\right) \end{aligned}$$

where

$$\begin{aligned} R= & {} \sum _{\varDelta k \in \mathbb {Z} \backslash \{0\}} \varDelta (j) \frac{e^{-\frac{(d^{{\text {circ}}}_{N}(i,\mu ^*)+\varDelta k N)^2}{2\sigma (m)^2}}}{\sqrt{2\pi }\sigma (m)}\\\le & {} \sum _{\varDelta k \in \mathbb {Z} \backslash \{0\}} \frac{e^{-\frac{(d^{{\text {circ}}}_{N}(i,\mu ^*)+\varDelta k N)^2}{2\sigma (m)^2}}}{\sqrt{2\pi }\sigma (m)} . \end{aligned}$$

One may easily show that $d^{{\text {circ}}}_{N}(i,\mu ^*) \le \frac{N}{2}$. It follows that for $\varDelta k \in \mathbb {Z} \backslash \{0\}$ we have

$$\begin{aligned} (d^{{\text {circ}}}_{N}(i,\mu ^*)+\varDelta k N)^2\ge & {} \left( |\varDelta k|N - \frac{N}{2}\right) ^2\\= & {} \left( \varDelta k^2 - |\varDelta k|+\frac{1}{4}\right) N^2. \end{aligned}$$

We therefore have:

$$\begin{aligned} R= & {} \sum _{\varDelta k \in \mathbb {Z} \backslash \{0\}} \frac{e^{-\frac{(d^{{\text {circ}}}_{N}(i,\mu ^*)+\varDelta k N)^2}{2\sigma (m)^2}}}{\sqrt{2\pi }\sigma (m)} \\\le & {} \sum _{\varDelta k \in \mathbb {Z} \backslash \{0\}} \frac{e^{-\frac{\left( \varDelta k^2 - |\varDelta k|+\frac{1}{4}\right) N^2}{2\sigma (m)^2}}}{\sqrt{2\pi }\sigma (m)} \end{aligned}$$

However, the mapping $ \varDelta k \in \mathbb {Z} \backslash \{ 0\} \rightarrow \varDelta k^2 - |\varDelta k|$ takes values in $\mathbb {N} \cup \{0\}$ and is two to one. Therefore

$$\begin{aligned} R\le & {} 2\sum _{k =0 }^{\infty } \frac{e^{-\frac{\left( k+\frac{1}{4}\right) N^2}{2\sigma (m)^2}}}{\sqrt{2\pi }\sigma (m)} \le \frac{2}{\sqrt{2\pi }\sigma (m)}e^{-\frac{N^2}{8\sigma (m)^2}}\sum _{k=0}^{\infty }\left( e^{-\frac{N^2}{2\sigma (m)^2}}\right) ^k \\= & {} \frac{2}{\sqrt{2\pi }\sigma (m)}\frac{e^{-\frac{N^2}{8\sigma (m)^2}}}{1-e^{-\frac{N^2}{2\sigma (m)^2}}} \le \frac{2}{\sqrt{2\pi }\sigma (m)}e^{-\frac{N^2}{8\sigma (m)^2}} \end{aligned}$$

Now, $\sigma (m)^2 = {\tilde{\sigma }}^2m$ where ${\tilde{\sigma }}^2= \frac{\gamma ^2}{|\mu _y|^3}$ is a constant. Hence, for m sufficiently large we have $\sigma (m) \ge 1$. Noting that $m = yN$, for m sufficiently large we have

$$\begin{aligned} R \le e^{-\frac{N^2}{8\sigma (m)^2}} = e^{-\frac{y^2}{8{\tilde{\sigma }}^2}m}. \end{aligned}$$

Hence

$$\begin{aligned} \rho ^*_{\mathbf{X}_{\tau }}(i,j)\le & {} \frac{e^{-\frac{d^{{\text {circ}}}_{N}(i,\mu ^*)^2}{2\sigma (m)^2}}}{\sqrt{2\pi }\sigma (m)}+e^{-\frac{y^2}{8{\tilde{\sigma }}^2}m}+o\left( \frac{1}{\sqrt{m}}\right) \\= & {} \frac{e^{-\frac{d^{{\text {circ}}}_{N}(i,\mu ^*)^2}{2\sigma (m)^2}}}{\sqrt{2\pi }\sigma (m)}+o\left( \frac{1}{\sqrt{m}}\right) . \end{aligned}$$

G Proof of Lemma 1 Statement II

The proof follows from the first statement of Lemma 1 together with with routine tail estimates for Gaussians and other simple manipulations. First, note that

$$\begin{aligned}&|G^*_{\sigma (m),\mu ,N}(x)-g^*_{\sigma (m),\mu ,N}(x)|\\&\quad =\left| e^{-\frac{d^{{\text {circ}}}_{N}(x,\mu )^2}{2\sigma (m)^2}}\right| \left| \frac{1}{\sqrt{2\pi }\sigma (m)} - \frac{1}{\sum ^{N-1}_{n=0} e^{-\frac{d^{{\text {circ}}}_{N}(n,\mu )^2}{2\sigma (m)^2}}} \right| \\&\quad \le \frac{1}{\sqrt{2\pi }\sigma (m)\sum ^{N-1}_{n=0} e^{-\frac{d^{{\text {circ}}}_{N}(n,\mu )^2}{2\sigma (m)^2}}}\left| \sqrt{2\pi }\sigma (m) - \sum ^{N-1}_{n=0} e^{-\frac{d^{{\text {circ}}}_{N}(n,\mu )^2}{2\sigma (m)^2}} \right| \end{aligned}$$

Let us define

$$\begin{aligned} \varXi = \{ d^{{\text {circ}}}_{N}(n,\mu ) : n =0,\ldots ,N-1\}. \end{aligned}$$

We do not need to characterize $\varXi $ precisely. It is enough for us to note that for N sufficiently large we have

$$\begin{aligned} \varXi \supseteq \left\{ -\left( {\hat{\mu }}-\left\lfloor \frac{N}{4} \right\rfloor -1\right) ,\ldots ,-\left( {\hat{\mu }}-1\right) ,{\hat{\mu }},{\hat{\mu }}+1,\ldots , {\hat{\mu }}+\left\lceil \frac{N}{4} \right\rceil +1 \right\} , \end{aligned}$$

where ${\hat{\mu }}:=\mu - \lfloor \mu \rfloor $. From this it follows that

$$\begin{aligned} \sum ^{N-1}_{n=0} e^{-\frac{d^{{\text {circ}}}_{N}(n,\mu )^2}{2\sigma (m)^2}} \ge \sum ^{\lceil \frac{N}{4} \rceil +1}_{n=-\lfloor \frac{N}{4} \rfloor -1} e^{-\frac{(n-{\hat{\mu }})^2}{2\sigma (m)^2}} \end{aligned}$$

(78)

First, by (78) together with the assumption $\sigma (m)^2 = {\tilde{\sigma }}^2 m \rightarrow \infty $ as $m \rightarrow \infty $ we have

$$\begin{aligned} \sqrt{2\pi }\sigma (m)\sum ^{N-1}_{n=0} e^{-\frac{d^{{\text {circ}}}_{N}(n,\mu )^2}{2\sigma (m)^2}}\ge & {} \sqrt{2\pi }\sigma (m)\sum ^{\lceil \frac{N}{4} \rceil +1}_{n=-\lfloor \frac{N}{4} \rfloor -1} e^{-\frac{(n-{\hat{\mu }})^2}{2\sigma (m)^2}}\\\ge & {} \sqrt{2\pi }\sigma (m)e^{-\frac{{\hat{\mu }}^2}{2\sigma (m)^2}}\\\ge & {} 1 \quad \text{ for } m \text{ sufficiently } \text{ large. } \end{aligned}$$

Hence, for m sufficiently large we have:

$$\begin{aligned} |G^*_{\sigma (m),\mu ,N}(x)-g^*_{\sigma (m),\mu ,N}(x)|\le & {} \left| \sqrt{2\pi }\sigma (m) - \sum ^{N-1}_{n=0} e^{-\frac{d^{{\text {circ}}}_{N+1}(n,\mu )^2}{2\sigma (m)^2}} \right| \\\le & {} \left| \sqrt{2\pi }\sigma (m) - \sum _{n \in \mathbb {Z}} e^{-\frac{(n-{\hat{\mu }})^2}{2\sigma (m)^2}} \right| \\&\quad +\left| \sum _{n \in \mathbb {Z}} e^{-\frac{(n-{\hat{\mu }})^2}{2\sigma (m)^2}} - \sum ^{N-1}_{n=0} e^{-\frac{d^{{\text {circ}}}_{N}(n,\mu )^2}{2\sigma (m)^2}} \right| \\\le & {} C{\tilde{\sigma }}\sqrt{m}e^{-2\pi ^2{\tilde{\sigma }}^2m}\\&\quad +\left| \sum _{n \in \mathbb {Z}} e^{-\frac{(n-{\hat{\mu }})^2}{2\sigma (m)^2}} - \sum ^{\lfloor \frac{N}{4} \rfloor -1}_{n=-\lceil \frac{N}{4} \rceil +1} e^{-\frac{(n-{\hat{\mu }})^2}{2\sigma (m)^2}} \right| \end{aligned}$$

where we have used Lemma 1 part one and (78) for the third inequality. Next,

$$\begin{aligned}&\left| \sum _{n \in \mathbb {Z}} e^{-\frac{(n-{\hat{\mu }})^2}{2\sigma (m)^2}} - \sum ^{\lfloor \frac{N}{4} \rfloor +1}_{n=-\lceil \frac{N}{4} \rceil -1} e^{-\frac{(n-{\hat{\mu }})^2}{2\sigma (m)^2}} \right| \\&\quad = \sum _{n=-\infty }^{-\lfloor \frac{N}{4} \rfloor -2} e^{-\frac{(n-{\hat{\mu }})^2}{2\sigma (m)^2}}+ \sum _{n=\lceil \frac{N}{4} \rceil +2}^{\infty } e^{-\frac{(n-{\hat{\mu }})^2}{2\sigma (m)^2}} \\&\quad \le \int _{-\infty }^{-\frac{N}{4}} e^{-\frac{x^2}{2\sigma (m)^2}}dx+ \int _{\frac{N}{4}}^{\infty } e^{-\frac{x^2}{2\sigma (m)^2}}dx \\&\quad = 2\int _{\frac{N}{4}}^{\infty } e^{-\frac{x^2}{2\sigma (m)^2}}dx\le \int _{\frac{N}{4}}^{\infty } \frac{x}{N/4} e^{-\frac{x^2}{2\sigma (m)^2}}dx \\&\quad = \frac{\sigma (m)^2}{N/4}e^{-\frac{N^2}{32\sigma (m)^2}}\\&\quad = 4{\tilde{\sigma }}^2ye^{-\frac{y^2m}{32{\tilde{\sigma }}^2}} \end{aligned}$$

where we have use $m = yN$ in the last equality. This gives:

$$\begin{aligned} |G^*_{\sigma (m),\mu ,N}(x)-g^*_{\sigma (m),\mu ,N}(x)|\le & {} C{\tilde{\sigma }}\sqrt{m}e^{-2\pi ^2{\tilde{\sigma }}^2m}+4{\tilde{\sigma }}^2ye^{-\frac{y^2m}{32{\tilde{\sigma }}^2}} \\\le & {} (C{\tilde{\sigma }}+4{\tilde{\sigma }}^2y)\sqrt{m}e^{-\min (2\pi ^2{\tilde{\sigma }}^2,\frac{y^2}{32{\tilde{\sigma }}^2})m} \end{aligned}$$

From which the claim follows with $A=C{\tilde{\sigma }}+4{\tilde{\sigma }}^2y$ and $b = \min (2\pi ^2{\tilde{\sigma }}^2,\frac{y^2}{32{\tilde{\sigma }}^2})$.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Hocking, L.R., Holding, T. & Schönlieb, CB. Analysis of Artifacts in Shell-Based Image Inpainting: Why They Occur and How to Eliminate Them. Found Comput Math 20, 1549–1651 (2020). https://doi.org/10.1007/s10208-020-09450-3

Download citation

Received: 11 December 2018
Revised: 19 December 2019
Accepted: 19 December 2019
Published: 16 March 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s10208-020-09450-3

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Analysis of Artifacts in Shell-Based Image Inpainting: Why They Occur and How to Eliminate Them

Abstract

Similar content being viewed by others

Diffusion–Shock Inpainting

Denoising by Inpainting

Regularised Diffusion–Shock Inpainting

1 Introduction

Remark 1

1.1 Organization

1.2 Notation

2 Shell-Based Geometric Inpainting

Remark 2

2.1 Advantages of Algorithm 1

Remark 3

2.2 Disadvantages and Artifacts

2.3 Related Work (Artifact Reduction)

2.4 Related Theoretical Work

2.5 Contributions

3 Review of Main Methods

Remark 4

4 Ghost Pixels and Equivalent Weights

Proof

Remark 5

Remark 6

Remark 7

5 Semi-Implicit Extension of Algorithm 1

5.1 Solving the Linear System

Remark 8

6 Analysis

6.1 Symmetry Assumptions

Remark 9

Definition 1

6.2 Convergence Rates of Damped Jacobi and SOR for Semi-Implicit Guidefill

Proposition 1

Proof

Corollary 1

Proof

6.3 Convergence of Algorithm 1 to Continuum and Asymptotic Limits

Remark 10

Remark 11

6.4 Connection to Stopped Random Walks

Definition 2

6.5 Derivation of the Asymptotic Limit

Theorem 1

Proof

Lemma 1

Proof

Remark 12

Definition 3

Lemma 2

Proof

Lemma 3

Proof

Lemma 4

Proof

Lemma 5

Proof

Theorem 2

Proof

Claim 1

Proof of Claim 1

6.6 Asymptotic Limit Version 2

Theorem 3

Proof

Definition 4

Theorem 4

Proof

6.7 Derivation of Fixed-Ratio Continuum Limit

Definition 5

Definition 6

Theorem 5

Proof

Theorem 6

Proof

Remark 13

7 Consequences

7.1 Kinking

7.1.1 The Direct form Algorithm 1 Kinks, the Semi-Implicit Extension Need Not

7.1.2 Limiting Transport Directions for Coherence Transport, Guidefill, and Semi-Implicit Guidefill

Definition 7