1 Introduction

The Fourier series of a periodic function converges spectrally fast with respect to the number of terms in the series, that is, with an algebraic order that increases with the number of available derivatives and exponentially fast for analytic functions. Furthermore, the truncated Fourier series can be approximated via the fast Fourier transform (FFT) in a fast and stable manner [40]. As such, it is the go-to approach to approximate a periodic function. However, when the function in question is nonperiodic, the situation is very different. Regardless of how smooth this function is, convergence is slow in the \(L^2\) norm and there is a permanent oscillatory overshoot close to the endpoints due to the Gibbs phenomenon [42].

Fourier extensions have been shown to be an effective means for the approximation of nonperiodic functions while avoiding the Gibbs phenomenon [1, 4, 7, 17, 23, 24, 26]. The idea is as follows: For a function \(f \in L^2(-1,1)\), consider an approximant \(f_N\) given by

$$\begin{aligned} f_N(x) = \sum _{k=-n}^n c_k e^{\frac{i\pi }{T}kx}, \quad N = 2n+1, \end{aligned}$$
(1)

where the coefficients \(c_{-n},\ldots c_n\) are chosen to minimize the error \(\Vert f - f_N\Vert _{L^2(-1,1)}\), and \(T > 1\) is a user-determined parameter. This approximant \(f_N\) is the nth Fourier extension of f to the periodic interval \([-T,T]\). For the purposes of this paper, other kinds of Fourier extensions, which might come from a discrete sampling of f or regularization, are a modification of this.Footnote 1

There are many approximation schemes that avoid the Gibbs phenomenon. Chebyshev polynomial interpolants such as those implemented in the Chebfun [13, 36] and ApproxFun [30] software packages are extremely successful, so why consider Fourier extensions? First, discrete collocation versions of Fourier extensions sample the function on equispaced or near-equispaced grids, which in some situations are more natural than Chebyshev grids, which cluster near the endpoints [5]. Second, the approach generalizes naturally to higher dimensions. If one has a function on a bounded subset \(\Omega \subset {\mathbb {R}}^d\), then one can use multivariate Fourier series that are periodic on a d-dimensional bounding box containing \(\Omega \) [8, 18, 27]. Modifications of Fourier extensions that use discete samples of a function are particularly relevant in this generalization, because the integrals defining the \(L^2(\Omega )\) norm can be difficult to compute.

Fourier extensions can be computed stably in \({\mathcal {O}}(N\log ^2(N))\) floating point operations, with the following important caveats ([20, 23, 26]). Computation of \(f_N\) is equivalent to inversion of the so-called prolate matrix [37], which is a Toeplitz matrix \(G \in {\mathbb {R}}^{N\times N}\) with entries \(G_{k,j} = \mathrm {sinc}\left( (k-j)\frac{\pi }{T}\right) \), with right-hand-side vector \({\mathbf {b}}\in {\mathbb {C}}^N\) with entries \(b_k = \left( \frac{T}{2}\right) ^{\frac{1}{2}}\int _{-1}^1 e^{-\frac{i\pi }{T}kx}f(x)\,\mathrm {d}x\) [26]. The prolate matrix is exponentially ill-conditioned [34, Eq. 63], so computation of the exact Fourier extension is practically impossible, even for moderately sized N. However, a truncated singular value decomposition (SVD) solution is only worse than the exact solution (in the \(L^2(-1,1)\) norm) by a small factor \({\mathcal {O}}(\varepsilon ^{\frac{1}{2}})\) in the limit as \(N \rightarrow \infty \), where \(\varepsilon > 0\) is the truncation parameter [3, 4]. Furthermore, using an oversampled least squares interpolation in equispaced points in \([-1,1]\) can bring this down to \({\mathcal {O}}(\varepsilon )\) for a sufficient oversampling rate [2,3,4]. At the heart of these facts is the observation that while the Fourier basis on \([-T,T]\) does not form a Schauder basis for \(L^2(-1,1)\), it satisfies the weaker conditions of a frame [3].

Fourier extensions which approximate a truncated SVD solution rather than the exact solution are called regularized Fourier extensions. An approximate SVD of the prolate matrix can be computed in \({\mathcal {O}}(N\log ^2(N))\) operations using the FFT and exploiting the so-called plunge region in the profile of its singular values [20]. This is a vast improvement on \({\mathcal {O}}(N^3)\) operations for a standard SVD. Fast algorithms for regularized, oversampled least squares interpolation Fourier extensions were developed in [26], building on the work of Lyon [23].

Previous convergence results on Fourier extensions have focused on convergence in the \(L^2\) norm, because the Fourier extension by definition minimizes the error in the \(L^2\) norm over the approximation space. Convergence in \(L^2\) of algebraic order k for functions in the Sobolev space \(H^k(-1,1)\) was proved by Adcock and Huybrechs [4, Thm. 2.1]. It follows immediately that convergence is superalgebraic for smooth functions. Exponential convergence in \(L^2\) and \(L^\infty \) norms for analytic functions was proved by Huybrechs for \(T =2\) [17] and by Adcock et al. for general \(T > 1\) [4]. The proofs of exponential convergence appeal to connections between the Fourier extension problem and the sub-range Chebyshev polynomials [4], for which series approximations converge at an exponential rate which depends on analyticity in Bernstein ellipses in the complex plane. Regarding pointwise convergence of Fourier extensions for nonanalytic functions, there are no proofs in the literature. Some numerical exploration of pointwise convergence appears in [9, Sec. 2], but a rigorous theoretical foundation is lacking.

1.1 Summary of New Results

In this paper we prove that for f in the Hölder space \(C^{k,\alpha }([-1,1])\),

$$\begin{aligned} f(x) - f_N(x) = {\left\{ \begin{array}{ll} {\mathcal {O}}(N^{-k-\alpha }\log (N)) &{} \text { for } x \in [a,b] \subset (-1,1), \\ {\mathcal {O}}(N^{\frac{1}{2}-k-\alpha })) &{} \text { for } x \in [-1,1], \end{array}\right. } \end{aligned}$$
(2)

see Theorem 3.2. The factors of \(\log (N)\) and \(N^{\frac{1}{2}}\) come from bounds on the Lebesgue function associated with the Fourier extension derived in Sect. 4, and the factor of \(N^{-k-\alpha }\) comes from a Jackson-type theorem proved for Fourier extensions derived in Sect. 5 on best uniform approximation by Fourier extensions.

This factor of \(N^{-k-\alpha }\) can be pessimistic if f is least regular at the boundary; in Sect. 5 we discuss how a weighted form of regularity (as opposed to Hölder regularity taken uniformly over the interval \([-1,1]\)) might yield a more natural correspondence between regularity and convergence rate. This is precisely the case in best polynomial approximation on an interval, where weighted moduli of continuity have a tight correspondence with best approximation errors [11, Ch. 7, Thm. 7.7].

From Eq. (2), it is immediate that if \(f \in C^{\alpha }([-1,1])\) where \(\alpha \in (0,1)\), then \(f_N\) converges to f uniformly in any subinterval \([a,b] \subset (-1,1)\), and if \(\alpha > \frac{1}{2}\), then we get uniform convergence over the whole interval \([-1,1]\).

We also prove a local pointwise convergence result, which states that if \(f \in L^2(-1,1)\), but f is uniformly Dini–Lipschitz in a subinterval [ab], then the Fourier extension converges uniformly in compact subintervals of (ab) (see Theorem 3.5). This is done by generalizing a localization theorem of Freud on convergence of orthogonal polynomial expansions in \([-1,1]\) (see Sect. 6).

A key insight of this paper is that the kernel associated with approximation by Fourier extension has an explicit formula that is related to the Christoffel–Darboux kernel of the Legendre polynomials on a circular arc (see Lemma 4.3). The asymptotics of these polynomials were derived by Krasovksy using Riemann–Hilbert analysis [10, 21, 22], which we use to derive asymptotics of the kernel. The Lebesgue function for Fourier extensions are estimated using these asymptotics in Theorem 4.1. We find that the Lebesgue function is \({\mathcal {O}}(\log (N))\) in the interior of \([-1,1]\) and \({\mathcal {O}}(N^{\frac{1}{2}})\) globally. This is just as with the Lebesgue function for Legendre series, and distinct from classical Fourier series which has a \({\mathcal {O}}(\log N)\) Lebesgue function over the full periodic interval.

The results of this paper would become more interesting if they could be extended to regularized and oversampled interpolation versions of Fourier extensions, because as discussed above, these are the versions for which stable and efficient algorithms have been developed. The multivariate case is another direction this line of inquiry would ideally lead. We briefly discuss future research like this in Sect. 8.

The paper is structured as follows. Section 2 recounts the known results about convergence of Fourier extensions in the \(L^2\) norm. Section 3 gives new pointwise and uniform convergence theorems along with proofs that depend on results proved in the self-contained Sects. 4, 5, and 6. Section 4 is on the Lebesgue function for Fourier extensions. Section 5 is on uniform best approximation for Fourier extensions, in which Jackson- and Bernstein-type theorems are proved. Section 6 is on an analogue of Freud’s localization theorem for Fourier extensions. Section 7 provides the reader with results from numerical experiments, and Sect. 8 provides discussion. The appendix contains a derivation of asymptotics of Legendre polynomials on a circular arc, on the arc itself, from the Riemann–Hilbert analysis of Krasovsky [10, 21, 22].

2 Convergence of Fourier Extensions in \(L^2\)

In this section we summarize the already known results regarding convergence in the \(L^2\) norm.

2.1 Exponential Convergence

As is discussed in [1, 17], the Fourier extension \(f_N\) in Eq. (1) is a polynomial in the mapped variable \(t = m(x)\), where

$$\begin{aligned} m(x) = 2 \frac{\cos \left( \frac{\pi }{T}x \right) - \cos \left( \frac{\pi }{T} \right) }{1-\cos \left( \frac{\pi }{T}\right) } - 1. \end{aligned}$$

This change of variables transforms the Fourier extension problem into two series expansions in modified Jacobi polynomials [17]. Since exponential convergence in this setting is dictated by Bernstein ellipses in the complex plane, which are defined to be the closed contours,

$$\begin{aligned} {\mathcal {B}}(\rho ) = \left\{ \frac{1}{2}\left( \rho e^{i\theta } + \frac{1}{\rho } e^{-i\theta }\right) : \theta \in [-\pi ,\pi ] \right\} , \quad \rho > 1, \end{aligned}$$

it makes sense to consider the mapped contours,

$$\begin{aligned} {\mathcal {D}}(\rho ) := m^{-1} \left( {\mathcal {B}}(\rho ) \right) , \end{aligned}$$
(3)

as a candidate for determining the rate of exponential convergence for Fourier extensions. They are indeed the relevant contours, as was proven in the following theorem.

Theorem 2.1

(Adcock–Huybrechs [17, Thm. 3.14], [4, Thm. 2.3]) If f is an analytic function in \(\mathcal {D}(\rho ^\star )\) and continuous on \(\mathcal {D}(\rho ^\star )\) itself, then

$$\begin{aligned} \Vert f-f_N\Vert _{L^2(-1,1)} = {\mathcal {O}}(\rho ^{-n}) \Vert f\Vert _{L^\infty ({\mathcal {D}}(\rho ))}, \end{aligned}$$

where \(\rho < \min \left\{ \rho ^\star , \cot ^2\left( \frac{\pi }{4T}\right) \right\} \) and \(N = 2n+1\). The constant in the big \({\mathcal {O}}\) depends only on T.

Note that there is a T-dependent upper limit on the rate of exponential convergence.

2.2 Algebraic Convergence

For functions in the Sobolev space \(H^k(-1,1)\) of \(L^2(-1,1)\) functions whose kth weak derivatives are in \(L^2(-1,1)\), we have algebraic convergence of order k.

Theorem 2.2

(Adcock–Huybrechs [1, Thm. 2.1]) If \(f\in H^k(-1,1)\), then

$$\begin{aligned} \Vert f-f_N\Vert _{L^2(-1,1)} = {\mathcal {O}}(N^{-k}) \Vert f\Vert _{H^k(-1,1)}, \end{aligned}$$

where the constant in the big \({\mathcal {O}}\) depends only on k and T.

Corollary 2.3

If f is smooth, then \(f_N \rightarrow f\) superalgebraically in the \(L^2(-1,1)\) norm.

2.3 Subalgebraic Convergence

This elementary result says that Fourier extensions converge in the \(L^2\) norm for \(L^2\) functions.

Proposition 2.4

If \(f \in L^2(-1,1)\), then

$$\begin{aligned} \Vert f - f_N\Vert _{L^2(-1,1)} \rightarrow 0 \text { as } N \rightarrow \infty . \end{aligned}$$

Proof

Let \(g \in L^2(-T,T)\) be the function that is equal to f inside \([-1,1]\) and zero in the complement. Let \(g(x) = \sum _{k=-\infty }^\infty c_k e^{\frac{i\pi }{T}kx}\) be its Fourier series, and for all odd integers \(N = 2n+1\), define \(t_N(x) = \sum _{k=-n}^n c_k e^{\frac{i\pi }{T}kx}\). Then following the definitions of \(f_N\), g and \(t_N\), we have \(\Vert f - f_N \Vert _{L^2(-1,1)} \le \Vert f - t_N \Vert _{L^2(-1,1)} = \Vert g - t_N \Vert _{L^2(-T,T)} \rightarrow 0\) as \(N\rightarrow \infty \). \(\square \)

3 Pointwise and Uniform Convergence

We prove pointwise convergence rates for functions in various Hölder spaces. For \(k = 0,1,2,\ldots \) and \(\alpha \in [0,1]\), the Hölder space \(C^{k,\alpha }([-1,1])\) is the space

$$\begin{aligned} C^{k,\alpha }([-1,1]) : = \left\{ f \in C^k([-1,1]) : |f^{(k)}|_{C^{\alpha }([-1,1])} < \infty \right\} , \end{aligned}$$

where

$$\begin{aligned} |g|_{C^\alpha ([-1,1])} := \sup _{x,y \in [-1,1]} \frac{|g(x)-g(y)|}{|x-y|^\alpha }. \end{aligned}$$

It is a Banach space when endowed with the norm \(\Vert f\Vert _{C^{k,\alpha }([-1,1])} = \Vert f\Vert _{C^k([-1,1])} + |f^{(k)}|_{C^\alpha ([-1,1])}\) [14]. For all \(\alpha \in [0,1]\), we have \(C^\alpha ([-1,1]) := C^{0,\alpha }([-1,1])\).

3.1 Exponential Convergence

The pointwise convergence result for analytic functions is the same as Theorem 2.1. In fact, Theorem 2.1 is a corollary of the following theorem.

Theorem 3.1

(Huybrechs [17, Theorem 3.14], Adcock–Huybrechs [4, Theorem 2.11], [1, Theorem 2.3]) If f is analytic inside of the mapped Bernstein ellipse \(\mathcal {D}(\rho ^\star )\) (see Eq. (3)) and continuous on \(\mathcal {D}(\rho ^\star )\) itself, then

$$\begin{aligned} \Vert f-f_N\Vert _{L^\infty (-1,1)} = {\mathcal {O}}(\rho ^{-n}) \Vert f\Vert _{L^\infty ({\mathcal {D}}(\rho ))}, \end{aligned}$$

where \(\rho < \min \left\{ \rho ^\star , \cot ^2\left( \frac{\pi }{4T}\right) \right\} \) and \(N= 2n+1\). The constant in the big \({\mathcal {O}}\) depends only on T.

3.2 Algebraic Convergence

Pointwise convergence for Hölder continuous functions is as follows.

Theorem 3.2

If \(f\in C^{k,\alpha }([-1,1])\), where \(k \ge 0\) and \(\alpha \in [0,1]\), then for all \([a,b] \subset (-1,1)\),

$$\begin{aligned} \Vert f - f_N\Vert _{L^\infty (a,b)} = {\mathcal {O}}(N^{- \alpha - k}\log N) |f^{(k)}|_{C^{\alpha }([-1,1])}. \end{aligned}$$

The constant in the big \({\mathcal {O}}\) depends on a, b, k, \(\alpha \), and T. Over the whole interval \([-1,1]\), we have

$$\begin{aligned} \Vert f - f_N\Vert _{L^\infty (-1,1)} = {\mathcal {O}}(N^{\frac{1}{2} - \alpha - k}) |f^{(k)}|_{C^{\alpha }([-1,1])}. \end{aligned}$$

The constant in the big \({\mathcal {O}}\) depends on k, \(\alpha \), and T.

We lose a half order of algebraic convergence at the endpoints, something that we could not possibly see in classical Fourier series because a periodic interval has no endpoints.

Corollary 3.3

If f is smooth, then \(f_N \rightarrow f\) superalgebraically in \(L^\infty (-1,1)\).

3.3 Subalgebraic Convergence

The loss of a half order of algebraic convergence at the endpoints predicted by Theorem 3.2 means that we require at least Hölder continuity with order greater than a half in order to guarantee uniform convergence.

Theorem 3.4

If \(f \in C^\alpha ([-1,1])\), where \(\alpha > \frac{1}{2}\), then

$$\begin{aligned} \Vert f - f_N\Vert _{L^\infty (-1,1)} \rightarrow 0 \text { as } N \rightarrow \infty . \end{aligned}$$

In order to guarantee local, pointwise convergence, there is a weak local continuity condition that can be employed as follows. A function f is uniformly Dini–Lipschitz in [ab] if [42],

$$\begin{aligned} \lim _{\delta \searrow 0}\sup _{\begin{array}{c} x,y \in [a,b] \\ |x-y|<\delta \end{array}} \left| (f(x)- f(y)) \log \delta \right| = 0. \end{aligned}$$
(4)

This is a very weak condition, weaker than the Hölder condition for any \(\alpha > 0\), but it is sufficient for convergence of Fourier extensions in the interior of \([-1,1]\).

Theorem 3.5

If \(f \in L^2(-1,1)\) is uniformly Dini–Lipschitz in \([a,b] \subseteq [-1,1]\), then

$$\begin{aligned} \Vert f - f_N\Vert _{L^\infty (c,d)} \rightarrow 0 \text { as } N \rightarrow \infty \end{aligned}$$

for all \([c,d] \subset (a,b)\).

Remark 3.6

This theorem is stronger than it might appear at first. It says that even if a function is in \(L^2(-1,1)\), and can have for example jump discontinuities, we will still have pointwise convergence in regions where f is Dini–Lipschitz. However, the localization theorem (Theorem 6.1) which we use to prove this result, does not give any indication of the rate of convergence.

3.4 Proofs of the Results of This Section

For each odd positive integer \(N = 2n+1\), let \(P_N\) be the orthogonal projection from \(L^2(-1,1)\) onto the subspace \({\mathcal {H}}_N\),

$$\begin{aligned} {\mathcal {H}}_N = \mathrm {span}\{e^{\frac{i\pi }{T}kx}\}_{k=-n}^n. \end{aligned}$$

Then \(f_N = P_N(f)\), since \(f_N\) minimizes the \(L^2(-1,1)\) distance between f and \({\mathcal {H}}_N\). Let \(\{ e_k \}_{k = 1}^N\) be any orthonormal basis for \({\mathcal {H}}_N \subset L^2(-1,1)\). Then the kernel

$$\begin{aligned} K_N(x,y) = \sum _{k=1}^N e_k(x) \overline{e_k(y)} \end{aligned}$$

satisfies

$$\begin{aligned} P_Nf(x) = \int _{-1}^1 K_N(x,y) f(y) \mathrm {d}y \end{aligned}$$

for all \(f \in L^2(-1,1)\). The Lebesgue function for the projection \(P_N\) at a point \(x\in [-1,1]\) is the \(L^1\) norm of the kernel at x,

$$\begin{aligned} \Lambda (x;P_N) = \int _{-1}^{1} \left| K_N(x,y) \right| \mathrm {d}y. \end{aligned}$$

The best approximation error functional on \({\mathcal {H}}_N\) is defined for all \(f\in C([-1,1])\) by

$$\begin{aligned} E(f;{\mathcal {H}}_N) = \inf _{r_N \in {\mathcal {H}}_N} \Vert f - r_N\Vert _{L^\infty (-1,1)}. \end{aligned}$$
(5)

The importance of \(\Lambda (x;P_N)\) and \(E(f;{\mathcal {H}}_N)\) are encapsulated in Lebesgue’s lemma, which states that for any \({f\in C([-1,1])}\),

$$\begin{aligned} |f(x) - P_N(f)(x)| \le (1+\Lambda (x;P_N))E(f;{\mathcal {H}}_N) \end{aligned}$$
(6)

for all \(x \in [-1,1]\) [11, Ch. 2, Prop. 4.1], [32, Thm. 2.5.2].

Now we can proceed to prove the pointwise convergence results stated above. The proofs depend on the content of Sects. 45, and 6, which consist of self-contained results.

Lemma 3.7

Let \(f \in C([-1,1])\). Then for all closed subsets \([a,b] \subset (-1,1)\), we have

$$\begin{aligned} \Vert f - P_N(f)\Vert _{L^\infty (a,b)}= {\mathcal {O}}(\log N) E\left( f;{\mathcal {H}}_N\right) , \end{aligned}$$

where the constant in the big \({\mathcal {O}}\) depends on a, b, and T. Over the whole interval \([-1,1]\), we have

$$\begin{aligned} \Vert f - P_N(f)\Vert _{L^\infty (-1,1)} = {\mathcal {O}}(N^{\frac{1}{2}}) E\left( f;{\mathcal {H}}_N\right) , \end{aligned}$$

where the constant in the big \({\mathcal {O}}\) depends only on T.

Proof

By Lebesgue’s lemma, given in Eq. (6), it suffices to show that \(\sup _{x \in [a,b]}\Lambda (x;P_N) = {\mathcal {O}}(\log N)\), and \(\sup _{x \in [-1,1]}\Lambda (x; P_N) = {\mathcal {O}}(N^{\frac{1}{2}})\). This is proved in Theorem 4.1. \(\square \)

Proof of Theorem 3.2

By Lemma 3.7, it suffices to show that for \(f \in C^{k,\alpha }([-1,1])\), we have \(E(f;{\mathcal {H}}_N) = {\mathcal {O}}\left( N^{-k-\alpha }\right) |f|_{C^\alpha ([-1,1])}\). This follows from Lemma 5.1 and Theorem 5.3. \(\square \)

Proof of Theorem 3.4

This follows from Theorem 3.2 with \(k = 0\), because \(N^{\frac{1}{2} - \alpha } \log N \rightarrow 0\) as \(N\rightarrow \infty \) for all \(\alpha > \frac{1}{2}\). \(\square \)

Proof of Theorem 3.5

The following proof is an analogue of a proof of Freud for polynomial approximation ([15, Thm. IV.5.6]). Define the functions \(f_1\) and \(f_2\) by

$$\begin{aligned} f_1(x) = {\left\{ \begin{array}{ll} f(x) &{} \text { for } x \in [a,b], \\ f(a) &{} \text { for } x \in [-1,a), \\ f(b) &{} \text { for } x \in (b,1], \end{array}\right. } \end{aligned}$$

and \(f_2 = f - f_1\). Since \(f_2\) vanishes in [ab] and is in \(L^2(-1,1)\), we have by Theorem 6.1 that \(P_N(f_2) \rightarrow 0\) uniformly in all subintervals \([c,d] \subset (a,b)\). It is clear by the definition of \(f_1\) and the definition of Dini–Lipschitz continuity in Eq. (4) that \(f_1\) is also uniformly Dini–Lipschitz in \([-1,1]\). By Lemma 3.7,

$$\begin{aligned} \Vert f_1 - P_N(f_1)\Vert _{L^\infty (c,d)} = {\mathcal {O}}(\log N) E\left( f_1;{\mathcal {H}}_N\right) . \end{aligned}$$

By Lemma 5.2 and Theorem 5.3, \(E\left( f_1;{\mathcal {H}}_N\right) = o(1/\log N)\). This proves that \(P_N(f_1) \rightarrow f_1\) uniformly on all subintervals \([c,d] \subset (a,b)\). Now, since \(f = f_1 + f_2\), we have proved the result. \(\square \)

4 The Lebesgue Function of Fourier Extensions

Recall from Sect. 3 that the kernel associated with the Fourier extension operator \(P_N\) is the bivariate function on \([-1,1] \times [-1,1]\),

$$\begin{aligned} K_N(x,y) = \sum _{k=1}^N e_k(x) \overline{e_k(y)}, \end{aligned}$$

where \(\{e_k\}_{k=1}^N\) is any orthonormal basis for \({\mathcal {H}}_N\). We call this kernel the prolate kernel, because one particular choice of orthonormal basis is the discrete prolate spheroidal wave functions (DPSWFs). These functions, denoted by \(\{\xi _{k,N} \}_{k=1}^N\), are the N eigenfunctions of a time-band-limiting operator; specifically, there exist eigenvalues \(\{\lambda _{k,N} \}_{k=1}^N\) such that

$$\begin{aligned} \int _{-1}^1 \xi _{k,N}(y) \frac{\sin \left( \frac{N \pi }{T}(x-y)\right) }{\sin \left( \frac{ \pi }{T}(x-y)\right) } \, \mathrm {d}y = \lambda _{k,N} \xi _{k,N}(x) \end{aligned}$$

for \(k = 1,\ldots N\). DPSWFs play an important role in the analysis of perfectly bandlimited and nearly timelimited periodic signals, which was pioneered by Landau, Pollak, and Slepian in the 1970s [34]. More recently, they have also been shown to be important for the computation of Fourier extensions, because the regularized version of Fourier extensions projects onto the DPSWFs \(\xi _{k,N}\) with eigenvalues \(\lambda _{k,N} > \varepsilon \) for a given tolerance \(\varepsilon > 0\) [3, 4]. This is discussed in Sect. 8.

The key outcome of this section is a proof of the following theorem.

Theorem 4.1

(Lebesgue function bounds)

  1. (i)

    For each closed interval \([a,b] \subset (-1,1)\), the Lebesgue function satisfies

    $$\begin{aligned} \sup _{x \in [a,b]}\Lambda (x;P_N) = {\mathcal {O}}(\log N). \end{aligned}$$
  2. (ii)

    Over the whole interval \([-1,1]\), we have

    $$\begin{aligned} \sup _{x \in [-1,1]}\Lambda (x;P_N) = {\mathcal {O}}(N^{\frac{1}{2}}). \end{aligned}$$

This will be proved by finding asymptotic formulae for the prolate kernel \(K_N\). The reader can verify that \(K_N\) is invariant under a change of orthonormal basis for \({\mathcal {H}}_N\), so a suitable choice of basis is desired. We have found that rather than the DPSWF basis, a basis related to orthogonal polynomials on the unit circle have been more amenable to analysis. For \(N = 2n+1\), recall the definition of the N-dimensional space \({\mathcal {H}}_N\),

$$\begin{aligned} {\mathcal {H}}_N = \mathrm{span}\left\{ e^{\frac{i\pi }{T}kx}\right\} _{k=-n}^n. \end{aligned}$$

Any function \(r_N \in {\mathcal {H}}_N\) is of the form

$$\begin{aligned} r_N(x) = e^{-\frac{i\pi }{T}nx} p_{2n}(e^{\frac{i\pi }{T}x}), \end{aligned}$$

where \(p_{2n}\) is a polynomial of degree 2n. Using this idea we prove the following lemma.

Lemma 4.2

(Orthonormal basis for \({\mathcal {H}}_N\)) Let \(\left\{ \Pi _{k}(z)\right\} _{k=0}^{\infty }\) be the (normalized) orthogonal polynomials on the unit circle with respect to the weight

$$\begin{aligned} f(\theta ) = 2T\cdot \chi _{\left[ -\frac{\pi }{T},\frac{\pi }{T}\right] }(\theta ), \quad \theta \in [-\pi ,\pi ]; \end{aligned}$$

i.e., for \(j,k = 0,1,2,\ldots ,\)

$$\begin{aligned} \frac{1}{2\pi }\int _{-\pi }^\pi \overline{\Pi _k(e^{i\theta })}\Pi _j(e^{i\theta }) \, f(\theta )\mathrm {d}\theta = \delta _{j,k}. \end{aligned}$$

Then the set

$$\begin{aligned} \left\{ e^{-\frac{i\pi }{T}nx} \cdot \Pi _k\left( e^{\frac{i\pi }{T}x} \right) \right\} _{k=0}^{2n} \end{aligned}$$

forms an orthonormal basis for \({\mathcal {H}}_N\).

Proof

By the observation immediately preceding this lemma, the set forms a basis for \({\mathcal {H}}_N\) because \(\{\Pi _k\}_{k=0}^{2n}\) forms a basis for polynomials of degree 2n. We need only show its orthonormality with respect to the inner product on \({\mathcal {H}}_N\) induced by \(L^2(-1,1)\). Let \(j , k \in \{0,\ldots , 2n\}\). Then, making the change of variables \(\theta = \frac{\pi }{T}x\), we have

$$\begin{aligned} \int _{-1}^1 \overline{e^{-\frac{i\pi }{T}nx} \cdot \Pi _j\left( e^{\frac{i\pi }{T}x} \right) } e^{-\frac{i\pi }{T}nx} \cdot \Pi _k\left( e^{\frac{i\pi }{T}x} \right) \, \mathrm {d}x= & {} \int _{-1}^1 \overline{\Pi _j\left( e^{\frac{i\pi }{T}x} \right) } \Pi _k\left( e^{\frac{i\pi }{T}x} \right) \, \mathrm {d}x \\= & {} \int _{-\frac{\pi }{T}}^{\frac{\pi }{T}} \overline{\Pi _j\left( e^{i\theta } \right) } \Pi _k\left( e^{i\theta }\right) \frac{T}{\pi }\,\mathrm {d}\theta \\= & {} \frac{1}{2\pi }\int _{-\pi }^\pi \overline{\Pi _j(e^{i\theta })}\Pi _k(e^{i\theta }) \, f(\theta )\mathrm {d}\theta . \end{aligned}$$

By the orthonormal relationship between \(\Pi _k\) and \(\Pi _j\) on the unit circle, the basis is orthonormal on \([-1,1]\). \(\square \)

The Christoffel–Darboux formula for orthogonal polynomials on the unit circle states that

$$\begin{aligned} \sum _{k=0}^{N-1} \overline{\Pi _k(\zeta )} \Pi _k(z) = \frac{\overline{\Pi ^*_N(\zeta )} \Pi ^*_N(z) - \overline{\Pi _N(\zeta )} \Pi _N(z) }{1-{\overline{\zeta }}z}, \qquad z, \zeta \in {\mathbb {C}}, \quad {\overline{\zeta }}z \ne 1, \end{aligned}$$

where \(\Pi ^*_N(z) = z^N \overline{\Pi \left( {\overline{z}}^{-1}\right) }\) (which is also a polynomial of degree N) [35, Thm 11.42]. On the unit circle itself, where \(z = e^{i\theta }\), \(\zeta =e^{i\phi }\), this reduces, after some elementary manipulations, to

$$\begin{aligned}&\sum _{k=0}^{N-1} \overline{\Pi _k(e^{i\phi })} \Pi _k(e^{i\theta })\nonumber \\&\quad = e^{i\frac{N-1}{2}(\theta - \phi )} \cdot \mathrm {Imag}\left( \frac{\overline{e^{-i\frac{N}{2}\phi } \cdot \Pi _N\left( e^{i\phi }\right) } \cdot e^{-i\frac{N}{2}\theta } \cdot \Pi _N\left( e^{i\theta }\right) }{\sin \left( \frac{\theta -\phi }{2} \right) }\right) . \end{aligned}$$
(7)

From this general formula for orthogonal polynomials on the unit circle, we prove the following lemma regarding the prolate kernel.

Lemma 4.3

(Prolate kernel formula) For all \(x,y \in \left[ -1,1 \right] \),

$$\begin{aligned} K_N(x,y) = \mathrm {Imag}\left( \frac{\overline{e^{-\frac{i\pi }{T}\frac{N}{2}y} \cdot \Pi _N\left( e^{\frac{i\pi }{T}y}\right) } \cdot e^{-\frac{i\pi }{T}\frac{N}{2}x} \cdot \Pi _N\left( e^{\frac{i\pi }{T}x}\right) }{\sin \left( \frac{\pi }{2T}(x-y) \right) }\right) . \end{aligned}$$

The formula in fact holds for all \(x,y \in [-T,T]\).

Remark 4.4

Setting \(T= 1\) in this formula returns the Dirichlet kernel of classical Fourier series, because \(\Pi _N(z) = z^N\) for the trivial weight \(f(\theta ) \equiv 1\).

Proof

By the fact that \(\left\{ e^{-\frac{i\pi }{T}nx} \cdot \Pi _k\left( e^{\frac{i\pi }{T}x} \right) \right\} _{k=0}^{2n}\) is an orthonormal basis for \({\mathcal {H}}_N\), from Lemma 4.2, we have that

$$\begin{aligned} K_N(x,y)= & {} \sum _{k=0}^{2n} \overline{e^{-\frac{i\pi }{T}ny} \Pi _k\left( e^{\frac{i\pi }{T}y}\right) } e^{-\frac{i\pi }{T}nx} \Pi _k\left( e^{\frac{i\pi }{T}x}\right) \\= & {} e^{\frac{i\pi }{T}n(y-x)} \sum _{k=0}^{N-1} \overline{\Pi _k\left( e^{\frac{i\pi }{T}y}\right) } \Pi _k\left( e^{\frac{i\pi }{T}x}\right) . \end{aligned}$$

The proof is completed by considering the Christoffel–Darboux formula for orthogonal polynomials on the unit circle in Eq. (7) (note that \(\frac{N-1}{2} = n\)). \(\square \)

Now, to ascertain asymptotics of the prolate kernel, it is sufficient to ascertain asymptotics of the orthogonal polynomials \(\{\Pi _k(z)\}_{k=0}^\infty \). These polynomials have been studied before in the literature, and are known as the Legendre polynomials on a circular arc [25].

Theorem 4.5

Let \(\{\Pi _k\}_{k=0}^\infty \) be the (normalized) orthogonal polynomials on the unit circle with respect to the weight \(f(\theta ) = 2T\cdot \chi _{\left[ -\frac{\pi }{T},\frac{\pi }{T}\right] }(\theta )\), and for \(x\in [-1,1]\) define the variable \(\eta \in [0,\pi ]\) by

$$\begin{aligned} \eta = \cos ^{-1}\left( \frac{\sin \left( x\frac{\pi }{2T} \right) }{\sin \left( \frac{\pi }{2T} \right) }\right) . \end{aligned}$$

There exists a constant \(\delta > 0\) such that for \(x \in [-1+\delta ,1-\delta ],\)

$$\begin{aligned} \Pi _N\left( e^{\frac{i\pi }{T}x}\right)= & {} \frac{e^{\frac{i\pi }{T}\frac{N}{2}x}}{\sqrt{2T\sin \left( \frac{\pi }{2T}\right) }} \bigg (e^{-\frac{i\pi }{4T}}\left( \frac{\sin \left( (1+x)\frac{\pi }{2T}\right) }{\sin \left( (1-x)\frac{\pi }{2T}\right) }\right) ^{\frac{1}{4}}\cos \left( N\eta - \frac{\pi }{4}\right) \nonumber \\&- \quad e^{\frac{i\pi }{4T}}\left( \frac{\sin \left( (1-x)\frac{\pi }{2T}\right) }{\sin \left( (1+x)\frac{\pi }{2T}\right) }\right) ^{\frac{1}{4}}\sin \left( N\eta - \frac{\pi }{4}\right) \bigg ) + {\mathcal {O}}(N^{-1}), \end{aligned}$$
(8)

and for \(x \in [1-\delta ,1],\)

$$\begin{aligned} \Pi _N\left( e^{\frac{i\pi }{T}x}\right)= & {} \frac{ e^{\frac{i\pi }{T}\frac{N}{2}x}}{\sqrt{2T\sin \left( \frac{\pi }{2T}\right) }}\left( \frac{\pi }{2}N\eta \right) ^{\frac{1}{2}} \bigg (e^{-\frac{i\pi }{4T}}\left( \frac{\sin \left( (1+x)\frac{\pi }{2T}\right) }{\sin \left( (1-x)\frac{\pi }{2T}\right) }\right) ^{\frac{1}{4}}J_0\left( N\eta \right) \nonumber \\&- \quad e^{\frac{i\pi }{4T}}\left( \frac{\sin \left( (1-x)\frac{\pi }{2T}\right) }{\sin \left( (1+x)\frac{\pi }{2T}\right) }\right) ^{\frac{1}{4}}J_1\left( N\eta \right) \bigg ) + {\mathcal {O}}(N^{-\frac{1}{2}}). \end{aligned}$$
(9)

The constants in the big \({\mathcal {O}}\) depend only on T and \(\delta \). The asymptotics for \(x\in [-1,-1+\delta ]\) are found by using the relation \(\Pi _N\left( e^{-\frac{i\pi }{T}x}\right) = \overline{\Pi _N\left( e^{\frac{i\pi }{T}x}\right) }\).

In terms of magnitude with respect to N, we have

$$\begin{aligned} \Pi _N\left( e^{\frac{i\pi }{T}x}\right) = {\left\{ \begin{array}{ll} {\mathcal {O}}(1) &{} \text { for } x \in [-1+\delta ,1-\delta ], \\ {\mathcal {O}}(N^{\frac{1}{2}}) &{} \text { for } x \in [-1,-1+\delta ] \cup [1-\delta ,1]. \end{array}\right. } \end{aligned}$$
(10)

Remark 4.6

The asymptotic order of \(\Pi _N\left( e^{\frac{i\pi }{T}x}\right) \) with respect to N in Eq. (10) is the same as for the Nth (normalized) Legendre polynomial in \([-1,1]\) [35, Thm. 8.21.6]. Further discussion on how Legendre series approximations compare to Fourier extensions lies in Sect. 8.1

Proof

This result follows directly from Lemma A.1 in Appendix A, because if we take \(\alpha = \pi - \pi /T\) and \(f_\alpha (\theta ) \equiv 1\), then the polynomials \(\Pi _N(z) = (2T)^{-\frac{1}{2}}\phi _N\left( -z,\alpha \right) \) satisfy the orthonormality conditions that define \(\Pi _N\) as in Lemma 4.2. To obtain the asymptotic formula above, make the change of variables \(\theta = \frac{\pi }{T}x + \pi \) in the asymptotic formulae for \(\phi _N(z,\alpha )\). Be careful to note that the endpoint with explicit formula given above (\(x = 1\)), corresponds to \(\theta = 2\pi - \alpha \), which is not the endpoint with explicit formula given in Lemma A.1 (\(\theta = \alpha \)). This was done to shorten the expressions for the asymptotics at the endpoints.

To complete the proof we must prove Eq. (10). For \(x \in [-1+\delta ,1-\delta ]\), all of the terms are clearly bounded above by \({\mathcal {O}}(\delta ^{-\frac{1}{4}}) = {\mathcal {O}}(1)\). Now let \(x\in [1-\delta ,1]\). We have \(\eta ^2 \le \frac{\pi ^2}{4}(1-\cos (\eta ))\) for all \(\eta \in \left[ 0,\frac{\pi }{2}\right] \) and \(x \sin \left( \frac{\pi }{2T}\right) \le \sin \left( x\frac{\pi }{2T}\right) \le x\frac{\pi }{2T}\) for all \(x \in [0,1]\). Assuming \(\delta < \frac{1}{2}\), we have \(x,y \in [0,1]\) and \(\eta ,\lambda \in \left[ 0,\frac{\pi }{2}\right] \), and hence \(\eta ^2 \le \frac{\pi ^2}{4}(1-x)\). Since \(1-x \in [0,1]\) we have \(1-x \le \sin \left( (1-x)\frac{\pi }{2T}\right) /\sin \left( \frac{\pi }{2T}\right) \), so \(\eta ^2 \le \sin \left( (1-x)\frac{\pi }{2T}\right) \frac{\pi ^2}{4\sin \left( \frac{\pi }{2T}\right) }\). This implies that

$$\begin{aligned} \left( \frac{\eta ^2\sin \left( (1+x)\frac{\pi }{2T}\right) }{\sin \left( (1-x)\frac{\pi }{2T}\right) }\right) ^{\frac{1}{4}} = {\mathcal {O}}(1), \end{aligned}$$

uniformly for all \(x \in [0,1]\). Note also that Bessel functions are uniformly bounded in absolute value by 1 (see [31, Eq. 10.14.1]). This makes it clear that \(\Pi _N\left( e^{\frac{i\pi }{T}x}\right) = {\mathcal {O}}(N^{\frac{1}{2}})\) for \(x\in [1-\delta ,1]\). For \(x \in [-1,-1+\delta ]\), use the relation \(\Pi _N\left( e^{-\frac{i\pi }{T}x}\right) = \overline{\Pi _N\left( e^{\frac{i\pi }{T}x}\right) }\). \(\square \)

We now have the required results to prove Theorem 4.1.

Proof of Theorem 4.1 part (i)

Let \([a,b] \subset (-1,1)\), and choose \(\tau > 0\) sufficiently small so that \([a-\tau ,b+\tau ] \subset (-1,1)\). Applying the first part of Theorem 4.5 gives us

$$\begin{aligned} \Pi _N\left( e^{\frac{i\pi }{T}x}\right) = {\mathcal {O}}(1), \quad x \in [a-\tau ,b+\tau ]. \end{aligned}$$
(11)

For the proof of part (i) we need to bound the integral \(\int _{-1}^1 |K_N(x,y)| \,\mathrm {d}y\) uniformly for \(x \in [a,b]\). We do so by dividing the interval \([-1,1]\) into the following subsets:

$$\begin{aligned} I_1= & {} \left\{ y \in [-1,1] : |y - x| \le N^{-1} \right\} , \\ I_2= & {} \left\{ y \in [-1,1] : N^{-1}< |y - x| \le \tau \right\} ,\\ I_3= & {} \left\{ y \in [-1,1] : \tau < |y - x| \right\} . \end{aligned}$$

We will obtain estimates for the kernel for \(x \in [a,b]\) and y in each of \(I_1\), \(I_2\), and \(I_3\), and then estimate the associated integral over each of \(I_1\), \(I_2\), and \(I_3\).

For \(N > 1 / \tau \), we have that \(I_1\) and \(I_2\) are nonempty and are contained within \([a - \tau , b+ \tau ] \subset (-1,1)\). By Eq. (11),

$$\begin{aligned} \Pi _N\left( e^{\frac{i\pi }{T}y}\right) = {\mathcal {O}}(1), \quad y \in I_1 \cup I_2. \end{aligned}$$

For \(y \in I_1\), we have

$$\begin{aligned} K_N(x,y) = e^{\frac{i\pi }{T}n(y-x)} \sum _{k=0}^{N-1} \overline{\Pi _k(e^{\frac{i\pi }{T}y})} \Pi _k(e^{\frac{i\pi }{T}x}) = {\mathcal {O}}(N). \end{aligned}$$

This implies

$$\begin{aligned} \int _{I_1} |K_N(x,y)| \,\mathrm {d}y \le {\mathcal {O}}(N) \int _{I_1} \,\mathrm {d}y = {\mathcal {O}}(1), \end{aligned}$$

because \(|I_1| \le 2N^{-1}\).

By Lemma 4.3,

$$\begin{aligned} K_N(x,y) = \mathrm {Imag}\left( \frac{\overline{e^{-\frac{i\pi }{T}\frac{N}{2}y} \cdot \Pi _N\left( e^{\frac{i\pi }{T}y}\right) } \cdot e^{-\frac{i\pi }{T}\frac{N}{2}x} \cdot \Pi _N\left( e^{\frac{i\pi }{T}x}\right) }{\sin \left( \frac{\pi }{2T}(x-y) \right) }\right) . \end{aligned}$$

Note that since the sine function is concave in \([0,\pi ]\), we have \(|\sin \left( \frac{\pi }{2T}(x-y)\right) | \ge \sin \left( \frac{\pi }{2T}\right) |x-y|\) for \(x,y \in [-1,1]\). Therefore, for all \(y \in [-1,1]\),

$$\begin{aligned} |K_N(x,y)| \le {\mathcal {O}}(1)\frac{1}{|x-y|} \left| \Pi _N\left( e^{\frac{i\pi }{T}y}\right) \right| . \end{aligned}$$

For \(y \in I_2\), this can be reduced to \(|K_N(x,y)| \le {\mathcal {O}}(1)\frac{1}{|x-y|}\). Therefore,

$$\begin{aligned} \int _{I_2} |K_N(x,y)| \, \mathrm {d}y \le {\mathcal {O}}(1) \int _{I_2} \frac{1}{|x-y|} \, \mathrm {d}y \le {\mathcal {O}}(1) \int _{N^{-1}}^{\tau } \frac{1}{s} \, \mathrm {d}s = {\mathcal {O}}(\log (N)). \end{aligned}$$

For \(y \in I_3\), since \(|x-y|^{-1} < \tau ^{-1} = {\mathcal {O}}(1)\), we have \(|K_N(x,y)| \le {\mathcal {O}}(1) \left| \Pi _N\left( e^{\frac{i\pi }{T}y}\right) \right| \). Therefore,

$$\begin{aligned}&\int _{I_3} |K_N(x,y)| \, \mathrm {d}y \le {\mathcal {O}}(1) \int _{I_3} \left| \Pi _N\left( e^{\frac{i\pi }{T}y}\right) \right| \, \mathrm {d}y \le {\mathcal {O}}(1) \left( \int _{-1}^1 \left| \Pi _N\left( e^{\frac{i\pi }{T}y}\right) \right| ^2 \, \mathrm {d}y \right) ^{\frac{1}{2}} \\&\quad = {\mathcal {O}}(1). \end{aligned}$$

This proves that \(\Lambda (x;P_N) = {\mathcal {O}}(\log (N))\) uniformly for \(x \in [a,b]\). \(\square \)

Proof of Theorem 4.1 part (ii)

Let \(\delta \in \left( 0,\frac{1}{4}\right) \) be sufficiently small so that Theorem 4.5 applies to the intervals \([-1+2\delta ,1-2\delta ]\) and \([1-2\delta ,1]\). Using part (i) of the present theorem, we have that for all \(x \in [-1+\delta , 1-\delta ]\) the Lebesgue function satisfies \(\Lambda (x;P_N) = {\mathcal {O}}(\log (N)) = {\mathcal {O}}(N^{\frac{1}{2}})\) uniformly in such x. Now, since \(\Pi _N\left( e^{-\frac{i\pi }{T}x}\right) = \overline{\Pi _N\left( e^{\frac{i\pi }{T}x}\right) }\), it follows that \(K_N(-x,y) = \overline{K_N(x,-y)}\), so that \(\Lambda (-x;P_N) = \Lambda (x;P_N)\). Therefore, to complete the proof we need only show that \(\Lambda (x;P_N) = {\mathcal {O}}(N^{\frac{1}{2}})\) uniformly for \(x \in [1-\delta ,1]\). For such x, we divide the interval \([-1,1]\) into the following subsets:

$$\begin{aligned} I_1= & {} \left\{ y \in [-1,1] : |y - x| \le N^{-1} \text { or } |1-y| \le N^{-1}\right\} , \\ I_2= & {} \left\{ y \in [-1,1] : N^{-1}< |y - x| \le \delta \text { and } |1-y|> N^{-1}\right\} ,\\ I_3= & {} \left\{ y \in [-1,1] : \delta < |y - x| \text { and } |1-y| > N^{-1} \right\} . \end{aligned}$$

By Theorem 4.5,

$$\begin{aligned} \Pi _N\left( e^{\frac{i\pi }{T}x}\right) = {\mathcal {O}}(N^{\frac{1}{2}}), \quad x \in [-1,1]. \end{aligned}$$

Therefore,

$$\begin{aligned} K_N(x,y) = e^{\frac{i\pi }{T}n(y-x)} \sum _{k=0}^{N-1} \overline{\Pi _k(e^{\frac{i\pi }{T}y})} \Pi _k(e^{\frac{i\pi }{T}x}) = {\mathcal {O}}(N^2). \end{aligned}$$

By the Cauchy–Schwarz inequality and the fact that \(|I_1| \le 3N^{-1}\), we have

$$\begin{aligned}&\int _{I_1} |K_N(x,y)| \,\mathrm {d}y \le \left( \int _{I_1} |K_N(x,y)|^2 \,\mathrm {d}y \right) ^{\frac{1}{2}} \left( \int _{I_1} \,\mathrm {d}y\right) ^{\frac{1}{2}} \\&\quad \le \left( \frac{3}{N}\right) ^{\frac{1}{2}}\left( \int _{-1}^1 |K_N(x,y)|^2 \,\mathrm {d}y \right) ^{\frac{1}{2}}. \end{aligned}$$

By the connection between \(K_N\) and \(P_N\), \(\int _{-1}^1 |K_N(x,y)|^2 \,\mathrm {d}y = P_N\left( \overline{K_N(x,\cdot )}\right) (x)\). Since \(\overline{K_N(x,y)} = K_N(y,x)\) and because \(K_N(\cdot ,x) \in {\mathcal {H}}_N\) for each \(x \in [-1,1]\), we have

$$\begin{aligned} \int _{-1}^1 |K_N(x,y)|^2 \,\mathrm {d}y = K_N(x,x). \end{aligned}$$

Therefore,

$$\begin{aligned} \int _{I_1} |K_N(x,y)| \,\mathrm {d}y = {\mathcal {O}}(N^{-\frac{1}{2}}) \left( {\mathcal {O}}(N^2)\right) ^{\frac{1}{2}} = {\mathcal {O}}(N^{\frac{1}{2}}). \end{aligned}$$

Just as in the proof of part (i) of the theorem, but this time using the estimate \(\Pi _N\left( e^{\frac{i\pi }{T}x}\right) = {\mathcal {O}}(N^{\frac{1}{2}})\), we have for all \(x,y \in [-1,1]\),

$$\begin{aligned} |K_N(x,y)| \le {\mathcal {O}}(N^{\frac{1}{2}})\frac{1}{|x-y|} \left| \Pi _N\left( e^{\frac{i\pi }{T}y}\right) \right| . \end{aligned}$$

Therefore, for \(y \in I_3\),

$$\begin{aligned} |K_N(x,y)| \le {\mathcal {O}}(N^{\frac{1}{2}})\left| \Pi _N\left( e^{\frac{i\pi }{T}y}\right) \right| , \end{aligned}$$

because \(|x-y| > \delta \) for \(y\in I_3\). Hence,

$$\begin{aligned} \int _{I_3} |K_N(x,y)| \, \mathrm {d}y\le & {} {\mathcal {O}}(N^{\frac{1}{2}}) \int _{I_3} \left| \Pi _N\left( e^{\frac{i\pi }{T}y}\right) \right| \, \mathrm {d}y \\\le & {} {\mathcal {O}}(N^{\frac{1}{2}}) \left( \int _{-1}^1 \left| \Pi _N\left( e^{\frac{i\pi }{T}y}\right) \right| ^2 \, \mathrm {d}y \right) ^{\frac{1}{2}}\\= & {} {\mathcal {O}}(N^{\frac{1}{2}}). \end{aligned}$$

All that remains is to show that \(\int _{I_2} |K_N(x,y)| \, \mathrm {d}y = {\mathcal {O}}(N^{\frac{1}{2}})\) uniformly for \(x \in [1-\delta ,1]\). For \(x \in [1-\delta ,1]\) and \(y \in I_2\), we have \(y \in [1-2\delta ,1]\) so that the asymptotic expression in Theorem 4.5 holds. Define the variables

$$\begin{aligned} \eta = \cos ^{-1}\left( \frac{\sin \left( x\frac{\pi }{2T} \right) }{\sin \left( \frac{\pi }{2T} \right) }\right) , \quad \lambda = \cos ^{-1}\left( \frac{\sin \left( y\frac{\pi }{2T} \right) }{\sin \left( \frac{\pi }{2T} \right) }\right) . \end{aligned}$$

Take the asymptotic expressions for \(\Pi _N\) in Theorem 4.5 for the x and y currently in question, and consider the numerator in the formula for the kernel \(K_N(x,y)\) (Lemma 4.3). An asymptotic formula is as follows:

$$\begin{aligned}&\mathrm {Imag}\left( \overline{e^{-\frac{i\pi }{T}\frac{N}{2}y} \cdot \Pi _N\left( e^{\frac{i\pi }{T}y}\right) } \cdot e^{-\frac{i\pi }{T}\frac{N}{2}x} \cdot \Pi _N\left( e^{\frac{i\pi }{T}x}\right) \right) \end{aligned}$$
(12)
$$\begin{aligned}&\quad = \frac{1}{2T} \left( \frac{\pi }{2}N\eta \right) ^{\frac{1}{2}} \left( \frac{\pi }{2}N\lambda \right) ^{\frac{1}{2}}\nonumber \\&\qquad \cdot \Bigg ( \left( \frac{\sin \left( (1+x)\frac{\pi }{2T}\right) }{\sin \left( (1-x)\frac{\pi }{2T}\right) }\right) ^{\frac{1}{4}} J_0(N\eta ) \left( \frac{\sin \left( (1-y)\frac{\pi }{2T}\right) }{\sin \left( (1+y)\frac{\pi }{2T}\right) }\right) ^{\frac{1}{4}}J_1\left( N\lambda \right) \nonumber \\&\qquad - \left( \frac{\sin \left( (1-x)\frac{\pi }{2T}\right) }{\sin \left( (1+x)\frac{\pi }{2T}\right) }\right) ^{\frac{1}{4}} J_1(N\eta ) \left( \frac{\sin \left( (1+y)\frac{\pi }{2T}\right) }{\sin \left( (1-y)\frac{\pi }{2T}\right) }\right) ^{\frac{1}{4}} J_0\left( N\lambda \right) \Bigg ) + {\mathcal {O}}(1) \quad \end{aligned}$$
(13)
$$\begin{aligned}&\quad = N^{\frac{1}{2}}\frac{\pi }{4T} \nonumber \\&\qquad \cdot \Bigg (\left( \frac{\eta ^2\sin \left( (1+x)\frac{\pi }{2T}\right) }{\sin \left( (1-x)\frac{\pi }{2T}\right) }\right) ^{\frac{1}{4}} J_0(N\eta ) \left( \frac{\sin \left( (1-y)\frac{\pi }{2T}\right) }{\sin \left( (1+y)\frac{\pi }{2T}\right) }\right) ^{\frac{1}{4}} \left( N\lambda \right) ^{\frac{1}{2}}J_1\left( N\lambda \right) \nonumber \\&\qquad - \left( \frac{\sin \left( (1-x)\frac{\pi }{2T}\right) }{\sin \left( (1+x)\frac{\pi }{2T}\right) }\right) ^{\frac{1}{4}} \left( N\eta \right) ^{\frac{1}{2}}J_1\left( N\eta \right) \left( \frac{\lambda ^2\sin \left( (1+y)\frac{\pi }{2T}\right) }{\sin \left( (1-y)\frac{\pi }{2T}\right) }\right) ^{\frac{1}{4}} J_0(N\lambda ) \Bigg )\nonumber \\&\qquad + {\mathcal {O}}(1). \end{aligned}$$
(14)

This was an important step in the proof, because there was cancellation when we took the imaginary part. This cancellation is essential for the result to hold, and it is the reason for deriving and including a fully explicit description of the leading order asymptotics of the polynomials in Appendix A.

We will now proceed to find upper bounds on the resulting expression. We showed in the proof of Theorem 4.5 that

$$\begin{aligned} \left( \frac{\eta ^2\sin \left( (1+x)\frac{\pi }{2T}\right) }{\sin \left( (1-x)\frac{\pi }{2T}\right) }\right) ^{\frac{1}{4}} = {\mathcal {O}}(1) \end{aligned}$$

uniformly for all \(x \in [0,1]\). The same is true when x and \(\eta \) are replaced by y and \(\lambda \).

It is straightforward to also show that \((1-y)\frac{\pi }{2T} \le \sin \left( \frac{\pi }{2T} \right) \lambda ^2\) for \(y \in [0,1]\) and \(\lambda \in \left[ 0,\frac{\pi }{2}\right] \). From this, we have that for \(y \in I_2\), \(\lambda \ge \sqrt{\frac{\pi }{2TN}}\). Combining this with the fact that for \(t \rightarrow \infty \), \(J_\alpha (t) = {\mathcal {O}}\left( t^{-\frac{1}{2}}\right) \) (see [31, Eq. 10.17.3]), we get that \(J_0(N\lambda ) = {\mathcal {O}}\left( N^{-\frac{1}{4}}\right) \).

Note also that Bessel functions are uniformly bounded in absolute value by 1 (see [31, Eq. 10.14.1]). Furthermore, as \(t \rightarrow \infty \), \(t^{\frac{1}{2}}J_\alpha (t) = {\mathcal {O}}(1)\) (see [31, Eq. 10.17.3]). Collecting the bounds mentioned in the last three paragraphs, we conclude that for \(y \in I_2\), we have

$$\begin{aligned}&\mathrm {Imag}\left( \overline{e^{-\frac{i\pi }{T}\frac{N}{2}y} \cdot \Pi _N\left( e^{\frac{i\pi }{T}y}\right) } \cdot e^{-\frac{i\pi }{T}\frac{N}{2}x} \cdot \Pi _N\left( e^{\frac{i\pi }{T}x}\right) \right) \nonumber \\&\quad = {\mathcal {O}}(N^{\frac{1}{2}})J_0(N\eta )(1-y)^{\frac{1}{4}} + {\mathcal {O}}(N^{\frac{1}{4}}). \end{aligned}$$
(15)

To conclude, we prove two refinements of Eq. (15), depending on whether \(x \in [1-\delta ,1-N^{-1}]\) or \(x \in [1-N^{-1},1]\). When \(x \in [1-\delta ,1-N^{-1}]\), we have \(J_0(N\eta ) = {\mathcal {O}}(N^{-\frac{1}{4}})\) (just like for \(y \in I_2\) discussed above), and so,

$$\begin{aligned} \mathrm {Imag}\left( \overline{e^{-\frac{i\pi }{T}\frac{N}{2}y} \cdot \Pi _N\left( e^{\frac{i\pi }{T}y}\right) } \cdot e^{-\frac{i\pi }{T}\frac{N}{2}x} \cdot \Pi _N\left( e^{\frac{i\pi }{T}x}\right) \right) = {\mathcal {O}}(N^{\frac{1}{4}}). \end{aligned}$$

This implies that \(K_N(x,y) = {\mathcal {O}}\left( \frac{N^{\frac{1}{4}}}{|x-y|} \right) \) for \(x \in [1-\delta ,1-N^{-1}]\) and \(y\in I_2\). Therefore,

$$\begin{aligned} \int _{I_2} |K_N(x,y)| \, \mathrm {d}y \le {\mathcal {O}}(N^{\frac{1}{4}}) \int _{I_2} \frac{1}{|x-y|} \, \mathrm {d}y \le {\mathcal {O}}(N^{\frac{1}{4}}) \int _{N^{-1}}^{\delta } \frac{1}{s} \, \mathrm {d}s = {\mathcal {O}}(N^{\frac{1}{4}}\log (N)). \end{aligned}$$

Finally, when \(x \in [1- N^{-1},1]\) and \(y \in I_2\), we have \(1-y = x-y + 1-x \le x - y + N^{-1}\) (since \(x \ge y\)). By concavity of the function \(t \mapsto |t|^{\frac{1}{4}}\) at \(t = x-y > 0\), we have

$$\begin{aligned} (x-y + N^{-1})^{\frac{1}{4}} \le (x-y)^{\frac{1}{4}} + \frac{1}{4} N^{-1}(x-y)^{-\frac{3}{4}}. \end{aligned}$$

Substituting this bound into Eq. (15), we get,

$$\begin{aligned} K_N(x,y)= & {} {\mathcal {O}}\left( \frac{N^{\frac{1}{2}}(1-y)^{\frac{1}{4}}}{|x-y|} \right) \\= & {} {\mathcal {O}}\left( N^{\frac{1}{2}}|x-y|^{-\frac{3}{4}}\right) + {\mathcal {O}}\left( N^{-\frac{1}{2}}|x-y|^{-\frac{7}{4}}\right) . \end{aligned}$$

The integral is bounded in the predictable manner as follows:

$$\begin{aligned} \int _{I_2} |K_N(x,y)| \,\mathrm {d}y= & {} {\mathcal {O}}(N^{\frac{1}{2}})\int _{I_2} |x-y|^{-\frac{3}{4}}\,\mathrm {d}y + {\mathcal {O}}(N^{-\frac{1}{2}})\int _{I_2} |x-y|^{-\frac{7}{4}} \,\mathrm {d}y \\\le & {} {\mathcal {O}}(N^{\frac{1}{2}}) + {\mathcal {O}}(N^{-\frac{1}{2}})\int _{N^{-1}}^1 s^{-\frac{7}{4}} \, \mathrm {d}s \\= & {} {\mathcal {O}}(N^{\frac{1}{2}}) + {\mathcal {O}}(N^{-\frac{1}{2}} \cdot N^{\frac{3}{4}}) \\= & {} {\mathcal {O}}(N^{\frac{1}{2}}). \end{aligned}$$

Since this covers all \(x \in [-1,1]\) with finitely many uniform \({\mathcal {O}}(N^{\frac{1}{2}})\) upper bounds, we have the final result uniformly for all \(x \in [-1,1]\). \(\square \)

5 Best Uniform Approximation by Fourier Extensions

We will compare best uniform approximation in three spaces. For odd positive integer \(N = 2n+1\), define

$$\begin{aligned} {\mathcal {H}}_N= & {} \mathrm{span}\left\{ e^{\frac{i\pi }{T}kx}\right\} _{k=-n}^n \subset C([-1,1]), \\ {\mathcal {T}}_N= & {} \mathrm{span}\left\{ e^{\frac{i\pi }{T}kx}\right\} _{k=-n}^n \subset C_{\mathrm {per}}([-T,T]), \\ {\mathcal {P}}_N= & {} \mathrm{span}\left\{ x^k\right\} _{k=0}^{2n} \subset C([-1,1]). \end{aligned}$$

We will see that best uniform approximation by Fourier extensions is more similar to that of algebraic polynomials. The best approximation error functionals for these spaces are defined by

$$\begin{aligned} E(f;{\mathcal {H}}_N)= & {} \inf _{r_N \in {\mathcal {H}}_N}\Vert f-r_N\Vert _{L^\infty (-1,1)} \quad \text { for all } f \in C([-1,1]), \\ E(g;{\mathcal {T}}_N)= & {} \inf _{t_N \in {\mathcal {T}}_N}\Vert g-t_N\Vert _{L^\infty (-T,T)} \,\,\quad \text { for all } g \in C_{\mathrm {per}}([-T,T]), \\ E(h;{\mathcal {P}}_N)= & {} \inf _{p_N \in {\mathcal {P}}_N}\Vert h-p_N\Vert _{L^\infty (-1,1)} \quad \text { for all } h \in C([-1,1]). \end{aligned}$$

We wish to find bounds in terms of N and the regularity of the functions to be approximated.

For \(f \in C([-1,1])\) the modulus of continuity is defined by [11, 28]

$$\begin{aligned} \omega (f;\delta ) = \sup _{\begin{array}{c} x,y \in [-1,1]\\ |x-y|<\delta \end{array}} |f(x) - f(y)|. \end{aligned}$$
(16)

For \(g \in C_{\mathrm {per}}([-T,T])\) we define the periodic modulus of continuity to be

$$\begin{aligned} \omega _{\mathrm {per}}(g;\delta ) = \sup _{\begin{array}{c} x,y \in [-T,T]\\ d_T(x,y)<\delta \end{array}} |g(x) - g(y)|, \end{aligned}$$

where \(d_T(x,y)\) is the distance between xy as elements of the periodic interval \([-T,T]\). The following results are immediate.

Lemma 5.1

If f is in the Hölder space \(C^\alpha ([-1,1])\) for \(\alpha \in [0,1]\), then \(\omega (f;\delta ) \le \delta ^\alpha |f|_{C^\alpha ([-1,1])}\) for all \(\delta > 0\).

Lemma 5.2

If \(f \in C([-1,1])\) is uniformly Dini–Lipschitz [42], i.e.,

$$\begin{aligned} \lim _{\delta \searrow 0}\sup _{\begin{array}{c} x,y \in [-1,1]\\ |x-y|<\delta \end{array}} \left| (f(x)- f(y)) \log \delta \right| = 0, \end{aligned}$$

then \(\omega (f;\delta ) = o(1 / |\log \delta |)\).

5.1 A Jackson-Type Theorem

The original Jackson theorem for classical Fourier series asserts that for all \(k = 0,1,2,\ldots \) and all functions \(g \in C_{\mathrm {per}}^k([-T,T])\), we have

$$\begin{aligned} E(g;{\mathcal {T}}_N) = {\mathcal {O}}(N^{-k}) \, \omega _{\mathrm {per}}\left( g^{(k)};\frac{1}{N}\right) , \end{aligned}$$

where the constant in the big \({\mathcal {O}}\) depends on k and T [19, Thm. 1.IV].

There is also a polynomial version of Jackson’s theorem, which states that for all \(k = 0,1,2,\ldots \) and all functions \(h \in C^k([-1,1])\), we have

$$\begin{aligned} E(h;{\mathcal {P}}_N) = {\mathcal {O}}(N^{-k})\, \omega \left( h^{(k)};\frac{1}{N}\right) , \end{aligned}$$
(17)

where the constant in the big \({\mathcal {O}}\) depends only on k [19, Thm. 1.VIII]. We prove a version of Jackson’s theorem for Fourier extensions.

Theorem 5.3

(Jackson-type) For all \(k = 0,1,2,\ldots \) and all functions \(f \in C^k([-1,1])\),

$$\begin{aligned} E(f;{\mathcal {H}}_N) = {\mathcal {O}}(N^{-k}) \, \omega \left( f^{(k)};\frac{1}{N}\right) , \end{aligned}$$

where the constant in the big \({\mathcal {O}}\) depends only on k and T.

Lemma 5.4

(Periodic extension) Let \(f\in C^k([-1,1])\). Then f can be extended to a function \(g\in C_{\mathrm {per}}^k([-T,T])\) such that

$$\begin{aligned} \omega _{\mathrm {per}}(g^{(k)};\delta ) \le \frac{T}{T-1}\omega (f^{(k)};\delta ). \end{aligned}$$

Proof

First let \(k=0\). Define the function \(g \in C_{\mathrm {per}}([-T,T])\) such that for \(x\in [-1,1]\), \(g(x) = f(x)\) and for \(x \in [-T,T]\backslash [-1,1]\), g(x) is the the linear function that interpolates f at \(\{-1,1\}\). We distinguish between 4 different cases for points \(x,y \in [-T,T]\) such that \(d_T(x,y) \le \delta \): (i) if \(x,y\in [-1,1]\), then

$$\begin{aligned} |g(x)-g(y)|=|f(x)-f(y)| \le \omega (f;\delta ); \end{aligned}$$

(ii) if \(x,y\in [-T,T]\backslash [-1,1]\), then since g is linear in this region,

$$\begin{aligned} |g(x)-g(y)|\le \frac{|f(1)-f(-1)|}{2(T-1)}\delta ; \end{aligned}$$

(iii) if \(x\in [-1,1],y\in [-T,T]\backslash [-1,1]\), then

$$\begin{aligned} |g(x)-g(y)|\le |f(\xi )-f(x)|+|g(y)-g(\xi )|\le \omega (f,\delta )+\frac{|f(1)-f(-1)|}{2(T-1)}\delta , \end{aligned}$$

where \(\xi \) is the closest of the endpoints to x; and (iv) if \(x\in [-T,T]\backslash [-1,1],y\in [-T,T]\), the bound is similar to the previous one. Now it remains to bound \(|f(1)-f(-1)|\) in terms of \(\omega (f;\delta )\). For any positive integer m, we can use a telescoping sum,

$$\begin{aligned} |f(1)-f(-1)|\le \sum _{k=0}^{2m-1}\left| f\left( -1+\frac{k}{m}\right) -f \left( -1+\frac{k+1}{m}\right) \right| \le 2m\omega \left( f;\frac{1}{m}\right) . \end{aligned}$$

It suffices to take \(m>1/\delta \) to show that \( \frac{|f(1)-f(-1)|}{2(T-1)}\delta \le \frac{1}{T-1}\omega (f;\delta )\). Combining all four cases demonstrates \(\omega _{\mathrm {per}}(g,\delta )\le \frac{T}{T-1}\omega (f,\delta )\).

Now let \(k>0\) and choose as extension of f the \(2(k+1)\)th degree Hermite interpolant in the points \(x=1\) and \(x=-1\); then \(g^{(k)}(x)\) is the linear interpolation between \(f^{(k)}(1)\) and \(f^{(k)}(-1)\) for \(x\in [-T,T]\backslash [-1,1]\). By the case \(k=0\) proved above, \(\omega _{\mathrm {per}}(g^{(k)};\delta )\le \frac{T}{T-1}\omega (f^{(k)};\delta )\). \(\square \)

Proof of Theorem 5.3

Let \(f \in C^k([-1,1])\). By Lemma 5.4, this function can be extended to a function \(g \in C_{\mathrm {per}}^k([-T,T])\) such that \(\omega _{\mathrm {per}}(g^{(k)};\delta )\) is bounded by \(\frac{T}{T-1}\omega (f^{(k)};\delta )\). Let \(t_N \in {\mathcal {T}}_N\) be the best uniform approximation to g, then (trivially) there exists a function \(r_N \in {\mathcal {H}}_N\) such that \(r_N(x) = t_N(x)\) for all \(x \in [-1,1]\). Hence,

$$\begin{aligned} E(f;{\mathcal {H}}_N) \le \Vert f - r_N\Vert _{L^\infty (-1,1)} \le \Vert g - t_N\Vert _{L^\infty (-T,T)} = E(g;{\mathcal {T}}_N). \end{aligned}$$

The original Jackson theorem can now be used to bound \(E(g;{\mathcal {T}}_N)\):

$$\begin{aligned} E(g;{\mathcal {T}}_N) = {\mathcal {O}}(N^{-k}) \, \omega _{\mathrm {per}}(g^{(k)};\delta ) \le {\mathcal {O}}(N^{-k}) \,\omega (f^{(k)};\delta ). \end{aligned}$$

This proves the result. \(\square \)

The combination of Lemma 5.1 and Theorem 5.3 yields the following useful fact. If \(f \in C^{k,\alpha }([-1,1])\) for \(k \ge 0\) and \(\alpha \in [0,1]\), then

$$\begin{aligned} E(f;{\mathcal {H}}_N) = {\mathcal {O}}(N^{-k-\alpha }) \, |f^{(k)}|_{C^\alpha ([-1,1])}. \end{aligned}$$

Hereafter, we will see that this is not actually tight, in the sense that functions in \(C^{k,\alpha }([-1,1])\) can see a decay of best approximation error with a rate faster than \(N^{-k-\alpha }\). This is in contrast to the situation for classical Fourier series in which it is indeed tight (see Theorem 5.5).

5.2 A Bernstein-Type Theorem

While Jackson-type theorems bound the best approximation error functional by powers of N and moduli of continuity of derivatives, Bernstein-type theorems attempt to do the opposite.

Bernstein-type theorems follow from Bernstein inequalities. For classical Fourier series, the Bernstein inequality is

$$\begin{aligned} \Vert t_N'\Vert _\infty \le \frac{\pi }{T}n \Vert t_N\Vert _\infty \end{aligned}$$
(18)

for all \(t_N \in {\mathcal {T}}_N\), where \(N = 2n+1\) [11, Ch. 4, Th. 2.4]. Equality holds when \(t_N(x) \propto e^{\pm \frac{i\pi }{T}nx}\). From Bernstein’s inequality it is possible to show that there exists \(C_T >0\) such that [11, Ch. 7, Thm. 3.1],

$$\begin{aligned} \omega _{\mathrm {per}}\left( g;\frac{1}{N}\right) \le \frac{C_T}{n}\sum _{\begin{array}{c} k=3 \\ k \text { odd} \end{array}}^N E(g;{\mathcal {T}}_{k}). \end{aligned}$$

Now, this is not precisely a converse to Jackson’s theorem, but it implies the following tightness property.

Theorem 5.5

(Jackson–Bernstein [11, Ch. 7, Thm. 3.3]) Let \(g \in C_{\mathrm {per}}([-T,T])\) and \(\alpha \in (0,1)\). It holds that

$$\begin{aligned} E(g;{\mathcal {T}}_N) = {\mathcal {O}}(N^{-\alpha }) \iff \omega _{\mathrm {per}}\left( g;\delta \right) = {\mathcal {O}}(\delta ^{\alpha }). \end{aligned}$$

The direct analogue of Theorem 5.5 for best uniform approximation by algebraic polynomials in \(C([-1,1])\) is not true. Indeed, consider the function \(h(x) = (1-x^2)^\alpha \), whose modulus of continuity satisfies \(\omega (h;\delta ) = {\mathcal {O}}(\delta ^{\alpha })\) by Lemma 5.1. Define the function \(g(\theta ) = h\left( \cos \left( \theta \right) \right) = \left| \sin \left( \theta \right) \right| ^{2\alpha }\) for \(\theta \in [-\pi ,\pi ]\). If \(\alpha < \frac{1}{2}\), then \(g \in C^{2\alpha }([-\pi ,\pi ])\), so \(E(g;{\mathcal {T}}_N) = {\mathcal {O}}(N^{-2\alpha })\) by Theorem 5.5. Furthermore, the best approximations will be even since g is even, so the approximants are in fact polynomials in \(\cos (\theta )\). This implies that the best approximations to h are polynomials in x, showing that \(E(h;{\mathcal {P}}_N) = {\mathcal {O}}(N^{-2\alpha })\), twice as good as would be expected from Jackson’s theorem for algebraic polynomials (Eq. (17)).

It was only in the late twentieth century that characterizations of functions \(h\in C([-1,1])\) for which \(E(h;{\mathcal {P}}_N) = {\mathcal {O}}(N^{-\alpha })\) were developed [11, Ch. 8]. The key insight is to use weighted moduli of continuity. The weighted modulus of continuity with weight \(\phi : [-1,1] \rightarrow [0,\infty )\) for a function \(f \in C([-1,1])\) is defined as

$$\begin{aligned} \omega _{\phi }(f;\delta ) = \sup _{\begin{array}{c} x\pm h \in [-1,1]\\ 0 \le h < \phi (x)\delta \end{array}} |f(x+h) - f(x-h)|. \end{aligned}$$

Taking the weight \(\phi (x) = \frac{1}{2}\) returns the standard modulus of continuity in Eq. (16).

It turns out that if this weighted modulus of continuity is used with \(\phi (x) = \sqrt{1-x^2}\), then there is a direct analogue of Theorem 5.5 for best uniform approximation by algebraic polynomials.

Theorem 5.6

(Ditzian–Totik [12, Cor. 7.2.5]) Let \(h \in C([-1,1])\) and \(\alpha \in (0,1)\). It holds that

$$\begin{aligned} E(h;{\mathcal {P}}_N) = {\mathcal {O}}(N^{-\alpha }) \iff \omega _{\phi }\left( h ;\delta \right) = {\mathcal {O}}(\delta ^{\alpha }), \end{aligned}$$

where \(\phi (x) = \sqrt{1-x^2}\).

The proof of Theorem 5.6 depends on the Bernstein inequality for algebraic polynomials, which states that for all \(p_N \in {\mathcal {P}}_N\),

$$\begin{aligned} \Vert \phi \cdot p_N' \Vert _{L^\infty (-1,1)} \le N\Vert p_N\Vert _{L^\infty (-1,1)}, \end{aligned}$$
(19)

where \(\phi (x) = \sqrt{1-x^2}\) [11, Ch. 4, Cor. 1.2]. Compare this with the Bernstein inequality for classical Fourier series (Eq. (18)). If we wish to remove the factor of \(\phi \) in the left-hand side of Eq. (19), then we must change N to \(N^2\) on the right-hand side; this is then Markov’s inequality [11, Ch. 4, Thm. 1.4].

A Bernstein inequality was proved for Fourier extensions by Videnskii [38], see also [6, p. 242] and [29, Sec. 2]. It states that for all \(r_N \in {\mathcal {H}}_N\),

$$\begin{aligned} \Vert \phi \cdot r_N'\Vert _{L^\infty (-1,1)} \le \frac{\pi }{T} n \Vert r_N\Vert _{L^\infty (-1,1)}, \end{aligned}$$
(20)

where the weight function is

$$\begin{aligned} \phi (x) = \frac{\sqrt{\sin \left( (1-x)\frac{\pi }{2T} \right) \sin \left( (1+x)\frac{\pi }{2T} \right) }}{\cos \left( x\frac{\pi }{2T}\right) }. \end{aligned}$$
(21)

Note that since the sine function is concave in \([0,\pi ]\), we have \(|\sin \left( \frac{\pi }{2T}(1\pm x)\right) | \ge \sin \left( \frac{\pi }{2T}\right) |1\pm x|\) for \(x \in [-1,1]\). Also, \(|\sin \left( \frac{\pi }{2T}(1\pm x)\right) | \le |\frac{\pi }{2T}(1\pm x)|\) and \(\cos \left( x\frac{\pi }{2T}\right) \in [\cos \left( \frac{\pi }{2T}\right) ,1]\) for \(x\in [-1,1]\). Therefore,

$$\begin{aligned} \sin \left( \frac{\pi }{2T}\right) \sqrt{1-x^2} \le \phi (x) \le \frac{\pi }{2T\cos \left( \frac{\pi }{2T}\right) }\sqrt{1-x^2}, \end{aligned}$$

and we can change Eq. (20) to

$$\begin{aligned} \Vert \phi \cdot r_N'\Vert _{L^\infty (-1,1)} \le \frac{\pi }{T\sin \left( \frac{\pi }{2T}\right) } n \Vert r_N\Vert _{L^\infty (-1,1)}, \end{aligned}$$
(22)

where \(\phi (x) = \sqrt{1-x^2}\). Using the Bernstein inequality in Eq. (22) we can prove a Bernstein-type theorem for Fourier extensions.

Theorem 5.7

(Bernstein-type) There exists a constant \(C_T>0\) such that for all \(f \in C([-1,1])\), the following holds:

$$\begin{aligned} \omega _{\phi }\left( f;\frac{1}{N}\right) \le \frac{C_T}{n}\sum _{\begin{array}{c} k=3 \\ k \text { odd} \end{array}}^N E(f;{\mathcal {H}}_k), \end{aligned}$$

where \(\phi (x) = \sqrt{1-x^2}\) and \(N = 2n+1\).

Proof

This follows directly from [11, Ch. 6, Thm. 6.2] and [11, Ch. 7, Thm. 5.1(b)], with \(r =1\), \(\mu = 1\), \(X = L^\infty (-1,1)\), \(\Phi _n = {\mathcal {H}}_N\), and \(Y = W_\infty ^1(\phi ) := \{f \in W^{1,1}(-1,1): \phi \cdot f' \in L^\infty (-1,1)\}\), where \(W^{1,1}(-1,1)\) is the Sobolev space of absolutely continuous functions on \((-1,1)\). \(\square \)

From this Bernstein-type theorem for Fourier extensions, we get one half of an equivalence theorem between best approximation errors and weighted moduli of continuity. For the full equivalence, one must prove Conjecture 5.9 below.

Theorem 5.8

Let \(f \in C([-1,1])\) and \(\alpha \in (0,1)\). It holds that

$$\begin{aligned} E(f;{\mathcal {H}}_N) = {\mathcal {O}}(N^{-\alpha }) \implies \omega _{\phi }\left( f; \delta \right) = {\mathcal {O}}(\delta ^\alpha ). \end{aligned}$$

If Conjecture 5.9 is true, then the reverse implication holds too.

Proof

The forward implication follows immediately from Theorem 5.7. Suppose now that Conjecture 5.9 is true. Then we would have

$$\begin{aligned} E(f;{\mathcal {H}}_N) \le \frac{C_T}{n}\Vert \phi \cdot f' \Vert _{L^\infty (-1,1)} \end{aligned}$$
(23)

for all \(f \in C^1([-1,1])\) by setting \(f(x) = F\left( e^{\frac{i\pi }{T}x}\right) \), because \(f \in C^1([-1,1])\) if and only if \(F \in C^1(A)\), \(x \mapsto q_n(e^{\frac{i\pi }{T}x}) \in {\mathcal {H}}\), and \(|\mu (e^{\frac{i\pi }{T}x})| \le \frac{\pi }{T}\phi (x)\). We wish to extend this to all \(f \in W^{1,1}(-1,1)\) such that \(\phi \cdot f' \in L^\infty (-1,1)\) by a density argument, where \(W^{1,p}(-1,1)\) is the Sobolev space of \(L^p(-1,1)\) functions whose weak derivatives lie in \(L^p(-1,1)\). For such a function f, one can verify that the functions \(f_\rho (x) = f(\rho x)\) for \(\rho \in (0,1)\) satisfy: \(f_\rho \in W^{1,\infty }(-1,1)\), \(f_\rho \rightarrow f\) in \(L^\infty \), and \(\Vert \phi \cdot f_\rho '\Vert _\infty \le \Vert \phi \cdot f'\Vert _\infty \). For each \(\rho \) and \(\varepsilon > 0\) there exists \(f_{\rho ,\varepsilon } \in C^1([-1,1])\) such that \(\Vert f_{\rho ,\varepsilon } - f_\rho \Vert _{W^{1,\infty }} < \varepsilon \) by density of \(C^1([-1,1])\) in \(W^{1,\infty }(-1,1)\). Therefore there exists \(f_\varepsilon \in C^1([-1,1])\) such that \(\Vert f - f_\varepsilon \Vert _{L^\infty (-1,1)} < \varepsilon \) and \(\Vert \phi \cdot f_\varepsilon '\Vert _\infty \le \Vert \phi \cdot f'\Vert _\infty + \varepsilon \). Hence \(E(f;{\mathcal {H}}_N) \le \Vert f-f_\varepsilon \Vert _{L^\infty (-1,1)} + E(f_\varepsilon ;{\mathcal {H}}_N) \le \left( 1 + \frac{C_T}{n}\right) \varepsilon + \frac{C_T}{n}\Vert \phi \cdot f'\Vert _\infty \). Since \(\varepsilon \) is arbitrary, we have the desired inequality. A similar argument may be found in [11, p. 280].

From the above it would follow that there exists a constant \(C_T > 0\) such that

$$\begin{aligned} E(f;{\mathcal {H}}_N) \le C_T \omega _\phi \left( f;\frac{1}{N}\right) , \end{aligned}$$
(24)

from Eq. (23), [11, Ch. 6, Thm. 6.2], and [11, Ch. 7, Thm. 5.1(a)], with \(r =1\), \(\mu = 1\), \(X = L^\infty (-1,1)\), \(\Phi _n = {\mathcal {H}}_N\), and \(Y = W_\infty ^1(\phi ) := \{f \in W^{1,1}(-1,1): \phi \cdot f' \in L^\infty (-1,1)\}\). Equation (24) would imply that if \(\omega _{\phi }\left( f; \delta \right) = {\mathcal {O}}(\delta ^\alpha )\), then \(E(f;{\mathcal {H}}_N) = {\mathcal {O}}(N^{-\alpha })\), as required. \(\square \)

Conjecture 5.9

(Jackson inequality for polynomials on a circular arc) For any \(T > 1\), define the arc on the complex unit circle,

$$\begin{aligned} A = \left\{ e^{i\theta } : \theta \in \left( -\frac{\pi }{T},\frac{\pi }{T}\right) \right\} . \end{aligned}$$

There exists a constant \(C_T>0\) such that for all \(F \in C^1(A)\) and all \(n \in {\mathbb {N}}\), there exists a polynomial \(q_n\) of degree n such that

$$\begin{aligned} \sup _{z \in A} \left| F(z)-q_n(z) \right| \le \frac{C_T}{n} \sup _{z \in A} \left| \mu (z) F'(z) \right| , \end{aligned}$$

where \(\mu (z) = \sqrt{(z-e^{\frac{i\pi }{T}})(z-e^{-\frac{i\pi }{T}})}\).

Notice that to approximate f we conjecture that we only need to use positive powers of z, which means we do not need to utilize all of the functions in \({\mathcal {H}}_N\). This is because by Mergelyan’s theorem [33, Thm. 20.5] polynomials are dense in the space C(A). It is not surprising because of the redundant nature of approximation by Fourier extensions.

6 A Localization Theorem for Fourier Extensions

The theorem proved in this section is a modification of a theorem of Freud ([15, Thm. IV.5.4]), which is a localization theorem for orthogonal polynomials on an interval. We, however, are working with the orthonormal basis given in Lemma 4.2, and there are some clear differences between the two situations. We show that these differences do not change the statement of the result.

Theorem 6.1

(Localization theorem) Let \(f\in L^2(-1,1)\) be such that \(f(x) = 0\) for all \(x \in [a,b] \subseteq [-1,1]\). Then \(P_N(f) \rightarrow 0\) uniformly in all subintervals \([c,d] \subset (a,b)\).

Proof

First note that the pointwise error can be written in terms of the prolate kernel discussed in Sect. 4 as

$$\begin{aligned} P_N(f)(x) - f(x) = \int _{-1}^1 (f(y)-f(x))K_N(x,y)\mathrm {d}y. \end{aligned}$$

Let \(x \in [c,d] \subset (a,b)\), so that \(f(x) = 0\). By the formula for the prolate kernel (Lemma 4.3),

$$\begin{aligned}&P_N(f)(x) - f(x)\\&\quad = \int _{-1}^1 \frac{f(y)}{\sin \left( \frac{\pi }{2T}(x-y)\right) } \mathrm {Imag}\left( \overline{e^{-\frac{i\pi }{T}\frac{N}{2}y} \cdot \Pi _N\left( e^{\frac{i\pi }{T}y}\right) } \cdot e^{-\frac{i\pi }{T}\frac{N}{2}x} \cdot \Pi _N\left( e^{\frac{i\pi }{T}x}\right) \right) \, \mathrm {d}y. \end{aligned}$$

By expressing the imaginary part as \(\frac{1}{2i}\) times the difference of the complex conjugates, it is easy to see that for this expression to tend to zero as \(N \rightarrow \infty \), it is sufficient that for any f as in the statement of the theorem, we have

$$\begin{aligned} \lim _{N\rightarrow \infty } \int _{-1}^1 \frac{f(y)}{\sin \left( \frac{\pi }{2T}(x-y)\right) } e^{\frac{i\pi }{T}\frac{N}{2}y} \cdot \overline{\Pi _N\left( e^{\frac{i\pi }{T}y}\right) } \, \mathrm {d}y = 0. \end{aligned}$$

To prove this we consider the functions

$$\begin{aligned} g_\xi (y) = \frac{f(y)e^{\frac{i\pi }{2T}y}}{\sin \left( \frac{\pi }{2T}(\xi -y)\right) } \end{aligned}$$

for \(\xi \in [c,d]\). It holds that \(g_{\xi } \in L^2(-1,1)\), because \(g_{\xi }\) is equal to 0 inside [ab] and equal to f (an \(L^2(-1,1)\) function) multiplied by a bounded function (\(y \mapsto e^{\frac{i\pi }{2T}y} / \sin \left( \frac{\pi }{2T}(\xi -y)\right) \)) outside of [ab].

Let \(\varepsilon > 0\). By Proposition 2.4, for any \(\xi \in [c,d]\), there exists \(K_\xi \in {\mathbb {N}}\) and a function \(h_{K_\xi } \in {\mathcal {H}}_{K_\xi }\) such that \(\left\| g_\xi - h_{K_\xi } \right\| _{L^2(-1,1)} < \varepsilon \). A key property of the function \(h_{K_\xi }\) is that for \(N \ge K_\xi \),

$$\begin{aligned} \int _{-1}^1 h_{K_\xi }(y) e^{\frac{i\pi }{T}\frac{N-1}{2}y} \cdot \overline{\Pi _N\left( e^{\frac{i\pi }{T}y}\right) } \, \mathrm {d}y = 0, \end{aligned}$$
(25)

because \(h_{K_\xi }(y) e^{\frac{i\pi }{T}\frac{N-1}{2}y}\) is a polynomial of degree \(\frac{K_\xi -1}{2} + \frac{N-1}{2} \le N - 1\) in the variable \(z = \exp \left( \frac{i\pi }{T}y\right) \). Now, because the map \(x \mapsto g_x\) is a continuous mapping from \([c,d] \rightarrow L^2(-1,1)\), there exists an interval \(I(\xi )\) such that for all \(x \in I(\xi )\), \(\left\| g_x - h_{K_\xi } \right\| _{L^2(-1,1)} < \varepsilon \) is still valid. In consequence of the Heine–Borel compactness theorem [33], the interval [cd] will be covered by finitely many of these intervals \(I(\xi )\), which we denote by \(I(\xi _1), I(\xi _2),\dots ,I(\xi _s)\).

Let \(K_\varepsilon \) be an odd integer such that \(h_{K_{\xi _i}}\in {\mathcal {H}}_{K_\varepsilon }\) for \(i = 1,\ldots ,s\). For an arbitrary \(x\in [c,d]\) there is an interval \(I(\xi _r)\) such that \(x\in I(\xi _r)\) and for \(N>K_{\varepsilon }\), we have (using the expression (25))

$$\begin{aligned}&\left| \int _{-1}^1 \frac{f(y)}{\sin \left( \frac{\pi }{2T}(x-y)\right) } e^{\frac{i\pi }{T}\frac{N}{2}y} \cdot \overline{\Pi _N\left( e^{\frac{i\pi }{T}y}\right) } \, \mathrm {d}y\right| \\&\quad = \left| \int _{-1}^1 \left( g_{x}(y) - h_{K_{\xi _r}}(y) \right) e^{\frac{i\pi }{T}\frac{N-1}{2}y} \overline{\Pi _N\left( e^{\frac{i\pi }{T}y}\right) } \, \mathrm {d}y\right| \\&\quad \le \left( \int _{-1}^1 |g_{x}(y) - h_{K_{\xi _r}}(y)|^2 \,\mathrm {d}y\right) ^{\frac{1}{2}} \\&\qquad \cdot \left( \int _{-1}^1 \left| e^{\frac{i\pi }{T}\frac{N-1}{2}y} \overline{\Pi _N\left( e^{\frac{i\pi }{T}y}\right) }\right| ^2 \,\mathrm {d}y \right) ^{\frac{1}{2}} \\&\quad < \varepsilon . \end{aligned}$$

This last line used the normality of the basis for \({\mathcal {H}}_N\) discussed in Lemma 4.2.

In conclusion, since \(\varepsilon \) is arbitrary and the inequality above is valid for all \(N > K_{\varepsilon }\), the integral must converge to zero as \(N \rightarrow \infty \), uniformly with respect to \(x \in [c,d]\), as required. \(\square \)

7 Numerical Experiments

In this section we provide numerically computed examples of pointwise and uniform convergence of Fourier extensions for functions with various regularity properties. It was discussed in the introduction that the condition number of the linear system for computing the Fourier extension is extremely ill-conditioned, making computation of the exact solution to the Fourier extension practically impossible. To deal with this issue, we used sufficiently high precision floating point arithmetic and we did not take N higher than 129, to ensure that the system could be inverted accurately. The right-hand side vectors for the computations are computed by quadrature in high precision floating point arithmetic.

In practice, one would compute a fast regularized oversampled interpolation Fourier extension using the algorithm in [26], requiring only \(\mathcal {O}(N\log ^2(N))\) floating point operations. However, we are interested in the exact Fourier extension and want to avoid any artefacts that may be caused by the regularization or discretization of the domain.

In some cases, we compare the convergence rate of Fourier extensions to that of Legendre series, because we predict that the qualitative behavior of Legendre series will be similar (see Sect. 8). For the Legendre series approximations we computed the Legendre series coefficients one by one using adaptive quadrature in 64-bit floating point precision. As such, the errors for the Legendre series approximations will stagnate due to numerical error.

7.1 Analytic and Entire Functions

Theorem 3.1 gives an upper bound on the rate of exponential convergence of Fourier extension approximations to analytic functions. The regions of analyticity in the complex plane which dictate the rate are the mapped Bernstein ellipse \(\mathcal D(\rho )\), where \(\rho > 1\). The theorem is illustrated in Fig. 1, where we approximate an entire function and four analytic functions that each have a pole in a different location in the complex plane. All examples exhibit exponential convergence in the uniform norm at a rate that is predicted by Theorem 3.1. This is also the case for the entire function, where the exponential convergence rate is limited by a T-dependent upper bound.

Fig. 1
figure 1

We compute Fourier extension approximations to 5 functions: \(f(x) = e^x\) (yellow stars) and \(f(x) = \frac{1}{x-r}\) for \(r=0.3i,~0.6i,~1.5i,~2.0i\) (red circles, blue squares, green crosses and brown diamonds, respectively). The T parameter is 2.43. Left: the mapped Bernstein ellipses \(\mathcal {D}(\rho )\) in the complex plane, for \(\rho = 1.891, 3.454, 8.913\). The outermost outline (in blue) encloses the maximal mapped Bernstein ellipse; analyticity outside this largest region does not increase the exponential convergence rate. Right: the \(L^\infty (-1,1)\) error against values of N for each of the 5 functions. The black dashed lines indicate the convergence rates predicted by Theorem 3.1. (Color figures online)

7.2 Differentiable Functions

We investigate Fourier extension approximation of splines of degree \(d = 3, 9\), and 15 on the interval \(\left[ 0,\frac{1}{2}\right] \), which lie in the Hölder spaces \(C^{2,1}\left( \left[ 0,\frac{1}{2}\right] \right) \), \(C^{8,1}\left( \left[ 0,\frac{1}{2}\right] \right) \), and \(C^{14,1}\left( \left[ 0,\frac{1}{2}\right] \right) \), respectively. By Theorem 3.2, we expect the pointwise errors to be \(\mathcal {O}(N^{-d}\log N)\) in the interior and \(\mathcal {O}(N^{\frac{1}{2} - d})\) uniformly over the whole interval.

The spline functions and the pointwise approximation errors for Fourier extensions with various values of N are plotted in Fig. 2. The rates of convergence predicted by Theorem 3.2 fit reasonably well, sometimes performing slightly better. For comparison, we include the errors for a Legendre series approximation in a dashed line of the same color. See Sect. 8.1 for a full discussion comparing convergence of Legendre series and Fourier extensions.

Fig. 2
figure 2

Above : plots of splines of degree 3, 9, and 15 in \(C^{2,1}\), \(C^{8,1}\), and \(C^{14,1}\), respectively with an interior point marked using a red circle, and a boundary point marked with a blue square. Below: the pointwise error at an interior point (red circle) and an endpoint (blue square) using Fourier extension with \(T=2\) (full lines) and using Legendre series (dashed lines) against the number of degrees of freedom, N. The black dotted lines without markers indicate the upper bounds on the algebraic rates of convergence predicted by Theorem 3.2. (Color figures online)

7.3 Nondifferentiable Functions

We investigate the approximation of functions with algebraic singularities, discontinuities, and Dini–Lipschitz continuity.

Functions with an algebraic singularity at the endpoint are studied in Fig. 3. We plot the pointwise errors for Fourier extension and Legendre series approximations to \(f(x) = x^\alpha \) for \(\alpha = \frac{3}{4}, \frac{1}{2}\), and \(\frac{1}{10}\). These functions lie in the Hölder spaces \(C^\alpha \left( \left[ 0,\frac{1}{2}\right] \right) \) for their respective values of \(\alpha \).

While Theorem 3.2 guarantees uniform convergence over \([-1,1]\) only for the first function (since for the other two functions \(\alpha \le \frac{1}{2}\)), in our experiments we find that the error of the other two functions converges to zero too. We believe that this discrepancy is related to the weighted moduli of continuity of these functions being more favorable than the standard moduli (see Sect. 5). Overall, the observed convergence rates are sometimes better than the predicted rates, but when Fourier extensions are compared with Legendre series, we see similar rates of pointwise convergence, especially at the singular point. See Sect. 8.1 for a full discussion comparing convergence of Legendre series and Fourier extensions.

Fig. 3
figure 3

Above left: \(f(x)=x^{3/4}\). Above middle: \(f(x)=x^{1/2}\). Above right: \(f(x)=x^{1/10}\). Below: the pointwise error at an interior point (red circles), singular endpoint (green crosses), and nonsingular endpoint (blue square) using Fourier extension with \(T=2\) (full lines) and Legendre series (dashed lines) against the number of degrees of freedom, N. The black dotted lines without markers indicate the upper bounds on the algebraic rates of convergence predicted by Theorem 3.2. (Color figures online)

Three functions with a singularity at the interior are shown in Fig. 4. The first has an algebraic singularity: \(f(x) = |x-r|^{1/4}\) where \(r = 0.29384\) (chosen to avoid any symmetry with respect to the domain). We observe agreement with the expected convergence rate of \(\mathcal O(N^{1/4}\log N))\) for the error at interior points. The second function has a jump:

$$\begin{aligned} f(x) = {\left\{ \begin{array}{ll} x &{} \text { if } x \in \left[ 0,\frac{1}{4}\right] , \\ 1 &{} \text{ if } x \in \left( \frac{1}{4},\frac{1}{2}\right] . \end{array}\right. } \end{aligned}$$
(26)

Even though the function is highly irregular because of the jump, this does not deny convergence at regular points, corroborating the local nature of Theorem 3.5. The last function is uniformly Dini–Lipschitz continuous in \(\left[ 0,\frac{1}{2}\right] \):

$$\begin{aligned} f(x) = {\left\{ \begin{array}{ll} (\log \left( |x-r|\right) )^{-2} &{} \text { if } x \in \left[ 0,\frac{1}{2}\right] \setminus \{r\}, \\ 0 &{} \text { if } x = r, \end{array}\right. } \end{aligned}$$
(27)

where \(r = 0.29384\) (chosen to avoid any symmetry with respect to the domain). In Fig. 4, the expected convergence rate of \(\mathcal {O}((\log (N))^{-1})\) of Lemma 3.7 is present.

Fig. 4
figure 4

Above left: \(f(x)=|x-r|^{1/4}\) with \(r = 0.29384\). Above middle: function with a jump, given in Eq. (26). Above right: Dini–Lipschitz continuous function given in Eq. (27). It has a strong cusp at \(x = 0.29384\). Below: the pointwise error at an interior point (red circles), singular interior point (green crosses), and endpoint (blue squares) using Fourier extension with \(T=2\) (full lines) and Legendre series (dashed lines) against the number of degrees of freedom, N. The black dotted lines without markers in the bottom left plot indicate the upper bounds on the algebraic rates of convergence predicted by Theorem 3.2. The black dotted line without markers in the bottom right plot indicates the rate of convergence predicted by Lemma 3.7. (Color figures online)

In all three cases, we compared the convergence of Fourier extension approximations and Legendre series. While there is sometimes a mismatch between the pessimistic prediction of Theorem 3.2 and Lemma 3.7 for the convergence rates (see Sect. 5), when we compare Fourier extensions and Legendre series, we observe agreement. See Sect. 8.1 for a full discussion comparing convergence of Legendre series and Fourier extensions.

8 Discussion

We proved pointwise and uniform convergence results for Fourier extension approximations of functions in Hölder spaces and with local uniform Dini–Lipschitz conditions. This was achieved by proving upper bounds on the associated Lebesgue function and the decay rate of best uniform approximation error for Fourier extensions, then appealing to Lebesgue’s lemma.

8.1 Comparison to Legendre Series

For a function \(f \in L^2(-1,1)\), let us compare the Fourier extension approximant, \(f_N\), to the Legendre series approximant,

$$\begin{aligned} f^{\mathrm {L}}_N(x) = \sum _{k=0}^{N-1} a_k p^{L}_k(x), \quad a_k = \frac{1}{2}\int _{-1}^1 f(x) p^L_k(x) \, \mathrm {d}x, \end{aligned}$$

where \(p^L_k\) is the kth Legendre polynomial normalized so that \(\frac{1}{2}\int _{-1}^1 p^{L}_k(x)^2 \,\mathrm {d}x = 1\).

The Lebesgue function of this approximation scheme is \({\mathcal {O}}(\log N)\) for \(x \in [a,b] \subset (-1,1)\) and \({\mathcal {O}}(N^{\frac{1}{2}})\) uniformly for \(x \in [-1,1]\) [19, Ch. 1], [16], which is precisely the same as the Lebesgue function for Fourier extensions (see Theorem 4.1). Best uniform approximation by Fourier extensions was compared to best uniform approximation by algebraic polynomials in Sect. 5. For any \(f \in C^{k}([-1,1])\) for \(k \in {\mathbb {Z}}_{\ge 0}\), we have

$$\begin{aligned} E(f;{\mathcal {H}}_N) = {\mathcal {O}}(N^{-k})\omega \left( f^{(k)};\frac{1}{N}\right) \text { and } E(f;{\mathcal {P}}_N) = {\mathcal {O}}(N^{-k})\omega \left( f^{(k)};\frac{1}{N}\right) . \end{aligned}$$

It follows that for \(C^{k,\alpha }([-1,1])\) functions, the statement of Theorem 3.2 also applies to Legendre series approximations. The localized convergence result for Dini–Lipschitz functions, Theorem 3.5 also also applies to Legendre series [15, Thm. IV.5.6]. Some of the experiments in Sect. 7 demonstrate these similarities.

Theorem 2.1 on exponential convergence differs from the exponential convergence results for Legendre series in two ways. First, the region in the complex plane that determines the rate of exponential convergence is determined not by Bernstein ellipses for Legendre series, but by mapped Bernstein ellipses for Fourier extensions. Second, there is an upper limit of \(\cot ^2\left( \frac{\pi }{4T}\right) \) for the rate of exponential convergence of Fourier extensions regardless of the region of analyticity, whereas for Legendre series the rate can be arbitrarily fast, and for entire functions the rate of convergence is superexponential [39].

8.2 Extensions of This Work

It was mentioned in the introduction that our convergence results will be more applicable if we can extend them to regularized and oversampled interpolation versions of Fourier extensions, because those are the kinds of Fourier extensions for which stable and efficient algorithms have been developed.

Regularized Fourier extensions for a given regularization parameter \(\varepsilon > 0\) can be defined as follows. Suppose the matrix \(G \in {\mathbb {R}}^{N\times N}\),

$$\begin{aligned} G_{k,j} = \mathrm {sinc}\left( (k-j)\frac{\pi }{T}\right) , \end{aligned}$$

has eigendecomposition \(G = VSV^*\). Let \(S_\varepsilon \) be S but with all entries less than \(\varepsilon \) set to 0. The coefficients \({\mathbf {c}}^\varepsilon \in {\mathbb {C}}^N\) of the regularized Fourier extension of \(f \in L^2(-1,1)\) are given by

$$\begin{aligned} {\mathbf {c}}^\varepsilon = V S_\varepsilon ^\dagger V^* {\mathbf {b}}, \end{aligned}$$

where \(b_k = \left( \frac{T}{2}\right) ^{\frac{1}{2}}\int _{-1}^1 e^{-\frac{i\pi }{T}kx}f(x)\,\mathrm {d}x\) [26]. In other words, the solution is projected onto the eigenvectors of G whose eigenvalues are greater than or equal to \(\varepsilon \). These eigenvectors are the discrete prolate spheroidal sequences (DPSSs), which are the Fourier coefficients of the DPSWFs \(\{\xi _{k,N}\}_{k=1}^N\) discussed in Sect. 4 [34]. The regularized Fourier extension, therefore, finds the best approximation not in \({\mathcal {H}}_N\), but in the linear space \({\mathcal {H}}_{N,\varepsilon } \subset {\mathcal {H}}_N \subset L^2(-1,1)\), where

$$\begin{aligned} {\mathcal {H}}_{N,\varepsilon } = \mathrm{span}\left\{ \xi _{k,N} : \lambda _{k,N} \ge \varepsilon \right\} . \end{aligned}$$

Therefore, if the Lebesgue function \(\Lambda (x;P_{N,\varepsilon })\) (where \(P_{N,\varepsilon }\) is the projection operator from \(L^2(-1,1)\) to \({\mathcal {H}}_{N,\varepsilon }\)) and best approximation error functional \(E(f;{\mathcal {H}}_{N,\varepsilon })\) can be estimated as in Sects. 4 and 5, then we immediately obtain pointwise convergence results for regularized Fourier extensions by Lebesgue’s lemma. Extensions to the regularized oversampled interpolation version of Fourier extensions can be conducted by considering the analogous quantities for the periodic discrete prolate spheroidal sequences (PDPSSs) [26, 41].

Generalization of this work to the multivariate case would be extremely interesting, because the shape of the domain \(\Omega \subset {\mathbb {R}}^d\) and regularity of its boundary will likely come into play [27].