Paper The following article is Free article

Greedy approximation by arbitrary sets

© 2020 Russian Academy of Sciences (DoM) and London Mathematical Society
, , Citation P. A. Borodin 2020 Izv. Math. 84 246 DOI 10.1070/IM8891

1064-5632/84/2/246

Abstract

We define various algorithms for greedy approximations by elements of an arbitrary set $M$ in a Banach space. We study the convergence of these algorithms in a Hilbert space under various geometric conditions on $M$. As a consequence, we obtain sufficient conditions for the additive semigroup generated by $M$ to be dense.

Export citation and abstract BibTeX RIS

This paper was written with the financial support of a grant of the Government of the Russian Federation (project 14.W03.31.0031). Theorem 5 was proved within the research program of RFBR (grant no. 18-01-00333a).

§ 1. Introduction

Let $H$ denote a real Hilbert space with scalar product ${(\,\cdot\,,\cdot\,)}$ and norm ${\|\,\cdot\,\|}$. Let $S(H)$ be the unit sphere of $H$. For every subset $D\subset S(H)$, referred to as a dictionary, there is a greedy approximation algorithm. With every element $x=x_0\in H$ it associates a sequence

Equation (1)

where $g_{n+1}\in D$ is such that

(the existence of $\max\{|(x,g)|\colon g\in D\}$ for every $x\in H$ is an additional condition on $D$; in the case when the maximum is attained at several elements of $D$, any one of them can be selected as $g_{n+1}$).

More precisely, this algorithm is called the pure greedy algorithm, in contrast to other approximation algorithms whose names contain the word 'greedy'; see [1].

It is known that the pure greedy algorithm converges for every complete dictionary $D$ (that is, when the linear combinations of elements of $D$ are dense in $H$). This means that for every initial element $x=x_0$ one has $x_n\to 0$ as $n\to\infty$ and $x$ is the sum of the series $\sum_{n=0}^\infty (x_n,g_{n+1})g_{n+1}$ (see [2] or [1], Ch. 2).

Note that the term $(x_n,g_{n+1})g_{n+1}$ in (1) is the nearest point to $x_n$ in the set

This observation leads to the definition of a greedy algorithm of approximation by an arbitrary subset of $H$ as introduced below.

Given any subset $M\subset H$ and any element $x\in H$, we write $\rho(x,M)=\inf\{\|x- y\|\colon y\in M\}$ for the distance from $x$ to $M$ and define the metric projection $P_M(x)=\{y\in M\colon \|x-y\|=\rho(x,M)\}$.

Let $M$ be proximal, that is, let $P_M(x)\ne \varnothing$ for every $x\in H$ (in particular, $M$ is closed).

We define the greedy approximation algorithm with respect to such a proximal set $M$ similarly to (1), that is, for any element $x=x_0\in H$ we consider the sequence

Equation (2)

(when $P_M(x_n)$ is not a singleton, we can choose any one of its elements as $y_{n+1}$). If $M=\Lambda(D)$ for some dictionary $D$, then the sequence (2) clearly coincides with (1). It is also clear that algorithm (2) works in an arbitrary Banach space $X$ and coincides with the well-known $X$-greedy algorithm ([1], Ch. 6) in the case when $M=\Lambda(D)$, where $D$ is a dictionary in $X$.

We say that algorithm (2) converges if $x_n\to 0$ as $n\to\infty$. In this case the initial element $x$ can be expressed as $\sum_{n=1}^\infty y_n$ in terms of elements from $M$.

A systematic study of such nonlinear approximations by sums of elements of a given set (without coefficients) has recently been started within the theory of density of semigroups in a Banach space [3] (see also §6 below). Particular cases of such approximations were studied much earlier. For example, approximations by simple partial fractions (logarithmic derivatives of polynomials) first appeared in a paper by Korevaar [4], and have a natural electrostatic interpretation (see [5]–[8] and the bibliography therein). Algorithm (2) and other algorithms of greedy approximation by an arbitrary set $M$ considered in this paper provide a natural way of constructing such approximations by sums of elements of $M$. In addition, these algorithms extend the geometric part of the classical greedy approximation theory, and it appears to the author that they may even open a new chapter of general geometric approximation theory.

We first investigate conditions on $M$ which are necessary or sufficient for the convergence of algorithm (2), that is, for

Equation (3)

for every $x\in H$.

The condition $0\in M$ is clearly necessary for (3): otherwise some ball $B_r(0)$ will be disjoint from $M$ (recall that $M$ is closed), and even if $x_n$ lies in the twice smaller ball $B_{r/2}(0)$, the next element $x_{n+1}=x_n-y_{n+1}$ will not belong to $B_{r/2}(0)$ since the norm of $y_{n+1}\in P_M(x_n)$ is not smaller than $r$.

Furthermore, the convergence of the greedy algorithm forces $M$ to have the property

Equation (4)

Indeed, if $\rho(x,M)=\|x\|$ for some non-zero $x$ ("$>$" is impossible since $0\in M$), then $0\in P_M(x)$. Selecting the zero element as $y_{n+1}$ at each step of the greedy algorithm for $x_0=x$, we obtain $x_n=x$ ($n=1,2,\dots$), and there is no convergence.

A set $M$ satisfying (4) is said to be norm-reducing. Condition (4), which clearly makes sense in any Banach space, is equivalent to the presence of points of $M$ in every open ball $B(x,\|x\|)$, whose sphere passes through $0$. In this geometric sense, the property of being norm-reducing in a Hilbert space strengthens the property of being all-round. (A subset of a Banach space $X$ is said to be all-round [3] if it has a non-empty intersection with the open half-space $\{x\colon f(x)>0\}$ for any non-zero functional $f$ in the dual space $X^*$.) A norm-reducing subset of a non-reflexive Banach space does not have to be all-round; see §6 below.

When $M=\Lambda(D)$ for a dictionary $D$ in a Hilbert space, the property of being norm-reducing is equivalent to the property of being all-round, which is, in its turn, equivalent to $D$ being complete.

The purpose of this paper is to show that the norm-reducing condition (4) is generally insufficient for the convergence (3) of greedy approximations (Remark 1), but provides this convergence in a Hilbert space under certain additional assumptions on $M$ (Theorem 1). Moreover, we propose a modified semi-greedy algorithm of approximation by elements of an arbitrary set $M$, which converges for every initial element in the case when $M$ is a norm-reducing symmetric subset of a Hilbert space (Theorem 3). We also propose recursive greedy and semi-greedy algorithms, for which these convergence results hold without the symmetry condition on $M$ (Theorem 4). As a consequence, we prove the density of the additive semigroup generated by a norm-reducing set $M$ in a Hilbert space (Theorem 5). This result does not hold in an arbitrary non-reflexive space (Remark 3).

§ 2. Example of divergence for a norm-reducing set

We construct this example in the Hilbert space $l_2$ of sequences $x=(t_1,t_2,\dots)$ with the norm

Let $e_n=(0,\dots ,0,1,0,\dots)$ be the basis vector of this space with 1 at the $n$th position, and let

We take a convex sequence of numbers $\lambda_n$ such that

and put $z_n=\lambda_ne_n-\lambda_{n+1}e_{n+1}$.

Remark 1.  The set

is symmetric and proximal in $l_2$. It satisfies condition (4), but the algorithm (2) of greedy approximation by elements of $M$ diverges for the initial element $x=e_1$.

Indeed, for every non-zero $x$ we have $\rho(x,\varepsilon_nS_n)<\|x\|$ for all sufficiently large $n$, whence $\rho(x,M)<\|x\|$ and $M$ is norm-reducing. Moreover,

and it follows from the compactness of the sets $\varepsilon_nS_n$ that $P_M(x)\ne \varnothing$. Since $P_M(0)=\{0\}$, we can see that $M$ is proximal.

For $x_0=e_1$, we have

whence $P_M(x_0)=\{z_1\}$, and the greedy algorithm (2) gives $x_1=x_0-z_1=\lambda_2e_2$ at the first step.

We now prove by induction that the algorithm produces $x_{k}=x_{k-1}-z_k=\lambda_{k+1}e_{k+1}$ at the $k$th step. Indeed, for $x_{k-1}=\lambda_ke_k$ we have

Therefore $P_M(x_{k-1})=\{z_k\}$, and $x_{k}=x_{k-1}-z_k=\lambda_{k+1}e_{k+1}$.

Since the $\lambda_k$ do not tend to zero, it turns out that the greedy algorithm diverges.

§ 3. Sufficient conditions for convergence

The condition that $M$ should be a proximal set, which is needed for the greedy algorithm (2), is a fairly strong one. Therefore, it is reasonable to relax the requirements for the sequence of approximants $y_n$ so that the algorithm would also work for non-proximal (in particular, non-closed) sets.

Given a norm-reducing set $M$ in a Banach space $X$ such that $0\in M$, we define the weak greedy algorithm by associating the following system with every non-zero element $x=x_0\in X$:

Equation (5)

(we can choose any element $y_{n+1}$ satisfying these conditions).

In the case of a Hilbert space and $M=\Lambda(D)$, where $D$ is a dictionary, this algorithm roughly corresponds to the weak greedy algorithm in [1], Ch. 2 (for the correspondence to be exact one should take the quadratic mean instead of the arithmetic mean in (5); however, the quadratic mean would look unnatural for an arbitrary Banach space). The arithmetic mean in (5) can of course be replaced by an arbitrary convex combination of $\rho(x_n,M)$ and $\|x_n\|$ with fixed coefficients playing the role of weakness parameters.

Theorem 1.  Let $M\ni 0$ be a symmetric all-round set in a Hilbert space $H$ such that $M/2\subset M$ (that is, for every $y\in M$ the element $y/2$ is also in $M$). Then the weak greedy algorithm (5) converges for each $x\in H$.

Here we write 'all-round' instead of 'norm-reducing' because it is the same under the condition $M/2\subset M\subset H$. Instead of $M/2\subset M$, one can write $M/q\subset M$ for any fixed $q>1$. Remark 1 shows that the hypothesis $M/2\subset M$ in Theorem 1 cannot be omitted. The condition $0\in M$ is not burdensome. It guarantees that the algorithm does not stop.

Many algebraic (but not geometric) aspects of the following proof go back to [2].

Proof.  1) We take an arbitrary initial element $x=x_0$. For every $n=0,1,2,\dots$, we have

so that the sequence $\|x_n\|$ decreases (non-strictly if $x_n=0$). The sequence

enjoys the following properties:

Equation (6)

Hence the sequence

is such that the series $\sum_{n=1}^\infty \alpha_n$ also converges, and we have $\alpha_n n \to 0$ along some subsequence $\Lambda\subset {\mathbb N}$ of the values of $n$.

2) Since $y_{n+1}/2\in M$, we have

whence, in view of (6), we obtain

Equation (7)

3) Consider two numbers $n,m\in \Lambda$ such that $n<m$. We have

Equation (8)

Let us rewrite the last term as

Equation (9)

where $\sigma_j=y_j/\|y_j\|$. We distinguish three types among all the indices $j\in \{n+ 1,\dots, m\}$.

I. $|(\sigma_j,x_m)|\le \|y_j\|$. For any such $j$, we have

Equation (10)

II. $|(\sigma_j,x_m)|> \|y_j\|$ and the line with direction $\sigma_j$ intersects the open ball $B_m=B(x_m,\rho(x_m,M))$; see Fig. 1.

Figure 1.

Figure 1. 

Standard image
Figure 2.

Figure 2. 

Standard image

Since $\pm y_j\in M$, both these points are outside the ball $B_m$. Then $y_j$ lies in the closed interval $[-a,a]$ in view of the inequality $\|y_j\|<|(\sigma_j,x_m)|$, and

Equation (11)

(the first equality holds by the high-school theorem on the square of the tangent).

III. $|(\sigma_j,x_m)|> \|y_j\|$ and the line with direction $\sigma_j$ does not intersect the open ball $B_m$; see Fig. 2.

In this case,

Equation (12)

4) In accordance with this partition of the values of $j$ into classes I, II, and III, the sum in (8) splits into three sums, for which the estimates (10)–(12) prepared above give

It is easy to see that the first two terms in the last expression tend to zero as $n,m\to\infty$, $n,m\in \Lambda$, because of (7) and the choice of $\Lambda$. The third term also tends to zero:

as $n,m\to\infty$, $n,m\in \Lambda$.

5) Thus the quantity (9) tends to zero as $n,m\in \Lambda$ tend to $\infty$. Hence $\{x_n\}_{n\in \Lambda}$ is a Cauchy sequence in view of (8). Let $z\in H$ be its limit. Clearly,

Equation (13)

If $z\ne 0$, then $\rho(z,M)<(1-\delta)\|z\|$ for some $\delta>0$ because $M$ is norm-reducing (we have already mentioned that this follows since $M$ is all-round and $M/2\subset M$). Therefore,

for all sufficiently large $n\in \Lambda$. Hence $\|x_{n+1}\|<(1-\delta/2)\|z\|$ for any sufficiently large $n\in \Lambda$. This contradicts (13).

As a result, we have $z=0$ and $x_n\to 0$ ($n\in \Lambda$), hence $x_n\to 0$ ($n\to \infty$) in view of the monotonicity of $\|x_n\|$ (see part 1 of the proof). $\Box$

Corollary 1.  Let $M$ be a symmetric all-round set in a Hilbert space $H$. Then any element $x\in H$ can be represented as a series

Equation (14)

where $z_k\in M$, $n_k\in\{0,1,2,\dots\}$.

Proof.  The set

is also symmetric and all-round, and $M'/2\subset M'$. Thus $M'$ satisfies the hypotheses of Theorem 1 and, by this theorem, every element $x\in H$ can be represented as a series $x=\sum_{k=1}^\infty y_k$ with $y_k\in M'$, that is, as a series (14). $\Box$

Theorem 2.  Let $M\ni 0$ be a norm-reducing set in a Hilbert space $H$. For each $x\in H$, the residuals $x_n$ in the weak greedy algorithm (5) converge weakly to zero.

Proof.  We have already seen in part 1 of the proof of Theorem 1 that $\|x_n\|$ is a decreasing sequence. Hence $\lim_{n\to\infty}\|x_n\|=R\le \|x\|$. We argue by contradiction. Suppose that $R>0$. The bounded sequence $\{x_n\}\subset H$ contains a weakly converging subsequence. Suppose that the elements $x_n$ converge weakly to $z\ne 0$ along some sequence $ \Lambda $ of the values of $n$. Clearly, $\|z\|\le R$. By hypothesis, there is an element $y\in M$ such that $\|z-y\|<\|z\|-\varepsilon$ for some $\varepsilon>0$.

All of this takes place in a separable part of $H$, which can be identified with the space $l_2$. Consider the coordinate projections

available in $l_2$.

There is an $N$ such that

Equation (15)

Equation (16)

Equation (17)

Since weak convergence in $l_2$ implies coordinate convergence, we can find a number $n\in \Lambda$ such that, for this $N$,

Equation (18)

Equation (19)

Applying (16) and (18), we get

Combining this with (17), (18), (15) and (19), we have

Consequently, $\rho^2(x_n,M)\le\|x_n-y\|^2<R^2-\varepsilon^2/20$, hence

This contradicts the inequality $\|x_k\|\ge R$.

Thus, every weak partial limit of $x_n$ is $0$, so this sequence converges weakly to zero. $\Box$

Remark 2.  For any norm-reducing set $M$ in a finite-dimensional normed space $X$, the weak greedy algorithm (5) converges for every $x\in X$.

Indeed, $\|x_n\|$ is a decreasing sequence bounded by $\|x\|$. Therefore $x_n\to z$ along some subsequence of the values of $n$. Similarly to part 5 of the proof of Theorem 1, we get $z=0$, hence $x_n\to 0$ ($n\to \infty$).

§ 4.  Semi-greedy approximations

If one is not too "greedy", it is possible to get convergence without the condition $M/2\subset M$ needed in Theorem 1.

Given a norm-reducing set $M\ni 0$ in an arbitrary Banach space $X$, we define the weak semi-greedy algorithm by associating the following sequence with every $x=x_0\in X$:

Equation (20)

(we select any element $y_{n+1}$ that meets these conditions).

If, in addition, $M$ is a proximal set, one can define a pure semi-greedy algorithm with $y_{n+1}\in P_M(x_n/2)$ instead of (20). In the case of $M=\Lambda(D)$, where $D$ is a dictionary in a Hilbert space, this pure semi-greedy algorithm coincides with a particular type of the weak greedy algorithm in [1], Ch. 2, with weakness coefficient $t=1/2$. However, this coincidence is only formal. Essentially, the semi-greedy algorithm differs fundamentally from all known greedy algorithms: we subtract from $x_n$ not the best or, in some sense, almost best approximation of $x_n$, but the best or almost best approximation of $x_n/2$. For arbitrary sets $M$, the semi-greedy algorithm approximation is worse at every step than the algorithms (2) and (5), but it turns out to be more reliable in the sense of convergence.

Theorem 3.  Let $M$ be a symmetric norm-reducing set in a Hilbert space $H$ with $0\,{\in}\,M$. Then the weak semi-greedy algorithm (20) converges for each $x\,{\in}\,H$.

Proof.  We follow the same scheme as in the proof of Theorem 1, but with some significant modifications.

1) Take any non-zero element $x=x_0$. If $x_n\ne 0$, then $\rho(x_n/2,M)\,{<}\,\|x_n\|/2$ and $y_{n+1}$ lies inside the ball $B(x_n/2, \|x_n\|/2)$, which has the closed interval $[0,x_n]$ as a diameter. Consequently, $y_{n+1}\ne x_n$, so that $x_{n+1}\ne 0$ (for all $n$), and

Hence the values $\|x_n\|$ decrease.

The sequence $\varepsilon_n=\|x_n\|-\|x_{n+1}\|$ satisfies

Hence the sequence

is such that

It follows that $\alpha_n n\to 0$ along some subsequence $\Lambda\subset\mathbb N$ of the values of $n$.

2) Since $y_{n+1}$ is in the ball with diameter $[0,x_n]$, the angle $\angle\, 0y_nx_n$ is greater than $90^\circ $. Therefore,

so that

Equation (21)

3) We take two numbers $n,m\in \Lambda$ with $n<m$. As in the proof of Theorem 1, we estimate the scalar product

Equation (22)

where $\sigma_j=y_j/\|y_j\|$.

We distinguish three types among all the indices $j\in \{n+1,\dots, m\}$.

I. $|(\sigma_j,x_m/2)|\le \|y_j\|$. For any such $j$, we have

Equation (23)

II. $|(\sigma_j,x_m/2)|> \|y_j\|$ and the line with direction $\sigma_j$ intersects the ball

Just as in part 3 of the proof of Theorem 1 (one should replace $x_m$ by $x_m/2$ in Fig. 1 and in (11)), in this case we obtain

Equation (24)

III. $|(\sigma_j,x_m/2)|> \|y_j\|$ and the line with direction $\sigma_j$ does not intersect the ball $B_m$. Just as in part 3 of the proof of Theorem 1 (one should replace $x_m$ by $x_m/2$ in Fig. 2 and in (12)), in this case we obtain

Equation (25)

4. As in part 4 of the proof of Theorem 1, we use (23)–(25) to bound the absolute value of (22) by the sum

Equation (26)

In view of (22) and the choice of $\Lambda$, this sum tends to zero as $n,m\to\infty$, $n,m\in \Lambda$.

5. By (8), $\{x_n\}_{n\in \Lambda}$ is a Cauchy sequence. Repeating the arguments in part 5 of the proof of Theorem 1, we find that $x_n\to 0$ as $n\to \infty$. $\Box$

Corollary 2.  Let $M$ be a norm-reducing symmetric set in a Hilbert space $H$. Then every element $x\in H$ has a representation as a series

Equation (27)

where $y_n\in M$ with $\sum_{n=1}^\infty \|y_n\|^2<\infty$.

It would be interesting to specify a non-trivial class of sets in this corollary for which the series (27) is absolutely convergent.

Theorem 2 remains valid for the modified algorithm (20) with $\theta x_n$ taken instead of $x_n/2$, where $\theta$ is any fixed value in $(0,1/2]$.

§ 5. Recursive algorithms

The symmetry condition on $M$ in Theorems 1 and 3 seems to be significant, although the author is not aware of any corresponding examples. In order to obtain convergence without symmetry, we need to change the algorithm in a natural way.

In the case of approximation by an asymmetric norm-reducing set $M$ it is convenient to use the recursive greedy algorithm described below. It is a modification of the algorithm invented by Livshitz [9] in the case when $M=\Lambda(D)$, where $D$ is a dictionary, to speed up the convergence rate compared to algorithm (1). In [10] Livshitz also applied this algorithm to an approximation by the set $\Lambda_+(D)=\{\lambda g\colon \lambda\ge 0,\, g\in D\}$, where $D$ is, generally speaking, an asymmetric dictionary in a Hilbert space.

Let $M$ be a norm-reducing set in an arbitrary Banach space $X$, and let $0\in M$. We define the recursive weak greedy algorithm (RWGA) and recursive weak semi-greedy algorithm (RWSGA) by associating with every $x=x_0\in X$ a sequence

according to the following rules.

For each $k$, the element $y_k$ either belongs to $M$ or it belongs to $-M$, and in the latter case there is a $j<k$ such that $y_j=-y_k\in M$ and $y_m\ne y_k$ for all $m=j+1,\dots,k-1$. Then $x_n$ (to be defined by induction) can be written as

where $z_j^{(n)}\in M$. We put $M_n=M\cup \{-z_1^{(n)},\dots,-z_{\nu_n}^{(n)}\}$ and finally define the algorithm step

where $y_{n+1}\in M_n$ and

(such an element $y_{n+1}$ exists since $M_n$ is norm-reducing and $0\in M$).

When $M$ is proximal, all the $M_n$ are also proximal. Then one can define the recursive pure greedy and recursive pure semi-greedy algorithms by choosing $y_{n+1}\in P_{M_n}(x_n)$ and $y_{n+1}\in P_{M_n}(x_n/2)$, respectively.

If the RWGA or RWSGA converges, that is, $x_n\to 0$, then $x$ can be represented as a series $\sum_{n=1}^\infty y_n$ of elements in $M\cup (-M)$ all of whose partial sums are sums of elements in $M$.

Theorem 4.  Let $M$ be a norm-reducing set in a Hilbert space $H$, and let $0\in M$. Then the recursive weak semi-greedy algorithm converges for every $x\in H$. If, moreover, $M/2\subset M$, then the recursive weak greedy algorithm also converges for each $x\in H$.

Proof.  This proof also follows the scheme of the proof of Theorem 1.

1) Clearly, the norms $\|x_n\|$ are decreasing, and the values $\varepsilon_n=\|x_n\|-\|x_{n-1}\|$ satisfy $\sum_{n=1}^\infty \varepsilon_n <\infty$.

Similarly to part 1 of the proof of Theorem 1 and of Theorem 3, we prove that $\sum_{n=1}^\infty \!\alpha_n \!< \!\infty$, where

We choose a subsequence $\Lambda\subset {\mathbb N}$ of $n$'s such that $\alpha_n n\to 0$.

2) Similarly to part 2 of the proof of Theorem 1 and of Theorem 3, we get

Equation (28)

3) Consider two numbers $n,m\in \Lambda$ such that $n<m$. We now give only an upper bound for $-(x_n-x_m,x_m)$ from (8), rather than a bound for the modulus: if this upper bound tends to zero as $n,m\to \infty$, $n,m\in \Lambda$, it will be sufficient for $\{x_n\}_{n\in \Lambda}$ to be a Cauchy sequence.

We have

where $\sigma_j=y_j/\|y_j\|$ as above, while the sum $\sum\nolimits'$ is taken over those $j\in\{n+ 1,\dots,m\}$ for which we have either

(a) $y_j\in M$ and $y_k\ne -y_j$ for all $k=j+1,\dots,m$ (so that $-y_j\in M_m$),

or

(b) $y_j\in (-M)$ and $y_k\ne -y_j$ for all $k=n+1,\dots,j-1$.

The terms corresponding to the remaining indices $j$ split into pairs of opposite values which cancel one another.

Next, we have

Equation (29)

where the sum $\sum\nolimits''$ contains only terms for which $(\sigma_j,x_m)<0$.

We split these terms into three types.

I. $-\|y_j\|\le(\sigma_j,x_m)\le 0$ in the case of RWGA or $-\|y_j\|\le(\sigma_j,x_m/2)\le 0$ in that of RWSGA. For such terms, we have

Equation (30)

II. $(\sigma_j,x_m)< -\|y_j\|$ in the case of RWGA or $(\sigma_j,x_m/2)< -\|y_j\|$ in that of RWSGA, and the line with direction $\sigma_j$ intersects the open ball

Since $(\sigma_j,x_m)<0$, the point $y_j$ lies in the lower part of the line in Fig. 1 (in the case of RWSGA, $x_m$ is replaced by $x_m/2$ in this figure).

In case (a) we have $-y_j\in M_m$, therefore $-y_j\notin B_m$, which together with $\|-y_j\|<-(\sigma_j,x_m)=|(\sigma_j,x_m)|$ implies $-y_j\in [0,a]$, and so $\|y_j\|\le \|a\|$.

In case (b) we have $-y_j\in M\subset M_m$, therefore $-y_j\notin B_m$, and we similarly get $-y_j\in [0,a]$ and $\|y_j\|\le \|a\|$.

Consequently, in both cases,

Equation (31)

For the details of the last step, see part 3 of the proof of Theorem 1.

III. $(\sigma_j,x_m)< -\|y_j\|$ in the case of RWGA or $(\sigma_j,x_m/2)< -\|y_j\|$ in that of RWSGA, and the line with direction $\sigma_j$ does not intersect $ B_m$. Similarly to part 3 of the proof of Theorem 1, in accordance with Fig. 2 (in the case of RWSGA, $x_m$ is replaced by $x_m/2$ in this figure) we obtain

Equation (32)

(in the case of RWGA, the estimate is even better: $2$ appears under the square root sign).

4. Similarly to part 4 of the proof of Theorem 1, we use (30)–(32) to bound the sum in (29) by the expression (26). In view of (28) and the choice of $\Lambda$, this expression tends to zero as $n,m\to 0$, $n,m \in \Lambda$.

5. By (8), $\{x_n\}_{n\in\Lambda}$ turns out to be a Cauchy sequence. Repeating the arguments in part 5 of the proof of Theorem 1 and using the inequalities $\rho(z,M_n)\le \rho(z,M)< \|z\|$, which hold for all non-zero $z\in H$, we obtain $x_n\to 0$ as $n\to \infty$. $\Box$

§ 6. Applications to the theory of density of semigroups

The main problem of this theory [3] is to find conditions on a subset $M$ of a Banach space $X$ which are necessary or sufficient for the set

(the additive semigroup generated by $M$) to be dense in $X$ (that is, every element of $X$ can be approximated with arbitrary accuracy by finite sums of elements of $M$). A necessary condition for $R(M)$ to be dense in $X$ is that $M$ should be all-round; see [3]. There are examples of all-round sets $M$ in a Hilbert space $H$ such that the closure $\overline{R(M)}$ is not even an additive subgroup of $H$; see Example 1 in [3].

Condition (4) that $M$ should be norm-reducing is clearly not necessary for $R(M)$ to be dense in $H$, but it turns out to be sufficient for this to be so.

Theorem 5.  For any norm-reducing set $M$ in a Hilbert space $H$, the set $R(M)$ is dense in $H$.

Proof.  We complement $M$ by the zero element, take an arbitrary $x\in H$ and run the recursive weak semi-greedy algorithm for $x$. It converges by Theorem 4. Hence $x$ can be represented as a series $\sum_{n=1}^\infty y_n$ of elements of $M\cup(-M)$ all of whose partial sums $s_k=\sum_{n=1}^ky_n$ are sums of elements of $M$ by the definition of the recursive algorithm. Thus $R(M)\ni s_k\to x$ and, therefore, $\overline{R(M)}=H$. $\Box$

It should be possible to generalise Theorem 5 for some class of Banach spaces. Nonetheless the theorem does not hold for an arbitrary Banach space.

Remark 3.  Each non-reflexive space $X$ contains a norm-reducing symmetric set $M$ such that $\overline{R(M)}\ne X$.

Indeed, by James' theorem ([11], Ch. 1), there is a functional $f\in X^*$ not attaining its norm: $|f(x)|\ne \|f\|\cdot \|x\|$ for any non-zero $x\in X$. The kernel $\ker f$ has the property that the metric projection $P_{\ker f}(x)$ is empty for every $x\in\ker f$. Indeed, if $y\in P_{\ker f}(x)$, then for $q=x-y\ne 0$ we have $P_{\ker f}(q)\ni 0$ and

(for the last equality, see, for example, [12], Ch. 1), contradicting the fact that $f$ does not attain its norm. It turns out that for any $x\notin \ker f$ there is no nearest element in $\ker f$, hence there is a $y\in \ker f$ such that $\|x-y\|<\|x-0\|=\|x\|$. Thus $\ker f$ is the desired norm-reducing set.

§ 7. Conclusion

We outline several questions that naturally arise in connection with the results presented here.

1) Is it possible to omit the symmetry condition on $M$ in Theorems 1 and 3? In particular, does the greedy algorithm (2) or the weak greedy algorithm (5) converge in the case when $M=\Lambda_+(D)=\{\lambda g\colon\lambda\ge 0, g\in D\}$, where $D$ is (generally speaking) an asymmetric all-round dictionary in a Hilbert space? Reference [10] suggests that its author was also interested in the last question.

2) Is it possible to extend Theorems 15 to some class of Banach spaces (say, uniformly convex and uniformly smooth spaces)? It is especially interesting whether or not Theorems 2 and 5 hold for an arbitrary reflexive space. For non-reflexive spaces, they do not hold in view of the example in Remark 3.

3) Is it possible to estimate the rate of convergence of greedy algorithms under the hypotheses of Theorems 1, 3 and 4 for some classes of sets $M$ and initial elements $x$? For classical greedy algorithms, there are many estimates of this kind [1]. For example, if $M=\Lambda(D)$, where $D$ is a complete dictionary in a Hilbert space, then for each initial element $x$ which is a finite linear combination of elements of $D$, the norms $\|x_n\|$ of the residuals in the pure greedy algorithm (1) are bounded above by $O(n^{-\gamma})$ with $\gamma=0.182\dots$ (DeVore, Temlyakov, Konyagin and Silnichenko; see [1], Ch. 2). It seems quite reasonable to compare the greedy approximations $\| x_n\|$ in the algorithms of approximation by elements of $M$ considered above with the corresponding $n$-term approximations

Clearly, $\|x_n\|\ge \sigma_n(x)$, as for the classical approximation with respect to a dictionary.

The author is deeply grateful to V. N. Temlyakov for valuable comments and advice.

Please wait… references are loading.
10.1070/IM8891