Abstract
We define various algorithms for greedy approximations by elements of an arbitrary set in a Banach space. We study the convergence of these algorithms in a Hilbert space under various geometric conditions on . As a consequence, we obtain sufficient conditions for the additive semigroup generated by to be dense.
Export citation and abstract BibTeX RIS
This paper was written with the financial support of a grant of the Government of the Russian Federation (project 14.W03.31.0031). Theorem 5 was proved within the research program of RFBR (grant no. 18-01-00333a). |
§ 1. Introduction
Let denote a real Hilbert space with scalar product and norm . Let be the unit sphere of . For every subset , referred to as a dictionary, there is a greedy approximation algorithm. With every element it associates a sequence
where is such that
(the existence of for every is an additional condition on ; in the case when the maximum is attained at several elements of , any one of them can be selected as ).
More precisely, this algorithm is called the pure greedy algorithm, in contrast to other approximation algorithms whose names contain the word 'greedy'; see [1].
It is known that the pure greedy algorithm converges for every complete dictionary (that is, when the linear combinations of elements of are dense in ). This means that for every initial element one has as and is the sum of the series (see [2] or [1], Ch. 2).
Note that the term in (1) is the nearest point to in the set
This observation leads to the definition of a greedy algorithm of approximation by an arbitrary subset of as introduced below.
Given any subset and any element , we write for the distance from to and define the metric projection .
Let be proximal, that is, let for every (in particular, is closed).
We define the greedy approximation algorithm with respect to such a proximal set similarly to (1), that is, for any element we consider the sequence
(when is not a singleton, we can choose any one of its elements as ). If for some dictionary , then the sequence (2) clearly coincides with (1). It is also clear that algorithm (2) works in an arbitrary Banach space and coincides with the well-known -greedy algorithm ([1], Ch. 6) in the case when , where is a dictionary in .
We say that algorithm (2) converges if as . In this case the initial element can be expressed as in terms of elements from .
A systematic study of such nonlinear approximations by sums of elements of a given set (without coefficients) has recently been started within the theory of density of semigroups in a Banach space [3] (see also §6 below). Particular cases of such approximations were studied much earlier. For example, approximations by simple partial fractions (logarithmic derivatives of polynomials) first appeared in a paper by Korevaar [4], and have a natural electrostatic interpretation (see [5]–[8] and the bibliography therein). Algorithm (2) and other algorithms of greedy approximation by an arbitrary set considered in this paper provide a natural way of constructing such approximations by sums of elements of . In addition, these algorithms extend the geometric part of the classical greedy approximation theory, and it appears to the author that they may even open a new chapter of general geometric approximation theory.
We first investigate conditions on which are necessary or sufficient for the convergence of algorithm (2), that is, for
for every .
The condition is clearly necessary for (3): otherwise some ball will be disjoint from (recall that is closed), and even if lies in the twice smaller ball , the next element will not belong to since the norm of is not smaller than .
Furthermore, the convergence of the greedy algorithm forces to have the property
Indeed, if for some non-zero ("" is impossible since ), then . Selecting the zero element as at each step of the greedy algorithm for , we obtain (), and there is no convergence.
A set satisfying (4) is said to be norm-reducing. Condition (4), which clearly makes sense in any Banach space, is equivalent to the presence of points of in every open ball , whose sphere passes through . In this geometric sense, the property of being norm-reducing in a Hilbert space strengthens the property of being all-round. (A subset of a Banach space is said to be all-round [3] if it has a non-empty intersection with the open half-space for any non-zero functional in the dual space .) A norm-reducing subset of a non-reflexive Banach space does not have to be all-round; see §6 below.
When for a dictionary in a Hilbert space, the property of being norm-reducing is equivalent to the property of being all-round, which is, in its turn, equivalent to being complete.
The purpose of this paper is to show that the norm-reducing condition (4) is generally insufficient for the convergence (3) of greedy approximations (Remark 1), but provides this convergence in a Hilbert space under certain additional assumptions on (Theorem 1). Moreover, we propose a modified semi-greedy algorithm of approximation by elements of an arbitrary set , which converges for every initial element in the case when is a norm-reducing symmetric subset of a Hilbert space (Theorem 3). We also propose recursive greedy and semi-greedy algorithms, for which these convergence results hold without the symmetry condition on (Theorem 4). As a consequence, we prove the density of the additive semigroup generated by a norm-reducing set in a Hilbert space (Theorem 5). This result does not hold in an arbitrary non-reflexive space (Remark 3).
§ 2. Example of divergence for a norm-reducing set
We construct this example in the Hilbert space of sequences with the norm
Let be the basis vector of this space with 1 at the th position, and let
We take a convex sequence of numbers such that
and put .
is symmetric and proximal in . It satisfies condition (4), but the algorithm (2) of greedy approximation by elements of diverges for the initial element .
Indeed, for every non-zero we have for all sufficiently large , whence and is norm-reducing. Moreover,
and it follows from the compactness of the sets that . Since , we can see that is proximal.
For , we have
whence , and the greedy algorithm (2) gives at the first step.
We now prove by induction that the algorithm produces at the th step. Indeed, for we have
Therefore , and .
Since the do not tend to zero, it turns out that the greedy algorithm diverges.
§ 3. Sufficient conditions for convergence
The condition that should be a proximal set, which is needed for the greedy algorithm (2), is a fairly strong one. Therefore, it is reasonable to relax the requirements for the sequence of approximants so that the algorithm would also work for non-proximal (in particular, non-closed) sets.
Given a norm-reducing set in a Banach space such that , we define the weak greedy algorithm by associating the following system with every non-zero element :
(we can choose any element satisfying these conditions).
In the case of a Hilbert space and , where is a dictionary, this algorithm roughly corresponds to the weak greedy algorithm in [1], Ch. 2 (for the correspondence to be exact one should take the quadratic mean instead of the arithmetic mean in (5); however, the quadratic mean would look unnatural for an arbitrary Banach space). The arithmetic mean in (5) can of course be replaced by an arbitrary convex combination of and with fixed coefficients playing the role of weakness parameters.
Theorem 1. Let be a symmetric all-round set in a Hilbert space such that (that is, for every the element is also in ). Then the weak greedy algorithm (5) converges for each .
Here we write 'all-round' instead of 'norm-reducing' because it is the same under the condition . Instead of , one can write for any fixed . Remark 1 shows that the hypothesis in Theorem 1 cannot be omitted. The condition is not burdensome. It guarantees that the algorithm does not stop.
Many algebraic (but not geometric) aspects of the following proof go back to [2].
Proof. 1) We take an arbitrary initial element . For every , we have
so that the sequence decreases (non-strictly if ). The sequence
enjoys the following properties:
Hence the sequence
is such that the series also converges, and we have along some subsequence of the values of .
2) Since , we have
whence, in view of (6), we obtain
3) Consider two numbers such that . We have
Let us rewrite the last term as
where . We distinguish three types among all the indices .
I. . For any such , we have
II. and the line with direction intersects the open ball ; see Fig. 1.
Download figure:
Standard imageSince , both these points are outside the ball . Then lies in the closed interval in view of the inequality , and
(the first equality holds by the high-school theorem on the square of the tangent).
III. and the line with direction does not intersect the open ball ; see Fig. 2.
In this case,
4) In accordance with this partition of the values of into classes I, II, and III, the sum in (8) splits into three sums, for which the estimates (10)–(12) prepared above give
It is easy to see that the first two terms in the last expression tend to zero as , , because of (7) and the choice of . The third term also tends to zero:
as , .
5) Thus the quantity (9) tends to zero as tend to . Hence is a Cauchy sequence in view of (8). Let be its limit. Clearly,
If , then for some because is norm-reducing (we have already mentioned that this follows since is all-round and ). Therefore,
for all sufficiently large . Hence for any sufficiently large . This contradicts (13).
As a result, we have and (), hence () in view of the monotonicity of (see part 1 of the proof).
Corollary 1. Let be a symmetric all-round set in a Hilbert space . Then any element can be represented as a series
where , .
is also symmetric and all-round, and . Thus satisfies the hypotheses of Theorem 1 and, by this theorem, every element can be represented as a series with , that is, as a series (14).
Theorem 2. Let be a norm-reducing set in a Hilbert space . For each , the residuals in the weak greedy algorithm (5) converge weakly to zero.
Proof. We have already seen in part 1 of the proof of Theorem 1 that is a decreasing sequence. Hence . We argue by contradiction. Suppose that . The bounded sequence contains a weakly converging subsequence. Suppose that the elements converge weakly to along some sequence of the values of . Clearly, . By hypothesis, there is an element such that for some .
All of this takes place in a separable part of , which can be identified with the space . Consider the coordinate projections
available in .
There is an such that
Since weak convergence in implies coordinate convergence, we can find a number such that, for this ,
Applying (16) and (18), we get
Combining this with (17), (18), (15) and (19), we have
Consequently, , hence
This contradicts the inequality .
Thus, every weak partial limit of is , so this sequence converges weakly to zero.
Remark 2. For any norm-reducing set in a finite-dimensional normed space , the weak greedy algorithm (5) converges for every .
Indeed, is a decreasing sequence bounded by . Therefore along some subsequence of the values of . Similarly to part 5 of the proof of Theorem 1, we get , hence ().
§ 4. Semi-greedy approximations
If one is not too "greedy", it is possible to get convergence without the condition needed in Theorem 1.
Given a norm-reducing set in an arbitrary Banach space , we define the weak semi-greedy algorithm by associating the following sequence with every :
(we select any element that meets these conditions).
If, in addition, is a proximal set, one can define a pure semi-greedy algorithm with instead of (20). In the case of , where is a dictionary in a Hilbert space, this pure semi-greedy algorithm coincides with a particular type of the weak greedy algorithm in [1], Ch. 2, with weakness coefficient . However, this coincidence is only formal. Essentially, the semi-greedy algorithm differs fundamentally from all known greedy algorithms: we subtract from not the best or, in some sense, almost best approximation of , but the best or almost best approximation of . For arbitrary sets , the semi-greedy algorithm approximation is worse at every step than the algorithms (2) and (5), but it turns out to be more reliable in the sense of convergence.
Theorem 3. Let be a symmetric norm-reducing set in a Hilbert space with . Then the weak semi-greedy algorithm (20) converges for each .
Proof. We follow the same scheme as in the proof of Theorem 1, but with some significant modifications.
1) Take any non-zero element . If , then and lies inside the ball , which has the closed interval as a diameter. Consequently, , so that (for all ), and
Hence the values decrease.
The sequence satisfies
Hence the sequence
is such that
It follows that along some subsequence of the values of .
2) Since is in the ball with diameter , the angle is greater than . Therefore,
so that
3) We take two numbers with . As in the proof of Theorem 1, we estimate the scalar product
where .
We distinguish three types among all the indices .
I. . For any such , we have
II. and the line with direction intersects the ball
Just as in part 3 of the proof of Theorem 1 (one should replace by in Fig. 1 and in (11)), in this case we obtain
III. and the line with direction does not intersect the ball . Just as in part 3 of the proof of Theorem 1 (one should replace by in Fig. 2 and in (12)), in this case we obtain
4. As in part 4 of the proof of Theorem 1, we use (23)–(25) to bound the absolute value of (22) by the sum
In view of (22) and the choice of , this sum tends to zero as , .
5. By (8), is a Cauchy sequence. Repeating the arguments in part 5 of the proof of Theorem 1, we find that as .
Corollary 2. Let be a norm-reducing symmetric set in a Hilbert space . Then every element has a representation as a series
where with .
It would be interesting to specify a non-trivial class of sets in this corollary for which the series (27) is absolutely convergent.
Theorem 2 remains valid for the modified algorithm (20) with taken instead of , where is any fixed value in .
§ 5. Recursive algorithms
The symmetry condition on in Theorems 1 and 3 seems to be significant, although the author is not aware of any corresponding examples. In order to obtain convergence without symmetry, we need to change the algorithm in a natural way.
In the case of approximation by an asymmetric norm-reducing set it is convenient to use the recursive greedy algorithm described below. It is a modification of the algorithm invented by Livshitz [9] in the case when , where is a dictionary, to speed up the convergence rate compared to algorithm (1). In [10] Livshitz also applied this algorithm to an approximation by the set , where is, generally speaking, an asymmetric dictionary in a Hilbert space.
Let be a norm-reducing set in an arbitrary Banach space , and let . We define the recursive weak greedy algorithm (RWGA) and recursive weak semi-greedy algorithm (RWSGA) by associating with every a sequence
according to the following rules.
For each , the element either belongs to or it belongs to , and in the latter case there is a such that and for all . Then (to be defined by induction) can be written as
where . We put and finally define the algorithm step
where and
(such an element exists since is norm-reducing and ).
When is proximal, all the are also proximal. Then one can define the recursive pure greedy and recursive pure semi-greedy algorithms by choosing and , respectively.
If the RWGA or RWSGA converges, that is, , then can be represented as a series of elements in all of whose partial sums are sums of elements in .
Theorem 4. Let be a norm-reducing set in a Hilbert space , and let . Then the recursive weak semi-greedy algorithm converges for every . If, moreover, , then the recursive weak greedy algorithm also converges for each .
Proof. This proof also follows the scheme of the proof of Theorem 1.
1) Clearly, the norms are decreasing, and the values satisfy .
Similarly to part 1 of the proof of Theorem 1 and of Theorem 3, we prove that , where
We choose a subsequence of 's such that .
2) Similarly to part 2 of the proof of Theorem 1 and of Theorem 3, we get
3) Consider two numbers such that . We now give only an upper bound for from (8), rather than a bound for the modulus: if this upper bound tends to zero as , , it will be sufficient for to be a Cauchy sequence.
We have
where as above, while the sum is taken over those for which we have either
(a) and for all (so that ),
or
(b) and for all .
The terms corresponding to the remaining indices split into pairs of opposite values which cancel one another.
Next, we have
where the sum contains only terms for which .
We split these terms into three types.
I. in the case of RWGA or in that of RWSGA. For such terms, we have
II. in the case of RWGA or in that of RWSGA, and the line with direction intersects the open ball
Since , the point lies in the lower part of the line in Fig. 1 (in the case of RWSGA, is replaced by in this figure).
In case (a) we have , therefore , which together with implies , and so .
In case (b) we have , therefore , and we similarly get and .
Consequently, in both cases,
For the details of the last step, see part 3 of the proof of Theorem 1.
III. in the case of RWGA or in that of RWSGA, and the line with direction does not intersect . Similarly to part 3 of the proof of Theorem 1, in accordance with Fig. 2 (in the case of RWSGA, is replaced by in this figure) we obtain
(in the case of RWGA, the estimate is even better: appears under the square root sign).
4. Similarly to part 4 of the proof of Theorem 1, we use (30)–(32) to bound the sum in (29) by the expression (26). In view of (28) and the choice of , this expression tends to zero as , .
5. By (8), turns out to be a Cauchy sequence. Repeating the arguments in part 5 of the proof of Theorem 1 and using the inequalities , which hold for all non-zero , we obtain as .
§ 6. Applications to the theory of density of semigroups
The main problem of this theory [3] is to find conditions on a subset of a Banach space which are necessary or sufficient for the set
(the additive semigroup generated by ) to be dense in (that is, every element of can be approximated with arbitrary accuracy by finite sums of elements of ). A necessary condition for to be dense in is that should be all-round; see [3]. There are examples of all-round sets in a Hilbert space such that the closure is not even an additive subgroup of ; see Example 1 in [3].
Condition (4) that should be norm-reducing is clearly not necessary for to be dense in , but it turns out to be sufficient for this to be so.
Theorem 5. For any norm-reducing set in a Hilbert space , the set is dense in .
Proof. We complement by the zero element, take an arbitrary and run the recursive weak semi-greedy algorithm for . It converges by Theorem 4. Hence can be represented as a series of elements of all of whose partial sums are sums of elements of by the definition of the recursive algorithm. Thus and, therefore, .
It should be possible to generalise Theorem 5 for some class of Banach spaces. Nonetheless the theorem does not hold for an arbitrary Banach space.
Remark 3. Each non-reflexive space contains a norm-reducing symmetric set such that .
Indeed, by James' theorem ([11], Ch. 1), there is a functional not attaining its norm: for any non-zero . The kernel has the property that the metric projection is empty for every . Indeed, if , then for we have and
(for the last equality, see, for example, [12], Ch. 1), contradicting the fact that does not attain its norm. It turns out that for any there is no nearest element in , hence there is a such that . Thus is the desired norm-reducing set.
§ 7. Conclusion
We outline several questions that naturally arise in connection with the results presented here.
1) Is it possible to omit the symmetry condition on in Theorems 1 and 3? In particular, does the greedy algorithm (2) or the weak greedy algorithm (5) converge in the case when , where is (generally speaking) an asymmetric all-round dictionary in a Hilbert space? Reference [10] suggests that its author was also interested in the last question.
2) Is it possible to extend Theorems 1–5 to some class of Banach spaces (say, uniformly convex and uniformly smooth spaces)? It is especially interesting whether or not Theorems 2 and 5 hold for an arbitrary reflexive space. For non-reflexive spaces, they do not hold in view of the example in Remark 3.
3) Is it possible to estimate the rate of convergence of greedy algorithms under the hypotheses of Theorems 1, 3 and 4 for some classes of sets and initial elements ? For classical greedy algorithms, there are many estimates of this kind [1]. For example, if , where is a complete dictionary in a Hilbert space, then for each initial element which is a finite linear combination of elements of , the norms of the residuals in the pure greedy algorithm (1) are bounded above by with (DeVore, Temlyakov, Konyagin and Silnichenko; see [1], Ch. 2). It seems quite reasonable to compare the greedy approximations in the algorithms of approximation by elements of considered above with the corresponding -term approximations
Clearly, , as for the classical approximation with respect to a dictionary.
The author is deeply grateful to V. N. Temlyakov for valuable comments and advice.