On the proximal point algorithm and its Halpern-type variant for generalized monotone operators in Hilbert space

Kohlenbach, Ulrich

doi:10.1007/s11590-021-01738-9

On the proximal point algorithm and its Halpern-type variant for generalized monotone operators in Hilbert space

Original Paper
Open access
Published: 16 April 2021

Volume 16, pages 611–621, (2022)
Cite this article

Download PDF

You have full access to this open access article

Optimization Letters Aims and scope Submit manuscript

On the proximal point algorithm and its Halpern-type variant for generalized monotone operators in Hilbert space

Download PDF

Ulrich Kohlenbach¹

1243 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

In a recent paper, Bauschke et al. study $\rho $-comonotonicity as a generalized notion of monotonicity of set-valued operators A in Hilbert space and characterize this condition on A in terms of the averagedness of its resolvent $J_A.$ In this note we show that this result makes it possible to adapt many proofs of properties of the proximal point algorithm PPA and its strongly convergent Halpern-type variant HPPA to this more general class of operators. This also applies to quantitative results on the rates of convergence or metastability (in the sense of T. Tao). E.g. using this approach we get a simple proof for the convergence of the PPA in the boundedly compact case for $\rho $-comonotone operators and obtain an effective rate of metastability. If A has a modulus of regularity w.r.t. $zer\, A$ we also get a rate of convergence to some zero of A even without any compactness assumption. We also study a Halpern-type variant HPPA of the PPA for $\rho $-comonotone operators, prove its strong convergence (without any compactness or regularity assumption) and give a rate of metastability.

Some Results on Approximation Properties of Lipschitz Maps

Article 03 April 2024

Ju Myung Kim

Ishikawa type mean convergence theorems for finding common fixed points of nonlinear mappings in Hilbert spaces

Article 19 April 2022

Atsumasa Kondo

Global Convergence of ADMM in Nonconvex Nonsmooth Optimization

Article 07 June 2018

Yu Wang, Wotao Yin & Jinshan Zeng

1 Introduction

A central theme in convex optimization is the computation of zeros $z\in zer\,A:=A^{-1}(0)$ of (maximally) monotone set-valued operators $A\subseteq H\times H$ in Hilbert space H. This stems from the fact that for A being the subdifferential $\partial f$ of a proper, convex and lower semi-continuous function $f:H\rightarrow (-\infty ,\infty ],$ $zer\, A$ coincides with the set of minimizers of f.

An important algorithm for the approximation of zeros of A is the Proximal Point Algorithm PPA [17, 20]

$$\begin{aligned} x_{n+1}:=J_{\gamma _n A} x_n, \ \ (\gamma _n)\subset (0,\infty ), \end{aligned}$$

where $J_{\gamma _n}:=(I+\gamma _n A)^{-1}:R(I+\gamma _n A)\rightarrow D(A)$ is the single valued resolvent of $\gamma _n A$ and A is assumed to satisfy some range condition such as $\overline{D(A)} \subseteq R(I+\lambda A)$ for all $\lambda >0$ so that the iteration is defined for $x_0\in \overline{D(A)}\subseteq R(I+\gamma _0A).$ Here D(A) and R(A) denote the domain and range of A respectively as defined for set-valued mappings (see e.g. [4, 22]).

The range condition trivially holds for maximally monotone operators A such as $\partial f$ since then $R(I+\lambda A)=H.$

The crucial relation between A and $J_{\lambda A}$ is that the set of zeros of A coincides with the fixed point set of $J_{\lambda A}$ (which, therefore, in particular does not depend on the choice of $\lambda >0$). If A is monotone, then $J_{\lambda A}$ is firmly nonexpansive so that many results from metric fixed point theory apply (see e.g. [4] for all this).

In order to be able to treat functions f which are not necessarily convex, one needs to weaken the requirement of A to be monotone from

$$\begin{aligned} (+) \ \forall (x,u),(y,v)\in A \,( \langle x-y,u-v\rangle \ge 0) \end{aligned}$$

to e.g. stipulating

$$\begin{aligned} (++)\ \forall (x,u),(y,v)\in A \,( \langle x-y,u-v\rangle \ge \rho \Vert u-v\Vert ^2), \end{aligned}$$

where now $\rho $ may also be negative (see e.g. [7, 8]).

In the recent paper [5], this condition—called $\rho $-comonotonicity—is thoroughly investigated and related to properties of $J_{A}.$ One key result is that $J_{A}$ is an averaged mapping whenever $(++)$ holds with $\rho >-\frac{1}{2}.$ The averaged mappings form a larger class of mappings than the firmly nonexpansive ones but still have nice properties, e.g. they are strongly nonexpansive.

In the recent papers [12, 13], we studied from a quantitative point of view the PPA as well as a strongly convergent so-called Halpern-type variant HPPA (in Banach spaces) making use essentially only of the fact that all firmly nonexpansive mappings have a common so-called modulus for being strongly nonexpansive (see [10]). This also holds true for the class of averaged mappings if we have some control on the averaging constant (see [21]). Putting all this together, it is rather straightforward to see that the main results on the PPA and HPPA established in [12, 13] generalize (in the case of Hilbert spaces) to $\rho $-comonotone operators which is the content of this short note. While the PPA has been considered for $\rho $-comonotone operators before (even for sequences of operators, error terms and relaxations, see [7]) our note shows that by the connection between the comonotonicity of A and the averagedness of $J_A$ as established in [5], many proofs for properties of the PPA and the HPPA for monotone operators can be easily adapted to cover the $\rho $-comonotone case. We also provide new quantitative results on the convergence. For the HPPA, to the best of our knowledge, our note provides the first results in the absence of monotonicity.

2 Preparatory results

Throughout this paper H is a real Hilbert space and $A\subseteq H\times H$ a set-valued operator with the usual definitions of D(A) and $zer\, A.$ $\overline{D(A)}$ denotes the topological closure of D(A). We always assume that $D(A)\not =\emptyset .$

Definition 2.1

[5] Let $\rho \in {\mathbb R}.$ A is called $\rho $-comonotone if

$$\begin{aligned} \forall (x,u),(y,v)\in A \,( \langle x-y,u-v\rangle \ge \rho \Vert u-v\Vert ^2). \end{aligned}$$

In the case where $\rho <0$ which we are interested in, $\rho $-comonotonicity has been studied before in [7] under the name of $|\rho |$-cohypomonotonicity in the context of proximal methods as discussed in the introduction (see also Remark 3.4 below).

Let $J_A:=(I+A)^{-1}$ be the resolvent of A.

Proposition 2.2

Let $\rho \in {\mathbb R}, \lambda >0$ and A be $\rho $-comonotone. Then $D(J_{\lambda A})=R(I+\lambda A),$

$x\in J_{\lambda A}x\leftrightarrow x\in zer\, A$ and, if $\rho >-1,$ $J_A$ is at most single-valued and $zer\,A=Fix(J_{A}).$

Proof

[4, Proposition 23.2] and [5, Proposition 2.13]. $\square $

Lemma 2.3

If A is $\rho $-comonotone for $\rho \in {\mathbb R},$ then for $\lambda >0$ we have that $\lambda A$ is $\rho /\lambda $-comonotone.

Proof

If $u\in \lambda Ax, v\in \lambda Ay,$ then $\frac{u}{\lambda } \in Ax, \frac{v}{\lambda }\in Ay$ and so

$$\begin{aligned} \langle x-y,u-v\rangle = \lambda \left\langle x-y, \frac{u}{\lambda }-\frac{v}{\lambda }\right\rangle \ge \lambda \cdot \rho \left\| \frac{u}{\lambda }-\frac{v}{\lambda } \right\| ^2 =\frac{\rho }{\lambda } \Vert u-v\Vert ^2. \end{aligned}$$

$\square $

The following proposition, which is well-known for monotone operators, extends to $\rho $-comonotone operators:

Proposition 2.4

Let $A\subseteq H\times H$ be $\rho $-comonotone with $\rho \in {\mathbb R}.$ Let $\lambda ,\mu >0.$

1.
If $\rho \ge -\frac{\lambda }{2},$ then $J_{\lambda A}$ is nonexpansive.
2.
$J_{\lambda A}$ satisfies the resolvent equation in the following form: if $\rho > - \lambda ,-\mu ,$ then
$$\begin{aligned} J_{\lambda A} x=J_{\mu A}\left( \frac{\mu }{\lambda } x+ \left( 1-\frac{\mu }{\lambda }\right) J_{\lambda A} x\right) , \ \ \ x\in D(J_{\lambda A}). \end{aligned}$$
3.
If $\rho \ge -\frac{\lambda }{2},-\frac{\mu }{2}$ then
$$\begin{aligned} \Vert x-J_{\mu A} x\Vert \le \left( 2+\frac{\mu }{\lambda }\right) \Vert x-J_{\lambda A} x \Vert \end{aligned}$$
for all $x\in R(I+\lambda A)\cap R(I+\mu A).$

Proof

(1) By the assumptions and Lemma 2.3, $\lambda A$ is $-\frac{1}{2}$-comonotone and so—by [5, Proposition 3.11(iii)] - $J_{\lambda A}$ is nonexpansive.

(2) follows as in [3][p.105] using Proposition 2.2 which is applicable since—by Lemma 2.3 - $J_{\lambda A},J_{\mu A}$ are $>-1$-comonotone.

(3) Using (1) and (2) we get

$$\begin{aligned} \begin{array}{l} \left\| x-J_{\mu A} x\right\| \le \left\| x-J_{\lambda A} x\right\| + \left\| J_{\lambda A} x -J_{\mu A} x\right\| \\ \quad =\left\| x-J_{\lambda A} x\right\| + \left\| J_{\mu A} \left( \frac{\mu }{\lambda } x+(1- \frac{\mu }{\lambda })J_{\lambda A} x\right) -J_{\mu A} x\right\| \\ \quad \le \left\| x-J_{\lambda A} x\right\| + \left\| \frac{\mu }{\lambda } x+(1-\frac{\mu }{\lambda }) J_{\lambda A} x -x\right\| \\ \quad =\left\| x-J_{\lambda A} x\right\| + \left| 1-\frac{\mu }{\lambda }\right| \left\| x-J_{\lambda A}x\right\| \le \left( 2+\frac{\mu }{\lambda }\right) \left\| x-J_{\lambda A}x\right\| . \end{array}\end{aligned}$$

$\square $

Definition 2.5

[6]. Let $C\subseteq H$ be a nonempty subset of H and $T:C\rightarrow H$ be a mapping.

1.
T is called $\alpha $-averaged with $\alpha \in (0,1)$ if $T=(1-\alpha )I+\alpha S,$ where $S:C\rightarrow H$ is nonexpansive.
2.
T is called strongly nonexpansive (SNE) if T is nonexpansive and for all sequences $(x_n),(y_n)$ in H the following implication is true:
$$\begin{aligned}&\text{ if }\,\left( (x_n-y_n) \ \text{ bounded } \ \wedge \Vert x_n-y_n\Vert -\Vert Tx_n-Ty_n\Vert \rightarrow 0\right) , \text{ then }\,\\&\quad (x_n-y_n)-(Tx_n-Ty_n) \rightarrow 0. \end{aligned}$$

Lemma 2.6

[10, Lemma 2.2] $T:C\rightarrow H$ is strongly nonexpansive iff T has as an SNE-modulus $\omega :(0,\infty )^2\rightarrow (0,\infty ),$ i.e.

$$\begin{aligned}&\forall b,\varepsilon >0 \,\forall x,y\in C\,\left( \Vert x-y\Vert \le b\wedge \Vert x-y\Vert -\Vert Tx-Ty\Vert<\omega (b,\varepsilon )\right. \\&\quad \left. \rightarrow \Vert (x-y)-(Tx-Ty)\Vert <\varepsilon \right) . \end{aligned}$$

The proof of [21, Proposition 2.7] establishes:

Proposition 2.7

[21]. Let $C\subseteq H$ be some subset of H and $T:C\rightarrow H$ be an $\alpha $-averaged mapping for some $\alpha \in (0,1).$ Then T is strongly nonexpansive with SNE-modulus

$$\begin{aligned} \omega _{\alpha }(b,\varepsilon ):=\frac{1-\alpha }{4b\alpha }\cdot \varepsilon ^2.\end{aligned}$$

Proposition 2.8

Let $(\gamma _n)\subset (0,\infty ), \gamma >0$ be such that $\gamma _n\ge \gamma >0$ for all $n\in {\mathbb N}.$ Let $\rho \in (-\frac{\gamma }{2},0]$ and $A\subseteq H\times H$ be $\rho $-comonotone.

Then for each $n\in {\mathbb N},$ $J_{\gamma _n A}:R(I+\gamma _n A)\rightarrow D(A)$ is strongly nonexpansive with common SNE-modulus $\omega _{\alpha }$, where $\alpha :=\frac{1}{2((\rho /\gamma ) +1)}\in (0,1).$

In particular, if $D(A)\subseteq C\subseteq \bigcap ^{\infty }_{n=0} R(I+\gamma _nA),$ then $(J_{\gamma _n A})$ (restricted to C) is a strongly nonexpansive sequence of mappings $C\rightarrow C$ in the sense of the papers [1, 2].

Proof

By the assumptions and Lemma 2.3, $\gamma _nA$ is $(\rho /\gamma _n)$-comonotone and so, since

$$\begin{aligned} \frac{\rho }{\gamma _n}\ge \frac{\rho }{\gamma }> -\frac{1}{2}, \end{aligned}$$

it a fortiori is $\eta $-comonotone with $\eta :=\frac{\rho }{\gamma }>-\frac{1}{2}.$ Hence by [5, Proposition 3.11(v)] applied to $\gamma _nA,$ the resolvent $J_{\gamma _n A}:R(I+\gamma _n A)\rightarrow D(A)$ is $\alpha $-averaged. The claim now follows from Proposition 2.7. $\square $

3 The proximal point algorithm PPA for comonotone operators

Let $A\subseteq H\times H$ be $\rho $-comonotone, $(\gamma _n)\subset (0,\infty )$ and assume that $D(A)\subseteq \bigcap ^{\infty }_{n=0} R(I+\gamma _n A).$ We assume that $zer\, A\not =\emptyset .$ The Proximal Point Algorithm PPA for A and $(\gamma _n)$ is defined by ($n\in {\mathbb N}=\{ 0,1,2,\ldots \}$)

$$\begin{aligned} x_{n+1}:=J_{\gamma _n A} x_n, \ \ x_0\in R(I+\gamma _0 A). \end{aligned}$$

Throughout this section we also assume that $\gamma _n\ge \gamma >0$ for all $n\in {\mathbb N}$ and that $\rho \in (-\frac{\gamma }{2},0].$

Proposition 3.1

1.
$$\begin{aligned} \lim \limits _{n\rightarrow \infty } \Vert x_n-J_{\gamma _0 A}x_n\Vert = \lim \limits _{n\rightarrow \infty } \Vert x_n-x_{n+1}\Vert =0. \end{aligned}$$
Moreover, with $\alpha :=\frac{1}{2((\rho /\gamma ) +1)}\in (0,1),$ $\omega _{\alpha }$ as in Proposition 2.7 and $b\ge \Vert x_0-p\Vert $ for some $p\in zer\,A,$
$$\begin{aligned} \Delta (\varepsilon ,L,b):=\left\lceil b/\omega _{\alpha } (b,\varepsilon ) \right\rceil +L+1 \end{aligned}$$
is a modulus of $\liminf $ (in the sense of [14]) i.e.
$$\begin{aligned} \forall L\in {\mathbb N},\,\varepsilon >0\,\exists n \,\left( L\le n\le \Delta (\varepsilon ,L,b)\ \text{ and } \ \Vert x_n-x_{n+1}\Vert < \varepsilon \right) . \end{aligned}$$
2.
Define
$$\begin{aligned} u_n:=\frac{x_n-x_{n+1}}{\gamma _n}. \end{aligned}$$
Then $u_n\in Ax_{n+1},$ $\lim \limits _{n\rightarrow \infty } u_n=0$ and
$$\begin{aligned} \exists n\le \rho (\varepsilon ,b,\gamma ):=\Delta (\varepsilon \cdot \gamma ,0,b)\,\left( \Vert u_n\Vert <\varepsilon \right) . \end{aligned}$$

Proof

1) Let $p\in zer\,A.$ Then by Propositions 2.2 and 2.4 (using that $\gamma _nA$ is $>-\frac{1}{2}>-1$ comonotone)

$$\begin{aligned} \Vert x_{n+1} -p\Vert \le \Vert x_n-p\Vert \le b, \ \ n\in {\mathbb N},\end{aligned}$$

and so $(x_n)$ is Fejér monotone w.r.t. $zer\,A=Fix(J_{\gamma _n A})$ and $(\Vert x_n-p\Vert )$ is convergent. Thus

$$\begin{aligned} \left| \Vert J_{\gamma _n A} x_n-J_{\gamma _n A}p\Vert -\Vert x_n-p\Vert \right| =\left| \Vert x_{n+1} -p \Vert -\Vert x_n -p\Vert \right| \rightarrow 0. \end{aligned}$$

Hence by Proposition 2.8

$$\begin{aligned} \Vert x_{n+1}-x_n\Vert =\Vert J_{\gamma _n A} x_n-x_n\Vert \rightarrow 0. \end{aligned}$$

By Proposition 2.4(3) (which is applicable in the nontrivial case where $n\ge 1$ due to $x_n\in D(A)$ and the range condition)

$$\begin{aligned} \Vert x_n-J_{\gamma _0 A} x_n\Vert \le \left( 2+\frac{\gamma _0}{\gamma }\right) \, \Vert x_n-J_{\gamma _n A} x_n\Vert \end{aligned}$$

and so also $\lim \limits _{n\rightarrow \infty } \Vert x_n-J_{\gamma _0 A} x_n\Vert =0.$

The $\liminf $-bound is proved as in [13, Proposition 2.1] using Proposition 2.8. We include the proof here for completeness: Let $L\in {\mathbb N}$ and $\delta >0.$ Then there exists an $n\in {\mathbb N}$ with $L\le n\le L+ \left\lceil b/\delta \right\rceil +1$ such that

$$\begin{aligned} \Vert x_n-p\Vert -\Vert J_{\gamma _n A}x_n-J_{\gamma _n A}p\Vert = \Vert x_n-p\Vert -\Vert x_{n+1}-p\Vert < \delta \end{aligned}$$

since, otherwise,

$$\begin{aligned} b\ge \Vert x_L- p\Vert \ge \Vert x_L-p\Vert -\Vert x_{L+\left\lceil b/\delta \right\rceil +1}-p\Vert \ge (\left\lceil b/\delta \right\rceil +1)\cdot \delta > b.\end{aligned}$$

Now fix $\delta :=\omega _{\alpha }(b,\varepsilon ).$ Then Proposition 2.8 implies the existence of an n with $L\le n\le \Delta (\varepsilon ,L,b)$ such that

$$\begin{aligned} \Vert x_n-x_{n+1}\Vert =\Vert (x_n-p)-(J_{\gamma _n A}x_n-J_{\gamma _n A} p)\Vert < \varepsilon .\end{aligned}$$

2) is immediate from 1). $\square $

The PPA for maximally monotone operators, while being weakly convergent, fails to be strongly convergent as shown in [9]. In the boundedly compact (i.e. finite dimensional) case there is in general no computable rate of convergence unless some strong metric regularity assumption is made (see [19] and [11]). However, in the boundedly compact case, one can get effective rates $\Psi $ of metastability in the sense of T. Tao [24, 25] for the Cauchy property of $(x_n),$ i.e.

$$\begin{aligned} \forall \varepsilon >0\,\forall g:{\mathbb N}\rightarrow {\mathbb N}\,\exists n\le \Psi (\varepsilon ,g)\, \forall i,j\in [n,n+g(n)]\ \left( \Vert x_i-x_j\Vert <\varepsilon \right) .\end{aligned}$$

Note that, noneffectively, this property implies the Cauchy property of $(x_n)$ and hence the existence of a limit x but does not allow one to convert $\Psi $ into an effective rate of convergence. One can additionally ensure that for $i\in [n,n+g(n)],$ $x_i$ is an approximate zero of A which guarantees that x is a zero of A.

We now extend our rate of metastability for the PPA from [13] to the $\rho $-comononotone case:

Theorem 3.2

Let A be as above and assume additionally that $\overline{D(A)}\subseteq \bigcap ^{\infty }_{n=0} R(I+\gamma _n A)$ is boundedly compact and $x_0\in \overline{D(A)}.$ Then $(x_n)$ strongly converges to a zero of A. Moreover, the rate of metastability $\Psi $ from [13, Theorem 2.12] also holds in our current situation with $\Delta $ being replaced by our definition in Proposition 3.1(1), i.e.

$$\begin{aligned}&(*)\ \forall k\in {\mathbb N}\,\forall g\in {\mathbb N}^{{\mathbb N}}\,\exists n\le \Psi (k,g,\beta ) \, \forall i,j\in [n,n+g(n)]\, \\&\qquad \quad \left( \Vert x_i-x_j\Vert \le \frac{1}{k+1} \ \text{ and } \ x_i\in \tilde{F}_k \right) , \end{aligned}$$

where

$$\begin{aligned} \tilde{F}_k:=\bigcap _{i\le k}\left\{ x\in \overline{D(A)}\,:\, \Vert x-J_{\gamma _i A} x\Vert \le \frac{1}{k+1}\right\} \end{aligned}$$

and $\beta $ is a modulus of total boundedness (in the sense of [13, Theorem 2.12]) for $\overline{D(A)}\cap \overline{B}(0,M),$ where $\overline{B}(0,M):=\{ x\in H: \Vert x\Vert \le M\},$ with $M\ge b+\Vert p\Vert $ and $b\ge \Vert x_0-p\Vert $ for some $p\in zer\,A.$

If $C\subseteq H$ is closed and convex with $\overline{D(A)}\subseteq C\subseteq \bigcap ^{\infty }_{n=0} R(I+\gamma _n A),$ then without compactness assumption, $(x_n)$ converges weakly to a zero of A.

Proof

The proof of [13, Theorem 2.12] for the rate of metastability of $(x_n)$ can be taken without any changes observing that [14, Lemma 8.1] holds with the same proof in our context and that $\Phi $ can be shown to be an approximate F-bound as in [13, Proposition 2.11] using Propositions 2.4(3) and 3.1(1) instead of [13, Prop.2.3(ii),Prop.2.1].

Since $(x_n)$ is metastable (the first part of $(*)$), it is a Cauchy sequence and hence convergent with $x:=\lim _n x_n\in \overline{D(A)}.$ By the extra clause ‘$x_i\in \tilde{F}_k$’ in $(*),$ which strengthens the usual formulation of a rate of metastability, we can conclude that $x\in zer\,A.$ Indeed, choosing in $(*)$ for given $N\in {\mathbb N}$ the function $g(n):=N$ we get an $n_N\ge N$ with $\Vert x_{n_N}-J_{\gamma _0 A}x_{n_N}\Vert \le \frac{1}{k+1}.$ Using the nonexpansivity of $J_{\gamma _0 A}$ this implies that $x\in Fix(J_{\gamma _0 A})=zer\, A.$

For the weak convergence in the noncompact case we reason as follows: let w be a weak sequential cluster point of $(x_n).$ Then there is a subsequence $(x_{n_k})$ which weakly converges to w. By Proposition 3.1(1) $(x_{n_k})$ is an approximate fixed point sequence of $J_{\gamma _0 A}.$ Hence by Browder’s demiclosedness principle ([4, Corollary 4.28]) applied to $J_{\gamma _0 A}$ and C it follows that $w\in Fix(J_{\gamma _0 A}).$ Hence we can—using again the fact that $(x_n)$ is Fejér monotone w.r.t. $Fix(J_{\gamma _0 A})$—conclude that $(x_n)$ weakly converges to $w\in Fix(J_{\gamma _0 A})=zer\, A$ by [4, Theorem 5.5]. $\square $

Remark 3.3

The range condition in Theorem 3.2 is trivially satisfied if A is maximally $\rho $-comonotone (in the sense of [5, Definition 2.4.(iv)]) since then by Lemma 2.3$\lambda A$ is maximally $(\rho /\lambda )$-comonotone with $\rho /\lambda >-1$ for $\lambda \ge \gamma $ so that by [5, Corollary 2.12] $R(I+\lambda A)=H.$

Remark 3.4

Note that the conditions on $\rho ,\gamma _n$ made in [7] on their general PPA in the case of a single operator A and without relaxation (i.e. $\lambda _n:=1$) imply our condition that $\rho >-\frac{\inf \{\gamma _n:n\in {\mathbb N}\}}{2}:$ observing that $\rho $ in [7] corresponds to our $-\rho ,$ the conditions (iii),(iv) in [7, Theorem 3.1] state the existence of an $\varepsilon \in (0,1)$ s.t.

$$\begin{aligned} \frac{1}{1+\rho /\gamma _n}\le 2-\varepsilon , \ \gamma _n>-\rho , \ \ n\in {\mathbb N}.\end{aligned}$$

An easy calculation shows that this implies that

$$\begin{aligned} \inf \{ \gamma _n:n\in {\mathbb N}\}\ge -\rho \frac{2-\varepsilon }{1-\varepsilon }>-2\rho .\end{aligned}$$

Also the converse holds: let $\delta >0$ be such that $\gamma >-2\rho +\delta .$ Then the condition

$$\begin{aligned} \frac{1}{1+\frac{\rho }{\gamma _n}} < 2-\varepsilon \end{aligned}$$

is satisfied with $\varepsilon :=2-\frac{2\gamma }{\gamma +\delta }.$

Error terms $u_n$ subject to the condition that $\sum \Vert u_n\Vert <\infty $ (implied by condition (vi) in [7, Theorem 3.1]) can be incorporated even in the quantitative part of our theorem (similar to [15, Theorem 4.5]). Our approach makes the relevance of the averagedness of $J_{\gamma _n A}$ explicit which only implicitly occurs in the proof of [7, Theorem 3.1].

Definition 3.5

[16] Let A be as at the beginning of this section with $p\in zer\,A$ and define $F(x):={{\,\mathrm{dist}\,}}(0_X,A(x))$ (with $F(x):=\infty $ for $x\not \in D(A)$). A function $\phi :(0,\infty ) \rightarrow (0,\infty )$ is called a ‘modulus of regularity for A w.r.t. $zer\,A$ and $\overline{B}(p,r)$ with $r>0$’ if for all $\varepsilon >0$ and $x\in \overline{B}(p,r):=\{ y\in H:\Vert y-p\Vert \le r\}$ one has

$$\begin{aligned} F(x)<\phi (\varepsilon ) \rightarrow {{\,\mathrm{dist}\,}}(x,zer\,F)<\varepsilon . \ \end{aligned}$$

As [13, Lemma 2.6] (but reasoning in the proof of $zer\,F\subseteq zer\,A$ with - say-$\gamma _0A$ and $J_{\gamma _0A}$ instead of $A,J_A$) one shows that

Lemma 3.6

With F as defined in the previous definition, $zer\, F=zer\, A$ and so $(x_n)$ as defined by the PPA for A is Fejér monotone w.r.t. $zer\, F=zer\,A,$ i.e.

$$\begin{aligned} \forall p\in zer\,F\,\forall n\in {\mathbb N}\, (\Vert x_{n+1}-p\Vert \le \Vert x_n-p\Vert ). \end{aligned}$$

As in the case of [13, Theorem 2.8] one now gets

Theorem 3.7

Let A and $(\gamma _n)$ be as above and assume that $\overline{D(A)}\subseteq \bigcap \limits ^{\infty }_{n=0} R(I+\gamma _n A).$ Let $p\in zer\,A$ and $b\ge \Vert x_0-p\Vert .$ If A has a modulus $\phi $ of regularity w.r.t $zer\, A$ and $\overline{B}(p,b),$ then $(x_n)$ converges to a zero $z:=\lim x_n$ of A with rate of convergence $\rho (\phi (\varepsilon /2),b,\gamma )+1,$ where $\rho $ is as in Proposition 3.1(2).

Proof

The proof is largely identical to that of [13, Theorem 2.8]. We only have to observe that in that latter proof it suffices to have the existence of an $n\le \rho (\varepsilon ,b,\gamma )\,(|F(x_{n+1})|\le \Vert u_n\Vert \le \varepsilon )$ (rather than that this holds for all $n\ge \rho (\varepsilon ,b,\gamma )$) and that this follows from Proposition 3.1(2). $\square $

4 The Halpern-type proximal point algorithm HPPA for comonotone operators

Whereas the PPA even for monotone operators A in general is not strongly convergent ([9]) a Halpern-type variant strongly converges also for $\rho $-comonotone operators as we show in this section.

Again we assume that $(\gamma _n)\subset (0,\infty )$ with $\gamma _n\ge \gamma >0$ for all $n\in {\mathbb N}$ and that A is $\rho $-comonotone with $\rho \in (-\frac{\gamma }{2},0]$ with $zer\,A\not =\emptyset .$ Let $C\subseteq H$ be a nonempty closed and convex subset such that $\overline{D(A)}\subseteq C\subseteq \bigcap ^{\infty }_{n=0} R(I+\gamma _n A).$

Definition 4.1

[2, 18, 23]. Let $S\subseteq H$ be some nonempty subset of H and $T:S\rightarrow H$ a mapping and $(S_n)$ be a sequence of mappings $S_n:S\rightarrow H.$ Let $F((S_n)):=\bigcap _{n\in {\mathbb N}} Fix(S_n)$ be the set of all common fixed points of $S_n$ for all n. $(S_n)$ is said to satisfy the NST condition (I) with T if $F((S_n))\not =\emptyset ,$ $Fix(T)\subseteq F((S_n))$ and $x_n-Tx_n\rightarrow 0$ whenever $(x_n)$ is a bounded sequence in S with $x_n-S_nx_n\rightarrow 0.$

Proposition 4.2

Let $T:=J_{\gamma _0 A}:C\rightarrow C$ and $S_n:=J_{\gamma _n A}:C\rightarrow C.$ Then $(S_n)$ (strictly speaking the sequence of the restrictions of $S_n$ to C) satisfies the NST condition (I) with T.

Proof

Clearly, $Fix(T)=Fix(S_n)=zer\, A\not =\emptyset .$ Let $(x_n)$ be a bounded sequence in C with $\lim _n\Vert x_n-S_nx_n\Vert =0.$ Then by Proposition 2.4(2) also $\lim _n \Vert x_n-Tx_n\Vert =0.$ $\square $

Theorem 4.3

Let $(\alpha _n)\subset (0,1]$ be such that $\lim _n\alpha _n=0$ and $\sum ^{\infty }_{n=0}\alpha _n=\infty .$ For $u,x_0\in C$ define the Halpern-type proximal point algorithm (HPPA) by

$$\begin{aligned} x_{n+1}:=\alpha _n u+(1-\alpha _n)J_{\gamma _n A} x_n\in C. \end{aligned}$$

Then $(x_n)$ strongly converges to the zero of A which is closest to u. Moreover, the rate of metastability from [12, Theorem 4.1] also holds for our current situation if $\omega _{\eta }$ is replaced by $\omega _{\alpha }$ from Proposition 2.7 above with $\alpha :=\frac{1}{2((\rho /\gamma )+1)}$ and $\omega _J(b,\varepsilon ):=\varepsilon .$

Proof

The strong convergence follows from [2, Theorem 3.1] whose assumptions are satisfied by Propositions 2.4(1), 2.8 and 4.2 using also that H has the fixed point property for nonexpansive mappings. The strong convergence also follows using [12, Theorem 4.1] which, moreover, gives the rate of metastability stated in the theorem. For this we only have to observe that the proof of [12, Theorem 4.1] only uses properties of $J_{\gamma _n A}$ which by the results stated above also hold true for $\rho $-comonotone operators A where now we use $\omega _{\alpha }$ and Proposition 2.8 instead of $\omega _{\eta }$ and [12, Lemma 2.4]. Finally, we note that we can take $\omega _J(b,\varepsilon ):=\varepsilon $ as modulus of uniform continuity for the normalized duality map on $\overline{B}(0,b)$ since we are in a Hilbert space. $\square $

Remark 4.4

Remark 3.3 applies here as well: if A is maximally $\rho $-comonotone, then the range condition is satisfied for any closed and convex subset $C\subseteq H$ satisfying $\overline{D(A)}\subseteq C.$

References

Aoyama, K., Toyoda, M.: Approximation of zeros of accretive operators in a Banach space. Israel J. Math. 220, 803–816 (2017)
Article MathSciNet Google Scholar
Aoyama, K., Toyoda, M.: Approximation of common fixed points of strongly nonexpansive sequences in a Banach space. J. Fixed Point Theory and Appl. 21, Article no. 35 (2019)
Barbu, V.: Nonlinear Semigroups and Differential Equations in Banach Spaces. Noordhoff International Publishing, Leyden (1976)
Book Google Scholar
Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2nd edn. Springer, New York (2017)
Book Google Scholar
Bauschke, H.H., Moursi, W.A., Wang, X.: Generalized monotone operators and their averaged resolvents. Math. Program. Ser. B (2020). https://doi.org/10.1007/s10107-020-01500-6
Article MATH Google Scholar
Bruck, R.E., Reich, S.: Nonexpansive projections and resolvents of accretive operators in Banach spaces. Houston J. Math. 3, 459–470 (1977)
MathSciNet MATH Google Scholar
Combettes, P.L., Pennanen, T.: Proximal methods for cohypomonotone operators. SIAM J. Control Optim. 43, 731–742 (2004)
Article MathSciNet Google Scholar
Diakonikolas, J., Daskalakis, C., Jordan, M.I.: Efficient methods for structured nonconvexnonconcave Min-Max optimization. arXiv:2011.00364
Güler, O.: On the convergence of the proximal point algorithm for convex minimization. SIAM J. Control Optim. 29, 403–419 (1991)
Article MathSciNet Google Scholar
Kohlenbach, U.: On the quantitative asymptotic behavior of strongly nonexpansive mappings in Banach and geodesic spaces. Israel J. Math. 216, 215–246 (2016)
Article MathSciNet Google Scholar
Kohlenbach, U.: On the reverse mathematics and Weihrauch complexity of moduli of regularity and uniqueness. Computability 8, 377–387 (2019)
Article MathSciNet Google Scholar
Kohlenbach, U.: Quantitative analysis of a Halpern-type proximal point algorithm for accretive operators in Banach spaces. J. Nonlin. Convex Anal. 9, 2125–2138 (2020)
MathSciNet MATH Google Scholar
Kohlenbach, U.: Quantitative results on the proximal point algorithm in uniformly convex Banach spaces. J. Convex Anal. 28, 11–18 (2021)
MathSciNet MATH Google Scholar
Kohlenbach, U., Leuştean, L., Nicolae, A.: Quantitative results on Fejér monotone sequences. Commun. Contemp. Math. (2018). DOIurlhttps://doi.org/10.1142/S0219199717500158
Kohlenbach, U., López-Acedo, G., Nicolae, A.: Quantitative asymptotic regularity results for the composition of two mappings. Optimization 66, 1291–1299 (2017)
Article MathSciNet Google Scholar
Kohlenbach, U., López-Acedo, G., Nicolae, A.: Moduli of regularity and rates of convergence for Fejér monotone sequences. Israel J. Math. 232, 261–297 (2019)
Article MathSciNet Google Scholar
Martinet, B.: Régularisation d’inéquations variationnelles par approximations successives. Rev. Française Inf. Recherche Opérationnelle 4, 154–158 (1970)
MATH Google Scholar
Nakajo, K., Shimoji, K., Takahashi, W.: Strong convergence to common fixed points of families of nonexpansive mappings in Banach spaces. J. Nonlinear Convex Anal. 8, 11–34 (2007)
MathSciNet MATH Google Scholar
Neumann, E.: Computational problems in metric fixed point theory and their Weihrauch degrees. Log. Method. Comput. Sci. 11, 1–44 (2015)
Article MathSciNet Google Scholar
Rockafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14, 877–898 (1976)
Article MathSciNet Google Scholar
Sipoş, A.: Quantitative inconsistent feasibility for averaged mappings, arXiv:2001.01513 (2020)
Takahashi, W.: Nonlinear Functional Analysis. Yokohama Publishers, Yokohama (2000)
MATH Google Scholar
Takahashi, W.: Viscosity approximation methods for countable families of nonexpansive mappings in Banach spaces. Nonlinear Anal. 70, 719–734 (2009)
Article MathSciNet Google Scholar
Tao, T.: Soft analysis, hard analysis, and the finite convergence principle. Essay posted May 23, 2007. Appeared in: ‘T. Tao, Structure and Randomness: Pages from Year One of a Mathematical Blog. AMS (2008)
Tao, T.: Norm convergence of multiple ergodic averages for commuting transformations. Ergod. Theory. Dyn. Syst. 28, 657–688 (2008)
Article MathSciNet Google Scholar

Download references

Acknowledgements

The author has been supported by the German Science Foundation (DFG Project KO 1737/6-1).

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Department of Mathematics, Technische Universität Darmstadt, Schlossgartenstraße 7, 64289, Darmstadt, Germany
Ulrich Kohlenbach

Authors

Ulrich Kohlenbach
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ulrich Kohlenbach.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kohlenbach, U. On the proximal point algorithm and its Halpern-type variant for generalized monotone operators in Hilbert space. Optim Lett 16, 611–621 (2022). https://doi.org/10.1007/s11590-021-01738-9

Download citation

Received: 01 March 2021
Accepted: 08 April 2021
Published: 16 April 2021
Issue Date: March 2022
DOI: https://doi.org/10.1007/s11590-021-01738-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Abstract

Similar content being viewed by others

Some Results on Approximation Properties of Lipschitz Maps

Ishikawa type mean convergence theorems for finding common fixed points of nonlinear mappings in Hilbert spaces

Global Convergence of ADMM in Nonconvex Nonsmooth Optimization

1 Introduction

2 Preparatory results

Definition 2.1

Proposition 2.2

Proof

Lemma 2.3

Proof

Proposition 2.4

Proof

Definition 2.5

Lemma 2.6

Proposition 2.7

Proposition 2.8

Proof

3 The proximal point algorithm PPA for comonotone operators

Proposition 3.1

Proof

Theorem 3.2

Proof

Remark 3.3

Remark 3.4

Definition 3.5

Lemma 3.6

Theorem 3.7

Proof

4 The Halpern-type proximal point algorithm HPPA for comonotone operators

Definition 4.1

Proposition 4.2

Proof

Theorem 4.3

Proof

Remark 4.4

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation