1 Introduction

A central theme in convex optimization is the computation of zeros \(z\in zer\,A:=A^{-1}(0)\) of (maximally) monotone set-valued operators \(A\subseteq H\times H\) in Hilbert space H. This stems from the fact that for A being the subdifferential \(\partial f\) of a proper, convex and lower semi-continuous function \(f:H\rightarrow (-\infty ,\infty ],\) \(zer\, A\) coincides with the set of minimizers of f.

An important algorithm for the approximation of zeros of A is the Proximal Point Algorithm PPA [17, 20]

$$\begin{aligned} x_{n+1}:=J_{\gamma _n A} x_n, \ \ (\gamma _n)\subset (0,\infty ), \end{aligned}$$

where \(J_{\gamma _n}:=(I+\gamma _n A)^{-1}:R(I+\gamma _n A)\rightarrow D(A)\) is the single valued resolvent of \(\gamma _n A\) and A is assumed to satisfy some range condition such as \(\overline{D(A)} \subseteq R(I+\lambda A)\) for all \(\lambda >0\) so that the iteration is defined for \(x_0\in \overline{D(A)}\subseteq R(I+\gamma _0A).\) Here D(A) and R(A) denote the domain and range of A respectively as defined for set-valued mappings (see e.g. [4, 22]).

The range condition trivially holds for maximally monotone operators A such as \(\partial f\) since then \(R(I+\lambda A)=H.\)

The crucial relation between A and \(J_{\lambda A}\) is that the set of zeros of A coincides with the fixed point set of \(J_{\lambda A}\) (which, therefore, in particular does not depend on the choice of \(\lambda >0\)). If A is monotone, then \(J_{\lambda A}\) is firmly nonexpansive so that many results from metric fixed point theory apply (see e.g. [4] for all this).

In order to be able to treat functions f which are not necessarily convex, one needs to weaken the requirement of A to be monotone from

$$\begin{aligned} (+) \ \forall (x,u),(y,v)\in A \,( \langle x-y,u-v\rangle \ge 0) \end{aligned}$$

to e.g. stipulating

$$\begin{aligned} (++)\ \forall (x,u),(y,v)\in A \,( \langle x-y,u-v\rangle \ge \rho \Vert u-v\Vert ^2), \end{aligned}$$

where now \(\rho \) may also be negative (see e.g. [7, 8]).

In the recent paper [5], this condition—called \(\rho \)-comonotonicity—is thoroughly investigated and related to properties of \(J_{A}.\) One key result is that \(J_{A}\) is an averaged mapping whenever \((++)\) holds with \(\rho >-\frac{1}{2}.\) The averaged mappings form a larger class of mappings than the firmly nonexpansive ones but still have nice properties, e.g. they are strongly nonexpansive.

In the recent papers [12, 13], we studied from a quantitative point of view the PPA as well as a strongly convergent so-called Halpern-type variant HPPA (in Banach spaces) making use essentially only of the fact that all firmly nonexpansive mappings have a common so-called modulus for being strongly nonexpansive (see [10]). This also holds true for the class of averaged mappings if we have some control on the averaging constant (see [21]). Putting all this together, it is rather straightforward to see that the main results on the PPA and HPPA established in [12, 13] generalize (in the case of Hilbert spaces) to \(\rho \)-comonotone operators which is the content of this short note. While the PPA has been considered for \(\rho \)-comonotone operators before (even for sequences of operators, error terms and relaxations, see [7]) our note shows that by the connection between the comonotonicity of A and the averagedness of \(J_A\) as established in [5], many proofs for properties of the PPA and the HPPA for monotone operators can be easily adapted to cover the \(\rho \)-comonotone case. We also provide new quantitative results on the convergence. For the HPPA, to the best of our knowledge, our note provides the first results in the absence of monotonicity.

2 Preparatory results

Throughout this paper H is a real Hilbert space and \(A\subseteq H\times H\) a set-valued operator with the usual definitions of D(A) and \(zer\, A.\) \(\overline{D(A)}\) denotes the topological closure of D(A). We always assume that \(D(A)\not =\emptyset .\)

Definition 2.1

[5] Let \(\rho \in {\mathbb R}.\) A is called \(\rho \)-comonotone if

$$\begin{aligned} \forall (x,u),(y,v)\in A \,( \langle x-y,u-v\rangle \ge \rho \Vert u-v\Vert ^2). \end{aligned}$$

In the case where \(\rho <0\) which we are interested in, \(\rho \)-comonotonicity has been studied before in [7] under the name of \(|\rho |\)-cohypomonotonicity in the context of proximal methods as discussed in the introduction (see also Remark 3.4 below).

Let \(J_A:=(I+A)^{-1}\) be the resolvent of A.

Proposition 2.2

Let \(\rho \in {\mathbb R}, \lambda >0\) and A be \(\rho \)-comonotone. Then \(D(J_{\lambda A})=R(I+\lambda A),\)

\(x\in J_{\lambda A}x\leftrightarrow x\in zer\, A\) and, if \(\rho >-1,\) \(J_A\) is at most single-valued and \(zer\,A=Fix(J_{A}).\)

Proof

[4, Proposition 23.2] and [5, Proposition 2.13]. \(\square \)

Lemma 2.3

If A is \(\rho \)-comonotone for \(\rho \in {\mathbb R},\) then for \(\lambda >0\) we have that \(\lambda A\) is \(\rho /\lambda \)-comonotone.

Proof

If \(u\in \lambda Ax, v\in \lambda Ay,\) then \(\frac{u}{\lambda } \in Ax, \frac{v}{\lambda }\in Ay\) and so

$$\begin{aligned} \langle x-y,u-v\rangle = \lambda \left\langle x-y, \frac{u}{\lambda }-\frac{v}{\lambda }\right\rangle \ge \lambda \cdot \rho \left\| \frac{u}{\lambda }-\frac{v}{\lambda } \right\| ^2 =\frac{\rho }{\lambda } \Vert u-v\Vert ^2. \end{aligned}$$

\(\square \)

The following proposition, which is well-known for monotone operators, extends to \(\rho \)-comonotone operators:

Proposition 2.4

Let \(A\subseteq H\times H\) be \(\rho \)-comonotone with \(\rho \in {\mathbb R}.\) Let \(\lambda ,\mu >0.\)

  1. 1.

    If \(\rho \ge -\frac{\lambda }{2},\) then \(J_{\lambda A}\) is nonexpansive.

  2. 2.

    \(J_{\lambda A}\) satisfies the resolvent equation in the following form: if \(\rho > - \lambda ,-\mu ,\) then

    $$\begin{aligned} J_{\lambda A} x=J_{\mu A}\left( \frac{\mu }{\lambda } x+ \left( 1-\frac{\mu }{\lambda }\right) J_{\lambda A} x\right) , \ \ \ x\in D(J_{\lambda A}). \end{aligned}$$
  3. 3.

    If \(\rho \ge -\frac{\lambda }{2},-\frac{\mu }{2}\) then

    $$\begin{aligned} \Vert x-J_{\mu A} x\Vert \le \left( 2+\frac{\mu }{\lambda }\right) \Vert x-J_{\lambda A} x \Vert \end{aligned}$$

    for all \(x\in R(I+\lambda A)\cap R(I+\mu A).\)

Proof

(1) By the assumptions and Lemma 2.3, \(\lambda A\) is \(-\frac{1}{2}\)-comonotone and so—by [5, Proposition 3.11(iii)] - \(J_{\lambda A}\) is nonexpansive.

(2) follows as in [3][p.105] using Proposition 2.2 which is applicable since—by Lemma 2.3 - \(J_{\lambda A},J_{\mu A}\) are \(>-1\)-comonotone.

(3) Using (1) and (2) we get

$$\begin{aligned} \begin{array}{l} \left\| x-J_{\mu A} x\right\| \le \left\| x-J_{\lambda A} x\right\| + \left\| J_{\lambda A} x -J_{\mu A} x\right\| \\ \quad =\left\| x-J_{\lambda A} x\right\| + \left\| J_{\mu A} \left( \frac{\mu }{\lambda } x+(1- \frac{\mu }{\lambda })J_{\lambda A} x\right) -J_{\mu A} x\right\| \\ \quad \le \left\| x-J_{\lambda A} x\right\| + \left\| \frac{\mu }{\lambda } x+(1-\frac{\mu }{\lambda }) J_{\lambda A} x -x\right\| \\ \quad =\left\| x-J_{\lambda A} x\right\| + \left| 1-\frac{\mu }{\lambda }\right| \left\| x-J_{\lambda A}x\right\| \le \left( 2+\frac{\mu }{\lambda }\right) \left\| x-J_{\lambda A}x\right\| . \end{array}\end{aligned}$$

\(\square \)

Definition 2.5

[6]. Let \(C\subseteq H\) be a nonempty subset of H and \(T:C\rightarrow H\) be a mapping.

  1. 1.

    T is called \(\alpha \)-averaged with \(\alpha \in (0,1)\) if \(T=(1-\alpha )I+\alpha S,\) where \(S:C\rightarrow H\) is nonexpansive.

  2. 2.

    T is called strongly nonexpansive (SNE) if T is nonexpansive and for all sequences \((x_n),(y_n)\) in H the following implication is true:

    $$\begin{aligned}&\text{ if }\,\left( (x_n-y_n) \ \text{ bounded } \ \wedge \Vert x_n-y_n\Vert -\Vert Tx_n-Ty_n\Vert \rightarrow 0\right) , \text{ then }\,\\&\quad (x_n-y_n)-(Tx_n-Ty_n) \rightarrow 0. \end{aligned}$$

Lemma 2.6

[10, Lemma 2.2] \(T:C\rightarrow H\) is strongly nonexpansive iff T has as an SNE-modulus \(\omega :(0,\infty )^2\rightarrow (0,\infty ),\) i.e.

$$\begin{aligned}&\forall b,\varepsilon >0 \,\forall x,y\in C\,\left( \Vert x-y\Vert \le b\wedge \Vert x-y\Vert -\Vert Tx-Ty\Vert<\omega (b,\varepsilon )\right. \\&\quad \left. \rightarrow \Vert (x-y)-(Tx-Ty)\Vert <\varepsilon \right) . \end{aligned}$$

The proof of [21, Proposition 2.7] establishes:

Proposition 2.7

[21]. Let \(C\subseteq H\) be some subset of H and \(T:C\rightarrow H\) be an \(\alpha \)-averaged mapping for some \(\alpha \in (0,1).\) Then T is strongly nonexpansive with SNE-modulus

$$\begin{aligned} \omega _{\alpha }(b,\varepsilon ):=\frac{1-\alpha }{4b\alpha }\cdot \varepsilon ^2.\end{aligned}$$

Proposition 2.8

Let \((\gamma _n)\subset (0,\infty ), \gamma >0\) be such that \(\gamma _n\ge \gamma >0\) for all \(n\in {\mathbb N}.\) Let \(\rho \in (-\frac{\gamma }{2},0]\) and \(A\subseteq H\times H\) be \(\rho \)-comonotone.

Then for each \(n\in {\mathbb N},\) \(J_{\gamma _n A}:R(I+\gamma _n A)\rightarrow D(A)\) is strongly nonexpansive with common SNE-modulus \(\omega _{\alpha }\), where \(\alpha :=\frac{1}{2((\rho /\gamma ) +1)}\in (0,1).\)

In particular, if \(D(A)\subseteq C\subseteq \bigcap ^{\infty }_{n=0} R(I+\gamma _nA),\) then \((J_{\gamma _n A})\) (restricted to C) is a strongly nonexpansive sequence of mappings \(C\rightarrow C\) in the sense of the papers [1, 2].

Proof

By the assumptions and Lemma 2.3, \(\gamma _nA\) is \((\rho /\gamma _n)\)-comonotone and so, since

$$\begin{aligned} \frac{\rho }{\gamma _n}\ge \frac{\rho }{\gamma }> -\frac{1}{2}, \end{aligned}$$

it a fortiori is \(\eta \)-comonotone with \(\eta :=\frac{\rho }{\gamma }>-\frac{1}{2}.\) Hence by [5, Proposition 3.11(v)] applied to \(\gamma _nA,\) the resolvent \(J_{\gamma _n A}:R(I+\gamma _n A)\rightarrow D(A)\) is \(\alpha \)-averaged. The claim now follows from Proposition 2.7. \(\square \)

3 The proximal point algorithm PPA for comonotone operators

Let \(A\subseteq H\times H\) be \(\rho \)-comonotone, \((\gamma _n)\subset (0,\infty )\) and assume that \(D(A)\subseteq \bigcap ^{\infty }_{n=0} R(I+\gamma _n A).\) We assume that \(zer\, A\not =\emptyset .\) The Proximal Point Algorithm PPA for A and \((\gamma _n)\) is defined by (\(n\in {\mathbb N}=\{ 0,1,2,\ldots \}\))

$$\begin{aligned} x_{n+1}:=J_{\gamma _n A} x_n, \ \ x_0\in R(I+\gamma _0 A). \end{aligned}$$

Throughout this section we also assume that \(\gamma _n\ge \gamma >0\) for all \(n\in {\mathbb N}\) and that \(\rho \in (-\frac{\gamma }{2},0].\)

Proposition 3.1

  1. 1.
    $$\begin{aligned} \lim \limits _{n\rightarrow \infty } \Vert x_n-J_{\gamma _0 A}x_n\Vert = \lim \limits _{n\rightarrow \infty } \Vert x_n-x_{n+1}\Vert =0. \end{aligned}$$

    Moreover, with \(\alpha :=\frac{1}{2((\rho /\gamma ) +1)}\in (0,1),\) \(\omega _{\alpha }\) as in Proposition 2.7 and \(b\ge \Vert x_0-p\Vert \) for some \(p\in zer\,A,\)

    $$\begin{aligned} \Delta (\varepsilon ,L,b):=\left\lceil b/\omega _{\alpha } (b,\varepsilon ) \right\rceil +L+1 \end{aligned}$$

    is a modulus of \(\liminf \) (in the sense of [14]) i.e.

    $$\begin{aligned} \forall L\in {\mathbb N},\,\varepsilon >0\,\exists n \,\left( L\le n\le \Delta (\varepsilon ,L,b)\ \text{ and } \ \Vert x_n-x_{n+1}\Vert < \varepsilon \right) . \end{aligned}$$
  2. 2.

    Define

    $$\begin{aligned} u_n:=\frac{x_n-x_{n+1}}{\gamma _n}. \end{aligned}$$

    Then \(u_n\in Ax_{n+1},\) \(\lim \limits _{n\rightarrow \infty } u_n=0\) and

    $$\begin{aligned} \exists n\le \rho (\varepsilon ,b,\gamma ):=\Delta (\varepsilon \cdot \gamma ,0,b)\,\left( \Vert u_n\Vert <\varepsilon \right) . \end{aligned}$$

Proof

1) Let \(p\in zer\,A.\) Then by Propositions 2.2 and 2.4 (using that \(\gamma _nA\) is \(>-\frac{1}{2}>-1\) comonotone)

$$\begin{aligned} \Vert x_{n+1} -p\Vert \le \Vert x_n-p\Vert \le b, \ \ n\in {\mathbb N},\end{aligned}$$

and so \((x_n)\) is Fejér monotone w.r.t. \(zer\,A=Fix(J_{\gamma _n A})\) and \((\Vert x_n-p\Vert )\) is convergent. Thus

$$\begin{aligned} \left| \Vert J_{\gamma _n A} x_n-J_{\gamma _n A}p\Vert -\Vert x_n-p\Vert \right| =\left| \Vert x_{n+1} -p \Vert -\Vert x_n -p\Vert \right| \rightarrow 0. \end{aligned}$$

Hence by Proposition 2.8

$$\begin{aligned} \Vert x_{n+1}-x_n\Vert =\Vert J_{\gamma _n A} x_n-x_n\Vert \rightarrow 0. \end{aligned}$$

By Proposition 2.4(3) (which is applicable in the nontrivial case where \(n\ge 1\) due to \(x_n\in D(A)\) and the range condition)

$$\begin{aligned} \Vert x_n-J_{\gamma _0 A} x_n\Vert \le \left( 2+\frac{\gamma _0}{\gamma }\right) \, \Vert x_n-J_{\gamma _n A} x_n\Vert \end{aligned}$$

and so also \(\lim \limits _{n\rightarrow \infty } \Vert x_n-J_{\gamma _0 A} x_n\Vert =0.\)

The \(\liminf \)-bound is proved as in [13, Proposition 2.1] using Proposition 2.8. We include the proof here for completeness: Let \(L\in {\mathbb N}\) and \(\delta >0.\) Then there exists an \(n\in {\mathbb N}\) with \(L\le n\le L+ \left\lceil b/\delta \right\rceil +1\) such that

$$\begin{aligned} \Vert x_n-p\Vert -\Vert J_{\gamma _n A}x_n-J_{\gamma _n A}p\Vert = \Vert x_n-p\Vert -\Vert x_{n+1}-p\Vert < \delta \end{aligned}$$

since, otherwise,

$$\begin{aligned} b\ge \Vert x_L- p\Vert \ge \Vert x_L-p\Vert -\Vert x_{L+\left\lceil b/\delta \right\rceil +1}-p\Vert \ge (\left\lceil b/\delta \right\rceil +1)\cdot \delta > b.\end{aligned}$$

Now fix \(\delta :=\omega _{\alpha }(b,\varepsilon ).\) Then Proposition 2.8 implies the existence of an n with \(L\le n\le \Delta (\varepsilon ,L,b)\) such that

$$\begin{aligned} \Vert x_n-x_{n+1}\Vert =\Vert (x_n-p)-(J_{\gamma _n A}x_n-J_{\gamma _n A} p)\Vert < \varepsilon .\end{aligned}$$

2) is immediate from 1). \(\square \)

The PPA for maximally monotone operators, while being weakly convergent, fails to be strongly convergent as shown in [9]. In the boundedly compact (i.e. finite dimensional) case there is in general no computable rate of convergence unless some strong metric regularity assumption is made (see [19] and [11]). However, in the boundedly compact case, one can get effective rates \(\Psi \) of metastability in the sense of T. Tao [24, 25] for the Cauchy property of \((x_n),\) i.e.

$$\begin{aligned} \forall \varepsilon >0\,\forall g:{\mathbb N}\rightarrow {\mathbb N}\,\exists n\le \Psi (\varepsilon ,g)\, \forall i,j\in [n,n+g(n)]\ \left( \Vert x_i-x_j\Vert <\varepsilon \right) .\end{aligned}$$

Note that, noneffectively, this property implies the Cauchy property of \((x_n)\) and hence the existence of a limit x but does not allow one to convert \(\Psi \) into an effective rate of convergence. One can additionally ensure that for \(i\in [n,n+g(n)],\) \(x_i\) is an approximate zero of A which guarantees that x is a zero of A.

We now extend our rate of metastability for the PPA from [13] to the \(\rho \)-comononotone case:

Theorem 3.2

Let A be as above and assume additionally that \(\overline{D(A)}\subseteq \bigcap ^{\infty }_{n=0} R(I+\gamma _n A)\) is boundedly compact and \(x_0\in \overline{D(A)}.\) Then \((x_n)\) strongly converges to a zero of A. Moreover, the rate of metastability \(\Psi \) from [13, Theorem 2.12] also holds in our current situation with \(\Delta \) being replaced by our definition in Proposition 3.1(1), i.e.

$$\begin{aligned}&(*)\ \forall k\in {\mathbb N}\,\forall g\in {\mathbb N}^{{\mathbb N}}\,\exists n\le \Psi (k,g,\beta ) \, \forall i,j\in [n,n+g(n)]\, \\&\qquad \quad \left( \Vert x_i-x_j\Vert \le \frac{1}{k+1} \ \text{ and } \ x_i\in \tilde{F}_k \right) , \end{aligned}$$

where

$$\begin{aligned} \tilde{F}_k:=\bigcap _{i\le k}\left\{ x\in \overline{D(A)}\,:\, \Vert x-J_{\gamma _i A} x\Vert \le \frac{1}{k+1}\right\} \end{aligned}$$

and \(\beta \) is a modulus of total boundedness (in the sense of [13, Theorem 2.12]) for \(\overline{D(A)}\cap \overline{B}(0,M),\) where \(\overline{B}(0,M):=\{ x\in H: \Vert x\Vert \le M\},\) with \(M\ge b+\Vert p\Vert \) and \(b\ge \Vert x_0-p\Vert \) for some \(p\in zer\,A.\)

If \(C\subseteq H\) is closed and convex with \(\overline{D(A)}\subseteq C\subseteq \bigcap ^{\infty }_{n=0} R(I+\gamma _n A),\) then without compactness assumption, \((x_n)\) converges weakly to a zero of A.

Proof

The proof of [13, Theorem 2.12] for the rate of metastability of \((x_n)\) can be taken without any changes observing that [14, Lemma 8.1] holds with the same proof in our context and that \(\Phi \) can be shown to be an approximate F-bound as in [13, Proposition 2.11] using Propositions 2.4(3) and 3.1(1) instead of [13, Prop.2.3(ii),Prop.2.1].

Since \((x_n)\) is metastable (the first part of \((*)\)), it is a Cauchy sequence and hence convergent with \(x:=\lim _n x_n\in \overline{D(A)}.\) By the extra clause ‘\(x_i\in \tilde{F}_k\)’ in \((*),\) which strengthens the usual formulation of a rate of metastability, we can conclude that \(x\in zer\,A.\) Indeed, choosing in \((*)\) for given \(N\in {\mathbb N}\) the function \(g(n):=N\) we get an \(n_N\ge N\) with \(\Vert x_{n_N}-J_{\gamma _0 A}x_{n_N}\Vert \le \frac{1}{k+1}.\) Using the nonexpansivity of \(J_{\gamma _0 A}\) this implies that \(x\in Fix(J_{\gamma _0 A})=zer\, A.\)

For the weak convergence in the noncompact case we reason as follows: let w be a weak sequential cluster point of \((x_n).\) Then there is a subsequence \((x_{n_k})\) which weakly converges to w. By Proposition 3.1(1) \((x_{n_k})\) is an approximate fixed point sequence of \(J_{\gamma _0 A}.\) Hence by Browder’s demiclosedness principle ([4, Corollary 4.28]) applied to \(J_{\gamma _0 A}\) and C it follows that \(w\in Fix(J_{\gamma _0 A}).\) Hence we can—using again the fact that \((x_n)\) is Fejér monotone w.r.t. \(Fix(J_{\gamma _0 A})\)—conclude that \((x_n)\) weakly converges to \(w\in Fix(J_{\gamma _0 A})=zer\, A\) by [4, Theorem 5.5]. \(\square \)

Remark 3.3

The range condition in Theorem 3.2 is trivially satisfied if A is maximally \(\rho \)-comonotone (in the sense of [5, Definition 2.4.(iv)]) since then by Lemma 2.3\(\lambda A\) is maximally \((\rho /\lambda )\)-comonotone with \(\rho /\lambda >-1\) for \(\lambda \ge \gamma \) so that by [5, Corollary 2.12] \(R(I+\lambda A)=H.\)

Remark 3.4

Note that the conditions on \(\rho ,\gamma _n\) made in [7] on their general PPA in the case of a single operator A and without relaxation (i.e. \(\lambda _n:=1\)) imply our condition that \(\rho >-\frac{\inf \{\gamma _n:n\in {\mathbb N}\}}{2}:\) observing that \(\rho \) in [7] corresponds to our \(-\rho ,\) the conditions (iii),(iv) in [7, Theorem 3.1] state the existence of an \(\varepsilon \in (0,1)\) s.t.

$$\begin{aligned} \frac{1}{1+\rho /\gamma _n}\le 2-\varepsilon , \ \gamma _n>-\rho , \ \ n\in {\mathbb N}.\end{aligned}$$

An easy calculation shows that this implies that

$$\begin{aligned} \inf \{ \gamma _n:n\in {\mathbb N}\}\ge -\rho \frac{2-\varepsilon }{1-\varepsilon }>-2\rho .\end{aligned}$$

Also the converse holds: let \(\delta >0\) be such that \(\gamma >-2\rho +\delta .\) Then the condition

$$\begin{aligned} \frac{1}{1+\frac{\rho }{\gamma _n}} < 2-\varepsilon \end{aligned}$$

is satisfied with \(\varepsilon :=2-\frac{2\gamma }{\gamma +\delta }.\)

Error terms \(u_n\) subject to the condition that \(\sum \Vert u_n\Vert <\infty \) (implied by condition (vi) in [7, Theorem 3.1]) can be incorporated even in the quantitative part of our theorem (similar to [15, Theorem 4.5]). Our approach makes the relevance of the averagedness of \(J_{\gamma _n A}\) explicit which only implicitly occurs in the proof of [7, Theorem 3.1].

Definition 3.5

[16] Let A be as at the beginning of this section with \(p\in zer\,A\) and define \(F(x):={{\,\mathrm{dist}\,}}(0_X,A(x))\) (with \(F(x):=\infty \) for \(x\not \in D(A)\)). A function \(\phi :(0,\infty ) \rightarrow (0,\infty )\) is called a ‘modulus of regularity for A w.r.t. \(zer\,A\) and \(\overline{B}(p,r)\) with \(r>0\)’ if for all \(\varepsilon >0\) and \(x\in \overline{B}(p,r):=\{ y\in H:\Vert y-p\Vert \le r\}\) one has

$$\begin{aligned} F(x)<\phi (\varepsilon ) \rightarrow {{\,\mathrm{dist}\,}}(x,zer\,F)<\varepsilon . \ \end{aligned}$$

As [13, Lemma 2.6] (but reasoning in the proof of \(zer\,F\subseteq zer\,A\) with - say-\(\gamma _0A\) and \(J_{\gamma _0A}\) instead of \(A,J_A\)) one shows that

Lemma 3.6

With F as defined in the previous definition, \(zer\, F=zer\, A\) and so \((x_n)\) as defined by the PPA for A is Fejér monotone w.r.t. \(zer\, F=zer\,A,\) i.e.

$$\begin{aligned} \forall p\in zer\,F\,\forall n\in {\mathbb N}\, (\Vert x_{n+1}-p\Vert \le \Vert x_n-p\Vert ). \end{aligned}$$

As in the case of [13, Theorem 2.8] one now gets

Theorem 3.7

Let A and \((\gamma _n)\) be as above and assume that \(\overline{D(A)}\subseteq \bigcap \limits ^{\infty }_{n=0} R(I+\gamma _n A).\) Let \(p\in zer\,A\) and \(b\ge \Vert x_0-p\Vert .\) If A has a modulus \(\phi \) of regularity w.r.t \(zer\, A\) and \(\overline{B}(p,b),\) then \((x_n)\) converges to a zero \(z:=\lim x_n\) of A with rate of convergence \(\rho (\phi (\varepsilon /2),b,\gamma )+1,\) where \(\rho \) is as in Proposition 3.1(2).

Proof

The proof is largely identical to that of [13, Theorem 2.8]. We only have to observe that in that latter proof it suffices to have the existence of an \(n\le \rho (\varepsilon ,b,\gamma )\,(|F(x_{n+1})|\le \Vert u_n\Vert \le \varepsilon )\) (rather than that this holds for all \(n\ge \rho (\varepsilon ,b,\gamma )\)) and that this follows from Proposition 3.1(2). \(\square \)

4 The Halpern-type proximal point algorithm HPPA for comonotone operators

Whereas the PPA even for monotone operators A in general is not strongly convergent ([9]) a Halpern-type variant strongly converges also for \(\rho \)-comonotone operators as we show in this section.

Again we assume that \((\gamma _n)\subset (0,\infty )\) with \(\gamma _n\ge \gamma >0\) for all \(n\in {\mathbb N}\) and that A is \(\rho \)-comonotone with \(\rho \in (-\frac{\gamma }{2},0]\) with \(zer\,A\not =\emptyset .\) Let \(C\subseteq H\) be a nonempty closed and convex subset such that \(\overline{D(A)}\subseteq C\subseteq \bigcap ^{\infty }_{n=0} R(I+\gamma _n A).\)

Definition 4.1

[2, 18, 23]. Let \(S\subseteq H\) be some nonempty subset of H and \(T:S\rightarrow H\) a mapping and \((S_n)\) be a sequence of mappings \(S_n:S\rightarrow H.\) Let \(F((S_n)):=\bigcap _{n\in {\mathbb N}} Fix(S_n)\) be the set of all common fixed points of \(S_n\) for all n. \((S_n)\) is said to satisfy the NST condition (I) with T if \(F((S_n))\not =\emptyset ,\) \(Fix(T)\subseteq F((S_n))\) and \(x_n-Tx_n\rightarrow 0\) whenever \((x_n)\) is a bounded sequence in S with \(x_n-S_nx_n\rightarrow 0.\)

Proposition 4.2

Let \(T:=J_{\gamma _0 A}:C\rightarrow C\) and \(S_n:=J_{\gamma _n A}:C\rightarrow C.\) Then \((S_n)\) (strictly speaking the sequence of the restrictions of \(S_n\) to C) satisfies the NST condition (I) with T.

Proof

Clearly, \(Fix(T)=Fix(S_n)=zer\, A\not =\emptyset .\) Let \((x_n)\) be a bounded sequence in C with \(\lim _n\Vert x_n-S_nx_n\Vert =0.\) Then by Proposition 2.4(2) also \(\lim _n \Vert x_n-Tx_n\Vert =0.\) \(\square \)

Theorem 4.3

Let \((\alpha _n)\subset (0,1]\) be such that \(\lim _n\alpha _n=0\) and \(\sum ^{\infty }_{n=0}\alpha _n=\infty .\) For \(u,x_0\in C\) define the Halpern-type proximal point algorithm (HPPA) by

$$\begin{aligned} x_{n+1}:=\alpha _n u+(1-\alpha _n)J_{\gamma _n A} x_n\in C. \end{aligned}$$

Then \((x_n)\) strongly converges to the zero of A which is closest to u. Moreover, the rate of metastability from [12, Theorem 4.1] also holds for our current situation if \(\omega _{\eta }\) is replaced by \(\omega _{\alpha }\) from Proposition 2.7 above with \(\alpha :=\frac{1}{2((\rho /\gamma )+1)}\) and \(\omega _J(b,\varepsilon ):=\varepsilon .\)

Proof

The strong convergence follows from [2, Theorem 3.1] whose assumptions are satisfied by Propositions 2.4(1), 2.8 and 4.2 using also that H has the fixed point property for nonexpansive mappings. The strong convergence also follows using [12, Theorem 4.1] which, moreover, gives the rate of metastability stated in the theorem. For this we only have to observe that the proof of [12, Theorem 4.1] only uses properties of \(J_{\gamma _n A}\) which by the results stated above also hold true for \(\rho \)-comonotone operators A where now we use \(\omega _{\alpha }\) and Proposition 2.8 instead of \(\omega _{\eta }\) and [12, Lemma 2.4]. Finally, we note that we can take \(\omega _J(b,\varepsilon ):=\varepsilon \) as modulus of uniform continuity for the normalized duality map on \(\overline{B}(0,b)\) since we are in a Hilbert space. \(\square \)

Remark 4.4

Remark 3.3 applies here as well: if A is maximally \(\rho \)-comonotone, then the range condition is satisfied for any closed and convex subset \(C\subseteq H\) satisfying \(\overline{D(A)}\subseteq C.\)