1 Introduction

This work is devoted to convergence properties of the Broyden-like method for systems of equations in which some of the equations are linear. Among others, it provides the first answer to the decades-old question whether the Broyden-like matrices converge under the standard assumptions for q-superlinear convergence of the iterates, albeit for a special case only.

Given a smooth nonlinear mapping \(F:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}\), Broyden’s method [3] aims at finding \(\bar {u}\in \mathbb {R}^{n}\) with:

$$ F(\bar{u}) = 0. $$

It is a well-established member of the class of quasi-Newton methods and shares its local q-superlinear convergence, cf. [9, 14, 15, 21, 23]. The Broyden-like method generalizes Broyden’s method by allowing an additional parameter σk in the matrix update. It reads as follows.

figure a

For (σk) ≡ 1, we recover Broyden’s method. An appropriate choice of σk ensures that Bk+ 1 is invertible if Bk is invertible. In fact, by the Sherman-Morrison formula, all choices but one maintain invertibility. The Broyden-like method is well known, cf. [22], [28, Section 6] and [16, Algorithm 1].

In this work, we consider Algorithm 1 for mixed linear-nonlinear systems of equations. That is, there exists J ⊂{1,…,n} such that \(F_{j}(u)={a_{j}^{T}} u + b_{j}\), where \(a_{j}\in \mathbb {R}^{n}\) and \(b_{j}\in \mathbb {R}\) for all jJ. In addition, we suppose that the initial matrix B0 agrees with the Jacobian of F in the rows that correspond to (some of) the affine components of F, i.e., \({B_{0}^{j}} = {a_{j}^{T}}\) for all jJ. For jJ the functions Fj can be nonlinear and \({B_{0}^{j}}\) is not restricted. This framework includes many practically relevant systems of equations. Also, it fits two standard suggestions for the choice of B0, which are to use \(B_{0}=F^{\prime }(u^{0})\) or a finite difference approximation of \(F^{\prime }(u^{0})\). In the following, we speak of exact initialization if \({B_{0}^{j}} = {a_{j}^{T}}\) for all jJ.

This article is divided into four parts. In the first part, we show that exact initialization ensures that the steps (sk)k≥ 1 stay in a subspace \({\mathcal {S}}\) and that they can be generated by applying Algorithm 1 to a lower-dimensional mapping \(G:\mathbb {R}^{d}\rightarrow \mathbb {R}^{d}\), where d is the dimension of \({\mathcal {S}}\). This extends results from [18].

The second part is concerned with the consequences of the first part for the convergence of the Broyden-like matrices (Bk). We point out that it is still largely open if (Bk) converges and that several renowned researchers have mentioned this issue in their works, cf. the survey articles [8, Example 5.3], [21, p. 117], [14, p. 306] and [2, p. 940]. The convergence of (Bk) is for example of interest because it is closely related to the rate of convergence of (uk), see, e.g., Lemma 2 and 3. For invertible \(F^{\prime }(\bar {u}),\) there is only one result available: It is established in [22, Theorem 5.7] and in [17] that if the sequence of steps (sk) is uniformly linearly independent, then (Bk) converges and \(\lim _{k\to \infty } B_{k}=F^{\prime }(\bar {u})\). We include the precise result as Theorem 4. Unfortunately, conditions that imply uniform linear independence of (sk) are unknown and we are not aware of a single example-be it theoretical or numerical-in which (sk) is actually uniformly independent. In the setting of this work, anyway, (sk)k≥ 1 is confined to the subspace \({\mathcal {S}}\) and thus violates uniform linear independence. After extending the notion of uniform linear independence to subspaces, we generalize the above convergence result for (Bk) to the setting of this work, cf. Theorem 5. In doing so, we also obtain a formula for the limit of (Bk).

In the third part, we observe that if F has only one nonlinear component function and B0 is initialized exactly, then the generalized convergence result from the second part implies that (Bk) converges whenever the iterates (uk) converge, and this holds for regular and for singular \(F^{\prime }(\bar {u})\), cf. Corollary 2. Since the assumption of only one nonlinear component function is very restrictive, we stress that this is the first time that convergence of (Bk) is shown for n > 1 and invertible \(F^{\prime }(\bar {u})\). We will also see that even though each Bk agrees with \(F^{\prime }(\bar {u})\) in n − 1 of n rows, the limit of (Bk) is generally not \(F^{\prime }(\bar {u})\).

We continue the third part by paying special attention to the case that σk = 1 for all kk0 and some k0 ≥ 0, i.e., Algorithm 1 turns into Broyden’s method. The result of the first part implies that in this case, Broyden’s method essentially reduces to the one-dimensional secant method. This yields a comprehensive characterization of the convergence of (uk) including a lower bound for its q-order, which in turn allows us to establish significantly stronger convergence properties of (Bk) than for the Broyden-like method, cf. Theorem 6. For affine F, we prove finite convergence if σk = 1 is selected at least once, cf. Theorem 7. The third part concludes with a brief application of the developed convergence theory to two examples from the literature.

In the last part, we verify the results from the third part in numerical experiments with high precision. Among others, we find that if \(F^{\prime }(\bar {u})\) is invertible, then choosing \((\sigma _{k})_{k\geq k_{0}}\equiv 1\) for some k0 ≥ 0 leads to much faster convergence than, e.g., (σk) ≡ 0.99, while this is not the case if \(F^{\prime }(\bar {u})\) is not invertible.

The convergence theory of Broyden’s method and specific versions of the Broyden-like method are developed in, e.g., [4, 12, 16, 22]. There is only one further result available on the convergence of the Broyden(-like) matrices besides the one mentioned above: In [19], it was recently shown for Broyden’s method that if \(F^{\prime }(\bar {u})\) is singular with some additional structure, then \(({\lVert B_{k+1}-B_{k}\rVert })\) converges q-linearly to zero under appropriate assumptions, so (Bk) converges.

For other quasi-Newton updates, convergence results are available. We are aware of results for the SR1 update [5, 11, 30], for the Powell-symmetric-Broyden update [26], for the DFP and the BFGS update [13], and for the convex Broyden class excluding the DFP update [29].

This paper is organized as follows. In Section 2, we collect preparatory results and we present the generalization of uniform linear independence that is useful for subspaces. In Section 3, we prove the subspace property of (sk)k≥ 1 and show that (sk)k≥ 1 can be obtained by applying Algorithm 1 to a suitable mapping \(G:\mathbb {R}^{d}\rightarrow \mathbb {R}^{d}\). Section 4 contains the convergence results for the Broyden-like matrices and the application to examples from the literature. Section 5 presents numerical experiments and Section 6 summarizes.

Notation

We use \(\mathbb {N}=\{1,2,3,\ldots \}\). For \(n\in \mathbb {N}\) we set [n] := {1,2,…,n}, [n]0 := [n] ∪{0} and [0] := . The Euclidean norm of \(v\in \mathbb {R}^{n}\) is \({\lVert v\rVert }\), while \({\lVert A\rVert }\) is the spectral norm if \(A\in \mathbb {R}^{m\times n}\). For \(A\in \mathbb {R}^{m\times n}\), Aj indicates the j th row of A, regarded as a row vector, whereas \(A^{i,j}\in \mathbb {R}\) is the usual notation for entries. The span of \(C\subset \mathbb {R}^{n}\) is indicated by 〈C〉. We will use tacitly that Algorithm 1 cannot generate a step sk satisfying sk = 0. For k ≥ 0, we define:

$$ E_{k}:= B_{k} - F^{\prime}(\bar{u}) \qquad\text{ and }\qquad \hat s^{k} := \frac{s^{k}}{{\left\|s^{k}\right\|}}, $$

where the first definition assumes that Algorithm 1 has generated (Bk) and (uk) with \(\lim _{k\to \infty }u^{k}=\bar {u}\) for some \(\bar {u}\) at which F is differentiable, while the second definition already makes sense if Algorithm 1 has generated sk. We employ the q-order of convergence and the r-order of convergence in this work. They are studied in, e.g., [25, Section 9].

2 Preliminaries

2.1 Convergence of the Broyden-like method

The main convergence result for Algorithm 1 reads as follows.

Theorem 1

Let \(F:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}\) be differentiable in a neighborhood of \(\bar {u}\) with \(F(\bar {u})=0\) and let \({\lVert F^{\prime }(u)-F^{\prime }(\bar {u})\rVert }\leq L{\lVert u-\bar {u}\rVert }^{\alpha }\) for all u from this neighborhood and constants L,α > 0. Let \(F^{\prime }(\bar {u})\) be invertible. If Algorithm 1 generates a sequence (uk) that satisfies \({\sum }_{k}{\lVert u^{k}-\bar {u}\rVert }^{\alpha }<\infty \), then there holds:

$$ \sum\limits_{k=0}^{\infty} \left( \frac{{\left\|u^{k+1}-\bar{u}\right\|}}{{\left\|u^{k}-\bar{u}\right\|}}\right)^{2} < \infty, $$
(1)

implying that (uk) converges q-superlinearly to \(\bar {u}\).

Moreover, there are δ,ε > 0 such that for every (u0,B0) with \({\lVert u^{0}-\bar {u}\rVert }\leq \delta \) and \({\lVert B_{0}-F^{\prime }(\bar {u})\rVert }\leq \varepsilon \), Algorithm 1 either terminates with output \(u^{\ast }=\bar {u}\) or it generates (uk) such that all Bk are invertible and \({\sum }_{k}{\lVert u^{k}-\bar {u}\rVert }^{\alpha }<\infty \).

Proof

This follows from [20, Theorem 1]. □

If we restrict attention to Broyden’s method instead of the Broyden-like method, then a stronger result is available, namely Gay’s theorem on 2n-step q-quadratic convergence [12, Theorem 3.1]. For mixed linear–nonlinear systems with exact initialization, this result has recently been improved.

Theorem 2

Let \(n\in \mathbb {N}\), d ∈ [n]0 and J := [n] ∖ [d]. Let \(F:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}\) satisfy \(F_{j}(u)={a_{j}^{T}} u + b_{j}\) for all jJ, where \(a_{j}\in \mathbb {R}^{n}\) and \(b_{j}\in \mathbb {R}\) for all jJ. Let F be differentiable in a neighborhood of \(\bar {u}\) with \(F(\bar {u})=0\) and let \({\lVert F^{\prime }(u)-F^{\prime }(\bar {u})\rVert }\leq L{\lVert u-\bar {u}\rVert }\) for all u from this neighborhood and a constant L > 0. Let \(F^{\prime }(\bar {u})\) be invertible. Then there are δ,ε > 0 and C > 0 such that for every (u0,B0) with \({\lVert u^{0}-\bar {u}\rVert }\leq \delta \), \({\lVert B_{0}-F^{\prime }(\bar {u})\rVert }\leq \varepsilon \), and \({B_{0}^{j}} = {a_{j}^{T}}\) for all jJ, Algorithm 1 with (σk) ≡ 1 either terminates with output \(u^{\ast }=\bar {u}\) or it generates (uk) that satisfies (1) and:

$$ {\left\|u^{k+2d}-\bar{u}\right\|}\leq C{\left\|u^{k}-\bar{u}\right\|}^{2} \qquad\forall k\geq 1. $$

In particular, (uk) converges q-superlinearly and with r-order at least 21/(2d) to \(\bar {u}\) and all Bk are invertible.

Proof

See [18]. □

2.2 Convergence of the Broyden-like updates

If (uk) and the Broyden-like updates converge, then \(F(\lim _{k\to \infty } u^{k})=0\).

Lemma 1

Let \(F:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}\) be continuous at \(\bar {u}\). Let (uk) and (Bk) be generated by Algorithm 1. Suppose that \(u^{k}\to \bar {u}\) and \(\sup _{k\geq 0}{\lVert B_{k+1}-B_{k}\rVert }<\infty \). Then \(F(\bar {u})=0\).

Proof

From \(\sup _{k\geq 0}{\lVert B_{k+1}-B_{k}\rVert }<\infty \), we infer \(\sup _{k\geq 0}\frac {{\lVert F(u^{k+1})\rVert }}{{\lVert s^{k}\rVert }}<\infty \). The convergence of (uk) yields \(\lim _{k\to \infty }{\lVert s^{k}\rVert }=0\), so \(\lim _{k\to \infty }{\lVert F(u^{k})\rVert }=0\), whence \(F(\bar {u})=0\). □

If (uk) and the Broyden-like matrices converge, then the convergence of (uk) is q-superlinear.

Lemma 2

Let \(F:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}\) be differentiable at \(\bar {u}\) with \(F^{\prime }(\bar {u})\) invertible. Let (uk) and (Bk) be generated by Algorithm 1. Suppose that \(u^{k}\to \bar {u}\) and \({\lVert B_{k+1}-B_{k}\rVert }\to 0\) for \(k\to \infty \). Then (uk) converges q-superlinearly to \(\bar {u}\).

Proof

Due to the invertibility of \(F^{\prime }(\bar {u})\) and \(u^{k}\to \bar {u}\), there is C > 0 such that:

$$ \begin{array}{llll} {\left\|u^{k+1}-\bar{u}\right\|} & \leq C {\left\|F(u^{k+1})-F(\bar{u})\right\|} = \frac{C}{\sigma_{k}} {\left\|B_{k+1}-B_{k}\right\|}{\left\|s^{k}\right\|}\\ & \leq \frac{C}{\sigma_{\min}}{\left\|B_{k+1}-B_{k}\right\|}\left( {\left\|u^{k+1}-\bar{u}\right\|}+{\left\|u^{k}-\bar{u}\right\|}\right) \end{array} $$

for all k sufficiently large. Here, we also used that \(F(\bar {u})=0\) by Lemma 1. Subtracting \(\frac {C}{\sigma _{\min \limits }}{\lVert B_{k+1}-B_{k}\rVert }{\lVert u^{k+1}-\bar {u}\rVert }\) and taking the limit yields the claim. □

Next we show that convergence of (uk) with q-order at least γ > 1 implies convergence of \(({\lVert B_{k+1}-B_{k}\rVert })\) with r-order at least γ, cf. also [25, 9.1.8&9.2.7].

Lemma 3

Let \(F:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}\) and let (uk) and (Bk) be generated by Algorithm 1. Suppose that (uk) converges to some \(\bar {u}\) and that F satisfies \({\lVert F(u)-F(\bar {u})\rVert }\leq L{\lVert u-\bar {u}\rVert }\) for all u in a neighborhood of \(\bar {u}\) and some constant L > 0. Let γ > 1.

  1. 1.

    If \(F(\bar {u})=0\) and there is C > 0 such that for all k sufficiently large:

    $$ {\left\|u^{k+1}-\bar{u}\right\|}\leq C {\left\|u^{k}-\bar{u}\right\|}^{\gamma} $$
    (2)

    is satisfied, then there exists \(\hat C>0\) such that:

    $$ {\left\|B_{k+1}-B_{k}\right\|} \leq \hat C {\left\|u^{k}-\bar{u}\right\|}^{\gamma-1} $$
    (3)

    for all sufficiently large k.

  2. 2.

    If \(C,\hat C>0\) exist such that (2) and (3) are satisfied for all sufficiently large k, then we have \(F(\bar {u})=0\) and \(\lim _{k\to \infty }\lVert {B_{k+1}-B_{k}}\rVert ^{\frac {1}{p^{k}}}=0\) for all p ∈ [1,γ). In particular, \({\sum }_{k}\lVert {B_{k+1}-B_{k}}\rVert <\infty \) and (Bk) converges.

Proof

  • Proof of 1: Since (2) implies q-superlinear convergence of (uk), we obtain from a well-known result of Dennis and Moré that \({\lVert u^{k}-\bar {u}\rVert }/{\lVert s^{k}\rVert }\to 1\) for \(k\to \infty \), cf. [7, Lemma 2.1]. The Lipschitz-type property of F at \(\bar {u}\), \(F(\bar {u})=0\) and (2) hence yield:

    $$ {\left\|B_{k+1}-B_{k}\right\|} = \sigma_{k}\frac{{\lVert F(u^{k+1})-F(\bar{u})\rVert}}{{\lVert s^{k}\rVert}} \leq \hat C{\left\|u^{k}-\bar{u}\right\|}^{\gamma-1} $$

    for all sufficiently large k and a constant \(\hat C>0\), which proves (3).

  • Proof of 2: Lemma 1 yields \(F(\bar {u})=0\) due to (3). To prove the remaining claims it suffices to establish that

    $$ \lim_{k\to\infty}\left( {\lVert B_{k+1}-B_{k}\rVert}^{\frac{1}{\gamma-1}}\right)^{\frac{1}{p^{k}}}=0 \qquad \forall p\in[1,\gamma). $$
    (4)

    As (uk) has q-order at least γ by (2), its r-order is also at least γ, cf. [25, 9.3.2], thus \(\lim _{k\to \infty }{\lVert u^{k}-\bar {u}\rVert }^{\frac {1}{p^{k}}}=0\) for all p ∈ [1,γ), so (4) follows from (3).

Remark 1

For Broyden’s method, it is unknown whether (2) holds for any γ > 1 if n > 1, cf. also [18]. For n = 1, it is known that (2) holds with γ equal to the golden mean [31]. In Theorem 6, we show that this result extends to arbitrary n provided F has n − 1 affine component functions and B0 is initialized exactly.

2.3 Uniform linear independence of dimension d

The following definition is the appropriate generalization of uniform linear independence for the purposes of this paper.

Definition 1

Let \(n\in \mathbb {N}\) and \(d\in \mathbb {N}\). The sequence of vectors \((s^{k})\subset \mathbb {R}^{n}\setminus \{0\}\) is called uniformly linearly independent of dimension d iff there exist constants \(m\in \mathbb {N}\) and ρ > 0 such that for every sufficiently large k the set:

$$ \bigl\{ s^{k}, s^{k+1}, \ldots, s^{k+m} \bigr\} $$

contains d vectors \(s^{k_{1}}, \ldots , s^{k_{d}}\) such that all singular values of the matrix:

$$ \begin{pmatrix} \frac{s^{k_{1}}}{{\lVert s^{k_{1}}\rVert}} & \frac{s^{k_{2}}}{{\lVert s^{k_{2}}\rVert}} & {\ldots} & \frac{s^{k_{d}}}{{\lVert s^{k_{d}}\rVert}} \end{pmatrix} \in\mathbb{R}^{n\times d} $$

are larger than ρ.

Remark 2

The usual notion of uniform linear independence, cf. [5, (AS.4)], is recovered for d = n. If d is not specified, then it is understood that d = n.

3 Behavior of the Broyden-like method on mixed systems

To conveniently state results for mixed linear–nonlinear systems of equations, we will use the following assumption.

Assumption 1

Let \(n\in \mathbb {N}\), d ∈ [n]0 and J := [n] ∖ [d]. Let \(F:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}\) satisfy \(F_{j}(u)={a_{j}^{T}} u + b_{j}\) for all jJ, where \(a_{j}\in \mathbb {R}^{n}\) and \(b_{j}\in \mathbb {R}\) for all jJ. Let \(B_{0}\in \mathbb {R}^{n\times n}\) satisfy \({B_{0}^{j}}={a_{j}^{T}}\) for all jJ and suppose that B0 is invertible.

Remark 3

Due to \({B_{0}^{j}}={a_{j}^{T}}\) for all jJ and the invertibility of B0, Assumption 1 implies \(\dim ({\langle \{a_{j}\}_{j\in J}\rangle })=n-d\), hence \(\dim ({\langle \{a_{j}\}_{j\in J}\rangle }^{\perp })=d\).

The first result establishes basic properties of Algorithm 1 under Assumption 1. It generalizes [18, Lemma 2.1].

Lemma 4

Let Assumption 1 hold and let (uk), (sk) and (Bk) be generated by Algorithm 1. Then we have for each jJ and all k ≥ 1 the identities \({B_{k}^{j}} = {a_{j}^{T}}\), Fj(uk) = 0, \({a_{j}^{T}} s^{k}=0\) and Bkaj = B1aj.

Proof

The proof of [18, Lemma 2.1] applies without changes. □

Under the assumptions of Lemma 4, the sequence (sk) necessarily violates uniform linear independence except if J = .

Corollary 1

Any selection \(\{s^{k_{1}}, \ldots , s^{k_{d+1}}\}\) of d + 1 vectors from the sequence (sk)k≥ 1 of Lemma 4 is linearly dependent.

Proof

Lemma 4 yields \({a_{j}^{T}} s^{k}=0\) for all jJ and all k ≥ 1, thus \(s^{k}\in {\langle \{a_{j}\}_{j\in J}\rangle }^{\perp }\) for all k ≥ 1. The claim follows from \(\dim ({\langle \{a_{j}\}_{j\in J}\rangle }^{\perp })=d\). □

To conveniently state the next result, we introduce some notation.

Definition 2

Let Assumption 1 hold. We set \({\mathcal {A}}:={\langle \{a_{j}\}_{j\in J}\rangle }\) and \({\mathcal {S}}:={\mathcal {A}}^{\perp }\). Furthermore, we let \(\{{\mathfrak {s}}^{i}\}_{i\in [d]}\) be an orthonormal basis of \({\mathcal {S}}\) and we denote \(S:=\begin {pmatrix} {\mathfrak {s}}^{1} & {\ldots } & {\mathfrak {s}}^{d} \end {pmatrix}\in \mathbb {R}^{n\times d}\). For any matrix \(B\in \mathbb {R}^{n\times n}\), we denote:

$$ \widetilde B:=\begin{pmatrix}B^{1} \\ {\vdots} \\ B^{d}\end{pmatrix}\in\mathbb{R}^{d\times n} \qquad\text{ and similarly }\qquad \widetilde F(u):=\begin{pmatrix}F_{1}(u) \\ {\vdots} \\ F_{d}(u)\end{pmatrix}. $$

We show that under Assumption 1, the iterates (uk)k≥ 1 obtained by applying Algorithm 1 to F can also be generated by applying it to a mapping G acting between \(\mathbb {R}^{d}\). The following result extends [18, Theorem 2.3].

Theorem 3

Let Assumption 1 hold and let (uk), (Bk) and (σk) be generated by Algorithm 1, where each Bk is assumed to be invertible. Define:

$$ G:\mathbb{R}^{d}\rightarrow\mathbb{R}^{d}, \qquad G(w):=\widetilde F(u^{1} + S w) $$

as well as:

$$ C_{0}:=\widetilde B_{1} S\in\mathbb{R}^{d\times d}, \qquad w^{0}:=0\in\mathbb{R}^{d}, \qquad\text{ and }\qquad \tau_{k}:=\sigma_{k+1}\quad\forall k\geq 0. $$

Then the application of Algorithm 1 to G with initial guess (w0,C0) and updating sequence (τk) generates sequences (wk) and (Ck) with the following properties:

  1. 1.

    Each Ck is invertible and for all k ≥ 1, there hold:

    $$ u^{k} = u^{1} + S w^{k-1}, \qquad \widetilde F(u^{k}) = G(w^{k-1}) \qquad\text{and}\qquad C_{k-1} = \widetilde B_{k} S. $$
    (5)
  2. 2.

    The iterates (uk) converge to \(\bar {u}\in \mathbb {R}^{n}\) if and only if there is \(\bar w\in \mathbb {R}^{d}\) such that (wk) converges to \(\bar w\). If (uk) and (wk) converge to \(\bar {u}\) and \(\bar w\), respectively, then we have for all k ≥ 1:

    $$ \bar u = u^{1} + S\bar w \qquad\text{ and }\qquad {\left\|u^{k}-\bar{u}\right\|} = {\left\|w^{k-1}-\bar w\right\|}. $$
    (6)
  3. 3.

    The matrices (Bk) converge to \(B\in \mathbb {R}^{n\times n}\) if and only if there is \(C\in \mathbb {R}^{d\times d}\) such that (Ck) converges to C. If (Bk) and (Ck) converge to B and C, respectively, then we have for all k ≥ 1:

    $$ C = \widetilde B S \qquad\text{ and }\qquad {\left\|C_{k}-C\right\|} = {\left\|B_{k}-B\right\|}. $$

Proof

  • Proof of 1: The proof of [18, Theorem 2.3], which is for (σk) ≡ 1, can be used almost verbatim.

  • Proof of 2: We will use several times that \({\lVert S v\rVert }={\lVert v\rVert }\) for all \(v\in \mathbb {R}^{d}\) because the columns of S are orthonormal.Let (uk) converge to \(\bar {u}\). From (5), it follows that unum = S(wn− 1wm− 1) for all n,m ≥ 1, which implies that (wk) is a Cauchy sequence, hence convergent. Denoting the limit by \(\bar w\) we deduce from (5) that \(\bar {u} = u^{1} + S\bar w\), which in turn yields \({\lVert u^{k}-\bar {u}\rVert } = {\lVert S(w^{k-1}-\bar w)\rVert }\), hence \({\lVert u^{k}-\bar {u}\rVert } = {\lVert w^{k-1}-\bar w\rVert }\). If (wk) converges to \(\bar w\), then we can argue similarly.

  • Proof of 3: Let (Bk) converge to B. From (5), it follows that \({\lVert C_{n-1}-C_{m-1}\rVert }\leq {\lVert \widetilde B_{n}-\widetilde B_{m}\rVert }={\lVert B_{n}-B_{m}\rVert }\) for all n,m ≥ 1, where we used that \({\lVert S\rVert }=1\) and that \({B_{n}^{j}} - {B_{m}^{j}} = 0\) for all jJ due to Lemma 4. This implies that (Ck) is a Cauchy sequence, hence convergent. Denoting the limit by C, we deduce from (5) that \(C = \widetilde B S\). Let now (Ck) converge to C. We denote by \(A\in \mathbb {R}^{n\times (n-d)}\) the matrix:

    $$ A := \begin{pmatrix} \mathfrak{a}^{1} & {\ldots} & \mathfrak{a}^{n-d} \end{pmatrix}, $$

    where \(\{\mathfrak {a}^{i}\}_{i\in [n-d]}\) is an orthonormal basis of \({\mathcal {A}}\). Furthermore, let \(\hat S\in \mathbb {R}^{n\times n}\) be given by \(\hat S:=\begin {pmatrix}S & A \end {pmatrix}\). Since \({B_{k}^{j}} S = {a_{j}^{T}} S = 0\) and BkA = B1A for all jJ and all k ≥ 1 by Lemma 4, we infer that:

    $$ B_{k} \hat S = \begin{pmatrix} \begin{array}{ccc}\widetilde B_{k} S \\ 0 \end{array} \biggl\lvert\biggr. & B_{k} A \end{pmatrix} = \begin{pmatrix} \begin{array}{ccc}C_{k-1} \\ 0 \end{array} \biggl\lvert\biggr. & B_{1} A \end{pmatrix}, $$
    (7)

    where we also used the identity \(\widetilde B_{k} S = C_{k-1}\) from (5). Since \(\hat S \hat S^{T} = I\), it follows that:

    $$ B_{k} = \begin{pmatrix} \begin{array}{ccc}C_{k-1} \\ 0 \end{array} \biggl\lvert\biggr. & B_{1} A \end{pmatrix} \hat S^{T} $$

    for all k ≥ 1. Since (Ck) converges, we see that (Bk) converges, too. Denoting the limit of (Bk) by B, we conclude from (5) that \(C = \widetilde B S\) and from (7) that \({\lVert C_{k-1}-C\rVert }={\lVert (B_{k} - B)\hat S\rVert } ={\lVert B_{k} - B\rVert }\), where we used that \(\hat S\) is orthogonal.

Remark 4

Theorem 3 does not require invertibility of \(F^{\prime }(\bar {u})\), which allows us to derive results for singular \(F^{\prime }(\bar {u})\), too, cf. Theorems 6 and 7.

4 Convergence of the Broyden-like matrices

4.1 The general result

From [22, Theorem 5.7], we recall the following sufficient condition for convergence of (Bk) to \(F^{\prime }(\bar {u})\).

Theorem 4

Let \(F:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}\) be strictly differentiable at \(\bar {u}\). Let (uk), (sk) and (Bk) be generated by Algorithm 1. Let (uk) converge to \(\bar {u}\) and let (sk) be uniformly linearly independent. Then \(B:=\lim _{k\to \infty } B_{k}\) exists and satisfies \(B=F^{\prime }(\bar {u})\). Moreover, we have \(F(\bar {u})=0\). If, in addition, \(F^{\prime }(\bar {u})\) is invertible, then (uk) converges q-superlinearly.

Proof

There are three differences to [22, Theorem 5.7]. The first is that we replaced continuous differentiability of F by strict differentiability. It is easy to verify that the proof of [22, Theorem 5.7] still holds under this weaker assumption. The second and third difference are the statements for \(F(\bar {u})=0\) and the q-superlinear convergence of (uk), which we added. They follow from Lemma 1 and Lemma 2, respectively. □

Corollary 1 shows that for mixed linear–nonlinear systems with exact initialization, the uniform linear independence required in Theorem 4 does not hold. The following result extends Theorem 4 to mixed systems. We recall that the matrix S is introduced in Definition 2.

Theorem 5

Let \(F:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}\). Let Assumption 1 hold and let (uk), (sk) and (Bk) be generated by Algorithm 1, where each Bk is assumed to be invertible. Let (uk) converge to \(\bar {u}\) and suppose that \(w\mapsto \widetilde F(\bar {u}+Sw)\) is strictly differentiable at w = 0. Let (sk) be uniformly linearly independent of dimension d. Then \(B:=\lim _{k\to \infty } B_{k}\) exists and satisfies \(\widetilde B S = \widetilde F^{\prime }(\bar {u}) S\), Baj = B1aj and \(B^{j} = {a_{j}^{T}}=F_{j}^{\prime }(\bar {u})\) for all jJ. Moreover, we have \(F(\bar {u})=0\). If \(\widetilde F^{\prime }(\bar {u}) S\) is invertible, then (uk) converges q-superlinearly. If F is strictly differentiable at \(\bar {u}\), then \(E:=\lim _{k\to \infty } E_{k}\) exists and satisfies E = E1(ISST).

Proof

For d = n, we have J = , \(\widetilde E=E\) and \(S\in \mathbb {R}^{n\times n}\) is orthogonal, so the result is equivalent to Theorem 4 and there is nothing to prove. For d < n, we begin by noting that Lemma 4 yields \({B_{k}^{j}} = {a_{j}^{T}}\) and Bkaj = B1aj for all jJ and all k ≥ 1, which carries over to \(\lim _{k\to \infty } B_{k}\) if it exists. Next we show the existence of \(\lim _{k\to \infty } B_{k}\). By applying Theorem 3, we obtain sequences (Ck) and (wk) and a point \(\bar w\) as stated in that theorem. Part 3 of that theorem shows that for convergence of (Bk), it suffices to demonstrate the convergence of (Ck). Denoting \({s_{w}^{k}}:=w^{k+1}-w^{k}\) we now prove that \(({s_{w}^{k}})\subset \mathbb {R}^{d}\setminus \{0\}\) is uniformly linearly independent (of dimension d). Indeed, using (5), we have:

$$ \hat s^{k} = \frac{S s_{w}^{k-1}}{{\lVert s^{k}\rVert}} = \frac{S(w^{k}-w^{k-1})}{{\lVert S(w^{k}-w^{k-1})\rVert}} = \frac{S(w^{k}-w^{k-1})}{{\lVert w^{k}-w^{k-1}\rVert}}. $$

This implies that the matrix \(\hat S^{k}\) appearing in the definition of uniform linear independence of dimension d of (sk) and the matrix appearing in the definition of uniform linear independence of \(({s_{w}^{k}})\) have identical singular values, so the uniform linear independence of dimension d of (sk) implies the uniform linear independence of \(({s_{w}^{k}})\). The uniform linear independence of \(({s_{w}^{k}})\) and the results of Theorem 3 allow us to apply Theorem 4 to G, (wk), \(({s_{w}^{k}})\), and (Ck). This yields convergence of (Ck) to \(G^{\prime }(\bar w)=\widetilde F^{\prime }(\bar {u}) S\), which by means of Theorem 3 3 implies \(\widetilde BS = \widetilde F^{\prime }(\bar {u}) S\). Since (Bk) converges, Lemma 1 supplies \(F(\bar {u})=0\) and Theorem 4 implies q-superlinear convergence of (wk), from which the q-superlinear convergence of (uk) follows by use of (6). If F is strictly differentiable at \(\bar {u}\), then the claims for B imply that E exists and satisfies \(\widetilde E S = 0\) as well as Ej = 0 and Eaj = E1aj for all jJ. It is easy to see that these conditions are equivalent to E = E1(ISST). □

Remark 5

  1. 1.

    If F is strictly differentiable at \(\bar {u}\), then \(\widetilde F(\bar {u}+Sw)\) is strictly differentiable at w = 0. If \(F^{\prime }(\bar {u})\) is invertible, then \(\widetilde F^{\prime }(\bar {u})S\) is invertible.

  2. 2

    To illustrate the conditions obtained for B let us consider the case that \({\mathcal {S}}=\{(s_{1},s_{2},\ldots ,s_{n})^{T}\in \mathbb {R}^{n}: s_{j}=0 \forall j>d\}\). In this case, we can use for S the first d columns of the n × n identity matrix. Thus, \(\widetilde {B} S\) consists of the entries Bi,j, i,j ∈ [d], and \(\widetilde B S = \widetilde F^{\prime }(\bar {u})S\) states that the first d × d block of B agrees with the respective block of \(F^{\prime }(\bar {u})\). From Baj = B1aj for all jJ, we obtain in addition that the entries Bi,j, i ∈ [d], j ∈ [n] ∖ [d], are the same as in B1. If F is strictly differentiable at \(\bar {u}\), then this implies that Bi,j, i ∈ [d], j ∈ [n] ∖ [d], cannot equal the respective entries of \(F^{\prime }(\bar {u})\) if the rank of \((E_{0}^{i,j})_{i\in [d],j\in [n]\setminus [d]}\) is larger than one.

4.2 The special case d = 1

Sufficient conditions for uniform linear independence of \((s^{k})\subset \mathbb {R}^{n}\) are unknown for Broyden’s method if n > 1 (hence also for the more general Algorithm 1). However, any sequence \((s^{k})\subset \mathbb {R}^{n}\setminus \{0\}\) is uniformly linearly independent of dimension 1, hence Theorem 5 implies the following result.

Corollary 2

Let \(F:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}\). Let Assumption 1 hold for d = 1 and let (uk), (sk) and (Bk) be generated by Algorithm 1, where each Bk is assumed to be invertible. Let (uk) converge to \(\bar {u}\) and suppose that \(t\mapsto F_{1}(\bar {u}+t\bar s)\) is strictly differentiable at t = 0, where \(\bar s:=S\). Then \(B:=\lim _{k\to \infty } B_{k}\) exists and satisfies \(B^{1} \bar s = F_{1}^{\prime }(\bar {u})(\bar s)\), \(B^{1} a_{j} = {B_{1}^{1}} a_{j}\) and \(B^{j} = {a_{j}^{T}}=F_{j}^{\prime }(\bar {u})\) for all j > 1. Moreover, we have \(F(\bar {u})=0\). If \(F_{1}^{\prime }(\bar {u})(\bar s)\neq 0\), then (uk) converges q-superlinearly. If F1 is strictly differentiable at \(\bar {u}\), then \(E:=\lim _{k\to \infty } E_{k}\) exists and satisfies \(E^{1} = {E_{1}^{1}} (I-\bar s \bar s^{T})\) and Ej = 0 for all j > 1; in particular, (Bk) converges to \(F^{\prime }(\bar {u})\) iff \({E_{1}^{1}} a_{j} = 0\) for all j > 1.

Remark 6

Under the assumptions of Corollary 2, each Bk agrees with \(F^{\prime }(\bar {u})\) in all rows except the first and \(B:=\lim _{k\to \infty }B_{k}\) exists, yet B will usually be different from \(F^{\prime }(\bar {u})\) (provided \(F^{\prime }(\bar {u})\) exists). If, say, \(\bar s\) is the first canonical unit vector, then \(E^{1} = \begin {pmatrix}0 & E_{1}^{1,2} & {\ldots } & E_{1}^{1,n}\end {pmatrix}\); hence, E = 0 holds iff \(B_{1}^{1,j}=\left [F_{1}^{\prime }(\bar {u})\right ]_{j}\) for all j > 1, where \([F_{1}^{\prime }(\bar {u})]_{j}\) indicates the j th component of the vector \(F_{1}^{\prime }(\bar {u})\). This also shows that if \({\lVert E_{0}\rVert }\) is large, then \({\lVert E\rVert }\) will usually be large, too. The numerical results in Section 5 and our numerical experience from other work confirm that (Bk) will frequently not converge to \(F^{\prime }(\bar {u})\) and indicate that this also holds in more nonlinear settings.

We now focus on Broyden’s method, where (σk) ≡ 1. In fact, it is enough if σk = 1 for all k sufficiently large. For this case, we can strengthen the findings of Corollary 2 in several ways, for instance by providing orders of convergence for (uk) and \(({\lVert B_{k+1}-B_{k}\rVert })\). These results are derived by exploiting the fact that if σk = 1 for a \(k\in \mathbb {N}\), then sk+ 1 and thus uk+ 2 can also be generated by the one-dimensional secant method, cf. the proof of Theorem 6 1. Correspondingly, let us first argue for the one-dimensional case.

Lemma 5

Let \(G:\mathbb {R}\rightarrow \mathbb {R}\). Let (wk), \(({s_{w}^{k}})\) and (Ck) be generated by Algorithm 1 applied to G, using an update sequence (τk) that satisfies:

$$ \lim_{k\to\infty}\frac{\tau_{k+1}}{\tau_{k}} = 1. $$

Let (wk) converge to \(\bar w\) with \(G(\bar w)=0\). For k ≥ 0, respectively, k ≥ 1 define:

$$ {q_{k}^{G}} := \frac{{\lvert w^{k+1}-\bar{w}\rvert}}{{\lvert w^{k}-\bar{w}\rvert}} \qquad\text{ and }\qquad {Q_{k}^{G}} := \frac{{\lvert C_{k+1}-C_{k}\rvert}}{{\lvert C_{k}-C_{k-1}\rvert}}. $$

Then the following statements hold:

  1. 1.

    Let G be differentiable at \(\bar w\) with \(G^{\prime }(\bar w)\neq 0\). Let \(\varphi :=\frac {1+\sqrt 5}{2}\) and suppose that:

    $$ \lim_{k\to\infty}\frac{{\lvert w^{k+1}-\bar w\rvert}}{{\lvert w^{k}-\bar w\rvert}^{\varphi}} $$
    (8)

    exists. Then we have:

    $$ \lim_{k\to\infty}\frac{{Q_{k}^{G}}}{q_{k-2}^{G}} = 1. $$

    If, in addition, \(\lim _{k\to \infty }\tau _{k}=1\) is satisfied, then there holds:

    $$ \lim_{k\to\infty}\frac{{\lvert C_{k+1}-C_{k}\rvert}}{{\lvert C_{k}-C_{k-1}\rvert}^{\varphi}}={\lvert G^{\prime}(\bar w)\rvert}^{1-\varphi}. $$
  2. 2.

    Let \(m_{0}\in \mathbb {N}\), κ ∈ (0,1) and \(\hat \kappa >0\). Let G be m0 + 1 times differentiable at \(\bar w\). Let \(G^{(m)}(\bar w)=0\) for all m ∈ [m0] and \(G^{(m_{0}+1)}(\bar w)\neq 0\). Suppose that:

    $$ \lim_{k\to\infty} {q_{k}^{G}} = \kappa \qquad\text{ and }\qquad \lim_{k\to\infty}\frac{{\lvert {s_{w}^{k}}\rvert}}{{\lvert w^{k}-\bar w\rvert}} = \hat\kappa $$

    are satisfied. Then we have:

    $$ \lim_{k\to\infty}{Q_{k}^{G}} = \kappa^{m_{0}}. $$

Proof

  • Proof of 1: Using \(G(\bar w)=0,\) we find:

    $$ \begin{array}{llll} \frac{\tau_{k-1}}{\tau_{k}}\cdot\frac{{\lvert C_{k+1}-C_{k}\rvert}}{{\lvert C_{k}-C_{k-1}\rvert}} & = \frac{{\lvert G(w^{k+1})\rvert}{\lvert s_{w}^{k-1}\rvert}}{{\lvert {s_{w}^{k}}\rvert}{\lvert G(w^{k})\rvert}}\\ & = \frac{{\lvert G^{\prime}(\bar{w})(w^{k+1}-\bar{w})+o({\lvert w^{k+1}-\bar{w}\rvert})\rvert}{\lvert s_{w}^{k-1}\rvert}}{{\lvert {s_{w}^{k}}\rvert}{\lvert G^{\prime}(\bar{w})(w^{k}-\bar{w})+o({\lvert w^{k}-\bar{w}\rvert})\rvert}} \end{array} $$

    for all k ≥ 1. As (8) implies that (wk) converges q-superlinearly, a well-known lemma of Dennis and Moré, cf. [7, Lemma 2.1], yields \(\lim _{k\to \infty }\frac {{\lvert {s_{w}^{k}}\rvert }}{{\lvert w^{k}-\bar {w}\rvert }}=1\). Therefore, we have:

    $$ \begin{array}{llll} \lim_{k\to\infty}\frac{{Q_{k}^{G}}}{q_{k-2}^{G}} & = \lim_{k\to\infty} \frac{{\lvert C_{k+1}-C_{k}\rvert}}{{\lvert C_{k}-C_{k-1}\rvert}}\frac{{\lvert w^{k-2}-\bar{w}\rvert}}{{\lvert w^{k-1}-\bar{w}\rvert}}\\ & = \lim_{k\to\infty} \frac{{\lvert G^{\prime}(\bar{w})\rvert}{\lvert w^{k+1}-\bar{w}\rvert}{\lvert w^{k-1}-\bar{w}\rvert}}{{\lvert w^{k}-\bar{w}\rvert}{\lvert G^{\prime}(\bar{w})\rvert}{\lvert w^{k}-\bar{w}\rvert}} \frac{{\lvert w^{k-2}-\bar{w}\rvert}}{{\lvert w^{k-1}-\bar{w}\rvert}} \\ & = \lim_{k\to\infty} \frac{{\lvert w^{k+1}-\bar{w}\rvert}{\lvert w^{k-2}-\bar{w}\rvert}}{{\lvert w^{k}-\bar{w}\rvert}^{2}}, \end{array} $$

    provided the latter limit exists. By applying (8) multiple times, we obtain:

    $$ \lim_{k\to\infty} \frac{{\lvert w^{k+1} - \bar{w}\rvert}{\lvert w^{k-2} - \bar{w}\rvert}}{{\lvert w^{k}-\bar{w}\rvert}^{2}} = \lim_{k\to\infty} \mu^{\varphi-1-\frac{1}{\varphi}}{\lvert w^{k-1} - \bar{w}\rvert}^{\varphi^{2}-2\varphi+\frac{1}{\varphi}} = 1, $$

    where \(\mu \in [0,\infty )\) denotes the limit from (8) and where we used the identities \(\varphi ^{2}-2\varphi +\frac {1}{\varphi } = -\varphi +1+\frac {1}{\varphi } = \varphi -1-\frac {1}{\varphi }=0\) that follow from φ2φ − 1 = 0. Similar considerations show that:

    $$ \lim_{k\to\infty}\frac{{\lvert C_{k+1}-C_{k}\rvert}}{{\lvert C_{k}-C_{k-1}\rvert}^{\varphi}}= \bar \mu \lim_{k\to\infty} \frac{{\lvert w^{k+1}-\bar{w}\rvert}}{{\lvert w^{k}-\bar{w}\rvert}}\cdot \frac{{\lvert w^{k-1}-\bar{w}\rvert}^{\varphi}}{{\lvert w^{k}-\bar{w}\rvert}^{\varphi}} = \bar \mu $$

    for \(\bar \mu :={\lvert G^{\prime }(\bar {w})\rvert }^{1-\varphi }\), where we used (8) to obtain the final equality.

  • Proof of 2: Let us prove the claim for m0 = 1; it is readily generalized to arbitrary m0 ≥ 1. Taylor expansion around \(\bar {w}\) together with \(G(\bar {w})=0\) implies

    $$ \begin{array}{llll} & \lim_{k\to\infty} \frac{{\lvert G(w^{k+1})\rvert}}{{\lvert G(w^{k})\rvert}}\\ & \enspace = \lim_{k\to\infty}\frac{{\lvert G^{\prime}(\bar{w})(w^{k+1}-\bar{w})+\frac{1}{2} G^{\prime\prime}(\bar{w})(w^{k+1}-\bar{w})^{2}+o({\lvert w^{k+1}-\bar{w}\rvert}^{2})\rvert}}{{\lvert G^{\prime}(\bar{w})(w^{k}-\bar{w})+\frac{1}{2} G^{\prime\prime}(\bar{w})(w^{k}-\bar{w})^{2}+o({\lvert w^{k}-\bar{w}\rvert}^{2})\rvert}}\\ & \enspace = \lim_{k\to\infty}\frac{{\lvert G^{\prime\prime}(\bar{w})\rvert}}{{\lvert G^{\prime\prime}(\bar{w})\rvert}}\cdot\frac{{\lvert w^{k+1}-\bar{w}\rvert}^{2}}{{\lvert w^{k}-\bar{w}\rvert}^{2}} = \kappa^{2} = \kappa^{m_{0}+1}. \end{array} $$

    By assumption, we have \(\hat \kappa =\lim _{k\to \infty }\frac {{\lvert {s_{w}^{k}}\rvert }}{{\lvert w^{k}-\bar {w}\rvert }}>0\), hence:

    $$ \lim_{k\to\infty}\frac{{\lvert s_{w}^{k-1}\rvert}}{{\lvert {s_{w}^{k}}\rvert}} = \lim_{k\to\infty}\frac{\hat\kappa{\lvert w^{k-1}-\bar{w}\rvert}}{\hat\kappa{\lvert w^{k}-\bar{w}\rvert}} = \frac{1}{\kappa}. $$

    By definition, there holds for all k ≥ 1:

    $$ \frac{\tau_{k-1}}{\tau_{k}}\cdot {Q_{k}^{G}} = \frac{{\lvert G(w^{k+1})\rvert}}{{\lvert G(w^{k})\rvert}} \cdot \frac{{\lvert s_{w}^{k-1}\rvert}}{{\lvert {s_{w}^{k}}\rvert}}. $$

    Taking the limit for \(k\to \infty \) yields the claim.

We now provide a detailed description of the convergence behavior of Algorithm 1 with σk = 1 for all large k and d = 1, where F has n − 1 affine component functions F2,…,Fn. We first present a result for nonlinear F1 and then deal with affine F1.

Theorem 6

Let Assumption 1 hold for d = 1 and let (uk), (sk) and (Bk) be generated by Algorithm 1, with each Bk invertible. Suppose that σk = 1 for all k large enough and that (uk) converges to some \(\bar {u}\). Set \(\bar s:=S\) and define:

$$ q_{k} := \frac{{\left\|u^{k+1}-\bar{u}\right\|}}{{\left\|u^{k}-\bar{u}\right\|}} \qquad\text{ and }\qquad Q_{k} := \frac{{\left\|B_{k+1}-B_{k}\right\|}}{{\left\|B_{k}-B_{k-1}\right\|}} $$

for all k ≥ 0, respectively, k ≥ 1. Then the following statements hold:

  1. 1.

    Let \(t\mapsto F_{1}(\bar {u}+t\bar s)\) be twice differentiable near t = 0 with \(t\mapsto F_{1}^{\prime \prime }(\bar {u} + t \bar s)(\bar s,\bar s)\) continuous at t = 0 and \(F_{1}^{\prime }(\bar {u})(\bar s)\neq 0\). Then we have:

    $$ \limsup_{k\to\infty}\frac{{\left\|u^{k+1}-\bar{u}\right\|}}{{\left\|u^{k}-\bar{u}\right\|}^{\varphi}} \leq \left\lvert\frac{F_{1}^{\prime\prime}(\bar{u})(\bar s,\bar s)}{2 F_{1}^{\prime}(\bar{u})(\bar s)}\right\rvert^{\frac{1}{\varphi}}, $$
    (9)

    where \(\varphi :=\frac {1+\sqrt 5}{2}\). For all p ∈ [1,φ), there holds:

    $$ \lim_{k\to\infty}{\left\|B_{k+1}-B_{k}\right\|}^{\frac{1}{p^{k}}}=0. $$
    (10)

    If, in addition, \(F_{1}^{\prime \prime }(\bar {u})(\bar s,\bar s)\neq 0\), then (9) holds with equality and \(\limsup \) replaced by \(\lim \), and we have:

    $$ \lim_{k\to\infty}\frac{{\left\|B_{k+1}-B_{k}\right\|}}{{\left\|B_{k}-B_{k-1}\right\|}^{\varphi}}=\left\lvert F_{1}^{\prime}(\bar{u})(\bar s)\right\rvert^{1-\varphi}\qquad\text{and}\qquad \lim_{k\to\infty}\frac{Q_{k}}{q_{k-2}} = 1. $$
    (11)
  2. 2.

    Let \(m_{0}\in \mathbb {N}\) and denote by κ ∈ (0,1) the unique root of the polynomial \(x^{m_{0}+1}+x^{m_{0}}-1\) in (0,1). Let \(t\mapsto F_{1}(\bar {u}+t\bar s)\) be m0 + 1 times differentiable near t = 0 with its (m0 + 1)th derivative continuous at t = 0. If \(F_{1}^{(m)}(\bar {u})(\bar s,\ldots ,\bar s)=0\) for all m ∈ [m0] and \(F_{1}^{(m_{0}+1)}(\bar {u})(\bar s,\ldots ,\bar s)\neq 0\), then:

    $$ \lim_{k\to\infty} q_{k} = \kappa \qquad\text{ and }\qquad \lim_{k\to\infty} Q_{k} = \kappa^{m_{0}}. $$

Proof

  • Proof of 1: From Theorem 3, we obtain \(G:\mathbb {R}\rightarrow \mathbb {R}\), (wk), (Ck), and \(\bar w\) as stated in that theorem. We let \({s_{w}^{k}}:=w^{k+1}-w^{k}\) for all k ≥ 0. Due to \(C_{k} {s_{w}^{k}} ({s_{w}^{k}})^{T} / {\lvert {s_{w}^{k}}\rvert }^{2} = C_{k}\), we have \(C_{k+1}=(G(w^{k+1})-G(w^{k}))/{s_{w}^{k}}\) if σk = 1 and thus Algorithm 1 for G agrees with the one-dimensional secant method for all sufficiently large k. As \((G(w^{k+1})-G(w^{k}))/{s_{w}^{k}} \to G^{\prime }(\bar w)\) for \(k\to \infty \), we obtain the convergence of (Ck), thus \(G(\bar w)=0\) by Lemma 1. Furthermore, there holds \(G^{\prime }(\bar w) = \widetilde F^{\prime }(\bar {u}) S = F_{1}^{\prime }(\bar {u})(\bar s)\neq 0\). Since (wk) converges to \(\bar w\) with \(G(\bar w)=0\) and \(G^{\prime }(\bar w)\neq 0\), classical results for the secant method, cf. [31, (6)], yield that if \(G^{\prime \prime }(\bar w)\neq 0\), then:

    $$ \lim_{k\to\infty}\frac{{\lvert w^{k}-\bar w\rvert}}{{\lvert w^{k-1}-\bar w\rvert}^{\varphi}} =\left\lvert\frac{G^{\prime\prime}(\bar w)}{2 G^{\prime}(\bar w)}\right\rvert^{\frac{1}{\varphi}}, $$

    which by use of (5) is readily transformed into (9) with equality and \(\limsup \) replaced by \(\lim \). Similarly for (9). The r-order (10) follows from Lemma 3 using that \(F(\bar {u})=0\) due to Corollary 2. Since \({Q_{k}^{G}}=Q_{k+1}\) and \(q_{k-2}^{G} = q_{k-1}\) by (5) and (6), Lemma 5 1 yields (11).

  • Proof of 2: We argue only for m0 = 1. It follows from Corollary 2 that \(F(\bar {u})=0\). It is a standard result for the one-dimensional secant method, cf. [10, Section 2.2.2], that \(\lim _{k\to \infty }{q_{k}^{G}} = \kappa \), hence \(\lim _{k\to \infty }q_{k} = \kappa \), too. The claim on (Qk) follows via \(({Q_{k}^{G}})\) from Lemma 5 2 if we can show that there is \(\hat \kappa >0\) such that:

    $$ \lim_{k\to\infty}\frac{{\lvert {s_{w}^{k}}\rvert}}{{\lvert w^{k}-\bar{w}\rvert}} = \hat\kappa. $$

    Using \(G^{\prime }(\bar w)=0\), \(G^{\prime \prime }(\bar w)\neq 0\), and \(\lim _{k\to \infty } {q_{k}^{G}}=\kappa \), elementary considerations show that there is an index k0 such that \((w^{k}-\bar w)_{k\geq k_{0}}\) converges to zero without changing signs. For sufficiently large k, we thus have:

    $$ \left\lvert {s_{w}^{k}}\right\rvert = \left\lvert (w^{k+1} - \bar w) - (w^{k} - \bar w)\right\rvert = (1-{q_{k}^{G}})\left\lvert w^{k}-\bar w\right\rvert, $$

    hence, the desired limit exists with \(\hat \kappa =1-\kappa >0\).

Remark 7

  1. 1.

    If \(F^{\prime }(\bar {u})\) is invertible, then \(F_{1}^{\prime }(\bar {u})(\bar s)\neq 0\). Indeed, since \(\bar s\in {\mathcal {S}}\) and since \(F_{j}^{\prime }(\bar {u})={a_{j}^{T}}\in {\mathcal {A}} = {\mathcal {S}}^{\perp }\) for all j > 1, we have \(F_{j}^{\prime }(\bar {u})(\bar s)=0\) for all j > 1; hence, \(F_{1}^{\prime }(\bar {u})(\bar s)=0\) would imply \(F^{\prime }(\bar {u})(\bar s)=0\).

  2. 2.

    (9) and (10) show that (uk), respectively, \(({\lVert B_{k+1}-B_{k}\rVert })\) have q-order, respectively, r-order no less than φ. If \(F_{1}^{\prime \prime }(\bar {u})(\bar s,\bar s)\neq 0\), then the additional part of 1 implies that both (uk) and \(({\lVert B_{k+1}-B_{k}\rVert })\) have q-order and r-order φ, cf. [25, 9.3.3]. For (uk), the q-order φ improves the best available result, which is the 2-step q-quadratic convergence ensured by Theorem 2 for d = 1. Moreover, the example in Section 4.3.2 shows that if \(F_{1}^{\prime \prime }(\bar {u})(\bar s,\bar s)=0\), then it is possible to have a higher q-order than φ.

  3. 3.

    For m0 = 1, Theorem 6 2 is related to the results in [6, 19].

  4. 4.

    Corollary 2 is valid under the assumptions of Theorem 6, so in 1 and 2, we also have \(F(\bar {u})=0\) and B satisfies the conditions from that corollary.

In the affine setting, Algorithm 1 terminates after finitely many steps, provided a root exists and σk = 1 for at least one k (if the Jacobian is regular). More precisely, we have the following result.

Theorem 7

Let \(F:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}\) be affine. Let Assumption 1 hold for d = 1 and let (uk), (sk) and (Bk) be generated by Algorithm 1, with each Bk invertible. Let F(u0)≠ 0. Then the following statements hold:

  1. 1.

    Let \(F^{\prime }\) be invertible. Then F has a unique root \(\bar {u}\). If there is an index k ≥ 1 with σk = 1, then \(u^{k+1}=\bar {u}\) or \(u^{k+2}=\bar {u}\), hence the algorithm terminates in iteration k + 1 or k + 2 with output \(u^{\ast }=\bar {u}\). If the algorithm does not terminate with output \(u^{\ast } = \bar {u}\), then (uk) converges to \(\bar {u}\) and satisfies (1).

  2. 2.

    Let \(F^{\prime }\) be singular. If F has a root, then F(u1) = 0. If F does not have a root, then the algorithm generates a diverging sequence (uk) such that F(uk) = (ω,0,…,0)T for all k ≥ 1 and some ω≠ 0.

Proof

  • Proof of 1: From [22, Theorem 3.2], we know that for affine F with invertible \(F^{\prime }\), Algorithm 1 converges q-superlinearly for any u0 if all Bk are invertible and the algorithm does not terminate with output \(u^{\ast }=\bar {u}\). (Since d = 1, it is also not difficult to establish this directly.) Theorem 1 now yields (1). Corollary 2 yields the convergence of (Bk). It remains to prove that if σk = 1 and F(uk+ 1)≠ 0, then F(uk+ 2) = 0. Since Fj(uk) = 0 for all j > 1 and all k ≥ 1 by Lemma 4, we have to show that F1(uk+ 2) = 0. Similar as in the proof of Theorem 6, we use Theorem 3 to obtain \(\{w^{j}\}_{j=0}^{k+1}\) and \(\{C_{j}\}_{j=0}^{k+1}\) by applying Algorithm 1 to the affine function \(G:\mathbb {R}\rightarrow \mathbb {R}\), \(G(w):=F_{1}(u^{1}+w\bar s)\), where \(\bar s:=S\). In view of (5), we have to show that G(wk+ 1) = 0. From τk− 1 = σk = 1, it follows that \(C_{k} = (G(w^{k})-G(w^{k-1}))/(w^{k}-w^{k-1}) = G^{\prime }\). Using Ck(wk+ 1wk) = −G(wk), we find \(G(w^{k+1}) = G(w^{k}) + G^{\prime }\cdot (w^{k+1} - w^{k}) = G(w^{k})-G(w^{k})=0\), hence F(uk+ 2) = 0.

  • Proof of 2: Defining \(A:=F^{\prime }\), we note that A has rank n − 1 since \(A \bar s=0\) and since n − 1 rows of A agree with the invertible B0. Thus, A1 can be expressed as a linear combination of \(\{A^{j}\}_{j=2}^{n}\). Since F has a root and since Fj(u1) = 0 for all j > 1 by Lemma 4, it readily follows that F1(u1) = 0, whence F(u1) = 0. Now suppose that F does not have a root. By applying Theorem 3 again, we obtain that \(G^{\prime }=A\bar s = 0\); hence, G is constant, say Gω for some \(\omega \in \mathbb {R}\). Since F has no root, we must have ω≠ 0. Since G is constant, there holds F1(uk) = G(wk− 1) = ω for all k ≥ 1. The sequence (uk) cannot be convergent because Corollary 2 would entail that the limit point is a root of F.

Remark 8

  1. 1.

    The starting point u0 is arbitrary in Theorem 7.

  2. 2.

    The finite convergence in Theorem 7 1 is related to the 2n-step convergence of Broyden’s method for regular linear systems [12, 24]. Indeed, in the proof of Thm. 7 1, we can replace the computation for showing G(wk+ 1) = 0 by an application of the 2n-step convergence to G using that due to τk− 1 = 1, \(s_{w}^{k-1}\) and \({s_{w}^{k}}\) are the Broyden steps for initial (wk− 1,Ck− 1).

  3. 3.

    If in Theorem 7 1, Algorithm 1 does not terminate with \(u^{\ast }=\bar {u}\), then \(\lim _{k\to \infty } E_{k}\) exists and satisfies the conditions from Corollary 2.

4.3 Application to two examples from the literature

We illustrate some of our findings on two examples from the literature. The second example also hints at two extensions.

4.3.1 An example by Dennis and Schnabel

In [9, Example 8.1.3] and [9, Lemma 8.2.7], it is shown that for:

$$ F:\mathbb{R}^{2}\rightarrow\mathbb{R}^{2}, \qquad F(u)=\begin{pmatrix} u_{1} + u_{2} - 3 \\ {u_{1}^{2}} + {u_{2}^{2}} - 9 \end{pmatrix} $$

with root \(\bar {u}=(0,3)^{T}\), the initial data:

$$ u^{0} = \begin{pmatrix} 1 \\ 5 \end{pmatrix} \qquad\text{ and }\qquad B_{0} = F^{\prime}(u^{0}) = \begin{pmatrix} 1 & 1 \\ 2 & 10 \end{pmatrix} $$

yields sequences (uk) and (Bk) with \(u^{k}\to \bar {u}\) for \(k\to \infty \) and:

$$ B_{1} = \begin{pmatrix} 1 & 1 \\ 0.375 & 8.625 \end{pmatrix}, \qquad B:=\lim_{k\to\infty} B_{k} = \begin{pmatrix} 1 & 1 \\1.5 & 7.5 \end{pmatrix}, \qquad F^{\prime}(\bar{u}) = \begin{pmatrix} 1 & 1 \\ 0 & 6 \end{pmatrix}. $$

The affine component F1 has coefficient vector a1 = (1,1)T, so \({\mathcal {S}}={\langle \{a_{1}\}\rangle }^{\perp }=\{t\bar s:t\in \mathbb {R}\}\) with \(\bar s:=\frac {1}{\sqrt 2} (1,-1)^{T}\). Theorem 3 yields that \((s^{k})_{k\geq 1}\subset {\mathcal {S}}\) and (F1(uk))k≥ 1 ≡ 0. Of course, this can also be verified directly, cf. also [9, Example 8.1.3 and Lemma 8.2.7]. In agreement with Theorem 5 and Corollary 2, there holds \(\widetilde B S = B^{2} \bar s = -3\sqrt {2} = \widetilde F^{\prime }(\bar {u})S \), \(B^{1} = {B_{0}^{1}}\) and B(1,1)T = B1(1,1)T. (From B1, \(F^{\prime }(\bar {u})\) and \(\bar s,\) we can actually determine the limit B.) Because of \(F_{2}^{\prime }(\bar {u})\bar s\neq 0\neq F_{2}^{\prime \prime }(\bar {u})(\bar s,\bar s)\), Theorem 6 1 yields q-order φ for (uk) and \(({\lVert B_{k+1}-B_{k}\rVert })\) as well as the validity of (11).

4.3.2 An example by Dennis and Moré

In [8, Example 5.3], Dennis and Moré consider Broyden’s method for:

$$ F:\mathbb{R}^{2}\rightarrow\mathbb{R}^{2}, \qquad F(u)=\begin{pmatrix} u_{1} \\ u_{2}+{u_{2}^{3}} \end{pmatrix} $$

with root \(\bar {u}=(0,0)^{T}\) and note that for any \(\delta ,\epsilon \in \mathbb {R}\) the initial data:

$$ u^{0} = \begin{pmatrix} 0 \\ \epsilon \end{pmatrix} \qquad\text{ and }\qquad B_{0} = \begin{pmatrix} 1+\delta & 0 \\ 0 & 1 \end{pmatrix} $$
(12)

yields a sequence (Bk) with \(B_{k}^{1,1}=1+\delta \) for all k ≥ 0. Hence, the incorrect entry 1 + δ is never corrected (assuming δ≠ 0), preventing convergence of (Bk) to \(F^{\prime }(\bar {u})\). According to [8], “The above example points out that one of the disadvantages of Broyden’s method is that it is not self-correcting. In particular, Bk depends upon each Bj with j < k and thus it may retain information which is irrelevant or even harmful.”. It is well known that the BFGS method is self-correcting, cf., e.g., [1, 27].

We show that the iterates (uk) converge rapidly despite the incorrect entry 1 + δ in all Bk. The affine component F1 has coefficient vector a1 = (1,0)T, thus \({\mathcal {S}}={\langle \{a_{1}\}\rangle }^{\perp }=\{(0,t)^{T}: t\in \mathbb {R}\}\). We set \(\bar s:=(0,1)^{T}\) and observe \((s^{k})_{k\geq 0}\subset {\mathcal {S}}\) as well as (F1(uk))k≥ 0 ≡ 0. It is not difficult to see that Theorem 3 and, in turn, Theorem 6 1 apply, even though Assumption 1 is not satisfied in this example. Theorem 6 1 implies that if (uk) converges to \(\bar {u}\), then it has a q-order no smaller than φ and \(({\lVert B_{k+1}-B_{k}\rVert })\) goes to zero with r-order no smaller than φ. The fast convergence is enabled by the fact that Broyden’s method effectively reduces to the one-dimensional secant method. It should also be noted that (Bk) converges to \(F^{\prime }(\bar {u})\) in \({\mathcal {S}}\), i.e., \((B_{k}-F^{\prime }(\bar {u}))S\to 0\), cf. Corollary 2. Furthermore, since B0S = 1 correctly approximates the affine part of F2 and since F2 does not contain a quadratic part, it can be shown that \(({\lVert B_{k+1}-B_{k}\rVert })\) has q-order 2, which implies that (uk) has q-order 2, too. The numerical experiments confirm the q-order 2, cf. Section 5.2.2.

5 Numerical experiments

We use numerical examples to verify Corollary 2 and Theorems 6 and 7. We first present the design of the experiments and then provide the examples and results.

5.1 Design of the experiments

5.1.1 Implementation and accuracy

We use the variable precision arithmetic (vpa) of Matlab 2020b. Unless stated otherwise, we work with a precision of 10000 digits and replace the termination criterion F(uk) = 0 in Algorithm 1 by \({\lVert F(u^{k})\rVert }\leq 10^{-5000}\). By \(\bar k,\) we denote the final value of k.

5.1.2 Known solution and random initialization

All examples have root \(\bar {u}=0\) and the experiments are set up in such a way that convergence to \(\bar {u}\) takes place in all runs except possibly a handful that are discarded. Except in the second example, the initial guess (u0,B0) is randomly generated using Matlab’s function rand to satisfy u0 ∈ [−α,α]n and \(B_{0}=F^{\prime }(u^{0})+\hat \alpha {\lVert F^{\prime }(u^{0})\rVert }R\). Here, \(R\in \mathbb {R}^{n\times n}\) is a matrix with Rj = 0 for all j > 1 and the entries in R1 randomly drawn from [− 1,1]. The values of α ∈ [10− 3,1000] and \(\hat \alpha \in [0,1000]\) will be specified within each example.

5.1.3 Quantities of interest

To display the course of Algorithm 1, we use the norm of Fk := F(uk), the error \({\lVert E_{k}\rVert }\), the quotients qk and Qk introduced in Theorem 6, and furthermore:

$$ \beta_{k}:={\lVert B_{k}-B_{k-1}\rVert}, \qquad {C_{k}^{u}} := \frac{{\lVert u^{k}-\bar{u}\rVert}}{{\left\|u^{k-1}-\bar{u}\right\|}^{\varphi}},\qquad {C_{k}^{B}}:=\frac{{\lVert B_{k}-B_{k-1}\rVert}}{{\left\|B_{k-1}-B_{k-2}\right\|}^{\varphi}}, $$

as well as:

$$ \mathcal{R}_{k}^{B}:=\log\bigl({\left\|B_{k}-B_{k-1}\right\|}^{-1}\bigr)^{\frac{1}{k}} $$

and:

$$ {{\mathcal{Q}}_{k}^{u}} := \frac{\log({\lVert u^{k}-\bar{u}\rVert})}{\log({\lVert u^{k-1}-\bar{u}\rVert})},\qquad {{\mathcal{Q}}_{k}^{B}}:=\frac{\log({\lVert B_{k}-B_{k-1}\rVert})}{\log({\lVert B_{k-1}-B_{k-2}\rVert})}. $$

We note that \({{\mathcal {Q}}_{k}^{u}}\) and \({{\mathcal {Q}}_{k}^{B}}\) approximate the q-order of convergence while \(\mathcal {R}_{k}^{B}\) approximates the r-order. Whenever any of these quantities is undefined, we set it to − 1, e.g., β0 := − 1. We will use these quantities to confirm that (Bk) converges, cf. Corollary 2, and to assess the convergence order of (uk) and \(({\lVert B_{k+1}-B_{k}\rVert })\), cf. Theorem 6. We are also interested in whether \({\lVert E_{k}\rVert }\to 0\), i.e., whether (Bk) converges to the true Jacobian \(F^{\prime }(\bar {u})\), cf. for instance Remark 6.

5.1.4 Single runs and cumulative runs

We use single runs and cumulative runs. For single runs, we display the quantities of interest during the course of the algorithm. A cumulative run consists of 1000 single runs with initial data varying according to Section 5.1.2, unless stated otherwise. Let us briefly describe the aggregated quantities that we use to assess cumulative runs. For instance, to gauge the q-order of \(({\lVert B_{k+1}-B_{k}\rVert })\), we compute for each single run of a cumulative run the number:

$$ {\mathcal{Q}}_{j} :=\min_{k_{0}(j)\leq k\leq\bar k(j)} {{\mathcal{Q}}_{k}^{B}}, $$

where j ∈ [1000] indicates the respective single run and we consistently use \(k_{0}(j):=\min \limits \{100,\lfloor 0.75\bar k(j)\rfloor \}\). As outcome of the cumulative run, we display:

$$ {\mathcal{Q}}_{B}^{-}:=\min_{j\in[1000]}{\mathcal{Q}}_{j}\qquad\text{ and }\qquad {\mathcal{Q}}_{B}^{+}:=\max_{j\in[1000]}{\mathcal{Q}}_{j}. $$

If the stronger conditions in Theorem 6 1 hold, then \({\mathcal {Q}}_{B}^{-}\) and \({\mathcal {Q}}_{B}^{+}\) should both be close to the golden mean φ. If the convergence is of lower order in any of the 1000 single runs, then we expect \({\mathcal {Q}}_{B}^{-}\) to be smaller than φ.

In the same way as just presented for \({\mathcal {Q}}_{B}^{-}\) and \({\mathcal {Q}}_{B}^{+}\), we derive \({\lVert E\rVert }^{-}\), \({\lVert E\rVert }^{+}\), q, q+, \({\mathcal {Q}}_{u}^{-}\), \({\mathcal {Q}}_{u}^{+}\), β, β+, Q, Q+, \(\mathcal {R}_{u}^{-}\), and \(\mathcal {R}_{u}^{+}\) from the respective quantities used in single runs. In addition, we use:

$$ \lVert F\rVert^{-}:=\min_{j\in\left[1000\right]} \lVert F\left( u^{\bar{k}(j)}\right)\rVert \qquad \text{ and }\qquad \lVert F\rVert^{+}:=\max_{j\in\left[1000\right]} \lVert F\left( u^{\bar{k}(j)}\right)\rVert. $$

To keep the tables for cumulative runs of a reasonable size, we will omit some of these quantities, but what is omitted varies from example to example.

5.2 Numerical examples

5.2.1 Example 1

To verify the results of Theorem 6 1, we consider \(F:\mathbb {R}^{10}\rightarrow \mathbb {R}^{10}\) given by:

$$ F(u)=\begin{pmatrix} u_{1} \cdot \left[ {\prod}_{j=2}^{10}\left( u_{j}+(-1)^{j}\right)\right]\\ A u \end{pmatrix}, $$

where \(A\in \mathbb {R}^{9\times 10}\) is a random matrix with entries in [− 1,1] that is changed after each of the 1000 single runs of the cumulative run. The randomly generated A is only accepted if the resulting \(F^{\prime }(\bar {u})\) is invertible. We use α = 0.001 in this example. A single and a cumulative run with (σk) ≡ 1 and \(\hat \alpha =0\) are displayed in Tables 1 and 2. The results agree with Theorem 6 1. For instance, it is apparent that (uk) and \(({\lVert B_{k+1}-B_{k}\rVert })\) converge with q-order φ ≈ 1.618 and that \(\lim _{k\to \infty }\frac {Q_{k}}{q_{k-2}}=1\) (since A is random, we expect \(F_{1}^{\prime \prime }(\bar {u})(\bar s,\bar s)\neq 0\)). Table 2 also shows results for a cumulative run with (σk) ≡ 1 and \(\hat \alpha =0.1\). In accordance with Theorem 6 1, deviating from the choice \(B_{0}=F^{\prime }(u^{0})\) does not affect the q-order of convergence. Next we keep \(\hat \alpha =0.1\) and let σk = 0.5 for k ≤ 3 and (σk)k≥ 4 ≡ 1. Theorem 6 1 predicts that this choice of (σk) maintains q-order φ for (uk) and \(({\lVert B_{k+1}-B_{k}\rVert })\), and Table 2 confirms this.

Table 1 Example 1: Single run with \(\hat \alpha =0\), i.e., \(B_{0}=F^{\prime }(\bar {u})\)
Table 2 Example 1: Cumulative runs with \(\hat \alpha =0\) (1st row), \(\hat \alpha =0.1\) (2nd row), \(\hat \alpha =0.1\) and σ0,1,2,3 = 0.5 (3rd), \(\hat \alpha =0\) and (σk) ≡ 0.99 (4th), \(\hat \alpha =0\) and σk = 1 − (k + 2)− 4 (5th), and \(\hat \alpha =0\) and σk = 1 − (k + 2)− 4 with higher precision (6th)

In contrast, if we choose \(\hat \alpha =0\) and (σk) ≡ 0.99, then the order of convergence drops significantly and the same holds for (σk) ≡ 1 − (k + 2)− 4, cf. Table 2. In fact, except for some special cases it can be shown that (uk) can only converge with q-order greater than one if σk → 1 fast enough. In particular, for (σk) ≡ 0.99 and (σk) ≡ 1 − (k + 2)− 4, both (uk) and \(({\lVert B_{k+1}-B_{k}\rVert })\) have q-order 1. To confirm this for (σk) ≡ 1 − (k + 2)− 4, we repeat the cumulative run with a higher precision of 100000 digits, using \({\lVert F(u^{k})\rVert }\leq 10^{-50000}\) as termination criterion and only 100 single runs instead of 1000. We view the results in Table 2 as being in line with q-order 1. In any case, it is apparent that for (σk) ≡ 0.99 and (σk) ≡ 1 − (k + 2)− 4 the q-order of convergence is not φ anymore and that \(({\lVert B_{k+1}-B_{k}\rVert })\) converges to zero at least q-linearly for all choices of (σk); hence, (Bk) converges, which validates Corollary 2. The values of \({\lVert E\rVert }^{-}\) show that (Bk) never converges to \(F^{\prime }(\bar {u})\).

5.2.2 Example 2

We provide results for the example by Dennis and Moré discussed in Section 4.3.2, which concerns Broyden’s method, so (σk) ≡ 1. A single run is displayed in Table 3 and four cumulative runs in Table 4. For the single run and the first cumulative run, we use (u0,B0) that satisfy (12) with randomly generated δ,𝜖 ∈ [− 0.5,0.5]. The results confirm that, as argued in Section 4.3.2; both (uk) and \(({\lVert B_{k+1}-B_{k}\rVert })\) have q-order 2. Because of \(F_{2}^{\prime \prime }(\bar {u})=0\), this does not contradict Theorem 6 1.

Table 3 Example 2: Single run with initial data of the form (12)
Table 4 Example 2: Cumulative runs with initial data of the form (12) (first row), random u0 without exact initialization (2nd), random u0 without exact initialization with higher precision (3rd), and random u0 with exact initialization (4th)

In the second cumulative run, we let u0 = (𝜖1,𝜖2)T for random numbers 𝜖1,𝜖2 ∈ [− 0.5,0.5], while keeping B0 as in (12) with δ ∈ [− 0.5,0.5]. Due to 𝜖1≠ 0, we cannot expect (sk) to belong to a one-dimensional subspace; hence, Theorem 6 does not apply anymore. Correspondingly, the second row in Table 4 shows that (uk) does not attain the q-order φ but suggests that the q-order may still have a lower bound larger than 1. This view is further encouraged by the fact that the r-order of \((\lVert {B_{k+1}-B_{k}}\rVert {2})\) seems to admit such a lower bound, too, which is a necessary condition for (uk) to have a q-order, cf. Lemma 3. To investigate the potential q-order of (uk) further, we repeat the cumulative run at a higher precision using \({\lVert F(u^{k})\rVert }\leq 10^{-100000}\) as termination criterion and 400 single runs. The results are contained in Table 4 and support the existence of a q-order larger than one for (uk).

In the third cumulative run, whose results are depicted in the last row of Table 4, we keep the choice u0 = (𝜖1,𝜖2)T from the second cumulative run, but use \(B_{0}=F^{\prime }(u^{0})\) as initial, so that \({B_{0}^{1}} = F_{1}^{\prime }(u^{0})\) and hence Assumption 1 holds. In turn, Theorem 6 1 applies, which ensures a q-order, respectively, r-order no smaller than φ for (uk) and \(({\lVert B_{k+1}-B_{k}\rVert })\), respectively. It can be argued in the same way as in Section 4.3.2 that both sequences actually converge with q-order 2. Table 4 confirms this q-order.

The values of \({\lVert E\rVert }^{-}\) in Table 4 show that (Bk) never converges to \(F^{\prime }(\bar {u})\). Yet, since \(({\lVert B_{k+1}-B_{k}\rVert })\) declines quickly, the convergence of (uk) is still rapid.

5.2.3 Example 3 a

We turn to Theorem 6 2, where \(F^{\prime }(\bar {u})\) is singular. Let:

$$ F:\mathbb{R}^{3}\rightarrow\mathbb{R}^{3},\qquad F(u)=\begin{pmatrix} {u_{2}^{2}}-2 {u_{3}^{3}} \\ u_{1} + u_{2} + u_{3} \\ 5 u_{1} \end{pmatrix}. $$

Because of \({\mathcal {A}}^{\perp } = {\langle \{(0,1,-1)^{T}\}\rangle ,}\) we have \(\bar s=\frac {1}{\sqrt {2}}(0,1,-1)^{T}\), hence \(F_{1}^{\prime }(0)=0\) and \(F_{1}^{\prime \prime }(0)(\bar s,\bar s)=2\neq 0\), which implies \(\lim _{k\to \infty }q_{k}=\lim _{k\to \infty } Q_{k}=\frac {\sqrt {5}-1}{2}\approx 0.618\) for the choice (σk) ≡ 1 that we consider first. We use \(\alpha =\hat \alpha =0.01\) in this example. The results of a cumulative run with (σk) ≡ 1 are displayed in Table 5 and are in perfect agreement with Theorem 6 2. Table 5 also provides results for (σk) ≡ 0.99, which are similar to those for (σk) ≡ 1. Moreover, it features ι and ι+, which denote the minimal, respectively, maximal number of iterations of all single runs within a cumulative run. As in the previous examples, we consistently find \(B_{k}\not \to F^{\prime }(\bar {u})\).

Table 5 Example 3 a and b: Two cumulative runs in a with (σk) ≡ 1 (top) and (σk) ≡ 0.99 (below top) and in b with (σk) ≡ 1 (above bottom) and (σk) ≡ 0.99 (bottom)

5.2.4 Example 3 b

We change F1 in example 3 a, using \(F_{1}(u)={u_{2}^{3}}-2 {u_{3}^{3}}\) instead. This results in \(F_{1}^{\prime }(0)=0\), \(F_{1}^{\prime \prime }(0)(\bar s,\bar s)=0\) and \(F_{1}^{\prime \prime \prime }(0)(\bar s,\bar s,\bar s)\neq 0\), so Theorem 6 2 implies \(\lim _{k\to \infty } q_{k} \approx 0.755\) and \(\lim _{k\to \infty } Q_{k} \approx 0.570\). Table 5 confirms this for (σk) ≡ 1 and shows that the choice (σk) ≡ 0.99 induces only marginal changes. Overall, example 3 exhibits a remarkably uniform convergence behavior of iterates and matrix updates, as evidenced, for instance, by the fact that q = q+ and Q = Q+. Table 6 exemplifies this for example 3 b in a single run with (σk) ≡ 1. Since this uniformity is characteristic for singular \(F^{\prime }(\bar {u})\) of rank n − 1, cf. also [19], we used \({\lVert F(u^{k})\rVert }\leq 10^{-500}\) as termination criterion in example 3 and the cumulative runs consisted of 100 single runs.

Table 6 Example 3 b: Single run with (σk) ≡ 1

5.2.5 Example 4

To verify Theorem 7 1 we consider F(u) = Au, where \(A\in \mathbb {R}^{10\times 10}\) is an invertible random matrix with entries in [− 1000,1000] that is changed after each single run of the cumulative run. We choose \(\alpha =\hat \alpha =1000\). In the first cumulative run, we use σ4 = 1, σk = 0.1 otherwise. Theorem 7 1 guarantees F(u6) = 0 if F(uk)≠ 0 for 0 ≤ k ≤ 5. Table 7 shows that ι = ι+ = 6, so all runs use exactly 6 steps. On a side note, we remark that Q = Q+ = 9 can easily be proven. The second experiment displayed in Table 7 uses (σk) ≡ 1 − (k + 2)− 4. The outcome is in line with Theorem 7 1 that asserts global q-superlinear, but not finite convergence for this choice of (σk), as well as convergence of (Bk). As in example 1, it can be shown that the q-order of (uk) and \(({\lVert B_{k+1}-B_{k}\rVert })\) is 1. To verify this, we repeat the cumulative run with (σk) ≡ 1 − (k + 2)− 4, using a precision of 100000 digits and \({\lVert F(u^{k})\rVert }\leq 10^{-50000}\) as termination criterion, but only 100 single runs. The result in Table 7 is in line with q-orders of 1. Despite the fact that all Bk agree with A on n − 1 of n rows, the difference between Bk and A in the last 25% of iterations is large in norm, which, however, does not prevent finite convergence if σk = 1 for at least one k ≥ 1; cf. Theorem 7 and Remark 6.

Table 7 Example 4: Cumulative runs with σ0,1,2,3,5 = 0.1 and σ4 = 1 (top), with (σk) ≡ 1 − (k + 2)− 4 (middle), with (σk) ≡ 1 − (k + 2)− 4 and higher precision (bottom)

6 Summary

We have shown that, up to a translation, the iterates of the Broyden-like method for mixed linear–nonlinear systems of equations can be obtained by applying the Broyden-like method to a lower-dimensional mapping, provided that the rows of the initial matrix agree with the rows of the Jacobian for (some of) the linear equations. We have used this subspace property to extend a sufficient condition for convergence of the Broyden-like matrices. For the special case that at most one equation is nonlinear, we have concluded that the Broyden-like matrices converge whenever the iterates converge. For Broyden’s method, we could, in addition, quantify how fast iterates and updates converge, respectively, prove finite convergence if the system is linear. We verified the results in high-precision numerical experiments.