Abstract
In optimization, one of the main challenges of the widely used family of Quasi-Newton methods is to find an estimate of the Hessian matrix as close as possible to the real matrix. In this paper, we develop a new update formula for the estimate of the Hessian starting from the Powell-Symetric-Broyden (PSB) formula and adding pieces of information from the previous steps of the optimization path. This lead to a multisecant version of PSB, which we call generalised PSB (gPSB), but which does not exist in general as was proven before. We provide a novel interpretation of this non-existence. In addition, we provide a formula that satisfies the multisecant condition and is as close to symmetric as possible and vice versa for a second formula. Subsequently, we add enforcement of the last secant equation and present a comparison between the different methods.
Similar content being viewed by others
References
Beiranvand, V., Hare, W., Lucet, Y.: Best practices for comparing optimization algorithms. Optim. Eng. 18(4), 815–848 (2017)
Bertolazzi, E.: Quasi-Newton methods for minimization (2011). http://www.ing.unitn.it/~bertolaz/2-teaching/2011-2012/AA-2011-2012-OPTIM/lezioni/slides-mQN.pdf
Boutet, N., Haelterman, R., Degroote, J.: Secant update version of quasi-Newton PSB with weighted multisecant equations. Computational Optimization and Applications pp. 1–26 (2020). https://biblio.ugent.be/publication/8644687/file/8644688
Boyd, S., Dattorro, J.: Alternating projections. EE392o, Stanford University (2003). https://pdfs.semanticscholar.org/1ed0/e86a12d31f1897b96b081489101a79da818a.pdf
Broyden, C.: On the discovery of the “good Broyden” method. Math. Program. 87(2), 209–213 (2000)
Broyden, C.G.: A class of methods for solving nonlinear simultaneous equations. Math. Comput. 19(92), 577–593 (1965)
Broyden, C.G.: Quasi-Newton methods and their application to function minimisation. Math. Comput. 21(99), 368–381 (1967)
Cheney, W., Goldstein, A.A.: Proximity maps for convex sets. Proc. Am. Math. Soc. 10(3), 448–450 (1959)
Courrieu, P.: Fast computation of Moore-Penrose inverse matrices. arXiv preprint arXiv:0804.4809 (2008)
Degroote, J., Bathe, K.J., Vierendeels, J.: Performance of a new partitioned procedure versus a monolithic procedure in fluid-structure interaction. Comput. Struct. 87(11–12), 793–801 (2009)
Degroote, J., Hojjat, M., Stavropoulou, E., Wüchner, R., Bletzinger, K.U.: Partitioned solution of an unsteady adjoint for strongly coupled fluid-structure interactions and application to parameter identification of a one-dimensional problem. Struct. Multidiscip. Optim. 47(1), 77–94 (2013)
Dennis, J., Walker, H.F.: Convergence theorems for least-change secant update methods. SIAM J. Numer. Anal. 18(6), 949–987 (1981)
Dennis Jr., J.E., Moré, J.J.: Quasi-Newton methods, motivation and theory. SIAM Rev. 19(1), 46–89 (1977)
Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91(2), 201–213 (2002)
DuPré, A.M., Kass, S.: Distance and parallelism between flats in Rn. Linear Algebra Appl. 171, 99–107 (1992)
Errico, R.M.: What is an adjoint model? Bull. Am. Meteorol. Soc. 78(11), 2577–2591 (1997)
Fang, Hr, Saad, Y.: Two classes of multisecant methods for nonlinear acceleration. Numer. Linear Algebra Appl. 16(3), 197–221 (2009)
Gould, N.I., Orban, D., Toint, P.L.: CUTEst: A constrained and unconstrained testing environment with safe threads for mathematical optimization. Comput. Optim. Appl. 60(3), 545–557 (2015)
Gratton, S., Malmedy, V., Toint, P.L.: Quasi-Newton updates with weighted secant equations. Optim. Methods Softw. 30(4), 748–755 (2015)
Gratton, S., Toint, P.: Multi-secant equations, approximate invariant subspaces and multigrid optimization. Ph.D. thesis, tech. rep., Dept of Mathematics, FUNDP, Namur (B) (2007). http://perso.fundp.ac.be/~phtoint/pubs/TR07-11.pdf
Gross, J., Trenkler, G.: On the least squares distance between affine subspaces. Linear Algebra Appl 237, 269–276 (1996)
Haelterman, R.: Analytical study of the least squares quasi-Newton method for interaction problems. Ph.D. thesis, Ghent University (2009). https://biblio.ugent.be/publication/720660
Haelterman, R., Bogaers, A., Degroote, J., Boutet, N.: Quasi-Newton methods for the acceleration of multi-physics codes. Int. J. Appl. Math. 47(3), 352–360 (2017)
Haelterman, R., Bogaers, A.E., Scheufele, K., Uekermann, B., Mehl, M.: Improving the performance of the partitioned QN-ILS procedure for fluid-structure interaction problems: Filtering. Comput. Struct. 171, 9–17 (2016)
Haelterman, R., Degroote, J., Van Heule, D., Vierendeels, J.: The quasi-Newton least squares method: A new and fast secant method analyzed for linear systems. SIAM J Numer. Anal. 47(3), 2347–2368 (2009)
Khalfan, H.F., Byrd, R.H., Schnabel, R.B.: A theoretical and experimental study of the symmetric rank-one update. SIAM J. Optim. 3(1), 1–24 (1993)
Kim, D., Sra, S., Dhillon, I.S.: A new projected quasi-Newton approach for the nonnegative least squares problem. Tech. rep., Computer Science Department, University of Texas at Austin (2006). https://pdfs.semanticscholar.org/1e8c/118ad4e92c0927b19ec2bcb1ae8623aebde7.pdf
Mielczarek, D.: Minimal projections onto spaces of symmetric matrices. Univ. Iagel. Acta Math. 44, 69–82 (2006)
Morales, J.L.: Variational quasi-Newton formulas for systems of nonlinear equations and optimization problems. (2008). http://users.eecs.northwestern.edu/~morales/PSfiles/PSB.pdf
Moré, J.J., Thuente, D.J.: Line search algorithms with guaranteed sufficient decrease. ACM Trans. Math. Softw. (TOMS) 20(3), 286–307 (1994)
Pang, C.J.: Accelerating the alternating projection algorithm for the case of affine subspaces using supporting hyperplanes. Linear Algebra Appl. 469, 419–439 (2015)
Plessix, R.E.: A review of the adjoint-state method for computing the gradient of a functional with geophysical applications. Geophys. J. Int. 167(2), 495–503 (2006)
Powell, M.: Beyond symmetric Broyden for updating quadratic models in minimization without derivatives. Math. Program. 138(1–2), 475–500 (2013)
Powell, M.J.: A new algorithm for unconstrained optimization. In: Nonlinear Programming, pp. 31–65. Elsevier (1970). https://www.sciencedirect.com/science/article/pii/B9780125970501500063
Rheinboldt, W.C.: Quasi-Newton methods. Lecture Notes, TU Munich (2000). https://www-m2.ma.tum.de/foswiki/pub/M2/Allgemeines/SemWs09/quasi-newt.pdf
Scheufele, K., Mehl, M.: Robust multisecant Quasi-Newton variants for parallel fluid-structure simulations–and other multiphysics applications. SIAM J. Sci. Comput. 39(5), S404–S433 (2017)
Schnabel, R.B.: Quasi-Newton methods using multiple secant equations. Tech. rep., DTIC Document (1983). http://www.dtic.mil/cgi-bin/GetTRDoc?AD=ADA131444
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendices
Preliminary lemmas
In the following proofs, we will apply the following simplifications (and their transposed version). Let \(X \in \mathbb {R}^{n \times m}\), \(Y \in \mathbb {R}^{n \times m}\), and \(\mathbf{x} \) the last column of X (same for \(\mathbf {y}\) and Y), we give here some interesting results.
Lemma A.1
Proof
\(XX^+\mathbf{x} \) is the last column of \(XX^+X\), but \(XX^+X=X\). So \(XX^+\mathbf{x} \) is the last column of X which is \(\mathbf{x} \). The second form is simply the transposed expression. \(\square \)
Lemma A.2
Proof
\(\mathbf{x} ^TXX^+\) is the last row of \(X^TXX^+\). But \(X^TXX^+=X^TX(X^T X)^{-1} X^T=X^T\). So \(\mathbf{x} ^TXX^+\) is the last row of \(X^T\) which is \(\mathbf{x} ^T\). The second form is simply the transposed expression. \(\square \)
Lemma A.3
Proof
\(YX^+\mathbf{x} \) is the last column of \(YX^+X\). But \(YX^+X=Y\). So \(YX^+\mathbf{x} \) is the last column of Y which is \(\mathbf {y}\). The second form is simply the transposed expression. \(\square \)
Alternating Projections applied on PSB
1.1 Closed convex sets
Here are the proofs that the sets used in Section 3.2 are closed and convex.
Lemma B.1
The set of symmetric matrices is a closed convex set.
Proof
The set of symmetric matrices is a vector subspace. So it is closed and convex. \(\square \)
Lemma B.2
The set of multisecant matrices is a closed convex set.
Proof
The set of multisecant matrices is the set of matrices A such that \(A\mathbf {x} = \mathbf {b}\). This is an affine space. So it is closed and convex. \(\square \)
1.2 Projection on the set of multisecant matrices
The formula for the projection on \(K_{MS}\) is (3.2):
Proof
For more readability and to avoid to handle too many subscripts, we lightly change the notation in this development. We note:
-
We omit the subscript i for S and Y.
-
j, k the scalar coordinates within a vector or a matrix (j-th row, k-th column).
We start with the following optimization problem:
We take the Lagrangian of the system:
We now take the partial derivative in function of \(A_{j,k}\). We first note that:
We find:
The system can thus be written as:
Putting (B.1) into (B.2), we find:
Putting this back into (B.1), we have a new update formula:
\(\square \)
1.3 Generalized PSB
Proof
As explained in Sect. 3.1, we use alternating projection. We project alternatively on the subspace of multisecant matrices (\(K_{MS}\)) and on the subspace of symmetric matrices (\(K_{Sym}\)):
-
\(K_{MS}\): Defined in (3.2) for the projection on the set of multisecant matrices. We call the projection \(_jB\).
-
\(K_{Sym}\): Equation (3.1) for the projection on the set of symmetric matrices. We call the projection \(_j\bar{B}\).
We recall that \((S^T S)^{-1} S^T=S^+\) (Moore-Penrose pseudoinverse). We start with and we develop:
After those two first projections, we go on:
We define:
We can easily see that, when \(j=2\), corresponds to and, when \(j=3\), corresponds to . We also easily check that is the projection of using equation (3.1), and that projecting with (3.2) gives .
Finally, taking the limit to infinity, we see that the sequences converge to two different formulas. On one side, we have , the symmetric formula closest to the space of matrices satisfying multiple secant equations: gPSB Sym. This proves Theorem 1.
The second formula, , gives the matrix satisfying multiple secant equations and being the closest to the symmetric matrix: gPSB MS. This proves Theroem 2. \(\square \)
1.4 Existence conditions of gPSB
1.4.1 Theorem 3
We give here a proof to Theorem 3, which is an alternating proof of the theorem of impossibility of existence of gPSB of Schnabel [37].
Proof
We are looking for \(B \in \mathbb {R}^{n \times n}: B S=Y\) and \(B=B^T\) with S and \(Y \in \mathbb {R}^{n \times m}\).
Step 1: construction of \(S^{\texttt {syst}}\)
We will first construct a matrix that we will call \(S^{\texttt {syst}}\). Therefore, using the underscript \(X_{*,k}\) as the k-th column of the matrix S, we notice that \(B \mathbf {s}_{*,1}=\mathbf {y}_{*,1}\) can be expressed as \(S^1 \mathbf{b} =\mathbf {y}_{*,1}\) where
-
\(\mathbf{b} =\texttt {vec}(B)\), a column vector containing every element of B.
-
\(\mathbf {s}_{*,1}\) being a column vector containing the first column of S. \(\mathbf {y}_{*,1}\) the first column of Y.
-
\(S^1\) a block diagonal \(n \times n^2\) matrix containing the line vector \(\mathbf {s}_{*,1}^T\) in each block of the diagonal.
We create \(S^i\) and \(\mathbf {y}_{*,i}\) in the same way for the following columns of S and Y.
We now express the symmetry condition in the same form: \(\Sigma \mathbf{b} =\mathbf{0} \). \(\Sigma \) is a \(\frac{n(n-1)}{2} \times n^2\) matrix. For each couple \(\{b_{ij},b_{ji}\}\) (\(1 \le i \le n-1\), \(i+1 \le j \le n\)), we have a line containing 0 everywhere except for the place corresponding to \(b_{ij}\) and \(b_{ji}\) where the value is respectively 1 and \(-1\).
Applying the 2 conditions \(B S=Y\) and \(B=B^T\) together, we have thus:
Here we used the symbol \([X_1|X_2]\) to indicate the concatenation of column vectors next to each other. See illustration in Example 1.
Example 1
For instance, for \(n=3\) and \(m=2\), we have:
Step 2: Linear dependence for \(m=2\)
Let’s build \(S^{\texttt {syst}}\) for \(m=2\). We apply then the following linear combination:
-
We multiply the n first rows by the coefficients of \(\mathbf {s}_{*,2}\).
-
We multiply the n following rows by the coefficient of \(-\mathbf {s}_{*,1}\).
-
For the column corresponding to \(b_{ii}\), the linear combination of the rows is then equal to zero. There are only zeroes in the rows of \(\Sigma \) for these columns.
-
For the columns corresponding to \(b_{ij}\) with \(i \ne j\), the linear combination on the 2n first rows gives a value \(\delta _{ij}\). We notice that \(\delta _{ij}=-\delta _{ji}\). But for these columns, there is a single row of \(\Sigma \) having 1 in the column corresponding to \(b_{ij}\) (\(i<j)\), \(-1\) in the column of \(b_{ji}\) and 0 everywhere else. No other row of \(\Sigma \) has a non-zero value under the column of \(b_{ij}\) or \(b_{ji}\). So we multiply that rows by \(-\delta _{ij}\) and the linear combination gives 0 for those columns, too.
See Example 2 for an illustration.
Applying this linear combination to the matrix leads to a row containing only zeroes. So there is at least one linear dependency between the rows of \(S^{\texttt {syst}}\). The rank of matrix \(S^{\texttt {syst}}\) is at least one less than its number of lines.
Example 2
For our example with \(n=3\), we associate the following coefficients to the successive row:
Using this coefficient to define a linear combination of the rows, we can easily check that this combination leads to \(\mathbf{0} \).
Step 3: Existence conditions for \(m=2\)
\(\Rightarrow \) If \(BS=Y\) has a solution, then \(S^{\texttt {syst}} \mathbf{b} =\mathbf {y}_i\) has a solution, too. In this case, the linear combination of the lines of matrix \(S^{\texttt {syst}}\) also holds for \(\mathbf {y}_i\). This linear combination applied to \(\mathbf {y}_i\) gives:
\(\Leftarrow \) On the other side, if \(\mathbf {s}_{*,2}^T\mathbf {y}_{*,1}+(-\mathbf {s}_{*,1}^T)\mathbf {y}_{*,2}=0\), then applying the linear combination on \(\mathbf {y}_i\) gives 0. The linear combination of \(S^{\texttt {syst}}\) also holds for \(\mathbf {y}_i\), so \(S^{\texttt {syst}} \mathbf{b} =\mathbf {y}_i\) has a solution which is then also the case for \(BS=Y\).
Step 4: Conditions for \(m>2\)
Extending to higher values of m, for each pair i, j of the columns of S and Y, we have a solution if and only if \(\mathbf {s}_{*,i}^T\mathbf {y}_{*,j}=\mathbf {s}_{*,j}^T\mathbf {y}_{*,i}=\mathbf {y}_{*,i}^T\mathbf {s}_{*,j}\). For the second equality we used the fact that the transpose of a scalar is the scalar itself. This is equivalent to \(\left[ S^TY\right] _{i,j}=\left[ Y^TS\right] _{i,j}\) which leads to \(S^TY=Y^TS\). \(\square \)
1.4.2 Theorem 4
We give here the proof of Theorem 4.
Proof
Step 0: Linear dependencies for \(m=2\)
This has been shown above in the alternative proof of Theorem 3. By reasoning column by column, we have found one single linear combination involving all the rows.
Step 1: Linear dependencies for \(m=3\)
For \(m=3\), we have \(S^{\texttt {syst}}_3=[S_1^T| S_2^T| S_3^T| \Sigma ]^T\) (see previous demo for details about the construction of this matrix). Thanks to step 0, we know that we have 3 linear combinations (one for each couple of columns of the matrix S). However, those linear combinations can be equivalent, as one of them could be a combination of the other two. We have to prove that those combinations are different. We will proceed by contradiction: say that such a linear combination exists and prove that this is only possible if S is not full-rank. We define \(S^{\texttt {syst}}_{2}=[S_1^T| S_2^T| \Sigma ]^T\).
The following facts are noted:
-
(1)
By hypothesis, we have \(n \ge 3\).
-
(2)
The dimension of \(S^{\texttt {syst}}_{2}\) is \(\frac{n(n+3)}{2} \times n^2\). The number of rows is less or equal to the number of columns.
-
(3)
As we have proven that, in \(S^{\texttt {syst}}_{2}\), there is only one linear combination involving every row, we can take a subset of \(\frac{n(n+3)}{2}-1\) rows without losing information nor reducing the rank of the system.
-
(4)
There are at least \(n+1\) independent rows in \(S^{\texttt {syst}}_{2}\). We have indeed n independent rows in \(S_1\) and at least one extra independent row in \(S_2\) because S is full-rank. Therefore, the dimension of the span of its rows is at least \(n+1\).
-
(5)
The dimension of the span of the rows of \(S_3\) is at most n.
This leads us to the following conclusion: If the span of \(S_3\) is included in the span of \(S^{\texttt {syst}}_{2}\), then there is only one linear combination in \(S^{\texttt {syst}}_3\). Otherwise, there are 3 distinct linear combinations.
For the proof by contradiction, we assume that the span of \(S_3\) is included in the span of \(S^{\texttt {syst}}_{2}\). Then, for each row of \(S_3\), there exists a linear combination of the rows of \(S^{\texttt {syst}}_{2}\) that is equal to it. Thanks to point 3 above, we know that we can remove one row of \(S^{\texttt {syst}}_{2}\). We start thus by replacing the first row of \(S_3\) (See Example 3).
Example 3
For the case \(n=3\), we have then:
We apply then the following linear combination:
-
We multiply the n first rows by the coefficients of \(\mathbf {s}_{*,2}\).
-
We multiply the \(n+1\)-th row by \(-\mathbf {s}_{1,3}\).
-
We multiply the \(n-1\) following rows by the coefficient of \(-\mathbf {s}_{*,1}\).
-
For the column corresponding to \(b_{ii}\), the linear combination of the rows is then equal to zero (there are only zeroes in the rows of \(\Sigma \) for these columns).
-
For the columns corresponding to \(b_{ij}\) with \(i \ne j\) and \(i,j \ne 1\), the linear combination on the 2n first rows gives a value \(\delta _{ij}\).
-
We notice that \(\delta _{ij}=-\delta _{ji}\).
-
But for these columns, there is a single row of \(\Sigma \) having 1 in the column corresponding to \(b_{ij}\) (\(i<j)\), \(-1\) in the column of \(b_{ji}\) and 0 everywhere else.
-
No other row of \(\Sigma \) has a non-zero value under the column of \(b_{ij}\) or \(b_{ji}\).
-
So we multiply those rows by \(-\delta _{ij}\) and the linear combination gives 0 for those columns, too.
-
-
For the columns corresponding to \(b_{1i}\) and \(b_{i1}\) with \(i \ne 1\), we should also have \(\delta _{1i}=-\delta _{i1}\), but the equation is not automatically satisfied. This gives us \(n-1\) equations (for \(i=2,\dots ,n-1\)):
$$\begin{aligned} \mathbf {s}_{1,2}\mathbf {s}_{i,3}-\mathbf {s}_{i,2}\mathbf {s}_{1,3}=\mathbf {s}_{1,2}\mathbf {s}_{i,1}-\mathbf {s}_{i,2}\mathbf {s}_{1,1}\end{aligned}$$(B.3)
Applying the same reasoning but after replacing the \(n+1\)-th row of \(S^{\texttt {syst}}_{2}\) by the first row of \(S_3\) leads to another \(n+1\) equations:
Combining (B.3) and (B.4) leads to:
We can now apply the same process but starting with successively replacing the i-th and \(n+i\)-th rows of \(S^{\texttt {syst}}_{2}\) by the i-th row of \(S_3\), for \(i=2,...,n\). This leads to the general equations:
The consequence is \(\mathbf {s}_{*,3}=k(\mathbf {s}_{*,1}+\mathbf {s}_{*,2})\). S is then not full-rank which is in contradiction with our hypothesis. So the span of \(S_3\) is not included in the span of \(S^{\texttt {syst}}_{2}\) and, for \(m=3\), we have 3 distinct linear combinations.
Step 2: Linear dependencies for \(m>3\)
The proof extends to higher values of m by applying Step 3 on every combination of 3 submatrices \(S_i\). So each additional column of S adds one new linear combination with every other column. In general, for n and m given, we have a system of \(m n + \frac{n^2-n}{2}\) equations with \(n^2\) variables, but the rank of the matrix is only \(m n + \frac{n^2-n}{2} - \frac{m^2-m}{2}\). \(\square \)
SUgPSB
Alternating Projections applied on SUgPSB
1.1 Closed convex sets
Here are the proofs that the sets used in Sect. 4.1 are closed and convex.
Lemma D.1
The set of symmetric matrices satisfying the last secant equation is a closed convex set.
Proof
This set is in fact a projection on the intersection of the set \(K_{Sym}\) and \(K_{SU}\), the latter being a special case of \(K_{MS}\) (one secant equation instead of multiple secant equations).
In Lemma B.1, we have already proved that \(K_{Sym}\) is closed and convex.
By application of Lemma B.2 on \(K_{SU}\) where we only take one column of S and Y, we prove that \(K_{SU}\) is a closed convex set.
As both sets (\(K_{Sym}\) and \(K_{SU}\)) are closed and convex, their intersection \(K_{SymSU}\) is closed and convex. \(\square \)
Lemma D.2
Within the set of multisecant matrices for given secant equations, the subset of matrices being the nearest to be a symmetric matrix is a closed convex set.
Proof
The set of multisecant matrices (\(K_{MS}\)) is an affine space. The set of symmetric matrices (\(K_{Sym}\)) is a vector subspace, which is a special case of an affine subspace where the origin point is null.
The end space \(E_1\) as defined in [15, 21], is the set of points defined as \(E_1:=\left\{ \mathbf{x} \in L_1: d(\mathbf{x} ,L_2)=d(L_1,L_2)\right\} \), where \(L_1\) and \(L_2\) are affine subspaces and d is the distance between a point and an affine subspace or between two affine subspaces. In Theorem 2 of [15], it is shown that \(\mathbf{x} \in E_1\) solves an equation of the form \(A\mathbf{x} =\mathbf{b} \), so \(E_1\) is an affine space. Applying this with \(K_{MS}\) as \(L_1\) and \(K_{Sym}\) as \(L_2\), we prove that \(K_{MS\triangleright Sym}\) is an affine subspace, so it is closed and convex. \(\square \)
1.2 PSB non symmetric start
We prove here the formula (4.1).
Theorem 9
(PSB - Direct Form from non symmetric matrix) Let \(B_i \in \mathbb {R}^{n \times n}\), \(\mathbf {y}_i\) and \(\mathbf {s}_i \in \mathbb {R}^{n \times 1}\) and \(S_i\) full-rank. Let \(B_{i+1}\) such that:
-
\(B_{i+1}\mathbf {s}_i=\mathbf {y}_i\)
-
\(\left\| B_{i+1}-B_i\right\| _{F}\) is minimal
Then, \(B_{i+1}\) is given by:
With
-
\(\bar{B}_i=\frac{B_i+B^T_i}{2}\)
-
\(\bar{\mathbf {w}_i}=\mathbf {y}_i-\bar{B}_i\mathbf {s}_i\)
Proof
We will apply the method of alternating projection with:
-
\(K_{Sym}\): equation (3.1) for the symmetrical projection, giving \(_j\bar{B}\)
-
\(K_{SU}\): Broyden “good” [5, 6] for the secant update projection, giving \(_jB\)
Let’s start with \(_0B\). We first project onto the set of symmetric matrices. For more readability and as we work within one step of the Quasi-Newton process, we omit the subscripts i which refers to the step of the Quasi-Newton process. We have:
We should now project on symmetric matrices again. But is already symmetric. So the alternating projection has already converged. \(\square \)
1.3 SUgPSB Sym & SUgPSB MS
Proof
We project alternatively on the 2 sets: \(K_{MS\triangleright Sym}\) and \(K_{SymSU}\).
We find out that (D.1) is equal to (D.3). The sequence converges to a fixed point. The equations (D.1) and (D.2) respectively lead thus to the formulas of Theorems 6 (SUgPSB MS) and 5 (SUgPSB Sym). \(\square \)
Rights and permissions
About this article
Cite this article
Boutet, N., Haelterman, R. & Degroote, J. Secant Update generalized version of PSB: a new approach. Comput Optim Appl 78, 953–982 (2021). https://doi.org/10.1007/s10589-020-00256-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10589-020-00256-1