Abstract
A thick-restart Lanczos type algorithm is proposed for Hermitian J-symmetric matrices. Since Hermitian J-symmetric matrices possess doubly degenerate spectra or doubly multiple eigenvalues with a simple relation between the degenerate eigenvectors, we can improve the convergence of the Lanczos algorithm by restricting the search space of the Krylov subspace to that spanned by one of each pair of the degenerate eigenvector pairs. We show that the Lanczos iteration is compatible with the J-symmetry, so that the subspace can be split into two subspaces that are orthogonal to each other. The proposed algorithm searches for eigenvectors in one of the two subspaces without the multiplicity. The other eigenvectors paired to them can be easily reconstructed with the simple relation from the J-symmetry. We test our algorithm on randomly generated small dense matrices and a sparse large matrix originating from a quantum field theory.
Similar content being viewed by others
1 Introduction
In theoretical physics, symmetry is an important guiding principle to search for laws of nature. When numerically simulating a physics system that has a symmetry, it is preferable to retain the symmetry property in the numerical simulation algorithm. Hermitian matrices often appear in analyzing physics systems, and spectra analysis is important to understand the nature. In this case the Hermitian symmetry must be considered in the spectra analysis and the eigensolver algorithm. Developing eigensolver algorithms for spectra analysis is one of the major research areas in applied mathematics.
In physics, matrices having both Hermiticity symmetry and J-symmetry exist. Let A be a Hermitian and J-symmetric matrix in \({\mathbb {C}}^{n\times n}\). The J-symmetric property of A is defined by
where J is a skew-symmetric matrix in \({\mathbb {R}}^{n\times n}\). We take this definition from [15] for the J-symmetry.Footnote 1 Throughout this paper, we employ the following notations: \(^{{T}}\) for matrix transposition, \(^{{H}}\) for Hermitian conjugation, and \(^*\) for complex conjugation. We use I and \(O\) to denote the identity and null matrices with appropriate sizes, respectively. The eigenvalues of A are doubly degenerated, or doubly multiple, with the J-symmetry. Let x be an eigenvector of A associated to the eigenvalue \(\lambda\), satisfying \(A x = \lambda x\). The vector defined by
is also the eigenvector of A associated to \(\lambda\) because of the symmetry (1).
Any Hermitian J-symmetric matrix A has the following block structure. Without loss of generality, J is
where the size of each block is \((n/2)\times (n/2)\) and n is an even number. Based on this, the symmetry (1) requires
where \(A_{11}\) and \(A_{12}\) are \((n/2)\times (n/2)\) matrices satisfying \(A_{11}^{{H}}=A_{11}\) and \(A_{12}^{{T}}=-A_{12}\). We can also explicitly construct eigenvectors x and \(y=J x^*\) associated with an eigenvalue \(\lambda\) in the block structure. Then A can be diagonalized as
where \(\varLambda =\mathrm {diag}(\lambda _1,\lambda _2,\dots ,\lambda _{n/2})\), and \(X_1\) and \(X_2\) are \((n/2)\times (n/2)\) matrices satisfying \(X_1^{{H}}X_1+X_2^{{H}}X_2=I\) and \(X_1^{{T}}X_2-X_2^{{T}}X_1=O\).
To investigate the spectrum of a sparse matrix in a large dimension, iterative eigensolver algorithms are generally employed. Iterative eigensolver algorithms in the literature, such as the Lanczos algorithm for Hermitian matrices, do not consider the J-symmetry. Therefore, even after convergence on one of a pair of the degenerate eigenvector pairs, the algorithm continues to search for the other eigenvector associated to the converged eigenvalue even though the other eigenvector can be easily reconstructed from the converged one. In this article, we restrict ourselves to improving the Lanczos algorithm for large Hermitian J-symmetric matrices by incorporating the J-symmetry property.
We could improve the convergence of the Lanczos algorithm by restricting the search space of the Krylov subspace of A to that spanned by one of each pair of the doubly degenerate eigenvector pairs by imposing complete orthogonality to the subspace spanned by the other eigenvectors paired to them. However, this strategy cannot be directly realized before knowing the invariant subspace of A. We find that the Krylov subspace \({\mathscr {K}}_k(A,v)=\mathrm {span}\{v,Av,\dots ,A^{k-1}v\}\) generated with the Lanczos algorithm with a starting vector v is orthogonal to \({\mathscr {K}}_k(A,Jv^*)=\mathrm {span}\{Jv^*,A Jv^*,\dots ,A^{k-1} Jv^*\}\) with the same Lanczos algorithm with the starting vector \(Jv^*\). On the basis of this property, we can obtain only one half of the degenerate eigenvectors in \({\mathscr {K}}_k(A,v)\) and can reconstruct the other half via (2). This can achieve the strategy just stated above.
Early attempts have been made to incorporate the symmetry property or structure of matrices into eigensolver algorithms in [7, 15] in which dense matrix algorithms were developed for the Hermitian J-symmetric matrices that appeared in a quantum mechanical system with time-reversal and inversion symmetry. The dense matrix algorithms retain or respect the matrix structure (4) during the computation [7, 15]. To focus on the algorithm for large sparse matrices, we do not discuss algorithms for dense matrices in this paper. The one similar to our algorithm that incorporates the symmetry properties of matrices was studied in [3,4,5, 13], in which the Lanczos type eigensolvers for complex J-skew-symmetric matrices or Hamiltonian matrices were investigated.Footnote 2 Because the symmetry treated in [3,4,5, 13] is different from the Hermitian J-symmetry, their algorithm cannot be directly applied to Hermitian J-symmetric matrices, even though it is based on the Lanczos algorithm. See also [12] for numerical algorithms for structured eigenvalue problems.
This paper is organized as follows. In the next section, we prove the orthogonality between the Krylov subspace \({\mathscr {K}}_k(A,v)\) generated with the Lanczos iteration to \({\mathscr {K}}_k(A,Jv^*)\) with the same Lanczos iteration. Having observed the orthogonality, we describe the restarting method based on the Krylov–Schur transformation method [19] and the thick-restart method [20, 21]. Then, we propose the thick-restart Lanczos algorithm for Hermitian J-symmetric matrices in Sect. 3. We estimate the computational cost in terms of the matrix–vector multiplication. In Sect. 4, we test the proposed algorithm for two types of the Hermitian J-symmetric matrix. The one type is an artificial randomly generated matrix satisfying the structure of (4) and (5), and the other matrix originates from quantum field theory. The convergence behavior and the computational cost are compared with those of the standard thick-restart Lanczos algorithm. We summarize this paper in the last section.
2 Lanczos iteration and J-symmetry
The Lanczos iteration transforms A to a tridiagonal form by generating the orthonormal basis vectors. The Lanczos decomposition after m-step iteration starting with a unit vector \(v_1\) is given by
where \(V_m=\left[ v_1,v_2,\dots ,v_m\right]\) and \(v_j \in {\mathbb {C}}^n\), and \(e_m\) is the m-dimensional unit vector in the m-th direction. The basis vectors are orthonormal, \(V_{m+1}^{{H}}V_{m+1}=I\). \(T_m\) denotes the \(m\times m\) tridiagonal matrix that is given by
The approximate eigenpairs are obtained from the eigenpairs of \(T_m\). Because of the Hermiticity property of A and the recurrence structure of the Lanczos iteration, all \(\alpha _j\) and \(\beta _j\) can be taken to be real. Thus, \(T_m\) becomes a real symmetric matrix. In the standard Lanczos iteration, the Hermitian symmetry of A is respected in the form \(T_m\).
We also investigate the J-symmetry property of decomposition (6). After taking the complex conjugate of (6) followed by multiplying it by J from the left-hand side, we have
Using the J-symmetry and Hermiticity, \(JA^*=A^{{H}}J=AJ\), and by defining \(w_j \equiv Jv_j^*\), we obtain
The columns of \(W_{m+1}\) are also orthonormal, \(W_{m+1}^{{H}}W_{m+1}=I\). Consequently, the vectors \(W_{m+1}\) have the same Lanczos decomposition as that of \(V_{m+1}\) when it starts from \(w_1 = Jv_1^*\). Furthermore, we can prove the orthogonality between \(W_{m+1}\) and \(V_{m+1}\).
To implement the thick-restart method [20, 21], we show the J-symmetry property for a generalized decomposition similar to the Krylov–Schur decomposition [19] instead of the Lanczos decomposition in the following part of the paper. After preparing two lemmas related to the J-symmetry, we prove the main theorem for the orthogonality property between \(W_{m+1}\) and \(V_{m+1}\) on the generalized decomposition. Subsequently, the orthogonality properties for the Lanczos decomposition and the thick-restart method are proved as corollaries.
We first show orthogonal properties between v and \(w=Jv^*\).
Lemma 1
Let v be an arbitrary vector in \({\mathbb {C}}^n\), and J and A be matrices satisfying (1). Then, the vector w defined by \(w \equiv J v^*\) satisfies
Proof
From the definition of w, it follows that
where \(J^{{T}} = -J\) is used. The identity \(v^{{T}} Jv = v^{{T}} J^{{T}} v\) and (13) yield
Thus, \(w^{{H}} v = 0\). Similarly,
where \(J^{{T}}=-J\) and \(JA=A^{{T}}J\) are used. The identity \(v^{{T}} A^{{T}} J v= v^{{T}} J^{{T}} A v\) and (15) yield
Thus, \(w^{{H}} A v = 0\). \(\square\)
We have the following lemma that is a generalization of the relation between (6) and (9).
Lemma 2
Let \(v_1,v_2,\ldots ,v_k,v_{k+1}\) be the vectors having the following relation:
where \(V_k=[v_1,\ldots ,v_k]\), \(S_k\) is a matrix in \({\mathbb {R}}^{k\times k}\), and b is a vector in \({\mathbb {R}}^k\) for a Hermitian J-symmetric matrix A satisfying (1). Then, the vectors \(w_j = J v_j^*\) \((j=1,\ldots ,k+1)\) satisfy the following decomposition:
where \(W_k=JV_k^*=[w_1,\ldots ,w_k]\).
Proof
By taking the complex conjugate of (17) followed by multiplying it by J from the left-hand side, we obtain (18) using \(JA^{{T}}=AJ\) and \(A^* = A^{{T}}\). \(\square\)
Using Lemmas 1 and 2, we can show the orthogonality properties between \(W_{k+1}\) and \(V_{k+1}\) as follows.
Theorem 1
Let \(V_{k+1}=[v_1,\ldots ,v_{k+1}]\) be a matrix satisfying (17), and \(W_{k+1}=JV_{k+1}^*\). If the matrices \(V_{k}\) and \(W_{k}\) are orthogonal and A-orthogonal to each other: \(V_k^{{H}}W_k=O\) and \(V_k^{{H}}AW_k=O\), then the matrices \(V_{k+1}\) and \(W_{k+1}\) also satisfy the following orthogonality relations:
Proof
The decomposition (18) follows from Lemma 2. Multiplying \(W_k^{{H}}\) and \(V_k^{{H}}\) to (17) and (18), respectively, yields
From the premise that \(W_k^{{H}} A V_k=V_k^{{H}} A W_k=O\) and \(W_k^{{H}} V_k =V_k^{{H}} W_k=O\), it follows that
Together with Lemma 1 and the premise, we find
Multiplying \(w_{k+1}^{{H}}\) and \(v_{k+1}^{{H}}\) to (17) and (18), respectively, yields
Because of (23) and (24) as well as Lemma 1, the right-hand sides of (26) and (27) vanish. Thus,
Together with Lemma 1 and the premise, we find
Therefore, \(V_{k+1}\) and \(W_{k+1}\) are orthogonal and A-orthogonal to each other. \(\square\)
Using Theorem 1 as well as Lemmas 1 and 2 , we can show that the Lanczos vectors \(V_{m+1}\) generated with (6) are orthogonal and A-orthogonal to \(W_{m+1}=JV_{m+1}^*\).
Corollary 1
The m-step Lanczos vectors \(V_{m+1}\) with \(m\ge 1\) generated with a unit vector \(v_1\) for a Hermitian J-symmetric matrix A satisfy
with
when no breakdown occurs.
Proof
Because the Lanczos decomposition (6) is a particular form of (17) with \(k\rightarrow m\), a real matrix \(S_m \rightarrow T_m\) and a real vector \(b \rightarrow \beta _m e_m\), Lemma 2 can be applied to obtain (9). Because \(w_1^{{H}} v_1=0\) and \(w_1^{{H}} A v_1=0\) hold from Lemma 1, we can apply Theorem 1 to (6) and (9) when \(m=1\). Moreover, the Lanczos decomposition retains its form applicable to Theorem 1 for any \(m>1\). Therefore, the corollary follows from Theorem 1 and Lemmas 1 and 2 by induction. \(\square\)
We cannot simultaneously find degenerate pairs of eigenvectors with the standard single-vector Lanczos process. This is true for computation with exact arithmetic. However, with finite precision arithmetic, the single-vector Lanczos process would generate a small overlap to \(W_k\) via round-off errors, so that even after the convergence of an eigenvector, a late convergence to the other paired eigenvector would be possible. Because this behavior is rather accidental, a block-type Lanczos algorithm has to be applied to accelerate the convergence for degenerate eigenvectors [1, 2, 8, 18, 22].
We further investigate the structure of the Lanczos decomposition. According to Corollary 1 and the Lanczos decompositions (6) and (9), the block-type decomposition can be constructed as:
where we define
where the size of \(E^{{T}}_{[m]}\) is \(2\times 2m\). When \(m = n/2\), the decomposition should terminate, because \(V_{[n/2]}\) completely block-tridiagonalize A and the Krylov subspaces \({\mathscr {K}}_{n/2}(A,v)\) and \({\mathscr {K}}_{n/2}(A,Jv^*)\) span the entire eigenspace of A. Because \({\mathscr {K}}_{n/2}(A,v)\) and \({\mathscr {K}}_{n/2}(A,Jv^*)\) are orthonormal and (6) and (9) are independent iterations, the Lanczos iteration for (6) terminates at \(m=n/2\) regardless of the iteration for (9). We, therefore, can construct eigenvectors from \({\mathscr {K}}_{n/2}(A,v)\) without the eigenvalue multiplicity associated with the J-symmetry. In other words, the standard Lanczos iteration with exact precision arithmetic is enough to find all the eigenvectors without multiplicity. However, this is impractical because it requires exact precision arithmetic. With the finite precision, the orthogonality to \({\mathscr {K}}_{n/2}(A,Jv^*)\) is not maintained because of round-off errors and, eventually, eigenvectors, including multiplicity, could be extracted from the single-vector Lanczos iteration, as stated previously.
By using Corollary 1 and with the above analysis, we can construct a Lanczos type algorithm in which the orthogonality to \(W_k\) is enforced to search for eigenvectors without multiplicity of eigenvalues associated to the J-symmetry. Additionally, the other vectors paired to them can be easily reconstructed. However, for a practical numerical algorithm of the Lanczos type iteration, the iteration should terminate at a finite step, and a restarting mechanism is required [6, 20, 21]. The most useful and simplest but effective restarting method is the so-called thick-restart method [20, 21] that is a specialization of the Krylov–Schur transformation [19] to Hermitian matrices. To involve the thick-restart method to the Lanczos algorithm with the J-symmetry, we have to prove the orthogonality between \(V_k\) and \(W_k\) after the Krylov–Schur transformation and restarting. To achieve this, we have the following corollary.
Corollary 2
Let \(V_{m+1}\) and \(W_{m+1}\) be the orthonormal matrices containing basis vectors generated with the m-step Lanczos process, (6) and (9), and \(Z_k\) be an orthonormal matrix in \({\mathbb {R}}^{k\times k}\). The Krylov–Schur transformation with \(Z_k\) on the Lanczos decomposition is defined by
Then, the matrices \(U_{m+1} = [ U_{m}, v_{m+1} ]\) and \(Q_{m+1} = [Q_{m}, w_{m+1} ]\) satisfy the orthogonal relations: \(U_{m+1}^{{H}} Q_{m+1}=O\) and \(U_{m+1}^{{H}} A Q_{m+1}=O\).
Proof
Because \(U_m\) and \(Q_m\) satisfy \(U_m^{{H}} Q_m=O\) and \(Q_m^{{H}} A U_m=O\) and the decompositions (35) and (36) are particular forms of the decomposition in Lemma 2, the orthogonality and A-orthogonality between \(U_{m+1}\) and \(Q_{m+1}\) simply follow from Theorem 1. \(\square\)
For the thick-restart method, \(Z_m\) is chosen to diagonalize \(T_m\), and the dimension of the decomposition is reduced from m to \(k<m\) with a selection criterion for vectors \(V_k \leftarrow V_m\). In the reduction, the last vectors are kept hold as \(v_{k+1} \leftarrow v_{m+1}\) and \(w_{k+1} \leftarrow w_{m+1}\) to retain the decomposition form properly. The orthogonality properties of the new basis \((U_{m+1},Q_{m+1})\) and the reduced basis \((U_{k+1},Q_{k+1})\) still hold according to Corollary 2. After restarting, the Lanczos iteration continues to keep the decomposed form applicable to Theorem 1. Because the structures of the Lanczos and Krylov–Schur decompositions for \(W_k\) and \(Q_k\) are the same as those for \(V_k\) and \(U_k\), respectively, we do not need to explicitly iterate the Lanczos algorithm for \(W_k\) and \(Q_k\). Consequently, we can continue the Lanczos thick-restart cycle only in the subspace \({\mathscr {K}}_k(A,v)\) that is orthogonal and A-orthogonal to \({\mathscr {K}}_k(A,Jv^*)\). We note that according to Corollaries 1 and 2, all the eigenpairs without multiplicity can be obtained with the thick-restart Lanczos algorithm using exact precision arithmetic. This is impractical and we must enforce the orthogonality between \({\mathscr {K}}_k(A,v)\) and \({\mathscr {K}}_k(A,Jv^*)\) for a practical algorithm.
3 Thick-restart Lanczos algorithm with J-symmetry
Based on Theorem 1 as well as Corollaries 1 and 2, we construct a thick-restart Lanczos algorithm for Hermitian J-symmetric matrices (TRLAN–JSYM) which efficiently searches for eigenvectors without the multiplicity of eigenvalues in \({\mathscr {K}}_k(A,v)\). Algorithm 1 shows the TRLAN–JSYM algorithm. The Lanczos iteration with the J-symmetry is described in Algorithm 2. We include the invert mode for the small eigenvalues.
The main difference from the standard thick-restart Lanczos algorithm (TRLAN) is in Algorithm 2, where we simultaneously construct \(w_{j}\) using \(w_{j}=Jv_j^*\) and enforce the orthogonality of \(V_{j+1}\) to \(W_j =J V_j^*\) to avoid the contamination of the search space \({\mathscr {K}}_k(A,v)\) from \({\mathscr {K}}_k(A,Jv^*)\).
We will compare the efficiency between the TRLAN–JSYM and the standard TRLAN algorithms in Sect. 4. For the comparison, we estimate the computational cost of the TRLAN–JSYM and the standard TRLAN algorithms as follows. We count the total number of matrix–vector multiplication \(Av_j\) or \(A^{-1}v_j\) contained in the Lanczos step Algorithm 2. For the invert mode with a large sparse matrix, it could require an iterative linear solver and a computational cost to obtain \(A^{-1}v_j\). To focus on the computational cost comparison between the TRLAN–JSYM and TRLAN algorithms, assuming the computation cost of a single inversion is identical between the two algorithms, we do not count the cost involved in the inversion and regard the single inversion operation \(A^{-1}v_j\) as one matrix–vector multiplication for the invert mode. We neglect the cost that explicitly computes the true residual at line 28 in Algorithm 1. For the first outer iteration, the count is m, and after restarting, it is \(m-k\). The thickness k for restarting is defined by
as shown in line 48 of Algorithm 1, where \(\mathrm {icnv}\) is the number of converged eigenvectors and \(\mathrm {mwin}\) is the initial thickness for restarting. When all desired eigenvectors are obtained at an outer iteration \(N_{\mathrm {conv}}\), the upper and lower bounds of the total number of matrix–vector multiplication \(N_{\mathrm {MV}}\) is estimated as
where \(\mathrm {nev}\) is the number of desired eigenvectors without the multiplicity of J-symmetry. The inequality (41) follows from the fact that \(\mathrm {icnv}\) increases monotonically from zero to \(\mathrm {nev}\) toward the convergence.
The same cost estimate can be derived for the standard TRLAN algorithm. The TRLAN algorithm can be obtained by removing \(W_m\) from Algorithms 1 and 2. Therefore, the cost bound for the TRLAN algorithm is identical to (41). However, to find all eigenvectors paired with the J-symmetry using the TRLAN algorithm, \(\mathrm {nev}\) for the TRLAN must be double that of the TRLAN–JSYM. Thus it is natural to double all parameters for the TRLAN algorithm than those of the TRLAN–JSYM. Therefore, the upper and lower bounds of the total number of matrix–vector multiplication \(N_{\mathrm {MV}}\) of the TRLAN algorithm is
where the parameters \((\mathrm {nev},\mathrm {mwin},m)\) are those of the TRLAN–JSYM, and we introduced \(N'_{\mathrm {conv}}\) for the number of outer iterations because it could be different from that of the TRLAN–JSYM.
Although the computational cost of the Gram–Schmidt orthonormalization of the Lanczos step is minor compared to that of the matrix–vector multiplication, we briefly discuss the cost here. As shown in Algorithm 2 for the TRLAN–JSYM, the number of vectors to orthonormalize is 2m and the cost scales with \(O((2m)^2)\). On the other hand, it scales with \(O(m^2)\) to orthonormalize \(V_m\) with the standard Lanczos algorithm. As described above, it is natural to double the parameters for the TRLAN. The cost of the Lanczos part to orthonormalize \(V_{2m}\) then becomes \(O((2m)^2)\). Therefore, the scaling of the cost is the same for both algorithms.
Our naive estimates on the computational cost are (41) and (42), where we assume that the parameters of the TRLAN is twice as large as those of the TRLAN–JSYM. Although \(N_{\mathrm {conv}}\) and \(N'_{\mathrm {conv}}\) depend on \((\mathrm {nev},\mathrm {mwin},m)\) and the algorithm itself, if \(N_{\mathrm {conv}}\simeq N'_{\mathrm {conv}}\) holds, the TRLAN–JSYM algorithm has a better performance than the TRLAN algorithm. We will see whether the condition \(N_{\mathrm {conv}}\simeq N'_{\mathrm {conv}}\) holds or not in the numerical tests on the two types of the matrices in the next section.
So far, we have described the single-vector Lanczos iteration type algorithm to introduce the TRLAN–JSYM. If the matrix A has a dense cluster of eigenvalues or multiple eigenvalues other than those with the J-symmetry, we need to incorporate the block type Lanczos iteration in the algorithm for efficiency. We can extend the proposed algorithm to the blocked version in the same manner as it was conducted for the standard thick-restart Lanczos algorithm [18, 22]. The study on the block version will be addressed in future studies and we have only shown the single vector version to demonstrate the idea for simplicity.
4 Numerical test
In this section, we show two numerical tests to explore the efficiency of the TRLAN–JSYM algorithm compared to the TRLAN algorithm for Hermitian J-symmetric matrices. The first test is conducted for randomly generated Hermitian J-symmetric matrices satisfying the structure (4). The second test is applied to a matrix in quantum field theory. We refer to these two test cases as Case A and Case B, respectively.
We implement both the algorithms, TRLAN–JSYM and TRLAN, with Fortran 2003. The numerical tests were performed on a single node of the subsystem A of the ITO supercomputer system of Kyushu university [16]. The code is parallelized using OpenMP and the Intel MKL library and executed with 36 threads.
4.1 Case A
4.1.1 Definition of the test matrix (case A)
We generated ten matrices with a size of \(2000\times 2000\). These sample matrices are randomly generated, as explained below. Although these matrices are dense, we employ them to explore the proposed Algorithm 1.
To randomly generate matrices A with the structure (4), we employ (5). The eigenvalues \(\varLambda\) are generated from uniformly distributed random real numbers in \(\{x\in {\mathbb {R}}: 0<x<1\}\), and the elements of matrices \(X_1\) and \(X_2\) are constructed from uniformly distributed random complex numbers in \(\{z \in {\mathbb {C}}: -1<\mathrm {Re}(z)< 1, -1< \mathrm {Im}(z) <1\}\). The constraints \(X_1^{{H}}X_1+X_2^{{H}}X_2=I\) and \(X_1^{{T}}X_2-X_2^{{T}}X_1=O\) are then imposed by a Gram–Schmidt algorithm similar to that used in Algorithm 2. Fig. 1 shows large eigenvalues for the ten random matrices. We solve the largest several eigenvalues with the TRLAN–JSYM and TRLAN algorithms and compare the convergence behavior.
4.1.2 Numerical results (case A)
Table 1 shows the algorithmic parameters used in this test. For the stopping condition of the algorithms, we employ \(\mathrm {tol}=10^{-13}\) for tolerance. We also tabulate the number of outer iteration counts \(N_{\mathrm {conv}}\) and matrix–vector multiplications \(N_{\mathrm {MV}}\) for convergence in the table. The minimal, average, and maximal values from the ten samples are shown in square brackets, respectively.
Figures 2 and 3 are the convergence histories of the eigenvalues and corresponding residuals for the 1st random matrix, respectively. The left panels show the result with the TRLAN–JSYM algorithm with \((\mathrm {nev},\mathrm {mwin},m)=(10,20,50)\), and the right panels show the result with the TRLAN algorithm with (20, 40, 100). We employ this doubled parameter for the TRLAN, as discussed in the previous section. The behavior of the TRLAN–JSYM is smooth, while it reorders several times for the TRLAN, even though the TRLAN algorithm successfully captures all the eigenvalues with multiplicity two. The sorting algorithm for the eigenvalues and the property of the Lanczos iteration to the J-symmetry cause the reordering of the eigenvalues for the TRLAN. As mentioned in Sect. 2, the TRLAN tends to evaluate one eigenvector of a pair of the doubly degenerate eigenvalues in the early stage of the iterations. Because of the finite precision arithmetic, it loses complete orthogonality to the other half of the degenerate eigenspace during the iterations, so that the other eigenvalue paired to the converged eigenvalue emerges in the later stage. Similar behaviors are also observed for other random matrices at the same algorithmic parameter. For the cases with a larger m, we do not observe the eigenvalue reordering with the TRLAN algorithm, because they quickly converge. Increasing m, \(N_{\mathrm {conv}}\) rapidly decreases, as shown in Table 1. However, \(N_{\mathrm {MV}}\) is almost constant or slightly increasing.
We compare the computational cost of the algorithms according to the discussion done in the previous section. The natural parameter choice for the TRLAN algorithm is to employ numbers twice as large as those of the TRLAN–JSYM algorithm. The \(N_{\mathrm {conv}}\) are comparable among algorithms paired with the doubled parameters, revealing that \(N_{\mathrm {conv}}\simeq N'_{\mathrm {conv}}\) holds for (41) and (42). The number of matrix–vector multiplications of the TRLAN–JSYM algorithm is smaller by roughly a factor of two than that of the TRLAN algorithm, as seen in the table. Even with the same maximum Krylov dimension m for both algorithms, e.g. the TRLAN–JSYM with \((\mathrm {nev},\mathrm {mwin},m)=(5,10,100)\) and the TRLAN with (10, 20, 100), the TRLAN–JSYM algorithm still beats the TRLAN algorithm because m does not drastically change the number of matrix–vector multiplications.
4.2 Case B
4.2.1 Definition of the test matrix (case B)
We evaluate the proposed algorithm 1 for a matrix that appears in a quantum field theory called the twisted Eguchi-Kawai (TEK) model with adjoint fermions [9,10,11]. The equation of motion for the adjoint fermions follows from a matrix D called the Wilson–Dirac operator in the physics literature. The matrix \(D=(D_{i,j})\) is defined by
where i and j are collective indices of \(i=(a,\alpha )\) and \(j =(b,\beta )\), respectively. \(V_\mu\) are \((N^2-1)\times (N^2-1)\) matrices satisfying \(V_{\mu }^{{T}}=V_{\mu }^{{H}}\), i.e. real orthonormal matrices in the adjoint representation of the SU(N) group, and a, b denote the group indices running in \(1,\dots ,N^2-1\). \(\gamma _\mu\) denotes \(4\times 4\) Hermitian matrices satisfying the anti-commuting relation \(\{\gamma _\mu ,\gamma _\nu \} = 2 \delta _{\mu ,\nu }I\), and \(\alpha ,\beta\) denote spin indices running from 1 to 4. An explicit form of \(\gamma _\mu\) can be seen in [17]. The parameter \(\kappa\) implicitly determines the mass of the fermion. For more details of D, we refer to [11, 14].
The matrix D satisfies the following properties:
where \(\gamma _5 = \gamma _4\gamma _1\gamma _2\gamma _3\) and \(C=\gamma _4 \gamma _2\). The matrices \(\gamma _5\) and C act only on the spin indices in this notation. We employ the definition for \(\gamma _\mu\) from [17] and give the explicit form in Appendix 1. In this case, \(\gamma _5\) is real and symmetric and C is real and skew symmetric \(C^{{T}}=-C\). Monte Carlo methods have been used for simulating quantum field theories. For the system considered herein, the quantum field \(V_\mu\) corresponds to the stochastic variable in a Monte Carlo algorithm. The spectrum of D becomes stochastic because it depends on \(V_\mu\).
We test the proposed algorithm for the matrix A defined by
The matrix A is Hermitian and J-symmetric with \(J = C\gamma _5\). The distribution of the small eigenvalues of A is physically important because it carries the information about the dynamics of the theory. The details of the algebraic property of D and A are given in Appendix.
4.2.2 Numerical results (case B)
We set \(N=289\) of SU(N) for the test. The dimension of A is \(4\times (289^2-1)=334{,}080\). The ensemble for \(V_\mu\) is generated with a Monte Carlo algorithm at a parameter set of the TEK model. We employ a single Monte Carlo sample of \(V_\mu\) for the test.
We compare the convergence behavior of the eigenvalues between the proposed algorithm (TRLAN–JSYM) and the standard (single vector) thick-restart Lanczos algorithm (TRLAN). We use the normal and invert modes for solving large and small eigenvalues, respectively. The conjugate–gradient (CG) algorithm is used in the invert mode. The algorithmic parameters, the number of desired eigenvalues \(\mathrm {nev}\), the restart window size \(\mathrm {mwin}\), and the maximum size of the search dimension m are shown in Table 2. We also tabulate the results of the outer iteration count, the number of matrix–vector multiplications, and the computational time for the convergence. The timings are shown as reference values, showing how the cost of the matrix–vector multiplication dominates the computational time in actual applications. We note that the convergence behavior of the CG in the invert mode was almost identical between the two algorithms as has been assumed in Sect. 3, justifying the cost comparison in terms of the number of matrix–vector multiplications of \(A^{-1}v_j\) in the invert mode. Compared with the TRLAN–JSYM, we double the parameters of the TRLAN to find all doubly degenerate eigenvalues. We set the tolerance to be \(10^{-13}\) for the eigensolvers.
Figures 4 and 5 show the convergence behavior and residual history of the large eigenvalues, respectively. The algorithmic parameters are \((\mathrm {nev},\mathrm {mwin},m)=(8,16,48)\) for the TRLAN–JSYM and (16, 32, 96) for the TRLAN, respectively. We observe similar convergence behavior as in Case A, where several reorderings occur among approximate eigenvalues during the iterations in the TRLAN algorithm. The same convergence behavior is seen in Figs. 6 and 7 for the small eigenvalues.
The computational costs are compared in Table 2. According to the discussion done in Sect. 3, most cases satisfy \(N_{\mathrm {conv}}\simeq N'_{\mathrm {conv}}\) for (41) and (42) among algorithms paired with doubled parameters, and \(N_{\mathrm {MV}}\) for the TRLAN–JSYM algorithm is approximately twice as small as that for the TRLAN algorithm. Three cases, \((\mathrm {nev},\mathrm {mwin},m)=(16,32,48)\) for both modes, and \((\mathrm {nev},\mathrm {mwin},m)=(8,16,24)\) for the invert mode, are the exceptions. In these cases, one or two eigenvalues, which are the largest for the invert mode or the smallest for the normal mode among \(\mathrm {nev}\) eigenvalues, show slow convergence. Even for these cases, however, the TRLAN–JSYM shows smoother convergence behavior than that of the TRLAN. All the cases we have investigated show that the TRLAN–JSYM algorithm has better computational cost than that of TRLAN regarding \(N_{\mathrm {MV}}\). The computational timings are roughly proportional to \(N_{\mathrm {MV}}\), indicating that the matrix–vector multiplication dominates the timings. With the invert mode for small eigenvalue problems, the timings are well proportional to the number of matrix–vector multiplications, because the CG algorithm is used for \(A^{-1}v\) and the cost of the Lanczos and the true residual computing parts become negligible.
5 Summary
In this study, we have shown the orthogonality and A-orthogonality between the two Krylov subspaces, \({\mathscr {K}}_k(A,v)\) and \({\mathscr {K}}_k(A,Jv^*)\) that are generated with the Lanczos algorithm for Hermitian J-symmetric matrices A. By employing this property, we proposed the thick-restarted Lanczos algorithm for Hermitian J-symmetric matrices (TRLAN–JSYM) using which we could efficiently search for one half of the doubly degenerate eigenvectors in \({\mathscr {K}}_k(A,v)\) without the need to explicitly construct \({\mathscr {K}}_k(A,Jv^*)\). The other half of the degenerate eigenvectors are simply constructed from the converged eigenvectors by utilizing the J-symmetry property.
We demonstrated the proposed algorithm TRLAN–JSYM for two test cases, the random matrices, and the fermion matrix from the quantum field theory called the TEK model. The convergence observed for the TRLAN–JSYM algorithm was smoother than that for the TRLAN algorithm, as expected. The TRLAN–JSYM algorithm performed better than the TRLAN algorithm regarding the matrix–vector multiplication. The TRLAN algorithm shows the reordering of eigenvalues among the degenerated eigenvalues caused by the loss of orthogonality between the eigenvectors paired with J-symmetry during the standard Lanczos iteration with the finite precision arithmetic. We did not discuss the mathematical background on the loss of orthogonality in the Lanczos algorithm. If exact arithmetic was employed for both of the algorithms, the TRLAN algorithm becomes identical to the TRLAN–JSYM algorithm, according to Corollary 1. However, our algorithm enforces the orthogonality to achieve the smooth convergence behavior at finite precision arithmetic, resulting in better performance.
Notes
The literature shows another definition, \((JA)=(JA)^T\), for the J-symmetry; however, this is opposite from ours because \(JAJ^{-1}=-A^{T}\).
References
Baglama, J., Calvetti, D., Reichel, L.: IRBL: an implicitly restarted Block–Lanczos method for large-scale Hermitian Eigenproblems. SIAM J. Sci. Comput. 24(5), 1650–1677 (2003). https://doi.org/10.1137/S1064827501397949
Bai, Z., Demmel, J., Dongarra, J., Ruhe, A., van der Vorst, H.: Templates for the solution of algebraic eigenvalue problems. Soc. Ind. Appl. Math. (2000). https://doi.org/10.1137/1.9780898719581
Benner, P., Faßbender, H.: An implicitly restarted symplectic Lanczos method for the Hamiltonian eigenvalue problem. Linear Algebra Appl. 263, 75–111 (1997). https://doi.org/10.1016/S0024-3795(96)00524-1
Benner, P., Faßbender, H., Stoll, M.: A Hamiltonian Krylov–Schur-type method based on the symplectic Lanczos process. Linear Algebra Appl. 435(3), 578–600 (2011). https://doi.org/10.1016/j.laa.2010.04.048
Benner, P., Faßbender, H., Yang, C.: Some remarks on the complex \(J\)-symmetric eigenproblem. Linear Algebra Appl. 544, 407–442 (2018). https://doi.org/10.1016/j.laa.2018.01.014
Calvetti, D., Reichel, L., Sorensen, D.C.: An implicitly restarted Lanczos method for large symmetric eigenvalue problems. Electron. Trans. Numer. Anal. 2, 1–21 (1994)
Dongarra, J., Gabriel, J., Koelling, D., Wilkinson, J.: The eigenvalue problem for Hermitian matrices with time reversal symmetry. Linear Algebra Appl. 60, 27–42 (1984). https://doi.org/10.1016/0024-3795(84)90068-5
Golub, G., Underwood, R.: The Block Lanczos method for computing eigenvalues. In: Rice, J.R. (ed.) Mathematical Software, pp. 361–377. Academic Press, Cambridge (1977). https://doi.org/10.1016/B978-0-12-587260-7.50018-2
González-Arroyo, A., Okawa, M.: A twisted model for large \(N\) lattice gauge theory. Phys. Lett. B 120, 174–178 (1983). https://doi.org/10.1016/0370-2693(83)90647-0
González-Arroyo, A., Okawa, M.: Twisted-Eguchi–Kawai model: a reduced model for large-\(N\) lattice gauge theory. Phys. Rev. D 27, 2397 (1983). https://doi.org/10.1103/PhysRevD.27.2397
González-Arroyo, A., Okawa, M.: Twisted space-time reduced model of large \(N\) QCD with two adjoint Wilson fermions. Phys. Rev. D 88, 014514 (2013). https://doi.org/10.1103/PhysRevD.88.014514
Kressner, D.: Numerical Methods for General and Structured Eigenvalue Problems. Lecture Notes in Computational Science and Engineering, vol. 46, 1st edn. Springer, Berlin (2005). https://doi.org/10.1007/3-540-28502-4. https://www.springer.com/gp/book/9783540245469
Mehrmann, V., Watkins, D.: Structure-preserving methods for computing eigenpairs of large sparse Skew-Hamiltonian/Hamiltonian pencils. SIAM J. Sci. Comput. 22(6), 1905–1925 (2001). https://doi.org/10.1137/S1064827500366434
Montvay, I.: Supersymmetric Yang–Mills theory on the lattice. Int. J. Mod. Phys. A 17, 2377–2412 (2002). https://doi.org/10.1142/S0217751X0201090X
Petkov, M.G., Ivanov, I.G.: Solution of symmetric and Hermitian \(J\)-symmetric eigenvalue problem. Mathematica Balkanica. New Series 8, 337–349 (1994). http://www.math.bas.bg/infres/MathBalk/MB-08/MB-08-337-349.pdf
Research Institute for Information Technology, Kyushu University, “Introduction of ITO,” 2018, https://www.cc.kyushu-u.ac.jp/scp/eng/system/01_into.html
Rothe, H.J.: Lattice Gauge Theories, 4th edn. World Scientific, Singapore (2012). https://doi.org/10.1142/8229
Shimizu, N., Mizusaki, T., Utsuno, Y., Tsunoda, Y.: Thick-restart Block Lanczos method for large-scale shell-model calculations. Comput. Phys. Commun. 244, 372–384 (2019). https://doi.org/10.1016/j.cpc.2019.06.011
Stewart, G.W.: A Krylov–Schur algorithm for large eigenproblems. SIAM J. Matrix Anal. Appl. 23(3), 601–614 (2001). https://doi.org/10.1137/S0895479800371529
Wu, K., Canning, A., Simon, H., Wang, L.W.: Thick-restart Lanczos method for electronic structure calculations. J. Comput. Phys. 154(1), 156–173 (1999). https://doi.org/10.1006/jcph.1999.6306
Wu, K., Simon, H.: Thick-restart Lanczos method for large symmetric eigenvalue problems. SIAM J. Matrix Anal. Appl. 22(2), 602–616 (2000). https://doi.org/10.1137/S0895479898334605
Zhou, Y., Saad, Y.: Block Krylov–Schur method for large symmetric eigenvalue problems. Numer. Algorithms 47(4), 341–359 (2008). https://doi.org/10.1007/s11075-008-9192-9
Acknowledgements
Numerical computations are performed on the ITO supercomputer system of Kyushu university. This work is partly supported by Priority Issue 9 to be tackled by using Post K Computer. We thank Antonio González-Arroyo and Masanori Okawa for comments on the draft and the details of the Wilson-Dirac operator in the adjoint representation. We are very grateful to the anonymous reviewer for his/her comments that enhanced the quality of the manuscript.
Funding
Funding was provided by MEXT.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Properties of the matrices D (43) and A (46)
Properties of the matrices D (43) and A (46)
In this appendix we show the algebraic properties of the matrices D (43) and A (46), including the J-symmetry. We employ the following explicit form for \(\gamma _\mu\):
In addition to these, we also have \(\gamma _5=\gamma _4\gamma _1\gamma _2\gamma _3\) as
These \(\gamma _\mu\) matrices have the following properties:
The matrix \(C=\gamma _4\gamma _2\) has the following properties:
We show the properties of (44) and (45) in detail. To simplify the proof, we suppress the matrix indices of (43) and write it as
where the direct product of the spinor index and the color index is implicit.
Equation (44) is shown as:
where \(\{\gamma _5,\gamma _\mu \}=0\) is used. Because \(V_\mu ^{{T}} = V_\mu ^{{H}}\) for the matrices in the adjoint representation of the SU(N) group, the last line is identical to
Next, we show (45) in:
where we used (54). The last line is identical to
We finally show the J-symmetry of A (46). The Hermiticity of A is apparent from (44) and (46). The properties of \(J\equiv C\gamma _5=\gamma _4\gamma _2\gamma _5\) are:
Because \(\gamma _2, \gamma _4,\) and \(\gamma _5\) are real matrices, J is real. Therefore, the properties of \(J\equiv C\gamma _5\) follows those in (1). The J-symmetry of A (46) is shown as:
where we used \(C\gamma _5=\gamma _5C, C^{{T}}C = I, \gamma _5^{{T}}=\gamma _5, (\gamma _5)^2=I\), and (45). Therefore, A of (46) is J-symmetric.
Rights and permissions
This article is published under an open access license. Please check the 'Copyright Information' section either on this page or in the PDF for details of this license and what re-use is permitted. If your intended use exceeds what is permitted by the license or if you are unable to locate the licence and re-use information, please contact the Rights and Permissions team.
About this article
Cite this article
Ishikawa, KI., Sogabe, T. A thick-restart Lanczos type method for Hermitian J-symmetric eigenvalue problems. Japan J. Indust. Appl. Math. 38, 233–256 (2021). https://doi.org/10.1007/s13160-020-00435-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13160-020-00435-x