Technical Note The following article is Open access

Robust mixing in self-consistent linearized augmented planewave calculations

, and

Published 14 August 2020 © 2020 The Author(s). Published by IOP Publishing Ltd
, , Citation Jongmin Kim et al 2020 Electron. Struct. 2 037001 DOI 10.1088/2516-1075/ababde

2516-1075/2/3/037001

Abstract

We devise a mixing algorithm for full-potential (FP) all-electron calculations in the linearized augmented planewave (LAPW) method. Pulay's direct inversion in the iterative subspace is complemented with the Kerker preconditioner and further improvements to achieve smooth convergence, avoiding charge sloshing and noise in the exchange–correlation potential. As the Kerker preconditioner was originally designed for the planewave basis, we have adapted it to the FP-LAPW method and implemented in the exciting code. Applications to the 2 × 2 Au(111) surface with a vacancy and to the Pd(111) surface demonstrate that this approach and our implementation work reliably with both density and potential mixing.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Density-functional theory [1, 2] is used for a wide range of problems that cover electronic, mechanical, and vibrational properties of atoms, molecules, and solids. The diversity of applications to various materials requires that numerical algorithms employed in electronic-structure codes are robust. Otherwise, a poorly chosen or implemented algorithm may lead to unnecessary long computation times as well as to non-converging calculations [3]. Such issues hinder not only individual calculations but also are particularly harmful for high-throughput studies.

One particular problem commonly recurring in electronic-structure theory is convergence of the self-consistent Kohn–Sham equation:

Equation (1)

where epsilonik , ${v}_{\mathrm{K}\mathrm{S}}\left(\mathbf{r}\right)$, and ${\psi }_{\mathrm{i}\mathbf{k}}\left(\mathbf{r}\right)$ represent Kohn–Sham eigenenergies, potential, and wavefunctions, respectively. ${v}_{\mathrm{K}\mathrm{S}}\left(\mathbf{r}\right)$ depends on the electron density:

Equation (2)

where ωk is the weight of the k-point and fik the occupation factor. Equation (1) defines a non-linear eigenvalue problem. To solve it, one considers a linear problem with a fixed potential ${v}_{\mathrm{K}\mathrm{S}}^{\text{in}}\left(\mathbf{r}\right)$. After computing ${\psi }_{\mathrm{i}\mathbf{k}}\left(\mathbf{r}\right)$ and subsequently calculating $\rho \left(\mathbf{r}\right)$, one obtains a new potential ${v}_{\mathrm{K}\mathrm{S}}^{\text{out}}\left(\mathbf{r}\right)$ that serves, in principle, as ${v}_{\mathrm{K}\mathrm{S}}^{\text{in}}\left(\mathbf{r}\right)$ for the next iteration of the KS equation (equation (1)). This procedure is repeated until self-consistency is reached, namely, ${v}_{\mathrm{K}\mathrm{S}}^{\text{in}}\left(\mathbf{r}\right)\approx {v}_{\mathrm{K}\mathrm{S}}^{\text{out}}\left(\mathbf{r}\right)$.

In practice, at every step, the potential is constructed as a mixture (linear combination) of ${v}_{\mathrm{K}\mathrm{S}}^{\text{out}}\left(\mathbf{r}\right)$ and history of ${v}_{\mathrm{K}\mathrm{S}}^{\text{in}}\left(\mathbf{r}\right)$. One can likewise target self-consistency in the electron density instead of the potential, applying a corresponding mixing scheme. Regardless of the choice, the success and efficiency of this procedure depends on how exactly either of these two quantities is updated, and the problem resembles a multivariate optimization. As a result, a number of mixing methods [412] relies on standard optimization techniques or is intimately related to them. Despite successful applications to numerous systems, there are challenging cases that require further improvement of these methods. In particular, metallic systems with large unit cells suffer from an instability known as charge sloshing [3, 1319]. This issue originates from (i) a constant non-zero susceptibility, χ, of metallic systems at small wave-vectors and/or (ii) a factor of G−2 in the Hartree potential at small G components. It can, e.g., be seen in the error of the output potential

Equation (3)

that indicates that a small error in the input potential for short G is amplified due to these two factors.

Several studies have offered a solution to this problem [1316, 2022]. Among them, the Kerker preconditioner is widely used to reduce the charge-sloshing instability induced at small G (long-wavelength) [14], thus making the self-consistency procedure stable for metallic systems with large unit cells.

Mixing schemes are most frequently developed with planewave/pseudopotential methods in mind. Nevertheless, one can adopt the same techniques for all-electron implementations. For instance, variants of multisecant Broyden method have been successfully applied in full-potential (FP) calculations with the linearized augmented planewave (LAPW) basis set [23, 24]. Corresponding implementations have become the default choice in two FP-LAPW codes, i.e. exciting [25] and Wien2k [26]. Our experience shows that this method, especially a multisecant Broyden type-1 method, which estimates a Jacobian [23], performs well in most cases, but is unstable with respect to charge-sloshing. Unfortunately, also the aforementioned solutions to this problem target planewave implementations. Mixing potentials rather than densities, pose additional numerical difficulties. For instance, the generalized-gradient approximation (GGA) for the exchange–correlation functional may lead to noisy potentials in the low-density regions, making the problem of finding a self-consistent potential ill-conditioned.

In this paper, we tackle the issues of charge-sloshing and noisy potentials in FP-LAPW calculations. We implement the Kerker scheme and a modified Pulay mixer in the exciting code and discuss their performance with two benchmark systems: (i) the 2 × 2 Au(111) surface with a vacancy and (ii) the Pd(111) surface. We show that the modified Pulay mixer with the Kerker preconditioner is robust for both density and potential mixing.

2. Kerker mixing

Considering the ith step of the self-consistent field procedure, the (i + 1)st guess of the Kohn–Sham potential can be obtained via a simple linear relation:

Equation (4)

where α is some prefactor. This simple approach may work in many cases, however, it fails in calculations of metallic systems with a large unit cell. The linear-response properties in these systems are such that a small change in ${v}_{i}^{\text{in}}$ triggers a large change in ${v}_{i}^{\text{out}}$ at short wavevectors G as outlined above. Such an instability causes massive charge fluctuations over the self-consistency cycle which is known as charge sloshing. These short-G fluctuations can be suppressed using the Kerker scheme [14]. In a planewave basis, the mixing relation in equation (4) transforms into

Equation (5)

where λ is a parameter that determines the range of G over which changes of the potential are suppressed.

Equation (5) is straightforward to implement in the planewave methods, but not as simple in others. To describe its implementation in the FP-LAPW method, we briefly sketch its characteristics. In this method, the unit cell is divided into the interstitial region (I) and atomic spheres (commonly known as muffin-tin spheres), that are centered at the atomic positions. The potential (as is the density) is expanded in terms of planewaves in I and in spherical harmonics in the atomic spheres (labeled α), respectively:

Equation (6)

Since performing a Fourier transform is not feasible in this case, calculating ${v}_{i+1}^{\text{in}}$ directly from equation (5) is not practical.

To make equation (5) applicable in the FP-LAPW case, we transform it to the real space and obtain

Equation (7)

The operator $\hat{K}={\left({\nabla }^{2}-{\lambda }^{2}\right)}^{-1}$ cannot be applied to a function directly. Instead, we calculate $V\left(\mathbf{r}\right)=\hat{K}f\left(\mathbf{r}\right)$ as a solution of the screened Poisson equation:

Equation (8)

where $f\left(\mathbf{r}\right)$ is some function. If we set $f\left(\mathbf{r}\right)=-4\pi \rho \left(\mathbf{r}\right)$, $V\left(\mathbf{r}\right)$ corresponds to the potential due to the charge density $\rho \left(\mathbf{r}\right)$ assuming the screened Coulomb interaction. Keeping in mind that our goal is merely to apply the operator ${\left({\nabla }^{2}-{\lambda }^{2}\right)}^{-1}$ on a given function, we treat equation (8) as if we had an electrostatics problem in which the input $\rho \left(\mathbf{r}\right)=-\left[{v}_{i}^{\text{out}}\left(\mathbf{r}\right)-{v}_{i}^{\text{in}}\left(\mathbf{r}\right)\right]/4\pi $ is given. From this, we can then obtain $V\left(\mathbf{r}\right)$.

Equation (8) can be solved in the spirit of the pseudo-charge method suggested by Weinert [28]. It was originally proposed for calculating the Hartree potential in FP-LAPW calculations, but the same idea can be exploited for solving the screened Poisson equation as suggested in reference [29]. The original purpose of the algorithm presented in that study was an implementation of the screened exchange in a hybrid exchange–correlation functional but it also meets our needs. The crucial steps of the method are described below.

At first, we calculate the screened potential in the interstitial region. To do so, the charge density $\rho \left(\mathbf{r}\right)$ is replaced by a smooth pseudo-charge density $\overline{\rho }\left(\mathbf{r}\right)$, for which it is easy to perform a Fourier transform. Note that the input charge density is given in the same form as the potential in equation (6). The pseudo-charge density is constructed as

Equation (9)

where the first term is the given interstitial density. Note that unlike in equation (6) planewaves are allowed to enter the muffin-tin spheres in equation (9). The second term is a smooth function that is non-zero only in the muffin-tin spheres. ${\tilde {\rho }}^{\alpha }\left(\mathbf{r}\right)$ are chosen such that $\overline{\rho }\left(\mathbf{r}\right)$ yields the correct screened potential in the interstitial region. To ensure it, we expand the second term as follows:

Equation (10)

where ${Q}_{lm}^{\alpha }$ are constants chosen in such a way that $\overline{\rho }\left(\mathbf{r}\right)$ has the same charge multipoles as $\rho \left(\mathbf{r}\right)$. Since we consider the screened Coulomb interaction, the multipoles inside the muffin-tin spheres have to be calculated as

Equation (11)

where il (r) is the modified spherical Bessel function of the first kind. Finally, the radial functions in equation (10) are defined as

Equation (12)

with n = Rα Gmax/4. Due to this choice of ${\sigma }_{lm}^{\alpha }\left(r\right)$, the Fourier transform of ${\tilde {\rho }}^{\alpha }\left(\mathbf{r}\right)$ can be performed analytically, yielding

Equation (13)

With the Fourier transforms of both parts in equation (9), the potential in the interstitial reads

Equation (14)

To obtain the potential in the muffin-tin region, we solve the Dirichlet boundary-value problem with the original density $\rho \left(\mathbf{r}\right)$. Employing the Green-function method [30], the potential reads

Equation (15)

where the Green function and its normal derivative are expressed as

Equation (16)

and

Equation (17)

respectively. kl (r) is the modified spherical Bessel function of the second kind, and r< (r>) is the maximum (minimum) value between r and r'. $\frac{\partial G}{\partial {n}^{\prime }}$ is the normal derivative at the atomic sphere boundary.

The method described here is not limited to mixing potentials. The input and output potentials in equation (7) can be replaced by respective densities, which does not change the procedure of solving the screened Poisson equation.

3. Pulay mixing

The Pulay mixer [6, 7] also known as direct inversion in the iterative subspace represents an improvement over the linear mixer. It uses a sequence of input potentials ${v}_{i}^{\text{in}}\left(\mathbf{r}\right)$ and residuals ${R}_{i}\left(\mathbf{r}\right)={v}_{i}^{\text{out}}\left(\mathbf{r}\right)-{v}_{i}^{\text{in}}\left(\mathbf{r}\right)$ of previous iterations to estimate the optimal potential and the corresponding residual in the following manner:

Equation (18)

with the weights ωi subject to the constraint

Equation (19)

Minimizing the residual ${R}_{\text{opt}}^{\text{in}}$, the weights ωi are determined by [15, 16]

Equation (20)

with

Equation (21)

According to the conventional Pulay mixing scheme, the next trial potential is given as

Equation (22)

where α is the mixing parameter. The optimal value of α depends on the system [7]. For example, it is typically bigger in the case of semiconductors and insulators than in the case of metallic systems.

Equation (21) contains an integral over the entire unit cell consisting of contributions from the interstitial region and the muffin-tin spheres. When ${v}_{i}^{\text{in}}\left(\mathbf{r}\right)$ in the interstitial region is noisy (e.g. in case the GGA is employed), this noise propagates into the calculated weights resulting in a numerically problematic potential in equation (18). We solve this issue by calculating equation (21) as

Equation (23)

In other words, we ignore the interstitial part of the residuals at this stage and, thus, ${R}_{\text{opt}}^{\text{in}}$ is minimized strictly within the atomic spheres only. The summation in equation (18) is, however, still performed over entire unit cell. The level of noise in the charge density is typically noticeably lower compared to that in the potential, and, hence, employing equation (23) is not useful in case of density mixing. Therefore, we use different equations, i.e. equations (21) and (23), for density and potential mixing, respectively. In both cases, we refer to the method as simple Pulay mixing.

Previous studies [15, 16] have shown that introducing the inverse Kerker metric in equation (21) helps to prevent charge sloshing. The matrix elements in this approach read

Equation (24)

Similarly to equation (5), this equation is written in the planewave basis and is thus not directly applicable in FP-LAPW calculations. We solve this issue analogously to the case of the Kerker mixer: Rj (G)/G2 corresponds to the bare Coulomb potential due to the charge density $\rho \left(\mathbf{r}\right)=-{R}_{j}\left(\mathbf{G}\right)/4\pi $. Finding such a potential is a standard problem in an FP-LAPW code, which is routinely solved using Weinert's method [28]. The expression for the matrix elements can be formally written in real space:

Equation (25)

The inner integral is evaluated over the entire space. If the range of the outer integral is the whole unit cell, equation (25) is equivalent to equation (24). We use it for mixing densities.

The Kerker transformation can also be used as a preconditioner in the Pulay method [15, 16, 18, 19]. In this case, the Kohn–Sham potential is updated as

Equation (26)

The expression on the right-hand side exactly matches equation (5), and it can be evaluated using the same procedure as described in section 2. We use this preconditioner only in the first n iterations (typically, n = 5). Afterward, equation (22) is employed. This modification of the Pulay method with the inverse Kerker metric and the Kerker preconditioning (termed PulayKP) is the most successful method as we will show in the example below.

Both methods, the simple Pulay method and the one with the Kerker preconditioner and the inverse Kerker metric, depend on the mixing parameter α that we set to 0.4 for all considered cases. The latter method contains also the screening parameter λ, which strongly influences the convergence properties of the method. Previous studies [36, 37] have suggested to use the wavevector of the Thomas–Fermi screening, kTF. Following reference [37], we estimate it as

Equation (27)

where N(ɛF) corresponds to the density of states at the Fermi energy.

4. Computational details

The first system studied here is the 2 × 2 Au(111) surface with a vacancy. It consists of 5 layers of Au and is termed 5L-(2 × 2) Au(111)-V. The second example is the Pd(111) surface with 15 atomic layers, termed 15L-Pd(111). Both structures are displayed in figure 1. The corresponding slabs are constructed based on the bulk geometries with the lattice constants of 4.19 Å and 3.95 Å for Au and Pd, respectively. These structural parameters are obtained using the Perdew–Berke–Ernzerhof (PBE) exchange–correlation functional. Further structural optimizations of the slabs are not performed. In order to eliminate spurious interactions between the periodic images of the metal slabs, the vacuum spacing in the z direction is set to 30 Å and 20 Å for both surfaces.

Figure 1.

Figure 1. Left: top view (top) and side view (bottom) of a 2 × 2 Au(111) surface with a vacancy. Right: same for a Pd(111) surface with 15 layers. The black lines indicates the unit cells.

Standard image High-resolution image

The above described methods have been implemented in the all-electron FP code exciting [25], which is then used for all calculations. It employs the LAPW plus local orbitals basis-set (LAPW + lo) [3133]. The chosen muffin-tin radii are ${R}_{\text{MT}}^{\text{Au}}=$ 2.4 bohr and ${R}_{\text{MT}}^{\text{Pd}}=$ 1.9 bohr. An LAPW cutoff of RMT Gmax = 7 is adopted in all considered systems. Exchange and correlation effects are described within the GGA using the PBE parametrization [27]. The Perdew–Wang parametrization [34] of the local-density approximation (LDA) is applied for the sake of comparison. The Kohn–Sham potential is evaluated with a planewave cutoff of ${G}_{\mathrm{max}}^{v}=$ 18 bohr−1. The Brillioun zone is sampled on a 4 × 4 × 1 and a 10 × 10 × 1 k-mesh for 5L-(2 × 2) Au(111)-V and 15L-Pd(111), respectively. The self-consistency cycle is considered converged once the total-energy difference between two consecutive steps is smaller than 10−6 Ha. Simple Pulay, Pulay with Kerker preconditioner and inverse Kerker metric, and a variant of the multisecant Broyden method, termed msec, employ stored potentials (densities) and residuals obtained from m = 12 previous iterations for the 5L-(2 × 2) Au(111)-V system. In 15L-Pd(111) calculations, we have set m = 30, since it is required for a stable performance.

5. Results and discussion

We start discussing the performance of the implemented mixing methods with the case of 5L-(2 × 2) Au(111)-V. This is a metallic system with a large unit cell and, therefore, it is prone to the charge-sloshing instability [1319]. Indeed, 5L-(2 × 2) Au(111)-V is a too complicated problem for the linear mixing and the Kerker method. To illustrate this, we perform calculations applying these methods by mixing the potentials and set the corresponding parameters in equations (4) and (5) to α = 0.4 and λ = 1.0 bohr−1. To monitor the convergence, we display in figure 2 the variation of the total energy between two consecutive steps, ΔE. With the linear mixer, this quantity not only does not converge during the first 100 iterations to a desired threshold, but remains large overall, i.e. within range of 104–105 Ha. The huge values of ΔE reflect excessive changes in the Kohn–Sham potential. ${v}_{\text{KS}}\left(\mathbf{r}\right)$ varies so much in the muffin-tin region that it heavily affects the radial functions in the LAPW basis. They acquire a different nodal structure and, hence, trigger enormous fluctuations of the total energy. The Kerker mixing, in turn, shows an improved performance compared to the linear mixing. ΔE decays with the number of iterations, yet this happens too slowly for the Kerker mixer to be practical. Still, we acknowledge that it suppresses charge sloshing.

Figure 2.

Figure 2. Convergence of total energy (in Ha) for 5L-(2 × 2) Au(111)-V using linear and Kerker mixing schemes.

Standard image High-resolution image

A further issue that makes 5L-(2 × 2) Au(111)-V challenging is the (typically) noisy GGA potential in the vacuum region. To demonstrate this fact, we compare in figure 3 the self-consistent exchange–correlation potentials ${v}_{\text{xc}}\left(\mathbf{r}\right)$ obtained by LDA and GGA, together with the corresponding electron densities, for 5L-(2 × 2) Au(111)-V. While the densities are essentially indistinguishable on the displayed scale, the differences in the exchange–correlation potentials are immediately apparent. The LDA potential is smooth throughout the unit cell and only slightly jagged in the low-density region (ρ < 10−4 bohr). The noise obviously stems from the gradient term (∇ρ/ρ4/3), which is prone to rapid variations in low-density regions [35] since a small variation in $\rho \left(\mathbf{r}\right)$ can cause a large variation in ${v}_{\text{xc}}\left(\mathbf{r}\right)$. Our numerical tests show that this issue persists even if we increase the LAPW cutoff to RMT Gmax = 8.5 and the planewave cutoff for the potential to ${G}_{\mathrm{max}}^{v}=$ 32 bohr−1. Thus, GGA calculations of low-dimensional systems, i.e. with vacuum, are less stable than those using the LDA, in particular when mixing potentials.

Figure 3.

Figure 3. Exchange–correlation potential and charge density (blue lines) of 5L-(2 × 2) Au(111)-V perpendicular to the surface. The exchange–correlation potential from PBE (top) and LDA (bottom) are shown as black and red lines, respectively.

Standard image High-resolution image

Our implementation of both the simple Pulay and the modified Pulay approaches is insensitive to both of these instabilities. This is demonstrated in figure 4 (top panel) where the convergence behavior of the total energy (using the potential mixing) is compared for these two methods and msec scheme. We note that msec does not include 'unpredicted components' for controlling step size and 'scaling' of planewaves as well as radial functions, compared to a method developed by Marks and Luke [23]. If these elements were considered, the convergence is expected to be more faster and stable. All three methods employed here, converge to the target precision within 25–50 steps, in contrast to the linear mixing and the Kerker method (see figure 2). Simple Pulay and Pulay–KP perform better than the msec method, where the noise in the potential is not taken care of.

Figure 4.

Figure 4. Convergence of total energy (in Ha) for 5L-(2 × 2) Au(111)-V with msec, simple Pulay, and Pulay with Kerker preconditioner and inverse Kerker metric (Pulay–KP) methods by using potential (top) and density (bottom) mixing.

Standard image High-resolution image

We also apply these approaches for mixing densities. The convergence behavior of the total energy is shown in the bottom panel of figure 4. The simple Pulay and msec methods do not reach self-consistency within 100 steps, while the Pulay–KP method converges in just a few iterations more than in the case of potential mixing. Overall, our calculations for 5L-(2 × 2) Au(111)-V show that the Pulay–KP method is the most efficient method among the considered ones.

Our second benchmark case is 15-layer Pd(111). Besides being metallic and having a large unit cell, this system has a high density of states near the Fermi level, therefore charge-sloshing is even more pronounced [3, 38]. The total-energy convergence in calculations using different approaches, i.e. msec, simple Pulay, and Pulay–KP is presented in figure 5. Regardless whether the density or the potential is mixed, the only method that succeeds for this system is Pulay–KP. The other two methods do not converge for either kind of mixing. In fact, msec leads in case of density mixing to such an unphysical density that the self-consistency loop cannot be terminated. Pulay–KP, however, reaches the target precision for the total energy in 35 (density) and 36 (potential) steps, respectively. This further indicates that Pulay–PK successfully attenuates the long-wavelength instability which induces charge sloshing.

Figure 5.

Figure 5. Convergence of total energy (in Ha) for 15L-Pd(111) with msec, simple Pulay, and Pulay–KP methods by using potential (top) and density (bottom) mixings.

Standard image High-resolution image

6. Summary and conclusions

In summary, we have reformulated the Kerker preconditioner and the inverse Kerker metric to make them applicable in FP-LAPW calculations. We have implemented the Pulay mixing algorithm modified with these features in the electronic-structure code exciting. Our applications demonstrate that this method is robust and superior to the standard Pulay method and the msec approach.

Acknowledgments

Work supported by the European Community's Horizon 2020 Research and Innovation Program under Marie Skłodowska-Curie Grant Agreement No. 675867. Partial support from the the Deutsche Forschungsgemeinschaft (DFG), Projektnummer 182087777—SFB 951 is acknowledged. All data are stored in the NOMAD Repository, http://dx.doi.org/10.17172/NOMAD/2020.03.25-1. Details on the calculations can be looked up there.

Please wait… references are loading.
10.1088/2516-1075/ababde