Identification of high-dielectric constant compounds from statistical design

Gopakumar, Abhijith; Pal, Koushik; Wolverton, Chris

doi:10.1038/s41524-022-00832-5

Download PDF

Article
Open access
Published: 07 July 2022

Identification of high-dielectric constant compounds from statistical design

npj Computational Materials volume 8, Article number: 146 (2022) Cite this article

2500 Accesses
6 Citations
2 Altmetric
Metrics details

Subjects

Abstract

The discovery of high-dielectric materials is crucial to increasing the efficiency of electronic devices and batteries. Here, we report three previously unexplored materials with very high dielectric constants (69 < ϵ < 101) and large band gaps (2.9 < E_g(eV) < 5.5) obtained by screening materials databases using statistical optimization algorithms aided by artificial neural networks (ANN). Two of these new dielectrics are mixed-anion compounds (Eu₅SiCl₆O₄ and HoClO) and are shown to be thermodynamically stable against common semiconductors via phase diagram analysis. We also uncovered four other materials with relatively large dielectric constants (20 < ϵ < 40) and band gaps (2.3 < E_g(eV) < 2.7). While the ANN training-data are obtained from the Materials Project, the search-space consists of materials from the Open Quantum Materials Database (OQMD)—demonstrating a successful implementation of cross-database materials design. Overall, we report the dielectric properties of 17 materials calculated using ab initio calculations, that were selected in our design workflow. The dielectric materials with high-dielectric properties predicted in this work open up further experimental research opportunities.

The first demonstration of entirely roll-to-roll fabricated perovskite solar cell modules under ambient room conditions

Article Open access 12 March 2024

Hasitha C. Weerasinghe, Nasiruddin Macadam, … Doojin Vak

Generative AI for designing and validating easily synthesizable and structurally novel antibiotics

Article 22 March 2024

Kyle Swanson, Gary Liu, … Jonathan M. Stokes

Methylation enables the use of fluorine-free ether electrolytes in high-voltage lithium metal batteries

Article 03 April 2024

Ai-Min Li, Oleg Borodin, … Chunsheng Wang

Introduction

Dielectric materials are among the most vital components for microelectronic device manufacturing. They are used in memory devices, capacitor-based energy storage, field-effect transistors, etc^1,2,3. The dielectric constant (denoted here as ϵ), more commonly referred to as the relative permittivity, is the factor by which the electric field strength decreases inside a material compared to the vacuum when it is placed near a finite electric charge. The ϵ values of commonly used dielectric materials range between 20 and 30^1,4,5—for example, Ta₂O₅ (ϵ ~ 23–27, E_g = 4.2 eV)^1,2,6,7 and TiO₂ (ϵ = 27, E_g = 3.5 eV)^1,2,8. There is a high demand to find novel materials with high ϵ to increase the device performance and reliability. Typically, ϵ and E_g are inversely related^2,9 in a compound. As a result, although several materials are reported to have even larger ϵ values, they often have a small E_g^9,10,11,12, making the dielectric vulnerable to leakage currents under exposure to large electric fields^1,2. Therefore, compounds with high ϵ and large band gaps are preferred while designing charge storage applications and microelectronic devices.

One of the methods to find high-ϵ compounds is to calculate the dielectric constants and band gaps of a large number of compounds that are available in large materials databases such as the Open Quantum Materials Database (OQMD)^13,14, Materials Project (MP)¹⁵, etc using ab initio methods such as density functional theory (DFT). However, since the accurate calculation of dielectric properties using density functional perturbation theory¹⁶ (DFPT) is computationally very expensive, it would be practically unfeasible to estimate the dielectric constants of tens of thousands of materials available in those databases using high-throughput methods. In this work, we employ an advanced screening strategy to identify compounds with better dielectric properties. Thus, the goal of this work is to find dielectric materials with large values for both ϵ and E_g by screening materials databases but at the expense of conducting as few DFPT calculations as possible. To accomplish this task, we have employed a materials design strategy comprised of statistical optimization models and DFPT calculations on a small set of compounds. While our training set consists of a small amount of data (dielectric constants) from the MP, the search-space contains a vast set of compounds available in the OQMD.

Several online data repositories exist today that are dedicated to hosting large sets of open-sourced inorganic crystal structure data generated from high-throughput (HT) DFT calculations such as the MP¹⁵, OQMD^13,14, and AFLOWLib¹⁷ among others^18,19. The design and discovery of novel materials using statistical modeling has become an active research area^20,21,22 in recent times, largely attributed to the availability of such HT datasets. Recently, multiple studies have reported HT-generation of dielectric data and subsequent analysis^9,23,24. For example, Morita et al. reported²⁵ machine learning modeling of data from MP^11,12,15 to assess the reliability of the theoretical models currently available to describe the dielectric properties of crystals.

In this work, we use the MP dataset of 1864 dielectric tensors^11,12 to train statistical models and subsequently identify dielectrics from the set of stable materials in the OQMD. Thus the MP data forms the training-data and the set of materials from OQMD forms the search-space for the materials design. This work is a successful demonstration of the scenario where the data obtained from multiple sources can be utilized to discover new compounds. The negligible difference found between the representation vectors, which are also called as feature vectors in machine learning, generated for equivalent materials in MP and OQMD made the cross-database design possible in this work. Overall, we conducted three design cycles which required us to perform dielectric calculations for just 17 materials using DFPT. We report the dielectric constant values of all the 17 materials among which three of them (HoClO, Eu₅SiCl₆O₄, and Tl₃PbBr₅) have very large ϵ (69 < ϵ < 101) and E_g (2.9 eV < E_g < 5.5 eV) values making them part of the Pareto front of the known data, and four other materials (Sr₂LuBiO₆, Bi₅IO₇, Bi₃ClO₄, and Bi₃BrO₄) have moderately large ϵ (20 < ϵ < 40) and E_g (2.3 eV < E_g < 2.7 eV) values.

Results

Materials design strategy

Our objective is to find large band gap materials with optimal dielectric constants. Since the dielectric tensor of a compound has nine components, the optimization of all nine components leads to a nine-objective optimization problem which is difficult to solve with training-data of size ~2000. Thus, we specifically optimize the largest eigenvalue of the dielectric tensor, referred to from here onward as ϵ, via statistical modeling through the materials design workflow, as depicted in Fig. 1. The workflow is similar to the strategies that have been previously reported in literature^26,27, where each design cycle consists of three steps—data processing, statistical modeling, and ab initio DFPT calculations. The largest eigenvalue of the total dielectric tensor is chosen as the property to be optimized because that is the highest possible dielectric behavior from a single crystal when it is aligned perfectly along the corresponding direction between two metallic plates. The total dielectric tensor is calculated as the sum of ionic and electronic dielectric tensors. The good agreement between dielectric tensor eigenvalues obtained from MP’s DFPT HT framework and experimentally measured dielectric constant values was reported by Petousis et al.²⁸. We preferred the largest eigenvalue over the average of eigenvalues because the latter value may severely underestimate the highest possible dielectric behavior from a single crystal (Supplementary Fig. 1), even though it is a popular choice to estimate the polycrystalline dielectric constant^12,28. The new data produced from DFPT calculations at the end of each cycle is fed into the next design cycle. In the first step, we collected the relevant data from the MP database (training-data) and OQMD (search-space). All materials in the training-data have a known value for ϵ and E_g, while the materials in the search-space have known values of E_g but their ϵ values are unknown. In the second step, Modeling, we created an ensemble of artificial neural network (ANN)²⁹ models, fit on the training-data, which learn to predict the ϵ value of materials when their crystal structures and E_g values are known. Using this ANN ensemble, we predicted the ϵ of each material in the search-space. Since the prediction was done from an ensemble, the results were a distribution of ϵ values for each material, contrary to the usage of a single ANN model where a single prediction value is obtained. The trained ANN ensemble was used to predict the ϵ-distributions of 11,102 stable non-metallic materials in the search-space, obtained from the OQMD.

**Fig. 1: Materials design workflow used in this work.**

Further, the predicted distribution of ϵ was input into the Efficient Global Optimization (EGO)²⁶ algorithm. EGO takes into account the distribution’s mean and standard deviation to rank the materials in search-space based on their potential to increase the chances of finding high-ϵ materials in this workflow within as few design cycles as possible. In this work, the optimization in dielectrics refers to the identification of dielectrics with large ϵ values. The reason for employing an EGO algorithm to explore the search-space is to account for the uncertainty in ANN model predictions when the available training-data may not have sampled the material space uniformly. The advantages of EGO-based optimization in materials design were first reported and benchmarked by Balachandran et al.^26,30,31. In this work, we used the EGO algorithm to select the best candidates that are either predicted to have a high ϵ value or have a large uncertainty in their ANN-ensemble predictions. Materials that belong to the latter category are from the regions of materials yet to be sampled by the training-data. The DFPT characterization of such materials is expected to increase the reliability of ANN-ensemble predictions after each design cycle and eventually lead to better optimization of dielectrics during the course of this work.

The metric that is used to rank the materials is called expected improvement, or E(I). More details on how the E(I) is calculated, are provided in the “Methods” section. A few (5–6) materials were selected in this step with the highest values of E(I) and carried onto the next step—DFPT calculations. In this final step, the dielectric tensors of the selected materials were calculated using DFPT calculations. If DFPT results show that any of the materials have a high value of E_g and ϵ, we stop the design workflow at that point. Otherwise, a new design cycle is started after transferring the newly computed ϵ values and the corresponding materials to the training-data from the search-space. With an increased size of training-data, the ANN ensemble is expected to have less uncertainty in ϵ predictions in the new design cycle. The design cycle was repeated with feedback three times in total in this work until three materials with very large values for E_g and ϵ were found.

Data

A dataset containing information about crystal structures, chemical compositions, band gap energy values, and dielectric tensors of 1864 stable materials was obtained from the MP^11,12,15 data repository. This dataset was used to generate the training-data. The target property, ϵ, was obtained for each material in this database from its calculated dielectric tensor. Another dataset consisting of 11,102 stable, non-metallic materials containing information about crystal structures, chemical compositions, and band gap energy values was obtained from OQMD^13,14. This OQMD dataset was used to generate the search-space in which the search to find dielectrics was conducted. The dielectric tensor data of all crystals included in the search-space were unknown at the beginning of this work.

The materials need to be represented as vectors of uniform length in order to be input into a statistical model. We generated the material representations using the Magpie³² crystal property generator tool. Magpie generates a set of physical features (such as the mean electronegativity of constituent atoms, average coordination number inside the unit cell, etc.) from a given chemical composition and crystal structure. Within Magpie, the crystal’s structure-related features are generated by building Voronoi tessellations inside the crystal and finding the nearest neighbors of each individual atom³³. Magpie generated 271 input features that include 145 composition-based, and 126 structure-based features to represent each material. In addition to these, the material’s DFT E_g value was also added as an extra feature to the representation vector since it is already known for all materials in both MP and OQMD datasets. The addition of E_g increased the size of the representation vector to 272, which was generated for each material in training-data and search-space. The input feature-vector size was further reduced to 100 using the widely-used feature reduction techniques such as principal component analysis and model-based selection, implemented in the Scikit-learn python library³⁴. The set of material representation vectors of training-data and the search-space, in addition to the target values associated with the training-data, completes the first step of materials design as depicted in Fig. 1. The size of the training-dataset increases after each design cycle as a result of conducting DFPT calculations on new materials from the search-space.

Statistical modeling utilizing data from multiple computational material databases is prone to errors arising from the differences in the DFT parameters used at each database’s high-throughput calculation strategy. Here, we have investigated the difference in Magpie-generated features for equivalent materials in OQMD and MP, cross-referenced based on their associated Inorganic Crystal Structure Database³⁵ (ICSD) Collection Codes. In total, 1717 out of 1864 materials in training-data had an ICSD Collection Code associated with them. The crystal structures from OQMD corresponding to all the 1717 ICSD materials were obtained, and their Magpie-generated features were compared against that of the structures obtained from MP as a part of the training-data. The results, as plotted in Fig. 2a, show negligible (≤2%) relative difference in 263 out of a total of 271 Magpie features, while the other eight features have low relative differences (≤7%). All 145 composition-based features are computed to be identical across the databases, as expected. The finite difference in some of the structure-based features originates because of the difference in the accuracy of crystal structural minimization across databases. Band gap, which joins the Magpie features to form the final material representation vector, was also compared between OQMD and MP for the 1717 equivalent materials, as shown in Fig. 2b. Band gap values showed a mean and median absolute deviation of 0.1 eV and 0.0 eV respectively, pointing toward a negligible difference between the calculations of band gap for materials included in the training-data across OQMD and MP. Overall, the materials representation vector considered in this design is generated in a cross-comparable manner across OQMD and MP structures with very low errors.

**Fig. 2: Comparison of material representation vectors between the OQMD and MP structures.**

The ϵ values in the training-data obtained from MP are predominantly concentrated in the range of 0 to 25, making it difficult to model the data reliably for materials with large ϵ due to a possible bias toward smaller values. Less than 5% of the materials in the training-data have ϵ > 50. The median of ϵ values in the MP dataset is 12.2 while the mean and standard deviation are 20.2 and 42.8 respectively. The distribution of ϵ in training-data is shown in Supplementary Fig. 2. The large spread of ϵ values is decreased upon a log-scale transformation, as shown in Fig. 3a. A smaller spread of target values helps stabilize the machine learning model during the training by reducing the probability of excessive changes in internal parameters, such as the weights in an ANN. We also analyzed the correlation between ϵ and E_g values for the materials in the training-data, and it is given in Supplementary Fig. 3.

The original dataset downloaded from MP listed BeO (MP ID: mp-1794) as having large ab initio computed values for ϵ(=312) and E_g(=8.2 eV). This large value of ϵ is possibly caused by the improper relaxation of the primitive cell of BeO in MP that leads to a large volume change. Hence, the succeeding calculations on this compound such as DFPT may be incorrect. We conducted a separate DFT cell-relaxation and DFPT calculation for BeO using VASP starting with the MP’s initial structure and find that the computed ϵ value for the correctly relaxed structure is 4—well in agreement with the previously reported values in literature³⁶. This compound was removed from the training-data before proceeding further. We looked up other materials in training-data with very high ϵ and smaller E_g individually and confirmed that they did not have a large cell-volume change upon relaxation in MP.

Statistical modeling

The predictions from trained machine learning models, such as ANNs, are often prone to errors arising from the insufficient sampling of material space by training-data. We needed to quantify the uncertainty associated with the ϵ value predictions even though the available ANN algorithms explicitly do not provide that value from a single ANN model. So we created an ensemble of ANNs, each of which was trained on a randomly chosen subset of the training-data, and has different architectures and internal parameters. An ANN ensemble containing 2000 independent ANN models was created and trained at each design cycle. Each ANN in the ensemble predicted a single ϵ value upon inputting a material-representation vector, resulting in a distribution of 2000 predicted ϵ values for each material in the search-space. The standard deviation of each of the predicted ϵ-distribution was defined as the uncertainty of ANN modeling for the corresponding material.

Further, a statistical single-objective optimization algorithm, called EGO^{26,37,38,39,40}, was used in this work to evaluate the ϵ-distribution and quantify a measure of probable optimization associated with each material in the search-space. EGO is not a method to model the data and predict ϵ. Instead, EGO is an algorithm to select the best candidates from a given search-space, based on their ϵ-distributions predicted by the ANN ensemble, in order to discover as many high-ϵ materials from as few design cycles as possible. Here, the desired optimization is the maximization of ϵ among all the materials in the search-space. The quantified measure of predicted optimization in EGO is called expected improvement, denoted as E(I). Conceptually, the E(I) of a material in search-space is the quantified probability with which a DFPT calculation of ϵ for that material will lead to the identification of high-ϵ material in the design workflow within as few design cycles as possible. Figure 3a shows the results from an ANN model validation as a part of model training during the second design cycle. The values of E(I) computed for the same validation data split from the training-data are shown in Fig. 3b. A simplified illustration of E(I) with the help of an example is given below.

Example illustration of E(I)

Suppose the predicted ϵ-distribution belonging to a material M₁ in the search-space has a large standard deviation. Then it is highly probable that the material M₁ belongs to a part of the material representation vector space which was not sampled very well in the training set. Computing the ϵ of M₁ using DFPT and feeding back that information to the training-data will lead to better ANN modeling in the subsequent design cycles. Thus, M₁ will have a large value of E(I). Now consider another material M₂ in search-space with a large mean and a small standard deviation for its predicted ϵ-distribution. The material M₂ belongs to a part of the material representation vector space that was sufficiently sampled by the training-data. So it is highly probable that M₂ will turn out to be a high-ϵ material upon DFPT calculations. Because of that, M₂ will also have a large value of E(I).

In EGO, the calculation of E(I) for a general optimization problem proceeds as follows (also shown in Fig. 4).

Let Y be the target property to be maximized and φ(Y) be the predicted distribution of Y for a given search-space material. The value, φ(Y = y) is the probability when the value of Y is y. The largest value of the target property in the training-data is denoted as ${y}_{t}^{{\rm{max}}}$. The EGO algorithm, as formulated by Jones et al.³⁸, computes the expected improvement, E(I), as:

$$E(I)=\int\nolimits_{{y}_{t}^{{\rm{max}}}}^{\infty }(y-{y}_{t}^{{\rm{max}}})\ \varphi (Y=y)\ dy$$

(1)

As mentioned in Balachandran et al.²⁶, if the predicted distribution is approximated as a normal (i.e., Gaussian) distribution with a mean μ and a standard deviation σ, the above equation can be re-written as:

$$E(I)=\sigma [\phi (z)+z{{\Phi }}(z)]$$

(2)

where, $z=\frac{\mu -{y}_{t}^{{\rm{max}}}}{\sigma }$, ϕ is the probability density function, and Φ is the cumulative distribution function³⁸ of the normal distribution, φ(Y).

For dielectric design, Y is the dielectric constant (ϵ) of a candidate material, and ${y}_{t}^{{\rm{max}}}$ is the highest value of ϵ in the training-data obtained from DFPT calculations. In the MP dataset, the largest ϵ value is for TiO₂ with ϵ = 988 and E_g = 1.8 eV. But our goal in this work is to find materials with large ϵ’s, not necessarily higher than 988 as long as the E_g’s are greater than 1.8 eV. Thus the ${y}_{t}^{{\rm{max}}}$ in this work was set at 100.0 for all design cycles, instead of setting it at 988.0, to consider the search-space materials whose ϵ values are predicted to be sufficiently high. The φ(Y) is approximated to be a normal distribution with the same mean, μ, and standard deviation, σ, as that of the original ϵ-distribution predicted by the ANN ensemble for each search-space material.

Design cycles with feedback

The ϵ values of a few materials selected from the statistical modeling are computed from DFPT calculations, as shown in the final segment of a design cycle in Fig. 1. The results from the DFPT calculations are used to determine whether to conduct any further design cycles. In this work, we conducted the design cycles until at least one high-ϵ dielectric with a large E_g is identified. When no such materials are found during a design cycle, all the selected materials along with their newly DFPT-estimated ϵ values are transferred from search-space to training-data, resulting in a feedback of information prior to the beginning of the next design cycle. The feedback is one of the most crucial parts of our material design workflow because it results in a better sampling of material representation vector space by training-data and thus, more reliable ANN model predictions during the next design cycle. The advantage of the feedback mechanism is prominent during the quantification of uncertainty which is used directly by the EGO algorithm to identify the best candidates for the next set of DFPT calculations. After the end of a design cycle, the uncertainty on predicting the ϵ values is decreased for the set of materials which are similar to the materials whose ϵ values were calculated using DFPT in the given cycle.

In addition to the feedback mechanism, another factor that influenced the candidate selection in the design workflow is the minimum cutoff imposed on the band gap values of materials when they are included in the search-space. The reason for implementing a cutoff is to externally introduce a character of multi-objective optimization in this work. Without explicitly setting a minimum band gap limit, the candidate selection process that is dictated by the EGO algorithm tries to optimize only a single objective, which is the ϵ value. We conducted three design cycles sequentially with feedback of the newly calculated data into training-data after each cycle. In the first design cycle, we set no band gap minimum cutoffs to allow the full exploration of the search-space that consists of 11,102 non-metals from OQMD. In the second design cycle, a minimum cutoff of 2.25 eV was set, leaving 6191 materials in the search-space. In the final cycle, the minimum cutoff was increased to 5 eV to limit the candidate selection only to the materials with very high E_g. Hence, the search-space size in the final cycle was reduced to 1046 materials. The workflow that we adopted in this work deviates from the ideal situation where a dedicated multi-objective optimization statistical algorithm will be used to find a material with high ϵ and large E_g values. Since the band gap values are already available for all materials in the search-space, the best approach here was to implement a statistical optimization algorithm to quickly find high-ϵ materials while the preference for large band gap values is achieved by manually setting a minimum cutoff. This work stands as an example for the modifications required to practically implement the statistical algorithms that are often benchmarked on idealistic scenarios.

New dielectric materials

The materials that are part of the Pareto front of MP data are listed in Table 1, while the Pareto front of training-data at each design cycle is plotted in Fig. 5. Since the maximization of ϵ and E_g values are considered as optimal in this study, each material in the Pareto front has a higher value of either ϵ or E_g than any other material in the corresponding training-data. Therefore, the modification of the training-data’s Pareto front by any of the newly calculated dielectric constants after each design cycle may indicate the identification of suitable, high-dielectric materials.

Table 1 The Pareto front of dielectric materials dataset from Materials Project.

Full size table

**Fig. 5: Evolution of the Pareto front with design cycles.**

During the first design cycle, the EGO algorithm picked out the five most promising candidates with the largest E(I) values in the search-space. The ϵ values of these five selected materials were calculated using DFPT. Two materials among them turned out to have very high ϵ values (~370) but very low E_g (~0.5 eV). The low E_g values are not unexpected since the EGO algorithm implemented in this work aims to maximize only the ϵ values. None of the materials selected in this cycle modified the Pareto front of the MP dataset, as shown in Fig. 5a. The ϵ values of these five materials were appended to the training-data prior to starting the next design cycle.

Five materials were selected in the second cycle and their dielectric constants were calculated. Our calculations predict a large dielectric constant for one of the five new materials—tetragonal Tl₃PbBr₅ (ϵ = 101, E_g = 2.9 eV). Tl₃PbBr₅ joined the Pareto front, as shown in Fig. 5b. Three other new materials—Bi₅IO₇ (ϵ = 36, E_g = 2.7 eV), Bi₃ClO₄ (ϵ = 39, E_g = 2.3 eV), and Bi₃BrO₄ (ϵ = 39, E_g = 2.3 eV), have moderately large ϵ values, even though they did not improve the existing Pareto front. All the five new materials were appended into the training-data before proceeding to begin the third design cycle.

During the third and final design cycle consisting of only materials with very large E_g in search-space, seven new candidate materials were selected to do DFPT calculations. Two among them—Eu₅SiCl₆O₄ (ϵ = 69, E_g = 5.5 eV) and HoClO (ϵ = 75, E_g = 5.2 eV) joined the Pareto front due to their large ϵ and E_g values, as shown in Fig. 5c. In total, three new dielectric materials in the Pareto front were discovered after three design cycles and 17 new DFPT calculations were performed in the entire workflow. No further design cycles were conducted since we have already identified multiple compounds with high ϵ and E_g, which remained unexplored experimentally.

The ϵ values of all 17 materials which were obtained in this work are given in Table 2. The ϵ and E_g of all materials belonging to the Pareto front of the MP dataset is listed in Table 1 for comparison. Among all the newly discovered dielectrics with large ϵ values, tetragonal HoClO and monoclinic Eu₅SiCl₆O₄ stand out because of their very large DFT-calculated band gap energies (5.2 eV and 5.5 eV respectively). These two rare earth oxychlorides are reported to have been experimentally synthesized^41,42,43,44 but their dielectric properties remained unstudied to the extent of our knowledge. Both of these compounds are mixed-anionic inorganic compounds—a class of emerging functional materials⁴⁵. Interestingly, the monoclinic Eu₅SiCl₆O₄ has 32 atoms in its primitive unit cell which often exceeds the maximum cutoff on the number of atomic sites in HT studies involving computationally expensive material properties^11,19.

Table 2 Dielectric constants of 17 materials calculated using DFT in this work.

Full size table

Thermodynamic stability of a dielectric when in contact with Si or other semiconductors is an important requirement for it to be used in electronic applications. Several of the high-ϵ dielectrics identified in the published literature were shown to be unstable while forming an interface with Si in subsequent experimental studies conducted at or above the room temperature. The formation of SiO_x and other undesired metal oxides were reported at the interface between Si and the popular high-ϵ dielectrics such as Ta₂O₃^46,47,48, TiO₂^49,50, BaTiO₃⁵¹, and SrTiO₃^52,53. The thermodynamic stability between two compounds can be assessed from the phase diagram involving those compounds. In this work, the phase diagram is constructed by computing the convex hull⁵⁴ of formation energies of all the materials that belong to a given phase space spanned by their constituent elements. Each of the compounds that form the convex hull not only has the lowest formation energy at its composition but also has lower energy than any linear combination of other materials in that phase space. The difference between the formation energy of a compound and energy at the convex hull for the same composition is called as the hull distance (E_hd). By definition, each material that is on the convex hull has a hull distance of zero (i.e., E_hd = 0) and is considered to be stable. On the other hand, every material that falls above the convex hull is considered as metastable (0 < E_hd ≤ 50 meV per atom) or unstable (E_hd > 50 meV per atom) depending on the magnitude of E_hd according to the heuristic conventions adopted in literature^{31,55,56,57,58}. The presence of a tie-line between two compounds in a convex hull phase diagram indicates that they are thermodynamically stable phases when in contact with each other. Our thermodynamic stability analysis on Ta₂O₃, TiO₂, BaTiO₃, and SrTiO₃ in OQMD using the qmpy API¹⁴ showed no tie-lines connecting any of them to Si, indicating they are unstable when in contact with Si. This is consistent with the published results^{46,47,48,49,50,51,52,53}. We also analyzed Gd₂O₃, a high ϵ (~20⁵⁹) that is proven to be stable against Si⁶⁰, and found that a tie-line does exist between Si and Gd₂O₃. These phase diagram plots are provided in Supplementary Fig. 6. In Fig. 6, we report a phase diagram to assess the stability of newly discovered high-ϵ dielectrics—HoClO and Eu₅SiCl₆O₄. The phase diagram shows that both these materials are thermodynamically stable with the semiconductors such as Si, Ge, GaAs, GaN, and SiC at 0K, a requirement for them to be used in microelectronic devices where an interface with one of the common semiconductors is often necessary⁶¹. The next most promising candidate, tetragonal Tl₃PbBr₅, has a very large ϵ (101) but possesses a relatively smaller band gap (2.9 eV) and is computed to be thermodynamically metastable at 0K (E_hd = 16 meV per atom) according to the data obtained from the OQMD. Tl₃PbBr₅ is also reported in the literature to have been experimentally synthesized^62,63,64, without any mention of its dielectric properties.

**Fig. 6: Phase diagram of all stable compounds in Ho-Cl-O-Eu-Si-Ge-Ga-As-C-N phase space from OQMD (as of January 2022).**

Discussion

We report the identification of three dielectric materials that contain a combination of high-dielectric constant and large band gap—HoClO(ϵ = 75, E_g = 5.2 eV), Eu₅SiCl₆O₄(ϵ = 69, E_g = 5.5 eV), and Tl₃PbBr₅(ϵ = 101, E_g = 2.9 eV). These compounds modify the Pareto front of previously known high-throughput dielectric constants data available from the MP database. Our screening strategy also uncovers four other dielectric materials with large E_g and moderately large ϵ—Sr₂LuBiO₆(ϵ = 24, E_g = 2.4 eV), Bi₅IO₇(ϵ = 36, E_g = 2.7 eV), Bi₃ClO₄(ϵ = 39, E_g = 2.3 eV), and Bi₃BrO₄(ϵ = 39, E_g = 2.3 eV)—at the cost of conducting only 17 DFPT calculations overall. We utilize the data available in the open-source databases (OQMD, MP) to build a statistical optimization model and use it to select the best candidates after searching among 11,102 stable non-metals that are available in the OQMD. Among the newly discovered dielectrics, two mixed-anionic materials—HoClO and Eu₅SiCl₆O₄ are shown to have tie-lines with multiple, commonly used semiconductors on their phase diagrams, that indicate their thermodynamic equilibrium.

The presence of rare earth elements such as Ho and Eu in dielectrics can be a challenge for their use in practical applications. However, the ongoing efforts toward increasing their availability such as efficient recycling of rare earth materials^65,66 can result in a sufficient supply of elements for mass production of small electronic components. In particular, Ho is an underutilized element in the industry⁶⁷ even though it is more abundant in the earth’s crust than other widely mined elements such as Mo, Bi, and precious metals⁶⁸. Eu is more abundant on earth’s crust than Ho and some of the heavily mined elements such as W and As⁶⁸. Hence, an active exploration of cheaper and easier extraction methods for rare earth elements may make it feasible to include them in mass-produced electronics in the near future. The presence of toxic elements such as Pb and Tl can stand as a barrier against including Tl₃PbBr₅ in consumer electronics. Since mixed-anionic materials are an emerging class of functional materials, our identification of promising dielectric materials in this family opens up further research opportunities on rational design of high-performance dielectrics and their experimental characterizations.

We also assessed the thermodynamic stability of the new dielectrics by creating a large convex hull diagram containing the best two new dielectrics (HoClO and Eu₅SiCl₆O₄) and several commonly used materials in electronics. The relevance of this analysis is also provided in detail along with examples of previously reported high-ϵ dielectrics^{46,47,48,49,50,51,52,53} that were later found out to be unstable when in contact with common electronic component materials such as SiO₂. Our convex hull analysis indicates that both HoClO and Eu₅SiCl₆O₄ are stable against the common electronic materials that we considered.

To understand what features of HoClO, Eu₅SiCl₆O₄, and Tl₃PbBr₅ make them the best dielectric candidates in this study, we have calculated their electronic structures and partial density of states (Supplementary Fig. 5). Our analysis shows that the top of the valence bands and bottom of the conduction bands in these compounds consists of primarily the contributions from the anions (Cl, Br) and cations (Ho, Eu, Tl), respectively. This analysis indicates that having lighter anions (such as Cl, Br) is advantageous as their valence orbitals making up the valence band edge in those compounds will have lower energies, hence, a relatively larger band gap that is desired in high-ϵ materials.

In addition to the identification of high-dielectrics, we successfully demonstrated an implementation of a cross-database statistical design for computational materials selection. Datasets from the MP and OQMD repositories are used in this work as training-data and search-space, respectively. The successful identification of new materials from such a workflow is another motivation for actively moving toward the interoperability of materials databases, which is one of the four pillars of FAIR data principles⁶⁹ in scientific data management. Therefore, better interoperability across databases amplifies the flexibility in utilizing materials data while solving a complex materials problem.

Lastly, this work also stands as an example of the practical implementation of a computational design strategy for property optimization via data-informed material selection. A multi-objective optimization problem (maximizing ϵ and E_g) is converted into a single objective optimization using statistical methods (maximizing ϵ) combined with explicit constraining of band gap values (higher E_g) among materials since E_g is already available for all materials in the search-space. The deviation from the ideal, statistically benchmarked multi-objective optimization workflows²⁷ enabled the efficient utilization of resources and resulted in the identification of three high-ϵ dielectrics at the cost of just 17 new DFPT calculations.

Methods

ANN modeling

The individual models in the ANN ensemble consisted of a single hidden layer with the number of neurons in the range of 10². The exact number of neurons varied randomly within a small range (10–30) to avoid any bias that may arise from model architecture since the subset of training-data for each ANN was randomly sampled. Each ANN ensemble consisted of 2000 independent ANNs. Thus, the ϵ-distribution for each material consisted of 2000 independent ϵ predictions. A new ANN ensemble was created and trained for each new design cycle to learn the incremented training-data. The Nadam optimizer is used for network optimization during the training. Both L2 layer regularization and early-stopping callback as implemented in Keras⁷⁰, are implemented for each ANN in the ensemble to prevent over-fitting. On average, it took between 300 to 400 epochs to reach the local minimum of the loss function. Each epoch is a full iteration of fitting the training-data to update the internal weights of an ANN. Validation details of one of the randomly chosen ANN models from the ensemble are plotted in Fig. 3a for reference. Feature dimensional reduction prior to the training of ANNs was done using the principal component analysis algorithm implemented in scikit-learn³⁴. Model validation during the training of one of the 2000 ANN models in the second design cycle is plotted in Fig. 3a.

DFPT calculations

We performed all DFT calculations using the Vienna Ab initio Simulation Package (VASP)^71,72 with potentials derived using the projector-augmented wave^73,74 method. We calculated the total dielectric constant (sum of electronic and ionic components) values for selected materials using DFPT as implemented in VASP. All the compounds were fully relaxed before the dielectric calculations. We used an energy cutoff of 520 eV, k-mesh of 6000 k-points per reciprocal atom, and an energy-threshold of 10⁻⁸ eV during the self-consistent calculations. The forces on the atoms after structural relaxations were less than 10⁻³ eV Å⁻¹. We used the generalized gradient approximation⁷⁵ to approximate the exchange-correlation energies of the electrons. A detailed discussion on DFPT calculations is provided in the Supplementary Methods section included within the Supplementary Material. We did DFPT calculations on a set of well-known dielectrics and a few rare earth compounds, and benchmarked the results against previously reported results in the literature. These results indicate the reliability of our calculated ϵ values, which are provided in Supplementary Table 2. Specifically, two rare earth oxides (EuO and Ho₂O₃) and one rare earth halide (EuF₂) were benchmarked to test the accuracy of the standard DFPT calculations in modeling these compounds. Furthermore, our calculations reveal that no imaginary phonon modes appear in HoClO, Eu₅SiCl₆O₄, and Tl₃PbBr₅, the best high-ϵ materials identified in this work. More details are provided in Supplementary Table 1 and Supplementary Fig. 4.

Data availability

The data used in building statistical design models are open-sourced and available via OQMD and Materials Project databases. Other data that support the findings of this study are available from the corresponding author upon reasonable request.

Code availability

The raw, unformatted codes used in this project for statistical materials design are available via Github at https://github.com/tachyontraveler/diel-design-scripts/tree/v0.1.0-alpha. The latest versions of the scripts upon release will be available in the future at https://doi.org/10.5281/zenodo.6515841.

References

Ortiz, R. P., Facchetti, A. & Marks, T. J. High-k organic, inorganic, and hybrid dielectrics for low-voltage organic field-effect transistors. Chem. Rev. 110, 205–239 (2009).
Article CAS Google Scholar
Wang, B. et al. High-k gate dielectrics for emerging flexible and stretchable electronics. Chem. Rev. 118, 5690–5754 (2018).
Article CAS Google Scholar
Kingon, A. I., Maria, J.-P. & Streiffer, S. Alternative dielectrics to silicon dioxide for memory and logic devices. Nature 406, 1032 (2000).
Article CAS Google Scholar
Shevlin, S. A., Curioni, A. & Andreoni, W. Ab initio design of high-k dielectrics: La_xY_1−xAlO₃. Phys. Rev. Lett. 94, 146401 (2005).
Article CAS Google Scholar
Delugas, P., Fiorentini, V., Filippetti, A. & Pourtois, G. Cation charge anomalies and high-κ dielectric behavior in DyScO₃: ab initio density-functional and self-interaction-corrected calculations. Phys. Rev. B 75, 115126 (2007).
Article CAS Google Scholar
Iino, Y. et al. Organic thin-film transistors on a plastic substrate with anodically oxidized high-dielectric-constant insulators. Jpn. J. Appl. Phys. 42, 299 (2003).
Article CAS Google Scholar
Kukli, K. et al. Properties of tantalum oxide thin films grown by atomic layer deposition. Thin Solid Films 260, 135–142 (1995).
Article CAS Google Scholar
Ramajothi, J., Ochiai, S., Kojima, K. & Mizutani, T. Performance of organic field-effect transistor based on poly (3-hexylthiophene) as a semiconductor and titanium dioxide gate dielectrics by the solution process. Jpn. J. Appl. Phys. 47, 8279 (2008).
Article CAS Google Scholar
Lee, M., Youn, Y., Yim, K. & Han, S. High-throughput ab initio calculations on dielectric constant and band gap of non-oxide dielectrics. Sci. Rep. 8, 14794 (2018).
Article CAS Google Scholar
Wilk, G. D., Wallace, R. M. & Anthony, J. High-κ gate dielectrics: current status and materials properties considerations. J. Appl. Phys. 89, 5243–5275 (2001).
Article CAS Google Scholar
Petretto, G. et al. High-throughput density-functional perturbation theory phonons for inorganic materials. Sci. Data 5, 180065 (2018).
Article CAS Google Scholar
Petousis, I. et al. High-throughput screening of inorganic compounds for the discovery of novel dielectric and optical materials. Sci. Data 4, 160134 (2017).
Article CAS Google Scholar
Saal, J. E., Kirklin, S., Aykol, M., Meredig, B. & Wolverton, C. Materials design and discovery with high-throughput density functional theory: the open quantum materials database (OQMD). JOM 65, 1501–1509 (2013).
Article CAS Google Scholar
Kirklin, S. et al. The open quantum materials database (OQMD): assessing the accuracy of dft formation energies. npj Comput. Mater. 1, 15010 (2015).
Article CAS Google Scholar
Jain, A. et al. The Materials Project: a materials genome approach to accelerating materials innovation. APL Mater. 1, 011002 (2013).
Article CAS Google Scholar
Giannozzi, P. & Baroni, S. Density-Functional Perturbation Theory, 195–214 (Springer, 2005).
Curtarolo, S. et al. Aflowlib. org: a distributed materials properties repository from high-throughput ab initio calculations. Comput. Mater. Sci. 58, 227–235 (2012).
Article CAS Google Scholar
Draxl, C. & Scheffler, M. The nomad laboratory: from data sharing to artificial intelligence. J. Phys.: Mater. 2, 036001 (2019).
CAS Google Scholar
Choudhary, K. et al. High-throughput density functional perturbation theory and machine learning predictions of infrared, piezoelectric, and dielectric responses. npj Comput. Mater. 6, 1–13 (2020).
Article Google Scholar
Pyzer-Knapp, E. O., Li, K. & Aspuru-Guzik, A. Learning from the Harvard Clean Energy Project: the use of neural networks to accelerate materials discovery. Adv. Funct. Mater. 25, 6495–6502 (2015).
Article CAS Google Scholar
Saal, J. E., Oliynyk, A. O. & Meredig, B. Machine learning in materials discovery: confirmed predictions and their underlying approaches. Annu. Rev. Mater. Res. 50, 49–69 (2020).
Park, C. W. & Wolverton, C. Developing an improved crystal graph convolutional neural network framework for accelerated materials discovery. Phys. Rev. Mater. 4, 063801 (2020).
Article CAS Google Scholar
Umeda, Y., Hayashi, H., Moriwake, H. & Tanaka, I. Prediction of dielectric constants using a combination of first principles calculations and machine learning. Jpn. J. Appl. Phys. 58, SLLC01 (2019).
Article CAS Google Scholar
Qu, J., Zagaceta, D., Zhang, W. & Zhu, Q. High dielectric ternary oxides from crystal structure prediction and high-throughput screening. Sci. Data 7, 1–10 (2020).
Article Google Scholar
Morita, K., Davies, D. W., Butler, K. T. & Walsh, A. Modeling the dielectric constants of crystals using machine learning. J. Chem. Phys. 153, 024503 (2020).
Article CAS Google Scholar
Balachandran, P. V., Xue, D., Theiler, J., Hogden, J. & Lookman, T. Adaptive strategies for materials design using uncertainties. Sci. Rep. 6, 19660 (2016).
Article CAS Google Scholar
Gopakumar, A. M., Balachandran, P. V., Xue, D., Gubernatis, J. E. & Lookman, T. Multi-objective optimization for materials discovery via adaptive design. Sci. Rep. 8, 3738 (2018).
Article CAS Google Scholar
Petousis, I. et al. Benchmarking density functional perturbation theory to enable high-throughput screening of materials for dielectric constant and refractive index. Phys. Rev. B 93, 115151 (2016).
Article CAS Google Scholar
Jain, A. K., Mao, J. & Mohiuddin, K. M. Artificial neural networks: a tutorial. Computer 29, 31–44 (1996).
Article Google Scholar
Balachandran, P. V., Young, J., Lookman, T. & Rondinelli, J. M. Learning from data to design functional materials without inversion symmetry. Nat. Commun. 8, 14282 (2017).
Article CAS Google Scholar
Balachandran, P. V. et al. Predictions of new ABO₃ perovskite compounds by combining machine learning and density functional theory. Phys. Rev. Mater. 2, 043802 (2018).
Article CAS Google Scholar
Ward, L., Agrawal, A., Choudhary, A. & Wolverton, C. A general-purpose machine learning framework for predicting properties of inorganic materials. npj Comput. Mater. 2, 16028 (2016).
Article Google Scholar
Ward, L. et al. Including crystal structure attributes in machine learning models of formation energies via Voronoi Tessellations. Phys. Rev. B 96, 024104 (2017).
Article Google Scholar
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Google Scholar
Belsky, A., Hellenbrandt, M., Karen, V. L. & Luksch, P. New developments in the Inorganic Crystal Structure Database (ICSD): accessibility in support of materials research and design. Acta Crystallogr., Sect. B: Struct. Sci. 58, 364–369 (2002).
Article CAS Google Scholar
Groh, D. et al. First-principles study of the optical properties of BeO in its ambient and high-pressure phases. J. Phys. Chem. Solids 70, 789–795 (2009).
Article CAS Google Scholar
Xue, D. et al. Accelerated search for materials with targeted properties by adaptive design. Nat. Commun. 7, 11241 (2016).
Article CAS Google Scholar
Jones, D. R., Schonlau, M. & Welch, W. J. Efficient global optimization of expensive black-box functions. J. Glob. Optim. 13, 455–492 (1998).
Article Google Scholar
Solomou, A. et al. Multi-objective Bayesian materials discovery: application on the discovery of precipitation strengthened NiTi shape memory alloys through micromechanical modeling. Mater. Des. 160, 810–827 (2018).
Article CAS Google Scholar
Talapatra, A. et al. Autonomous efficient experiment design for materials discovery with bayesian model averaging. Phys. Rev. Mater. 2, 113803 (2018).
Article CAS Google Scholar
Templeton, D. & Dauben, C. H. Crystal structures of rare earth oxychlorides. J. Am. Chem. Soc. 75, 6069–6070 (1953).
Article CAS Google Scholar
Hölsä, J., Lahtinen, M., Lastusaari, M., Valkonen, J. & Viljanen, J. Stability of rare-earth oxychloride phases: bond valence study. J. Solid State Chem. 165, 48–55 (2002).
Article CAS Google Scholar
Basiev, T. et al. Hydration of strontium chloride and rare-earth element oxychlorides. Russ. J. Appl. Chem. 78, 1035–1037 (2005).
Article CAS Google Scholar
Jacobsen, H., Meyer, G., Schipper, W. & Blasse, G. Synthesis, structures and luminescence of two new Europium (II) Silicate-Chlorides, Eu₂SiO₃Cl₂ and Eu₅SiO₄Cl₆. Z. Anorg. Allg. Chem. 620, 451–456 (1994).
Article CAS Google Scholar
Kageyama, H. et al. Expanding frontiers in materials chemistry and physics with multiple anions. Nat. Commun. 9, 1–15 (2018).
Article CAS Google Scholar
Atanassova, E. & Spassov, D. X-ray photoelectron spectroscopy of thermal thin Ta₂O₅ films on Si. Appl. Surf. Sci. 135, 71–82 (1998).
Article CAS Google Scholar
Schlom, D. G. & Haeni, J. H. A thermodynamic approach to selecting alternative gate dielectrics. MRS Bull. 27, 198–204 (2002).
Article CAS Google Scholar
Alers, G. et al. Intermixing at the tantalum oxide/silicon interface in gate dielectric structures. Appl. Phys. Lett. 73, 1517–1519 (1998).
Article CAS Google Scholar
Perego, M., Seguini, G., Scarel, G., Fanciulli, M. & Wallrapp, F. Energy band alignment at TiO₂/Si interface with various interlayers. J. Appl. Phys. 103, 043509 (2008).
Article CAS Google Scholar
McCurdy, P. R., Sturgess, L. J., Kohli, S. & Fisher, E. R. Investigation of the PECVD TiO₂–Si (1 0 0) interface. Appl. Surf. Sci. 233, 69–79 (2004).
Article CAS Google Scholar
George, J. P. et al. Preferentially oriented BaTiO₃ thin films deposited on silicon with thin intermediate buffer layers. Nanoscale Res. Lett. 8, 1–7 (2013).
Article CAS Google Scholar
Hu, X. et al. The interface of epitaxial SrTiO₃ on silicon: in situ and ex situ studies. Appl. Phys. Lett. 82, 203–205 (2003).
Article CAS Google Scholar
Goncharova, L. et al. Interface structure and thermal stability of epitaxial SrTiO₃ thin films on Si (001). J. Appl. Phys. 100, 014912 (2006).
Article CAS Google Scholar
Barber, C. B., Dobkin, D. P. & Huhdanpaa, H. The quickhull algorithm for convex hulls. ACM Trans. Math. Softw. 22, 469–483 (1996).
Article Google Scholar
Sun, W. et al. The thermodynamic scale of inorganic crystalline metastability. Sci. Adv. 2, e1600225 (2016).
Article CAS Google Scholar
Wu, Y., Lazic, P., Hautier, G., Persson, K. & Ceder, G. First principles high throughput screening of oxynitrides for water-splitting photocatalysts. Energy Environ. Sci. 6, 157–168 (2013).
Article CAS Google Scholar
Zakutayev, A. et al. Theoretical prediction and experimental realization of new stable inorganic materials using the inverse design approach. J. Am. Chem. Soc. 135, 10048–10054 (2013).
Article CAS Google Scholar
Pal, K. et al. Accelerated discovery of a large family of quaternary chalcogenides with very low lattice thermal conductivity. npj Comput. Mater. 7, 1–13 (2021).
Article CAS Google Scholar
Zhou, J.-P. et al. Properties of high k gate dielectric gadolinium oxide deposited on Si (1 0 0) by dual ion beam deposition (DIBD). J. Cryst. Growth 270, 21–29 (2004).
Article CAS Google Scholar
Kwo, J. et al. Properties of high κ gate dielectrics Gd₂O₃ and Y₂O₃ for Si. J. Appl. Phys. 89, 3920–3927 (2001).
Article CAS Google Scholar
Robertson, J. High dielectric constant gate oxides for metal oxide Si transistors. Rep. Prog. Phys. 69, 327 (2005).
Article CAS Google Scholar
Keller, H.-L. Darstellung und kristallstruktur von hoch-Tl₃PbBr₅. J. Less-Common Met. 78, 281–286 (1981).
Article CAS Google Scholar
Denysyuk, N. et al. Electronic structure of the high-temperature tetragonal Tl₃PbBr₅ phase. J. Alloy. Compd. 576, 271–278 (2013).
Article CAS Google Scholar
Ferrier, A., Velázquez, M., Portier, X., Doualan, J.-L. & Moncorgé, R. Tl₃PbBr₅: a possible crystal candidate for middle infrared nonlinear optics. J. Cryst. Growth 289, 357–365 (2006).
Article CAS Google Scholar
Qiu, Y. & Suh, S. Economic feasibility of recycling rare earth oxides from end-of-life lighting technologies. Resour. Conserv. Recycl. 150, 104432 (2019).
Article Google Scholar
Amato, A. et al. Sustainability analysis of innovative technologies for the rare earth elements recovery. Renew. Sustain. Energy Rev. 106, 41–53 (2019).
Article CAS Google Scholar
Thornton, B. F. & Burdette, S. C. Homely holmium. Nat. Chem. 7, 532–532 (2015).
Article CAS Google Scholar
Yaroshevsky, A. Abundances of chemical elements in the earth’s crust. Geochem. Int. 44, 48–55 (2006).
Article Google Scholar
Wilkinson, M. D. et al. The fair guiding principles for scientific data management and stewardship. Sci. Data 3, 1–9 (2016).
Article Google Scholar
Chollet, F. et al. Keras. https://keras.io (2015).
Kresse, G. & Furthmüller, J. Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set. Comput. Mater. Sci. 6, 15–50 (1996).
Article CAS Google Scholar
Kresse, G. & Furthmüller, J. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B 54, 11169 (1996).
Article CAS Google Scholar
Kresse, G. & Joubert, D. From ultrasoft pseudopotentials to the projector augmented-wave method. Phys. Rev. B 59, 1758 (1999).
Article CAS Google Scholar
Blöchl, P. E. Projector augmented-wave method. Phys. Rev. B 50, 17953 (1994).
Article Google Scholar
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865 (1996).
Article CAS Google Scholar

Download references

Acknowledgements

This work was funded by the SAMSUNG Global Research Outreach Program, and the U.S. Department of Commerce, National Institute of Standards and Technology as part of the Center for Hierarchical Materials Design (CHiMaD) award 70NANB14H012. We acknowledge the computing resources provided by (1) the National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility operated under Contract No. DE-AC02-05CH11231, (2) Quest high-performance computing facility at Northwestern University which is jointly supported by the Office of the Provost, the Office for Research, and Northwestern University Information Technology, and (3) the Extreme Science and Engineering Discovery Environment (National Science Foundation Contract ACI-1548562).

Author information

Authors and Affiliations

Department of Materials Science and Engineering, Northwestern University, 2220 Campus Drive, Evanston, IL, 60208, USA
Abhijith Gopakumar, Koushik Pal & Chris Wolverton

Authors

Abhijith Gopakumar
View author publications
You can also search for this author in PubMed Google Scholar
Koushik Pal
View author publications
You can also search for this author in PubMed Google Scholar
Chris Wolverton
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A.G. devised computational strategies, wrote the manuscript, and conducted the calculations. K.P. provided important hands-on guidance in calculations and theoretical understanding. A.G. and C.W. modeled the project and analyzed the results. All authors have reviewed the manuscript.

Corresponding author

Correspondence to Chris Wolverton.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Material

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Gopakumar, A., Pal, K. & Wolverton, C. Identification of high-dielectric constant compounds from statistical design. npj Comput Mater 8, 146 (2022). https://doi.org/10.1038/s41524-022-00832-5

Download citation

Received: 10 February 2022
Accepted: 13 June 2022
Published: 07 July 2022
DOI: https://doi.org/10.1038/s41524-022-00832-5