1 Introduction

The study of stellar properties and stellar evolution plays a central role in astrophysics. Observations of stars determine the chemical composition, age and distance of the varied components of the Milky Way Galaxy and hence form the basis for studies of Galactic evolution. Stellar abundances and their evolution, particularly for lithium, are also a crucial component of the study of Big-Bang nucleosynthesis. Understanding of the pulsational properties of Cepheids underlies their use as distance indicators and hence the basic unit of distance measurement in the Universe. The detailed properties of supernovae are important for the study of element nucleosynthesis, while supernovae of Type Ia are crucial for determining the large-scale properties of the Universe, including the evidence for a dominant component of ‘dark energy’. In all these cases an accurate understanding, and modelling, of stellar interiors and their evolution is required for reliable results.

Modelling stellar evolution depends on a detailed treatment of the physics of stellar interiors. Insofar as the star is regarded as nearly spherically symmetric the basic equations of stellar equilibrium are relatively straightforward (see Sect. 2.1), but the detailed properties, often referred to as microphysics, of matter in a star are extremely complex, yet of major importance to the modelling. This includes the thermodynamical properties, as specified by the equation of state, the interaction between matter and radiation described by the opacity, the nuclear processes generating energy and causing the evolution of the element composition, and the diffusion and settling of elements. Equally important are potential hydrodynamical processes caused by various instabilities which may contribute to the transport of energy and material, hence causing partial or full mixing of given regions in a star. It is obvious that sufficiently detailed observations of stellar properties, and comparison with models, may provide a possibility for testing the physics used in the model calculation, hence allowing investigations of physical processes far beyond the conditions that can be reached in a terrestrial laboratory.

Amongst stars, the Sun obviously plays a very special role, both to our daily life and as an astrophysical object. Its proximity allows very precise, and probably accurate, determination of its global parameters, as well as extremely detailed investigations of phenomena in the solar atmosphere, compared with other stars. Indications are that the Sun is typical for its mass and age (e.g., Gustafsson 1998; Robles et al. 2008; Strugarek et al. 2017),Footnote 1 although a detailed analysis by Reinhold et al. (2020) of photometric variability observed with the Kepler spacecraft indicated that solar magnetic activity may be rather low compared with solar-like stars. Also, conditions in the solar interior are relatively benign, providing some hope that reasonably realistic modelling can be carried out. Thus it is an ideal case for investigations of stellar structure and evolution. Interestingly, there still remain very significant discrepancies between the observed properties of the Sun and solar models.

A good overview of the development of the study of stellar structure and evolution was provided by Tassoul and Tassoul (2004). Also, Shaviv (2009) gave an excellent wide-ranging and deep description of the evolution of the field, including an extensive discussion of the relevant observational basis, the underlying physics, and related aspects, such as the early tension between estimates of the age of the Earth and the Sun. The application of physics to the understanding of stellar interiors developed from the middle of the nineteenth century. The first derivation of stellar models based on mechanical equilibrium was carried out by Lane (1870).Footnote 2 Further development of the theory of such models, summarized in an extensive bibliographical note by Chandrasekhar (1939), was carried out by Ritter, Lord Kelvin and others, culminating in the monograph by Emden (1907). These models were based on the condition of hydrostatic equilibrium, combined with a simplified, so-called polytropic, equation of state. Major advances came with the application of the theory of radiative transfer, and quantum-mechanical calculations of atomic absorption coefficients, to the energy transport in stellar interiors. This allowed theoretical estimates to be made of the relation between the stellar mass and luminosity, even without detailed knowledge about the stellar energy sources (for a masterly discussion of these developments, see Eddington 1926). Further investigations of the properties of stellar opacity led to the conclusion that stellar matter was dominated by hydrogen (Strömgren 1932, 1933), in agreement with the detailed determination of the composition of the solar photosphere by Russell (1929), as well as with the analysis of a broad range of stars by Unsöld (1931). Although stellar modelling had proceeded up to this point without any definite information about the sources of stellar energy, this issue was evidently of very great interest. As early as 1920 Eddington (1920) and others noted that the fusion of hydrogen into helium might produce the required energy, over the solar lifetime, but a mechanism making the fusion possible, given the strong Coulomb repulsion between the nuclei, was lacking. This mechanism was provided by Gamow’s development of the treatment of quantum-mechanical barrier penetration between reacting nuclei, resulting in the identification of the dominant reactions in hydrogen fusion through the PP chains and the CNO cycle (cf. Sect. 2.3.3) by von Weizsäcker (1937, 1938), Bethe and Chritchfield (1938) and Bethe (1939). With this, the major ingredients required for the modelling of the solar interior and evolution had been established.

An important aspect of solar structure is the presence of an outer convection zone. Following the introduction by Schwarzschild (1906) of the criterion for convective instability in stellar atmospheres, Unsöld (1930) noted that such instability would be expected in the lower photosphere of the Sun. As a very important result, Biermann (1932) noted that the temperature gradient resulting from the consequent convective energy transport would in general be close to adiabatic; as a result, the structure of the convection zone depends little on the details of the convective energy transport. Also, he found that the resulting convective region in the Sun extended to very substantial depths, reaching a temperature of \(10^7\,\,\mathrm{K}\). Further calculations by, for example, Biermann (1942) and Rudkjøbing (1942), taking into account more detailed models of the solar atmosphere, generally confirmed these results. In an interesting short paper Strömgren (1950) summarized these early results. He noted that the presence of \({}^7\mathrm{Li}\) in the solar atmosphere clearly showed that convective mixing could extend at most to a temperature of \(3.5 \times 10^6 \,\mathrm{K}\),Footnote 3 beyond which lithium would be destroyed by nuclear reactions. He also pointed out that a revision of determinations of the composition of the solar atmosphere, relative to the one assumed by Biermann, had reduced the heavy-element abundance and that this would reduce the temperature at the base of the convection zone to the acceptable value of \(2.5 \times 10^6 \,\mathrm{K}\). Although these models are highly simplified, the use of the lithium abundance as a constraint on the extent of convective mixing, and the effect of a composition adjustment on the convection-zone depth, remain highly relevant, as discussed below.

Specific computations of solar models must satisfy the known observational constraints for the Sun, namely that solar radius and luminosity be reached at solar age, for a \(1 M_\odot \) model. As discussed by Schwarzschild et al. (1957) this can be achieved by adjusting the composition and the characteristics of the convection zone. They noted that no independent determination of the initial hydrogen and helium abundances \(X_0\) and \(Y_0\) was possible and consequently determined models for specified initial values of the hydrogen abundance. The convection zone was assumed to have an adiabatic stratification and to consist of fully ionized material, such that it was characterized by the adiabatic constant K in the relation \(p = K \rho ^{5/3}\) between pressure p and density \(\rho \). Given \(X_0\) the values of \(Y_0\) and K were then determined to obtain a model with the correct luminosity and radius. Although since substantially refined, this remains the basic principle for the calibration of solar models (see Sect. 2.6). A detailed discussion of the calibration of the properties of the convection zone was provided by Gough and Weiss (1976).

Given the calibration, the observed mass, radius and luminosity clearly provide no test of the validity of the solar model. An important potential for testing solar models became evident with the realization (Fowler 1958) that nuclear reactions in the solar core produce huge numbers of neutrinos which in principle may be measured, given a suitable detector (Davis 1964; Bahcall 1964). The first results of a large-scale experiment (Davis et al. 1968) surprisingly showed an upper limit to the neutrino flux substantially below the predictions of the then current solar models. Further experiments using a variety of techniques, and additional computations, did not eliminate this discrepancy, the predictions being higher by a factor 2–3 than the experiment, until the beginning of the present millennium.

An independent way of testing solar models, with potentially much higher selectivity, became available with the detection of solar oscillations (see Christensen-Dalsgaard 2004, for further details on the history of the field). Oscillations with periods near 5 min were discovered by Leighton et al. (1962). Their character as standing acoustic waves was proposed independently by Ulrich (1970) and Leibacher and Stein (1971), leading also to the expectation that their frequencies could be used to probe the outer parts of the Sun. This identification was confirmed observationally by Deubner (1975), whose data clearly showed the modal character of the oscillations. The observed modes had short horizontal wavelength and extended only a few per cent into the Sun. Indications of global oscillations in the solar diameter were presented by Hill et al. (1976), immediately suggesting that detailed information about the whole solar interior could be obtained from analysis of their frequencies (e.g., Christensen-Dalsgaard and Gough 1976). Although Hill’s data have not been confirmed by later studies, they served as important inspirations for such studies, now known as helioseismology.Footnote 4

Early analyses of the short-wavelength five-minute oscillations (Gough 1977c; Ulrich and Rhodes 1977) showed that the solar convection zone was substantially deeper than in the models of the epoch. A major breakthrough was the detection of global five-minute oscillations by Claverie et al. (1979) and Grec et al. (1980) and the subsequent identification of modes in the five-minute band over a broad range of horizontal wavelengths (Duvall and Harvey 1983). Observations of these modes have formed the basis for the dramatic development of helioseismology over the last three decades. With the increasing precision and detail of the observed oscillation frequencies, increasing sophistication was applied to solar modelling, generally leading to improved agreement between models and observations. Important examples were the realization that the opacity of the solar interior should be increased to match the inferred sound-speed profile (Christensen-Dalsgaard et al. 1985), that sophisticated equations of state were required to match the observed frequencies (Christensen-Dalsgaard et al. 1988), and that the inclusion of diffusion and settling substantially improved the agreement between the models and the Sun (Christensen-Dalsgaard et al. 1993). Remarkably, these developments in the model physics, motivated by but not directly fitted to, the steadily improving observations, led to models in good overall agreement with the inferred solar structure (e.g., Christensen-Dalsgaard et al. 1996; Gough et al. 1996; Bahcall et al. 1997; Brun et al. 1998). The remaining discrepancies were highly significant and clearly required changes to the physics of the solar interior, however. Interestingly, later revisions of the measured solar surface abundance now result in rather larger discrepancies between models and observations, indicating that more basic modifications to the modelling may be required.

In the present review I provide an overview of these issues, covering both the modelling and the sensitivity of solar models to the physical assumptions and the inferences drawn from various observations and their interpretation. Section 2 presents the tools required to model the Sun and its evolution, including some emphasis on the underlying physical properties of solar matter. In Sect. 3 I present a brief overview of the evolution of a solar-mass star. A detailed discussion of the sensitivity of solar models to changes in the model parameters or physics is provided in Sect. 4, using as reference case the widely used so-called Model S (Christensen-Dalsgaard et al. 1996). Section 5 discusses the observations available to test our understanding of solar structure and evolution, i.e., helioseismology, solar neutrinos and the details of the solar surface composition; in discussing the helioseismic results a brief presentation of results on solar internal rotation is also provided. In Sect. 6 the serious issues raised by the revised determinations of the solar composition after 2000 are discussed in detail, including the revisions to solar modelling which have attempted to obtain agreement with the helioseismically inferred structure under the constraints of these revised abundances. Finally, Sect. 7 gives a very brief presentation of studies of other stars, including the place of the Sun in relation to solar-like stars, and Sect. 8 provides a few concluding remarks. In support of the numerical results provided here, the Appendix briefly addresses the important issue of the numerical accuracy of the computed models.

2 Modelling the Sun

2.1 Basics of stellar modelling

Stellar models are generally calculated under a number of simplifying approximations, of varying justification. In most cases rotation and other effects causing departures from spherical symmetry are neglected and hence the star is regarded as spherically symmetric. Also, with the exception of convection, hydrodynamical instabilities are neglected, while convection is treated in a highly simplified manner. The mass of the star is assumed to be constant, so that no significant mass loss is included. In contrast to these simplifications of the ‘macrophysics’ the microphysics is included with considerable, although certainly inadequate, detail. In recent calculations effects of diffusion and settling are typically included, at least in computations of solar models. The result of these approximations is what is often called a ‘standard solar model’, although still obviously depending on the assumptions made in the details of the calculation.Footnote 5 Even so, such models computed independently, with recent formulations of the microphysics, give rather similar results. In this paper I generally restrict the discussion to standard models, although discussing the effects of some of the generalizations. It might be noted that the present Sun is in fact one case where the standard assumptions may have some validity: at least the Sun rotates sufficiently slowly that direct dynamical effects of rotation are likely to be negligible. On the other hand, rotation was probably faster in the past and the loss and redistribution of angular momentum may well have led to instabilities and hence mixing affecting the present composition profile.

With the assumption of spherical symmetry the model is characterized by the distance r to the centre. Hydrostatic equilibrium requires a balance between the pressure gradient and gravity which may then be written as

$$\begin{aligned} {\mathrm{d}p \over \mathrm{d}r} = - {G m \rho \over r^2}, \end{aligned}$$
(1)

where p is pressure, \(\rho \) is density, m is the mass of the sphere contained within r, and G is the gravitational constant. Also, obviously,

$$\begin{aligned} {\mathrm{d}m \over \mathrm{d}r} = 4 \pi r^2 \rho . \end{aligned}$$
(2)

The energy equation relates the energy generation to the energy flow and the change in the internal energy of the gas:

$$\begin{aligned} {\mathrm{d}L \over \mathrm{d}r} = 4 \pi r^2 \left[ \rho \epsilon - \rho {\mathrm{d}\over \mathrm{d}t }\left( {e \over \rho }\right) + {p \over \rho }{\mathrm{d}\rho \over \mathrm{d}t }\right] ; \end{aligned}$$
(3)

here L is the energy flow through the surface of the sphere of radius r, \(\epsilon \) is the rate of nuclear energy generationFootnote 6 per unit mass and unit time, e is the internal energy per unit volume and t is time.Footnote 7 The gradient of temperature T is determined by the requirements of energy transport, from the central regions where nuclear reactions take place to the surface where the energy is radiated. The temperature gradient is conventionally written in terms of \(\nabla = \mathrm{d}\ln T / \mathrm{d}\ln p\) as

$$\begin{aligned} {\mathrm{d}T \over \mathrm{d}r} = \nabla {T \over p} {\mathrm{d}p \over \mathrm{d}r}. \end{aligned}$$
(4)

The form of \(\nabla \) depends on the mode of energy transport; for radiative transport in the diffusion approximation

$$\begin{aligned} \nabla = \nabla _{\mathrm{rad}}\equiv {3 \over 16 \pi a {\tilde{c}} G} {\kappa p \over T^4}{L(r) \over m (r)}, \end{aligned}$$
(5)

where \(\kappa \) is the opacity, a is the radiation energy density constant and \({\tilde{c}}\) is the speed of light. Finally, we need to consider the rate of change of the composition, which controls stellar evolution. In a main-sequence star such as the Sun the dominant effect is the burning of hydrogen; however, we must also take into account the changes in composition resulting from diffusion and settling. The rate of change of the abundance \(X_i\) by mass of element i is therefore given by

$$\begin{aligned} {\partial X_i \over \partial t} = {{\mathcal {R}}}_i + {1 \over r^2 \rho } {\partial \over \partial r} \left[ r^2 \rho \left( D_i {\partial X_i \over \partial r} + V_i X_i \right) \right] , \end{aligned}$$
(6)

where \({{\mathcal {R}}}_i\) is the rate of change resulting from nuclear reactions, \(D_i\) is the diffusion coefficient and \(V_i\) is the settling velocity.

To these basic equations we must add the treatment of the microphysics. This is discussed in Sect. 2.3 below.

I have so far ignored the convective instability. This sets in if the density decreases more slowly with position than for an adiabatic change, i.e.,

$$\begin{aligned} {\mathrm{d}\ln \rho \over \mathrm{d}\ln p} < {1 \over \varGamma _1}, \end{aligned}$$
(7)

where \(\varGamma _1 = (\partial \ln p / \partial \ln \rho )_{\mathrm{ad}}\), the derivative being taken for an adiabatic change. In stellar modelling this condition is often replaced by

$$\begin{aligned} {\mathrm{d}\ln T \over \mathrm{d}\ln p} \equiv \nabla > \nabla _{\mathrm{ad}} \equiv \left( \mathrm{d}\ln T \over \mathrm{d}\ln p \right) _{\mathrm{ad}}, \end{aligned}$$
(8)

which is equivalent in the case of a uniform composition.Footnote 8 Thus a layer is convectively unstable if the radiative gradient \(\nabla _{\mathrm{rad}}\) (cf. Eq. 5) exceeds \(\nabla _{\mathrm{ad}}\). In this case convective motion sets in, with hotter gas rising and cooler gas sinking, both contributing to the energy transport towards the surface. The structure of the convective flow should clearly be such that the combined radiative and convective energy transport at any point in the convection zone match the luminosity. The conditions in stellar interiors are such that complex, possibly turbulent, flows are expected over a broad range of scales (e.g., Schumacher and Sreenivasan 2020). Also, the convective flux at a given location obviously represents conditions over a range of positions in the star, sampled by a moving convective eddy, so that convective transport is intrinsically non-local. As a related issue, motion is inevitably induced outside the immediate unstable region, also potentially affecting the energy transport and structure, although this is often ignored. However, in computations of stellar evolution these complexities are almost always reduced to a grossly simplified local description which allows the computation of the average temperature gradient in terms of local conditions, as

$$\begin{aligned} \nabla = \nabla _{\mathrm{conv}}(\rho , T, L, \ldots ), \end{aligned}$$
(9)

applied in regions of convective instability (see Sect. 2.5).

The equations are supplemented by boundary conditions. The centre, which is a regular singular point, can be treated through a series expansion in r. For example, it follows from Eq. (2) for the mass and Eq. (1) of hydrostatic support that

$$\begin{aligned} m = {4 \over 3} \pi \rho _{\mathrm{c}} r^3 + \cdots , \quad p = p_{\mathrm{c}} - {2 \over 3} \pi \rho _{\mathrm{c}}^2 r^2 + \cdots , \end{aligned}$$
(10)

where \(\rho _{\mathrm{c}}\) and \(p_{\mathrm{c}}\) are the central density and pressure. A discussion of the expansions to second significant order in r, and techniques for incorporating them in the central boundary conditions, was given by Christensen-Dalsgaard (1982). At the surface, the model must include the stellar atmosphere. Since this requires a more complex description of radiative transfer than provided by the diffusion approximation (Eq. 33), separately calculated detailed atmospheric models are often matched to the interior solution, thus effectively providing the surface boundary condition. Simpler alternatives are discussed in Sect. 2.4.

The equations and boundary conditions are most often solved using finite-difference methods, by what in the stellar-evolution community is known as the Henyey technique (e.g., Henyey et al. 1959, 1964).Footnote 9 This was discussed in some detail by Clayton (1968) and Kippenhahn et al. (2012). The presence of the time dependence, in the energy equation and the description of the composition evolution, is an additional complication. The detailed implementation in the Aarhus STellar Evolution Code (ASTEC), used in the following to compute examples of solar models, was discussed in some detail by Christensen-Dalsgaard (2008).

An important issue is the question of numerical accuracy, in the sense of providing an accurate solution to the problem, given the assumptions about micro- and macrophysics. It is evident that the accuracy must be substantially higher than the effects of, for example, those potential errors in the physics which are investigated through comparisons between the models and observations. Ab initio analyses of the computational errors are unlikely to be useful, given the complexity of the equations. As discussed in the Appendix, computations with differing spatial and temporal resolution provide estimates of the intrinsic precision of the calculation. Additional tests, which may also uncover errors in programming, are provided by comparisons between independently computed models, with carefully controlled identical physics (e.g., Gabriel 1991; Christensen-Dalsgaard and Reiter 1995; Lebreton et al. 2008; Monteiro 2008).

2.2 Basic properties of the Sun

The Sun is unique amongst stars in that its global parameters can be determined with high precision. From planetary motion the product \(G M_\odot \) of the gravitational constant and the solar mass is know with very high accuracy, as \(1.32712438 \times 10^{26} \,\mathrm{cm}^3 \,\mathrm{s}^{-2}\). Even though G is the least precisely determined of the fundamental constants this still allows the solar mass to be determined with a precision far exceeding the precision of the determination of other stellar masses. The 2014 recommendations of CODATAFootnote 10 (Mohr et al. 2016) give a value \(G = 6.67408 \pm 0.00031 \times 10^{-8}\,\,\mathrm{cm}^3\,\,\mathrm{g}^{-1}\,\,\mathrm{s}^{-2}\), corresponding to \(\,M_\odot = 1.98848 \times 10^{33} \,\mathrm{g}\). However, the solar mass has traditionally been taken to be \(\,M_\odot = 1.989 \times 10^{33} \,\mathrm{g}\), corresponding to \(G = 6.672320 \times 10^{-8}\,\,\mathrm{cm}^3\,\,\mathrm{g}^{-1}\,\,\mathrm{s}^{-2}\); in the calculations reported in the present paper I use the latter values of \(\,M_\odot \) and G, even though these are not entirely consistent with the CODATA 2014 recommendations. I note that Christensen-Dalsgaard et al. (2005) found that variations to G and \(\,M_\odot \), keeping their product fixed, had very small effects on the resulting solar models.

The angular diameter of the Sun can be determined with very substantial precision, although the level in the solar atmosphere to which the value refers obviously has to be carefully specified. From such measurements, and the known mean distance between the Earth and the Sun, the solar photospheric radius, referring to the point where the temperature equals the effective temperature, has been determined as \(6.95508 \pm 0.00026 \times 10^{10} \,\mathrm{cm}\) by Brown and Christensen-Dalsgaard (1998); this was adopted by Cox (2000). Haberreiter et al. (2008) obtained the value \(6.95658 \pm 0.00014 \times 10^{10} \,\mathrm{cm}\), which within errors is consistent with the value of Brown and Christensen-Dalsgaard (1998). However, most solar modelling has used the older value \(\,R_\odot = 6.9599 \times 10^{10} \,\mathrm{cm}\) (Auwers 1891), as quoted, for example, by Allen (1973); thus, for most of the models presented here I use this value.

From bolometric measurements of the solar ‘constant’ from space the total solar luminosity can be determined, given the Sun-Earth distance, if it is assumed that the solar flux is independent of latitude; although no evidence has been found to question this assumption, it is perhaps of some concern that measurements of the solar irradiance have only been made close to the ecliptic plane. An additional complication is provided by the variation in solar irradiance with phase in the solar cycle of around 0.1%, peak to peak (for a review, see Fröhlich and Lean 2004); since the cause of this variation is uncertain it is difficult to estimate the appropriate luminosity corresponding to equilibrium conditions. The value \(\,L_\odot = 3.846 \times 10^{33} \,\mathrm{erg}\,\mathrm{s}^{-1}\) (obtained from the average irradiance quoted by Willson 1997) has often been used and will generally be applied here. However, recently Kopp et al. (2016) has obtained a revised irradiance, as an average over solar cycle 23, leading to \(\,L_\odot = 3.828 \times 10^{33} \,\mathrm{erg}\,\mathrm{s}^{-1}\).

The solar radius and luminosity are often used as units in characterizing other stars, although with some uncertainty about the precise values that are used. In 2015 this led to Resolution B3 of the International Astronomical UnionFootnote 11 (see Mamajek et al. 2015; Prša et al. 2016), defining the nominal solar radius \({{\mathcal {R}}}_\odot ^N = 6.957 \times 10^8 \,\mathrm{m}\), suitably rounded from the value obtained by Haberreiter et al. (2008), and the nominal solar luminosity \({{\mathcal {L}}}_\odot ^N = 3.828 \times 10^{33} \,\mathrm{erg}\,\mathrm{s}^{-1}\) from Kopp et al. (2016).

The solar age \(t_\odot \) can be estimated from radioactive dating of meteorites combined with a model of the evolution of the solar system, relating the formation of the meteorites to the arrival of the Sun on the main sequence. Detailed discussions of meteoritic dating were provided by Wasserburg, in Bahcall and Pinsonneault (1995), and by Connelly et al. (2012). Wasserburg found \(t_\odot = 4.570 \pm 0.006 \times 10^9\)  years, with very similar although more accurate values obtained by Connelly et al. Uncertainties in the modelling of the early solar system obviously affect how this relates to solar age. For simplicity, in the following I simply identify this age with the time since the arrival of the Sun on the main sequence.Footnote 12 Despite the remaining uncertainty this still provides an independent measure of a stellar age of far better accuracy than is available for any other star.

The solar surface abundance can be determined from spectroscopic analysis (for reviews, see Asplund 2005; Asplund et al. 2009). Additional information about the primordial composition of the solar system, and hence likely the Sun, is obtained from analysis of meteorites. A major difficulty is the lack of a reliable determination from spectroscopy of the solar helium abundance. Lines of helium, an element then not known from the laboratory, were first detected in the solar spectrum;Footnote 13 however, these lines are formed under rather uncertain, and very complex, conditions in the upper solar atmosphere, making an accurate abundance determination from the observed line strengths infeasible; the same is true of other noble gases, with neon being a particularly important example. For those elements with lines formed in deeper parts of the atmosphere the spectroscopic analysis yields reasonably precise abundance determinations (e.g., Allende Prieto 2016); however, given that the helium abundance is unknown these are only relative, typically specified as a fraction of the hydrogen abundance. Detailed analyses were provided by Anders and Grevesse (1989) and Grevesse and Noels (1993), the latter leading to a commonly used present ratio \(Z_{\mathrm{s}}/X_{\mathrm{s}}= 0.0245\) between the surface abundances \(X_{\mathrm{s}}\) and \(Z_{\mathrm{s}}\) by mass of hydrogen and elements heavier than helium, respectively. Also, for most refractory elements there is good agreement between the solar abundances and those inferred from primitive meteorites. A striking exception is the abundance of lithium which has been reduced in the solar photosphere by a factor of around 150, relative to the meteoritic abundance (Asplund et al. 2009). This is presumably the result of lithium destruction by nuclear reaction, which would take place to the observed extent over the solar lifetime at a temperature of around \(2.5 \times 10^6 \,\mathrm{K}\), indicating that matter currently at the solar surface has been mixed down to this temperature. On the other hand, the abundance of beryllium, which would be destroyed at temperatures above around \(3.5 \times 10^6 \,\mathrm{K}\), has apparently not been significantly reduced relative to the primordial value (Balachandran and Bell 1998; Asplund 2004), so that significant mixing has not reached this temperature. These abundance determinations obviously provide interesting constraints on mixing processes in the solar interior during solar evolution (see Sect. 5.3).

Since 2000 major revisions of solar abundance determinations have been carried out, through the use of three-dimensional (3D) hydrodynamical simulations of the solar atmosphere (Nordlund et al. 2009, see also Sect. 2.5). This resulted in a substantial decrease in the inferred abundances of, in particular, oxygen, carbon and nitrogen (for a summary, see Asplund et al. 2009), resulting in \(Z_{\mathrm{s}}/X_{\mathrm{s}}= 0.0181\). The resulting decrease in the opacity in the radiative interior has substantial consequences for solar models and their comparison with helioseismic results; I return to this in Sect. 6.

Observations of the solar surface show that the Sun is rotating differentially, with an angular velocity that is highest at the equator. This was evident already quite early from measurements of the apparent motion of sunspots across the solar disk (Carrington 1863), and has been observed also in the Doppler velocity of the solar atmosphere. In an analysis of an extended series of Doppler measurements, Ulrich et al. (1988) obtained the surface angular velocity \(\varOmega \) as

$$\begin{aligned} {\varOmega \over 2 \pi } = (415.5 - 65.3 \cos ^2 \theta - 66.7 \cos ^4 \theta ) \, \mathrm{nHz} \end{aligned}$$
(11)

as a function of co-latitude \(\theta \), corresponding to rotation periods of 25.6 d at the equator and 31.7 d at a latitude of \(60^\circ \).

As discussed in Sect. 5.1, helioseismology has provided very detailed information about the properties of the solar interior. Here I note that the depth of the solar convection zone has been determined as 0.287R, with errors as small as 0.001R (e.g., Christensen-Dalsgaard et al. 1991; Basu and Antia 1997). Also, the effect of helium ionization on the sound speed in the outer parts of the solar convection zone allows a determination of the solar envelope helium abundance \(Y_{\mathrm{s}}\), although with some sensitivity to the equation of state; the results are close to \(Y_{\mathrm{s}} = 0.25\) (e.g., Vorontsov et al. 1991; Basu 1998).

2.3 Microphysics

Within the framework of ‘standard solar models’ most of the complexity in the calculation lies in the determination of the microphysics, and hence very considerable effort has gone into calculations of the relevant physics. In comparing the resulting models with observations, particularly helioseismic inferences, to test the validity of these physical results one must, however, obviously keep in mind potential errors in the approximations defining the standard models.

In this section I provide a relatively brief discussion of the various formulations that have been used for the physics. To illustrate some of the effects comparisons are made based on the structure of the present Sun discussed in more detail in Sect. 4 below. A detailed discussion of the physics of stellar interiors was provided by Cox and Giuli (1968) and updated by Weiss et al. (2004); for a concise review of the treatment of the equation of state and opacity, see Däppen and Guzik (2000).

2.3.1 Equation of state

The thermodynamic properties of stellar matter, defined by the equation of state, play a crucial role in stellar modelling. This directly involves the relation between pressure, density, temperature and composition. In addition, the adiabatic compressibility \(\varGamma _1\) affects the adiabatic sound speed (cf. Eq. 55) and hence the oscillation frequencies of the star, whereas other thermodynamic derivatives are important in the treatment of convective energy transport.

The treatment of the equation of state involves the determination of all relevant thermodynamic quantities, for example defined as functions of \((\rho , T,\{X_i\})\), where \(X_i\) are the abundances of the relevant elements; the composition is often characterized by the abundances X, Y and Z by mass of hydrogen, helium and heavier elements with, obviously, \(X + Y + Z = 1\). This should take into account the interaction between the different constituents of the gas, including partial ionization. Also, pressure and internal energy from radiation must be included, although they play a comparatively minor role in the Sun. An important constraint on the treatment is that it be thermodynamically consistent such that all thermodynamic relations are satisfied between the computed quantities (e.g., Däppen 1993). Thus it would not, for example, be consistent to add the contribution of Coulomb effects to pressure and internal energy without making corresponding corrections to other quantities, including the thermodynamical potentials that control the ionization.

A particular problem concerns ionization in the solar core. As pointed out by, e.g., Christensen-Dalsgaard and Däppen (1992) straightforward application of the Saha equation would predict a substantial degree of recombination of hydrogen at the centre of the Sun, yet the volume available to each hydrogen nuclei does not allow this. In fact, ionization must be largely controlled by interactions between the constituents of the gas, not included in the Saha equation, and often somewhat misleadingly denoted pressure ionization. These effects are taken into account in formulations of the equation of state at various levels of detail, generally showing that ionization is almost complete in the solar core. The simplest approach, which is certainly not thermodynamically consistent, is to enforce full ionization above a certain density or pressure.

A simple approximation to the solar equation of state is that of a fully ionized ideal gas, according to which

$$\begin{aligned} p \simeq {k_{\mathrm{B}}\rho T \over \mu m_{\mathrm{u}}}, \quad \nabla _{\mathrm{ad}}\simeq 2/5, \quad \varGamma _1 \simeq 5/3; \end{aligned}$$
(12)

here \(k_{\mathrm{B}}\) is Boltzmann’s constant, \(m_{\mathrm{u}}\) is the atomic mass unit and \(\mu \) is the mean molecular weight which can be approximated by

$$\begin{aligned} \mu = {4 \over 3 + 5 X - Z}. \end{aligned}$$
(13)

However, departures from this simple relation must obviously be taken into account in solar modelling. The most important of this is partial ionization, particularly relatively near the surface where hydrogen and helium ionize. Figure 1 shows the fractional ionization in a model of the present Sun. As discussed in Sect. 5.1.2 the effects of the ionization of helium on \(\varGamma _1\) provides a strong diagnostics of the solar envelope helium abundance.

Fig. 1
figure 1

Fractional ionization in a model of the present Sun (Model S; see Sect. 4.1), as a function of the logarithm of the temperature (in K; bottom) and of fractional radius (top). The ionization was calculated with the CEFF equation of state (see below). The solid curve shows the fraction of ionized hydrogen, the dashed and dot-dashed curves the fraction of singly and fully ionized helium, respectively, and the dotted curve shows the average degree of ionization of the heavy elements

Other effects are smaller but highly significant, particularly given the high precision with which the solar interior can be probed with helioseismology. Radiation pressure, \(p_{\mathrm{rad}} = 1/3 a T^4\), and other effects of radiation are small but not entirely negligible. Coulomb interactions between particles in the gas need to be taken into account; a measure of their importance is given by

$$\begin{aligned} \varGamma _{\mathrm{e}} = {e^2 \over d_{\mathrm{e}} k_{\mathrm{B}}T }, \quad \hbox {with} \quad d_{\mathrm{e}} = \left( {3 \over 4 \pi n_{\mathrm{e}}} \right) ^{1/3}, \end{aligned}$$
(14)

which determines the ratio between the average Coulomb and thermal energy of an electron; here e is the charge of an electron, and \(d_{\mathrm{e}}\) is the average distance between the electrons, \(n_{\mathrm{e}}\) being the electron density per unit volume. Also, in the core effects of partial electron degeneracy must be included; the importance of degeneracy is measured by

$$\begin{aligned} \zeta _{\mathrm{e}} = \lambda _{\mathrm{e}}^3 n_{\mathrm{e}} = {4 \over \sqrt{\pi }} F_{1/2} (\psi ) \simeq 2 e^\psi , \end{aligned}$$
(15)

where

$$\begin{aligned} \lambda _{\mathrm{e}} = {h \over (2 \pi m_{\mathrm{e}} k_{\mathrm{B}}T)^{1/2}} \end{aligned}$$
(16)

is the de Broigle wavelength of an electron, h being Planck’s constant and \(m_{\mathrm{e}}\) the mass of an electron. In Eq. (15) \(\psi \) is the electron degeneracy parameter and \(F_\nu (\psi )\) is the Fermi integral,

$$\begin{aligned} F_\nu (y) = \int _0^\infty {x^\nu \over 1 + \exp (y+x)} \mathrm{d}x. \end{aligned}$$
(17)

The last approximation in Eq. (15) is valid for small degeneracy, \(\psi \ll -1\); in this case the correction to the electron pressure \(p_{\mathrm{e}}\), relative to the value for an ideal non-degenerate electron gas, is

$$\begin{aligned} {p_{\mathrm{e}} \over n_{\mathrm{e}} k_{\mathrm{B}}T} -1 \simeq 2^{-5/2} e^\psi \simeq 2^{-7/2} \zeta _e \end{aligned}$$
(18)

(see also Chandrasekhar 1939). Finally, the mean thermal energy of an electron is not negligible compared with the rest-mass energy of the electron near the solar centre, so relativistic effects should be taken into account; their importance is measured by

$$\begin{aligned} x_{\mathrm{e}} = {k_{\mathrm{B}}T \over m_{\mathrm{e}} {\tilde{c}}^2}; \end{aligned}$$
(19)

at the centre of the present Sun \(x_{\mathrm{e}} \simeq 0.0026\). As an important example, the relativistic effects cause a change

$$\begin{aligned} {\delta \varGamma _1 \over \varGamma _1} \simeq - {2 + 2X \over 3 +5 X} x_{\mathrm{e}} \end{aligned}$$
(20)

in \(\varGamma _1\), which is readily detectable from helioseismic analyses (Elliott and Kosovichev 1998).

The magnitude of these departures from a simple ideal gas are summarized in Fig. 2, for a standard solar model. Given the precision of helioseismic inferences, none of the effects can be ignored. Coulomb effects are relatively substantial throughout the model, although peaking near the surface. Inclusion of these effects, in the so-called MHD equation of state (see below) was shown by Christensen-Dalsgaard et al. (1988) to lead to a substantial improvement in the agreement between the observed and computed frequencies. Electron degeneracy has a significant effect in the core of the model while, as already noted, relativistic effects for the electrons have been detected in helioseismic inversion (Elliott and Kosovichev 1998).

Fig. 2
figure 2

Measures of non-ideal effects in the equation of state in a model of the present Sun (Model S; see Sect. 4.1), as a function of fractional radius (top panel) and temperature (bottom panel). The solid line shows \(\varGamma _{\mathrm{e}}\) (cf. Eq. 14) which measures the importance of Coulomb effects. The short-dashed line shows \(\zeta _{\mathrm{e}}\) (cf. Eq. 15) which measures effects of electron degeneracy. (Note that in \(\varGamma _{\mathrm{e}}\) and \(\zeta _{\mathrm{e}}\) the electron number density was obtained with the CEFF equation of state; see below.) The long-dashed line shows \(x_{\mathrm{e}}\) (cf. Eq. 19), the ratio between the thermal energy and rest-mass energy of electrons. Finally, the double-dot-dashed line shows \(p_{\mathrm{rad}}/p\), the ratio between radiation and total pressure

The computation of the equation of state has been reviewed by Däppen (1993, 2004, 2007, 2010), Christensen-Dalsgaard and Däppen (1992), Baturin et al. (2013). Extensive discussions of issues related to the equation of state in astrophysical systems were provided by Čelebonović et al. (2004). The procedures can be divided into what has been called the chemical picture and the physical picture. In the former, the gas is treated as a mixture of different components (molecules, atoms, ions, nuclei and electrons) each contributing to the thermodynamical quantities. Approximations to the contributions from these components are used to determine the free energy of the system, and the equilibrium state is determined by minimizing the free energy at given temperature and density, say, under the relevant stoichiometric constraints. The level of complexity and, one may hope, realism of the formulation depends on the treatment of the different contributions to the free energy. In the physical picture, the basic constituents are taken to be nuclei and electrons, and the state of the gas, including the formation of ions and atoms, derives from the interaction between these constituents. In practice, this is dealt with in terms of activity expansions (Rogers 1981), the level of complexity depending on the number of terms included.

A simple form of the chemical picture is the so-called EFF equation of state (Eggleton et al. 1973). This treats ionization with the basic Saha equation, although adding a contribution to the free energy which ensures full ionization at high electron densities. Partial degeneracy and relativistic effects are covered with an approximate expansion. Because of its simplicity it can be included directly in a stellar evolution code and hence it has found fairly widespread use; however, it is certainly not sufficiently accurate to be used for computation of realistic solar models. An extension of this treatment, the CEFF equation of state including in addition Coulomb effects treated in the Debye–Hückel approximation, was introduced by Christensen-Dalsgaard and Däppen (1992). A comprehensive equation of state based on the chemical treatment has been provided in the so-called MHDFootnote 14 equation of state (Mihalas et al. 1988, 1990; Däppen et al. 1988; Nayfonov et al. 1999). This includes a probabilistic treatment of the occupation of states in atoms and ions (Hummer and Mihalas 1988), based on the perturbations caused by surrounding neutral and charged constituents of the gas, and including excluded-volume effects. Also, Coulomb effects and effects of partial degeneracy are taken into account. The MHD treatment and other physically realistic equations of state are too complex (so far) to be included directly into stellar evolution codes. Instead, they are used to set up tables which are then interpolated to obtain the quantities required in the evolution calculation. Thus both the table properties and the interpolation procedures become important for the accuracy of the representation of the physics. Issues of interpolation were addressed by Baturin et al. (2019).

The physical treatment of the equation of state, for realistic stellar mixtures, has been developed by the OPAL group at the Lawrence Livermore National Laboratory, in what they call the ACTEX equation of state (for ACTivity EXpansion), in connection with the calculation of opacities. For this purpose it has obviously been necessary to extend the treatment to include also a determination of atomic energy levels and their perturbations from the surrounding medium. The result is often referred to as the OPAL tables. Extensive tables, in the following OPAL 1996, were initially provided by Rogers et al. (1996), with later updates presented by Rogers and Nayfonov (2002).

Interestingly, relativistic effects were ignored in the original formulations of both the MHD and the OPAL tables, while they were included, in approximate form, in the simple formulation of Eggleton et al. (1973). Following the realization by Elliott and Kosovichev (1998), based on helioseismology, that this was inadequate, updated tables taking these effects into account have been produced by Gong et al. (2001b) and Rogers and Nayfonov (2002). The latter tables, with additional updates, are known as the OPAL 2005 equation-of-state tablesFootnote 15 and are seeing widespread use.

To illustrate the effects of using the different formulations, Figs. 3 and 4Footnote 16 show relative differences in p and \(\varGamma _1\) for various equations of state at the conditions in a model of the present Sun, using the OPAL 1996 equation of state as reference. It is clear that the inclusion of Coulomb effects in CEFF captures a substantial part of the inadequacies of the simple EFF formulation, although the remaining differences are certainly very significant. In the bottom panel of Fig. 4 it should be noticed that the MHD and OPAL 1996 formulations share the lack of proper treatment of relativistic effects and hence have very similar behaviour of \(\varGamma _1\) at the highest temperatures. This is corrected in both CEFF and OPAL 2005 which therefore show very similar departures from OPAL 1996 at high temperature. A detailed comparison between the MHD and OPAL formulations was carried out by Trampedach et al. (2006).

Fig. 3
figure 3

Comparison of equations of state at fixed \((\rho , T)\) and composition corresponding to the structure of the present Sun (specifically Model S of Christensen-Dalsgaard et al. 1996), in the sense (modified equation of state)–(model), plotted against the logarithm of the temperature in the model; the model used the original (OPAL 1996; Rogers et al. 1996) equation of state. The top panel shows the difference in pressure and the bottom panel the difference in \(\varGamma _1\) Solid lines show the EFF equation of state (Eggleton et al. 1973), and dashed lines the CEFF equation of state (Christensen-Dalsgaard and Däppen 1992). For the comparison the same relative composition of the heavy elements was chosen for the EFF and CEFF calculations as in the OPAL tables

Fig. 4
figure 4

Note that the relative composition of the heavy elements may differ between the different implementations

As Fig. 3, but showing CEFF (black solid lines), the MHD equation of state (Mihalas et al. 1990, red dashed lines) the OPAL 2005 equation of state (Rogers and Nayfonov 2002, green dot-dashed lines), and the SAHA-S equation of state (Gryaznov et al. 2004, blue long-dashed lines).

Further developments of the MHD equation of state have been undertaken to emulate aspects of the OPAL equation of state in a flexible manner, allowing the calculation of extensive consistent and physically more realistic tables (Liang 2004; Däppen and Mao 2009), or developing a similar emulation in the simpler CEFF equation of state, which might enable bypassing the table calculations (Lin and Däppen 2010). A comprehensive update of the MHD equation of state is being prepared by R. Trampedach. The implementation of these developments in solar and stellar model calculations will be very interesting.

An independent development of an equation of state in the chemical picture has been carried out in the so-called SAHA-S formulation (Gryaznov et al. 2004; Baturin et al. 2013, 2017).Footnote 17 Results for this equation of state are shown in Fig. 4 with the blue long-dashed curve. Apart from a rather stronger variation in \(\varGamma _1\) in the atmosphere due to the wide variety of molecular species included, the SAHA-S formulation is clearly quite similar to OPAL 2005. Also, Alan W. Irwin has developed the FreeEOS formulation,Footnote 18 based on free-energy minimization (see Cassisi et al. 2003a), which allows rapid calculation of an equation of state that closely matches the OPAL equation of state.

2.3.2 Opacity

In stellar interiors, the diffusion approximation for radiative transfer, implied by Eq. (5), is adequate, and the opacity is determined as the Rosseland mean opacity,

$$\begin{aligned} \kappa ^{-1} \equiv \kappa _{\mathrm{R}}^{-1} = {\pi \over a {\tilde{c}}T^3} \int _0^\infty \kappa _\nu ^{-1} {\mathrm{d}B_\nu \over \mathrm{d}T} \mathrm{d}\nu \end{aligned}$$
(21)

(Rosseland 1924), where \(\kappa _\nu \) is the monochromatic opacity at (radiation) frequency \(\nu \) and \(B_\nu \) is the Planck function. The computation of stellar opacities is generally so complicated that opacities have to be obtained in stellar modelling through interpolation in tables. The computation of the tables includes contributions of transitions between the different levels of the atoms and ions in the gas, including as far as possible the effects of level perturbations; an extensive review of opacity calculations was provided by Pain et al. (2017). The thermodynamic state of the gas, including the degrees of ionization and the distribution amongst the levels, is an important ingredient in the calculation; indeed, both the MHD and the OPAL equations of state were developed as bases for new opacity calculations. Within the convection zone, solar structure is essentially independent of opacity, since the temperature gradient is nearly adiabatic. Below the convection zone the opacity is dominated by heavy elements; hence it is sensitive not only to the total heavy-element abundance Z but also to the relative distribution of the individual elements. This is illustrated in Fig. 5 showing the sensitivity of the opacity to variations in the dominant contributions to the heavy elements. Evidently iron is an important contribution to the opacity, particularly in the solar core, but other elements such as oxygen, neon and silicon also play major roles. Modelling the solar atmosphere requires low-temperature opacities, including effects of molecules; in the calculation of the structure of calibrated solar models the resulting uncertainties are largely suppressed by changes in the treatment of convection (cf. Fig. 28).

Fig. 5
figure 5

Courtesy of H. M. Antia

Logarithmic derivatives of the opacity with respect to contributions to the total heavy-element abundance of the different elements indicated, evaluated for OPAL opacities (Iglesias and Rogers 1996) in the radiative part of a standard solar model. The vertical dotted line marks the temperature at the base of the convection zone in the present Sun.

Early models used for helioseismic analysis generally used the Cox and Stewart (1970) and Cox and Tabor (1976) tables. An early inference of the solar internal sound speed (Christensen-Dalsgaard et al. 1985) showed that the solar sound speed was higher below the convection zone than the sound speed of a model using the Cox and Tabor (1976) tables, prompting the suggestion that the opacity had to be increased by around 20% at temperatures higher than \(2 \times 10^6 \,\mathrm{K}\). This followed an earlier plea by Simon (1982) for a reexamination of the opacity calculations in connection with problems in the interpretation of double-mode Cepheids and in the understanding of the excitation of oscillations in \(\beta \) Cephei stars; it was subsequently demonstrated by Andreasen and Petersen (1988) that agreement between observed and computed period ratios for double-mode \(\delta \) Scuti stars and Cepheids could be obtained by a substantial opacity increase, by a factor of 2.7, in the range \(\log T = 5.2{-}5.9\).

These results motivated a reanalysis of the opacities by the Livermore group, who pointed out (Iglesias et al. 1987) that the contribution from line absorption in metals had been seriously underestimated in earlier opacity calculations. This work resulted in the OPAL tables (e.g., Iglesias and Rogers 1991; Iglesias et al. 1992; Rogers and Iglesias 1992, 1994, in the following OPAL92). Owing to the inclusion of numerous transitions in iron-group elements and a better treatment of the level perturbations and associated line broadening these new calculations did indeed show very substantial opacity increases, qualitatively matching the requirements from the helioseismic sound-speed inference; also, this led largely to agreement with evolution models of the period ratios for RR Lyrae and Cepheid double-mode pulsators (e.g., Cox 1991; Moskalik et al. 1992; Kanbur and Simon 1994) and to opacity-driven instability in the \(\beta \) Cephei models (e.g., Cox et al. 1992; Kiriakidis et al. 1992; Moskalik and Dziembowski 1992). These results are excellent examples of stellar pulsations, and in particular helioseismology, providing input to the understanding of basic physical processes.

The OPAL tables, with further developments (e.g., Iglesias and Rogers 1996, in the following OPAL96),Footnote 19 have seen widespread use in solar and stellar modelling. In parallel with the OPAL calculations, independent calculations were carried out within the Opacity Project (OP) (Seaton et al. 1994), with results in good agreement with those of OPAL96 at relatively low density and temperature, although larger discrepancies were found under conditions relevant to the solar radiative interior (Iglesias and Rogers 1995). More recent updates to the OP opacities, in the following OP05, have decreased these discrepancies substantially, to a level of 5–10% (Seaton and Badnell 2004; Badnell et al. 2005).Footnote 20 A recent effort is under way at the CEA, France, resulting in the so-called OPAS tablesFootnote 21 (Blancard et al. 2012; Mondet et al. 2015). Also, the Los Alamos group has updated their calculations, in the OPLIB tables (Colgan et al. 2016).Footnote 22 A review of these recent opacity results was provided by Turck-Chièze et al. (2016), while Fig. 6 shows a comparison of the opacity values in a model of the present Sun.

Fig. 6
figure 6

Figure courtesy of Aldo Serenelli

Comparison of the OPAL, OPLIB and OPAS opacities (see text) relative to the OP opacities. The dashed curves are for the Grevesse and Sauval (1998) composition, while the solid curves are for the Asplund et al. (2009) composition (see also Sect. 6.1). From Villante, Serenelli and Vinyoles (in preparation).

The opacity tables discussed so far typically include few or no molecular lines. Thus the opacity at low temperature (often taken to be below \(10^4 \,\mathrm{K}\)) must be obtained from separate tables, suitably matched to the opacity at higher temperature. Tables provided by Kurucz (1991) and Alexander and Ferguson (1994) have often been used. A set of tables with a more complete equation of state and improved treatment of grains was provided by Ferguson et al. (2005).

I note that the potential uncertainties in the opacity calculations have gained renewed interest in connection with the apparent discrepancies between helioseismic inferences and solar models computed with revised inferences of solar surface composition. I return to this in Sect. 6.4.

2.3.3 Energy generation

The basic energy generation in the Sun takes place through hydrogen fusion to helium which may be schematically written as

$$\begin{aligned} 4 {{}^{1}\mathrm{H}}\rightarrow {{}^{4}\mathrm{He}}+ 2 \mathrm{e}^+ + 2 \nu _{\mathrm{e}}. \end{aligned}$$
(22)

Here the emission of the two positrons results from the required conversion of two protons to neutrons, as also implied by conservation of charge in the process, and the two electron neutrinos ensure conservation of lepton number. Evidently the positrons are immediately annihilated by two electrons, resulting in further release of energy. Thus the net reaction can formally be regarded as the fusion of four hydrogen atoms into a helium atom; this is convenient from the point of view of calculating the energy release based on tables of atomic masses. The result is that each reaction in Eq. (22) generates 26.73 MeV. However, the neutrinos have a negligible probability for interaction with matter in the Sun, and hence the energy contributed to the neutrinos must be subtracted to obtain the energy generation rate \(\epsilon \) actually available to the Sun. Thus \(\epsilon \) depends on the energy of the emitted neutrinos and hence on the details of the reactions resulting in the net reaction in Eq. (22). As discussed in Sect. 5.2 detection of the emitted neutrinos provides a crucial confirmation of the presence of nuclear reactions in the solar core and a probe of the properties of the neutrinos.

The detailed properties of nuclear reactions in stellar interiors have been discussed by, for example, Clayton (1968). Reactions require tunneling through the potential barrier resulting from the Coulomb repulsion between the two nuclei. Thus to a first approximation reactions between more highly charged nuclei are expected to have a lower probability. Also, the temperature dependence of the reactions depends strongly on the charges of the reacting nuclei. The dependence on temperature of the reaction rate \(r_{12}\) between two nuclei 1 and 2 is often approximated as \(r_{12} \propto T^n\), where

$$\begin{aligned} n = {\eta -2 \over 3}, \quad \eta = 42.487 ({{\mathcal {Z}}}_1 {{\mathcal {Z}}}_2 {{\mathcal {A}}})^{1/3} T_6^{-1/3}; \end{aligned}$$
(23)

here \({{\mathcal {Z}}}_1 e\) and \({{\mathcal {Z}}}_2 e\) are the charges of the two nuclei, \({{\mathcal {A}}} = {{\mathcal {A}}}_1 {{\mathcal {A}}}_2/( {{\mathcal {A}}}_1 + {{\mathcal {A}}}_2)\) is the reduced mass of the nuclei in atomic mass units, \({{\mathcal {A}}}_1\) and \({{\mathcal {A}}}_2\) being the masses of the nuclei, and \(T_6 = T/(10^6 \,\mathrm{K})\).Footnote 23 However, the specific properties of the interacting nuclei also play a major role for the reaction rate. Furthermore, the conversion of protons into neutrons and the production of neutrinos involve the weak interaction which takes place with comparatively low probability. This has a strong effect on the rates of reactions where this conversion takes place.

The net reaction in Eq. (22) obviously has to take place through a number of intermediate steps. The dominant series of reactions starts directly with the fusion of two hydrogen nuclei; the full sequence of reactions isFootnote 24

$$\begin{aligned} {{}^{1}\mathrm{H}}({{}^{1}\mathrm{H}}, \mathrm{e^+}\nu _{\mathrm{e}})\,{{}^{2}\mathrm{D}}({{}^{1}\mathrm{H}}, \gamma )\,{{}^{3}\mathrm{He}}({{}^{3}\mathrm{He}},2 {{}^{1}\mathrm{H}})\, {{}^{4}\mathrm{He}}. \end{aligned}$$
(24)

This sequence of reactions is known as the PP-I chain and clearly corresponds to Eq. (22). The average energy of the neutrinos lost in the first reaction in the chain is 0.263 MeV. Thus the effective energy production for each resulting \({{}^{4}\mathrm{He}}\) is 26.21 MeV.

Two alternative chains, PP-II and PP-III, continue with the fusion of \({{}^{3}\mathrm{He}}\) and \({{}^{4}\mathrm{He}}\) after the production of \({{}^{3}\mathrm{He}}\):

$$\begin{aligned} \begin{aligned} {{}^{3}\mathrm{He}}({{}^{4}\mathrm{He}}, \gamma )\,&{{}^{7}\mathrm{Be}}(\mathrm{e^-}, \nu _{\mathrm{e}})\,{{}^{7}\mathrm{Li}}({{}^{1}\mathrm{H}}, {{}^{4}\mathrm{He}})\,{{}^{4}\mathrm{He}}\qquad \qquad \,\, (\hbox {PP-II}) \\&\,\,\Downarrow \\&{{}^{7}\mathrm{Be}}({{}^{1}\mathrm{H}}, \gamma )\, {{}^{8}\mathrm{B}}(,\mathrm{e^+}\nu _{\mathrm{e}})\,{{}^{8}\mathrm{Be}}(, {{}^{4}\mathrm{He}})\,{{}^{4}\mathrm{He}}\qquad (\hbox {PP-III}) \end{aligned} \end{aligned}$$
(25)

Here the total average neutrino losses per produced \({{}^{4}\mathrm{He}}\) are 1.06 MeV and 7.46 MeV, respectively. At the centre of the present Sun the contributions of the PP-I, PP-II and PP-III reactions to the total energy generation by the PP chains, excluding neutrinos, are 23, 77 and 0.2%, respectively; owing to a much higher temperature sensitivity of the PP-II and PP-III chains the corresponding contributions to the solar luminosity are 77, 23 and 0.02%. However, even though insignificant for the energy generation, the PP-III chain is very important for the study of neutrino emission from the Sun due to the high energies of the neutrinos emitted in the decay of \({{}^{8}\mathrm{B}}\).

Of the reactions in the PP chains the initial reaction, fusing two hydrogen nuclei, has by far the lowest rate per pair of reacting nuclei. This is a result of the effect of the weak interaction in the conversion of a proton into a neutron, coupled with the penetration of the Coulomb barrier.Footnote 25 Thus the overall rate of the chains is controlled by this reaction; since the charges of the interacting nuclei is relatively low, it has a modest temperature sensitivity, approximately as \(T^4\) [cf. Eq. (23)]. The distribution of the reactions between the different branches depends on the branching ratios at the reactions destroying \({{}^{3}\mathrm{He}}\) and \({{}^{7}\mathrm{Be}}\); as a result PP-II and in particular PP-III become more important with increasing temperature, with important consequences for the neutrino spectrum of the Sun.

In principle, the full reaction network should be considered as a function of time, to follow the changing abundances resulting from the nuclear reactions. In practice the relevant reaction timescales for the reactions involving \({{}^{2}\mathrm{D}}\), \({{}^{7}\mathrm{Be}}\) and \({{}^{7}\mathrm{Li}}\) are so short that the reactions can be assumed to be in equilibrium under solar conditions (e.g., Clayton 1968); the resulting equilibrium abundances are minute.Footnote 26 On the other hand, the timescales for the reactions involving \({{}^{3}\mathrm{He}}\) are comparable with the timescale of solar evolution, at least in the outer parts of the core; thus the calculation should follow the detailed evolution with time of the \({{}^{3}\mathrm{He}}\) abundance. The resulting abundance profile in a model of the present Sun is illustrated in Fig. 7; below the maximum \({{}^{3}\mathrm{He}}\) has reached nuclear equilibrium, with an abundance that increases with decreasing temperature. The location of this maximum moves further out with increasing age. It was found by Christensen-Dalsgaard et al. (1974) that the establishment of this \({{}^{3}\mathrm{He}}\) profile caused instability to a few low-degree g modes early in the evolution of the Sun.

Fig. 7
figure 7

Evolution of the abundance of \({}^3 \mathrm{He}\). The solid curve shows the abundance in a model of the present Sun, while the dotted, dashed, dot-dashed, double-dot-dashed and long-dashed curves show the abundances at ages 0.5, 1.0, 2.0, 2.9 and 3.9 Gyr, respectively. The initial abundance was assumed to be zero

The primordial abundances of light elements, as inferred from solar-system abundances, are crucial constraints on models of the Big Bang (e.g. Geiss and Gloeckler 2007). This includes the abundances of \({{}^{2}\mathrm{D}}\) and \({{}^{3}\mathrm{He}}\), with \({{}^{2}\mathrm{D}}\) burning (cf. Eq. 24) taking place at sufficiently low temperature that the primordial \({{}^{2}\mathrm{D}}\) has largely been converted to \({{}^{3}\mathrm{He}}\). The \({{}^{3}\mathrm{He}}/{{}^{4}\mathrm{He}}\) ratio can be determined from the solar wind; the resulting value can probably be taken as representative for matter in the solar convection zone and hence provides a constraint on the extent to which the convection zone has been enriched by \({{}^{3}\mathrm{He}}\) resulting from hydrogen burning. This was used by, for example, Schatzman et al. (1981), Lebreton and Maeder (1987) and Vauclair and Richard (1998) to constrain the extent of turbulent mixing beneath the convection zone. Heber et al. (2003) investigated the time variation in the \({{}^{3}\mathrm{He}}/{{}^{4}\mathrm{He}}\) ratio from analysis of lunar regolith samples. After correction for secondary processes, using the presumed constant \({}^{20}\mathrm{Ne}/{}^{22}\mathrm{Ne}\) as reference, they deduced that the \({{}^{3}\mathrm{He}}/{{}^{4}\mathrm{He}}\) ratio has been approximately constant over the past around 4 Gyr, with an average value for the ratio of number densities of \((4.47 \pm 0.13) \times 10^{-4}\). This provides a further valuable constraint on the mixing history below the solar convection zone.Footnote 27

A second set of processes resulting in the net reaction in Eq. (22) involves successive reactions with isotopes of carbon, nitrogen and oxygen:

(26)

This CNO cycle is obviously a catalytic process, with the net result of converting hydrogen into helium. The reaction with the lowest rate in this cycle is proton capture on \({{}^{14}\mathrm{N}}\) which therefore controls the overall rate of the cycle; this leads to a temperature dependence of roughly \(T^{20}\) under solar conditions, owing to the high nuclear charge of nitrogen [cf. Eq. (23)]. As a result, the CNO cycle is significant mainly very near the solar centre, and its importance increases rapidly with increasing age of the model, due to the increase in core temperature (cf. Fig. 8a). Owing to the strong temperature dependence it is strongly concentrated near the centre, as illustrated in Fig. 8b. Thus, although in the present Sun the central contribution to the energy-generation rate is 11%, the CNO cycle only contributes 1.3% to the luminosity. As a consequence of the \({}^{14}\mathrm{N}\) bottleneck in the CN cycles almost all the initial carbon is converted into nitrogen by the reactions. An additional side branch mainly serves to convert oxygen into nitrogen; under the conditions leading up to the present Sun this is relatively unimportant, causing an increase in the central abundance of \({{}^{14}\mathrm{N}}\) by around 12% in the present Sun, relative to the initial abundance.

Fig. 8
figure 8

Contributions of the CNO cycle to the energy generation in a solar model. Top panel: the ratio of \(\epsilon _{\mathrm{CNO}}\) to the total \(\epsilon \) at the centre of the model, as a function of age. Bottom panel: the fractional contribution \(\epsilon _{\mathrm{CNO}}/\epsilon \) as a function of position in a model of present Sun

The computation of nuclear reaction rates requires nuclear parameters, determined from experiments or, in the case of the \({{}^{1}\mathrm{H}}+ {{}^{1}\mathrm{H}}\) reaction, from theoretical considerations. In addition to affecting the energy-generation rate the details of the reactions have a substantial effect on the branching ratios in the PP chains and hence on the production rate of the high-energy \({{}^{8}\mathrm{B}}\) neutrinos. The reaction rate, averaged over the thermal energy distribution of the nuclei, is typically expressed as a function of temperature in terms of a factor describing the penetration of the Coulomb barrierFootnote 28 and a correction factor provided as an expansion in temperature. A substantial number of compilations of data for nuclear reactions have been made, starting with the classical, and much used, sets by Fowler et al. (1967, 1975). Bahcall and Pinsonneault (1995) provided an updated set of parameters specifically for the computation of solar models. Two extensive and commonly used compilations of parameters have been provided by Adelberger et al. (1998) and Angulo et al. (1999). Revised parameters for the important reaction \({{}^{14}\mathrm{N}}({{}^{1}\mathrm{H}}, \gamma )\,{{}^{15}\mathrm{O}}\), which controls the overall rate of the CNO cycle, have been obtained (Formicola et al. 2004; Angulo et al. 2005), reducing the rate by a factor of almost 2. An updated set of nuclear parameters specifically for solar modelling was provided by Adelberger et al. (2011), including also the revised rates for \({{}^{14}\mathrm{N}}({{}^{1}\mathrm{H}}, \gamma )\,{{}^{15}\mathrm{O}}\).

The nuclear reactions take place in a plasma, with charged particles that modify the interaction between the nuclei. A classical and widely used treatment of this effect was developed by Salpeter (1954), with a mean-field treatment of the plasma in the Debye–Hückel approximation; this shows that the nuclei are surrounded by clouds of electrons which partly screen the Coulomb repulsion between the nuclei and hence increase the reaction rate. Following criticism of Salpeter’s result by Shaviv and Shaviv (1996), Brüggen and Gough (1997, 2000) made a more careful analysis of the thermodynamical assumptions underlying the derivation, confirming Salpeter’s result and in the second paper extending it to take into account quantum-mechanical exclusion and polarization of the screening cloud; in the solar case, however, such effects are largely insignificant. On the other hand, the mean-field approximation may be questionable in cases, such as the solar core, where the average number of electrons within the radius of the screening cloud is very small. This has given rise to extensive discussions of dynamic effects in the screening (e.g. Shaviv and Shaviv 2001). Bahcall et al. (2002) argued that such effects, and other claims of problems with the Salpeter formulation, were irrelevant. However, molecular-dynamics simulations of stellar plasma strongly suggest that dynamical effects may in fact substantially influence the screening (Shaviv 2004a, b). Further investigations along these lines are clearly needed. Thus it is encouraging that Mussack et al. (2007) started independent molecular-dynamics simulations. Initial results by the group (Mao et al. 2009) confirmed the earlier conclusions by Shaviv; a more detailed analysis by Mussack and Däppen (2011) found evidence for a slight reduction in the reaction rate as a result of plasma effects. Interestingly, Weiss et al. (2001) noted that the solar structure as inferred from helioseismology (cf. Sect. 5.1.2) can be used to constrain the departures from the simple Salpeter formulation; in particular, they found that a model computed assuming no screening was inconsistent with the helioseismically inferred sound speed. These issues clearly need further investigations.

2.3.4 Diffusion and settling

As indicated in Eq. (6) the temporal evolution of stellar internal abundances must take into account effects of diffusion and settling. Crudely speaking, settling due to gravity and thermal effects tends to establish composition gradients; diffusion, described by the diffusion coefficient \(D_i\), tends to smooth out such gradients, including those that are established through nuclear reactions. A brief review of these processes was provided by Michaud and Proffitt (1993). They were discussed in some detail already by Eddington (1926); he concluded that they might lead to unacceptable changes in surface composition unless suppressed by processes that redistributed the composition, such as circulation.

A brief review of diffusion was provided by Thoul and Montalbán (2007). The basic equations describing the microscopic motion of matter in a star are the Boltzmann equations for the velocity distribution of each type of particle. The treatment of diffusion and settling in stars has generally been based on approximate solutions of the Boltzmann equations presented by Burgers (1969). This results in a set of equations for momentum, energy and mass conservation for each species which can be solved numerically to obtain the relevant quantities such as \(D_i\) and \(V_i\) in Eq. (6). The equations depend on the collisions between particles in the gas, greatly complicated by the long-range nature of the Coulomb force between charged particles (electrons and ions); these are typically described in terms of coefficients based on the screened Debye–Hückel potential, mentioned above in connection with Coulomb effects in the equation of state and electron screening in nuclear reactions, and depending on the ionization state of the ions. As emphasized initially by Michaud (1970) the gravitational force on the particles may be modified by radiative effects, depending on the detailed ionization and excitation state of the individual species and hence varying strongly between different elements or with position in the star.Footnote 29 It should be noted that the typical diffusion and settling timescales, although possibly short on a stellar evolution timescale, are generally much longer than the timescales associated with large-scale hydrodynamical motions. Thus regions affected by such motion, particularly convection zones, can generally be assumed to be fully mixed; in the solar case microscopic diffusion and settling is only relevant beneath the convective envelope. Formally, hydrodynamical mixing can be incorporated by maintaining Eq. (6) but with a very large value of \(D_i\) (e.g., Eggleton 1971).

Michaud and Proffitt (1993) presented relatively simple approximations to the diffusion and settling coefficients for hydrogen as well as for heavy elements regarded as trace elements (see also Christensen-Dalsgaard 2008). These were based on solutions of Burger’s equations, adjusting coefficients to obtain a reasonable fit to the numerical results. These approximations were also compared with the results of the numerical solutions by Thoul et al. (1994) who in addition presented simpler, and rather less accurate, approximate expressions for the coefficients.

Although diffusion and settling have been considered since the early seventies (e.g., Michaud 1970) to explain peculiar abundances in some stars, it seems that Noerdlinger (1977) was the first to include these effects in solar modelling; indeed, the early estimates by Eddington (1926) suggested that the effects would be fairly small. In fact, including helium diffusion and settling Noerdlinger found a reduction of about 0.023 in the surface helium abundance \(Y_{\mathrm{s}}\), from the initial value. Roughly similar results were obtained by Gabriel et al. (1984) and Cox et al. (1989), the latter authors considering a broad range of elements, while Wambsganss (1988) found a much smaller reduction. Proffitt and Michaud (1991) provided a detailed comparison of these early results, although without explaining the discrepant value found by Wambsganss. Bahcall and Pinsonneault (1992a, b) made careful calculations of models with helium diffusion and settling, using the then up-to-date physics, and emphasizing the importance of calibrating the models to yield the observed present surface ratio \(Z_{\mathrm{s}}/X_{\mathrm{s}}\) between the abundances of heavy elements and hydrogen; they found that the inclusion of diffusion and settling increased the neutrino capture rates from the models by up to around 10%. A careful analysis of the effects of heavy-element diffusion and settling on solar models and their neutrino fluxes was presented by Proffitt (1994).

Gabriel et al. (1984) concluded that the inclusion of helium diffusion and settling had little effect on the oscillation frequencies of the model, while Cox et al. (1989), in their more detailed treatment, actually found that the model with diffusion and settling showed a larger difference between observed and model frequencies than did the model that did not include these effects. However, Christensen-Dalsgaard et al. (1993) showed that the inclusion of helium diffusion and settling substantially decreased the difference in sound speed between the Sun and the model, as inferred from a helioseismic differential asymptotic inversion. Further inverse analyses of observed solar oscillation frequencies have confirmed this result, thus strongly supporting the reality of these effects in the Sun and contributing to making diffusion and settling a part of ‘the standard solar model’ (e.g., Christensen-Dalsgaard and Di Mauro 2007). Further evidence is the difference between the initial helium abundance required to calibrate solar models and the helioseismically inferred envelope helium abundance (see Sect. 5.1.2), which is largely accounted for by the effects of helium settling.

Detailed calculations of atomic data for the OPAL and OP opacity projects (cf. Sect. 2.3.2) have allowed precise calculations of the radiative effects on settling (Richer et al. 1998). As mentioned above such effects are highly selective, affecting different elements differently. As a result, not only does the heavy-element abundance change as a result of settling, but the relative mixture of the heavy elements varies as a function of stellar age and position in the star. As is evident from Fig. 5 this has a substantial effect on the opacities. To take such effects consistently into account the opacities must therefore be calculated from the appropriate mixture at each point in the model, requiring appropriately mixing monochromatic contributions from individual elements and calculating the Rosseland mean (cf. Eq. 21). Such calculations are feasible (Turcotte et al. 1998) although obviously very demanding on computing resources in terms of time and storage. Turcotte et al. (1998) carried out detailed calculations of this nature for the Sun. Here the relatively high temperatures and resulting ionization beneath the convective envelope, where diffusion and settling are relevant, result in modest effects of radiative acceleration and little variation in the relative heavy-element abundances. In fact, Fig. 14 of Turcotte et al. shows that neglecting radiative effects and assuming all heavy elements to settle at the same rate, corresponding to fully ionized oxygen, yield results somewhat closer to the full detailed treatment than does neglecting radiative effects and taking partial ionization fully into account. The rather reassuring conclusion is that, as far as solar modelling is concerned, the simple procedure of treating all heavy elements as one is adequate (see also Turcotte and Christensen-Dalsgaard 1998). This simpler approach, neglecting radiative effects, is in fact what is used for the models presented here.

The timescale of diffusion and settling, defined by Eq. (6), increases with increasing density and hence with depth beneath the stellar surface, as illustrated in Fig. 9. Since the convective envelope is fully mixed, the relevant timescale controlling the efficiency of diffusion is the value just below the convective envelope. In the solar case this is of order \(10^{11}\) years, resulting in a modest effect of diffusion over the solar lifetime. In somewhat more massive main-sequence stars, however, with thinner outer convection zones, the time scale is short compared with the evolution timescale; in the case illustrated for a \(2 \,M_\odot \) star, for example, it is around \(5 \times 10^6\) years. Thus settling has a dramatic effect on the surface abundance unless counteracted by other effects (Vauclair et al. 1974). This leads to a strong reduction in the helium abundance, likely eliminating instability due to helium driving in stars that might otherwise be expected to be pulsationally unstable (Turcotte et al. 2000). Also, differential radiative acceleration leads to a surface mixture of the heavy elements very different from the solar mixture, which is indeed observed in ‘chemically peculiar stars’, as already noted by Michaud (1970). Richer et al. (2000) pointed out that to match the observed abundances even in these cases compensating effects had to be included to reduce the effects of settling; they suggested either sub-surface turbulence, increasing the reservoir from which settling takes place, or mass loss bringing fresh material less affected by settling to the surface. An interesting analysis of these processes in controlling the observed abundances of Sirius was presented by Michaud et al. (2011). To obtain ‘normal’ composition in such stars, processes of this nature reducing the effects of settling are a fortiori required;Footnote 30 since most main-sequence stars somewhat more massive than the Sun rotate relatively rapidly, circulation or hydrodynamical instabilities induced by rotation are likely candidates (e.g., Zahn 1992, see also Sect. 7). Deal et al. (2020) investigated the combined effects of rotation and radiatively affected diffusion in main-sequence stars and found that this could account for the observed surface abundances for stars with masses below \(1.3\,M_\odot \). For more massive stars additional mixing processes appeared to be required. It should also be noted that such hydrodynamical models of the evolution of rotation are unable to account for the rotation observed in the solar interior (see Sect. 5.1.4). A complete model of the transport of composition and angular momentum in stellar interiors remains to be found.

Fig. 9
figure 9

Image reproduced with permission from Aerts et al. (2010), copyright by Springer

Diffusion timescales for helium, defined by the term in \(V_i X_i\) in Eq. (6), for a model of the present Sun (dashed) and a zero-age main-sequence \(2 \,M_\odot \) model (continuous). The thinner red parts of the curves mark the fully mixed convection zones.

2.4 The near-surface layer

The treatment of the outermost layers of the model is complicated and affected by substantial physical uncertainties. In the atmosphere the diffusion approximation for radiative transport, implicit in Eq. (5), is no longer valid; here the full radiative-transfer equations need to be considered, including the details of the frequency dependence of absorption and emission. Such detailed stellar atmosphere models are available and can in principle be incorporated in the full solar model (e.g., Kurucz 1991, 1996; Gustafsson et al. 2008). However, additional complications arise from the effects of convection which induce motion in the atmosphere as well as strong lateral inhomogeneities in the thermal structure. Also, observations of the solar atmosphere strongly indicate the importance of non-radiative heating processes in the upper parts of the atmosphere, likely caused by acoustic or magnetic waves, or other forms of magnetic energy dissipation, for which no reliable models are available. The thermal structure just beneath the photosphere is strongly affected by the transition to convective energy transport, which determines the temperature gradient \(\nabla = \nabla _{\mathrm{conv}}\). Also, in this region convective velocities are a substantial fraction of the speed of sound, leading to significant momentum transport by convection described as a ‘turbulent pressure’, but most often ignored in the model calculations.

From the point of view of the global structure of the Sun, these near-surface problems are of lesser importance. In most of the convection zone the temperature gradient is very nearly adiabatic, \(\nabla \simeq \nabla _{\mathrm{ad}}\) (see also Fig. 12). Thus the structure is essentially determined by the (constant) value of the specific entropy \(s_{\mathrm{conv}}\); in other words, the variations of the thermodynamical quantities within this part of the convection zone lie on an adiabat. In fact, if the further approximation of a fully ionized ideal gas is made, such as is roughly valid except in the outer few per cent of the solar radius, \(\nabla _{\mathrm{ad}}\simeq 2/5\), \(\mathrm{d}\ln p / \mathrm{d}\ln \rho \simeq 5/3\), and the relation between pressure and density can be approximated by

$$\begin{aligned} p = K \rho ^\gamma , \end{aligned}$$
(27)

with \(\gamma = 5/3\). In this case, therefore, the properties of the convection zone are characterized by the adiabatic constant K. Such an approximation was generally used in early calculations of solar models (e.g., Schwarzschild et al. 1957). The structure of the convection zone determines its radial extent and hence affects the radius of the model. In the solar case the radius is known observationally with high precision; thus the adiabat of the adiabatic part of the convection zone [i.e., the value of K in the approximation in Eq. (27)] must therefore be chosen such that the model has the observed radius. This is part of the calibration of solar models (see Sect. 2.6).

From this point of view the details of the treatment of the near-surface layers serve to determine \(s_{\mathrm{conv}}\) (or K). This is obtained from the specific entropy at the bottom of the atmosphere through the change in entropy resulting from integrating \(\nabla - \nabla _{\mathrm{ad}}\) over the significantly superadiabatic part of the convection zone. The treatment of convection typically involves parameters that can be adjusted to control the adiabat and hence the radius of the model; given such calibration to solar radius, the structure of the deeper parts of the model is largely insensitive to the details of the treatment of the atmosphere and the convective gradient (for an example, see Fig. 31 below).

I note that although the detailed modelling of the near-surface layers has modest effect on the internal properties of calibrated solar models, they have a substantial effect on the computed oscillation frequencies which may affect the analysis of observed frequencies (see Sect. 5.1.1). Also, in computations of other stars no similar calibration based on the observed properties is generally possible. It is customary to apply solar-calibrated convection properties in these cases; although this is clearly not a priori justified, some support at least for only modest variations relative to the Sun over a substantial range of stellar parameters has been found from hydrodynamical simulations of near-surface convection (cf. Fig. 11).

Although the atmospheric structure can be implemented in terms of reasonably realistic models of the solar atmosphere, the usual procedure in modelling solar evolution is to base the atmospheric properties on a simple relation between temperature and optical depth \(\tau \), \(T = T(\tau )\); here \(\tau \) is defined by

$$\begin{aligned} {\mathrm{d}\tau \over \mathrm{d}r} = - \kappa \rho , \end{aligned}$$
(28)

with \(\tau = 0\) at the top of the atmosphere. This \(T(\tau )\) relation is often expressed on the form

$$\begin{aligned} T^4 = {3 \over 4} T_{\mathrm{eff}}^4 [\tau + q(\tau )], \end{aligned}$$
(29)

defining the (generalized) Hopf function q.Footnote 31 Given \(T(\tau )\), and the equation of state and opacity as functions of density and temperature, the atmospheric structure can be obtained by integrating the equation of hydrostatic support, which may be written as

$$\begin{aligned} {\mathrm{d}p \over \mathrm{d}\tau } = {g \over \kappa }, \end{aligned}$$
(30)

where the gravitational acceleration g can be taken to be constant, at least for main-sequence stars such as the Sun. This defines the photospheric pressure, e.g. at the point where \(T = T_{\mathrm{eff}}\), the effective temperature, and hence the outer boundary condition for the integration of the full equations of stellar structure.Footnote 32 The \(T(\tau )\) relation can be obtained from fitting to more detailed theoretical atmospheric models, as done, for example, by Morel et al. (1994), who used the Kurucz (1991) models. Alternatively, a fit to a semi-empirical model of the solar atmosphere can be used, such as the Krishna Swamy fit (Krishna Swamy 1966) or the Harvard-Smithsonian Reference Atmosphere (Gingerich et al. 1971). As an example, the Vernazza et al. (1981) Model C \(T(\tau )\) relation is shown in Fig. 10; here is also shown the result of using the following approximation for the Hopf function in Eq. (29):

$$\begin{aligned} q(\tau ) = 1.036 -0.3134 \exp (-2.448 \tau ) -0.2959 \exp (-30 \tau ). \end{aligned}$$
(31)

The approximation provides a reasonable fit to the observationally inferred temperature structure in that part of the atmosphere which dominates the determination of the photospheric pressure.

Fig. 10
figure 10

Comparison of the temperature structure in Model C of Vernazza et al. (1981) (dashed curve), against monochromatic optical depth \(\tau \) at \(500 \,\mathrm{nm}\), and the fit given in Eq. (31) (solid curve). The red dot-dashed curve shows the \(T(\tau )\) relation, against Rosseland mean opacity, obtained from matching a 3D hydrodynamical simulation (Trampedach et al. 2014b, see also Sect. 2.5)

\(T(\tau )\) relations based on a solar \(q(\tau )\) are often used for general modelling of stars, even though the atmospheric structure may have substantial variations with stellar properties. An interesting alternative is to determine \(q(\tau )\), as a function of stellar parameters, from averaged hydrodynamical simulations of the stellar near-surface layers (e.g. Trampedach et al. 2014b). An example based on a simulation for the present Sun is also shown in Fig. 10.

2.5 Treatment of convection

A detailed review of observational and theoretical aspects of solar convection was provided by Nordlund et al. (2009), while Rincon and Rieutord (2018) focused on the largest clearly observed scale of convection on the solar surface, the supergranulation. Further details, including the treatment of convection in a time-dependent environment such as a pulsating star, were reviewed by Houdek and Dupret (2015). As discussed below, extensive hydrodynamical simulations have been carried out of the near-surface convection in the Sun and other stars. However, direct inclusion of these simulations in stellar evolution calculations is impractical, owing to the computational expense; thus we must rely on simpler procedures. It is obviously preferable to have a physically motivated description of convection; as discussed above (see also Sect. 2.6), solar modelling requires one or more parameters which can be used to adjust the specific entropy in the adiabatic part of the convection zone and hence the radius of the model. In stellar modelling convection is typically treated by means of some variant of mixing-length model (e.g. Biermann 1932; Vitense 1953; Böhm-Vitense 1958); a more physically-based derivation of the description was provided by Gough (1977a, b), in terms of the linear growth and subsequent dissolution of unstable modes of convection. In the commonly used physical description of this prescriptionFootnote 33 (for further details, see Kippenhahn et al. 2012) convection is described by the motion of blobs over a distance \(\ell \), after which the blob is dissolved in the surroundings, giving up its excess heat. If the temperature difference between the blob and the surroundings is \(\varDelta T\) and the typical speed of the blob is \(v\), the convective flux is of order \(F_{\mathrm{con}}\sim v c_p \rho \varDelta T\), where \(c_p\) is the specific heat at constant pressure. Assuming, for simplicity, that the motion of the blob takes place adiabatically, \(\varDelta T \sim \ell T (\nabla - \nabla _{\mathrm{ad}}) /H_p\), where \(H_p = - (\mathrm{d}\ln p/\mathrm{d}r)^{-1}\) is the pressure scale height. Also, the speed of the element is determined by the work of the buoyancy force \(- \varDelta \rho g\) on the element, where \(\varDelta \rho \sim - \rho \varDelta T/T\) is the density difference between the blob and the surroundings, assuming the ideal gas law and pressure equilibrium between the blob and the surroundings. This gives \(\rho v^2 \sim - \ell g \varDelta \rho \sim \rho \ell ^2 g (\nabla - \nabla _{\mathrm{ad}})/H_p\). Thus we finally obtainFootnote 34

$$\begin{aligned} F_{\mathrm{con}}\sim \rho c_p T {\ell ^2 g^{1/2} \over H_p^{3/2}} (\nabla - \nabla _{\mathrm{ad}})^{3/2}. \end{aligned}$$
(32)

To this must be added the radiative flux

$$\begin{aligned} F_{\mathrm{rad}}= {4 a {\tilde{c}}T^4 \over 3 \kappa \rho } {\nabla \over H_p} \end{aligned}$$
(33)

(cf. Eq. 5); the total flux \(F = F_{\mathrm{con}}+ F_{\mathrm{rad}}\) must obviously match \(L/(4 \pi r^2)\), for equilibrium. This condition determines the temperature gradient in this description.

This description obviously depends on the choice of \(\ell \); this is typically also regarded as a measure of the size of the convective elements. An almost universal, if not particularly strongly physically motivated, choice of \(\ell \) is to take it as a multiple of the pressure scale height,

$$\begin{aligned} \ell = \alpha _{\mathrm{ML}}H_p. \end{aligned}$$
(34)

From Eq. (32) it is obvious that \(F_{\mathrm{con}}\) then scales as \(\alpha _{\mathrm{ML}}^2\). Adjusting \(\alpha _{\mathrm{ML}}\) therefore modifies the convective efficacy and hence the superadiabatic gradient \(\nabla - \nabla _{\mathrm{ad}}\) required to transport the energy, thus fixing the specific entropy in the deeper parts of the convection zone. This in turn affects the structure of the convection zone, including its radial extent, and hence the radius of the star. As discussed in Sect. 2.6 the requirement that models of the present Sun have the correct radius is typically used to determine a value of \(\alpha _{\mathrm{ML}}\), which is then often used for the modelling of other stars.

In practice, further details are added. These involve a more complete thermodynamical description, the inclusion of factors of order unity in the relation for the average velocity and energy flux and expressions for the heat loss from the convective element. Although not of particular physical significance, the choice made for these aspects obviously affects the final expressions and must be taken into account in comparisons between different calculations, particularly when it comes to the value of \(\alpha _{\mathrm{ML}}\) required to calibrate the model. A detailed description of a commonly used formulation was provided by Böhm-Vitense (1958). It was pointed out by Gough and Weiss (1976) (see also Sect. 2.4) that solar models, with the appropriate calibration of the relevant convection parameters to obtain the proper radius, are largely insensitive to the details of the treatment of convection, although the specific values of \(\alpha _{\mathrm{ML}}\) may obviously differ. It is important to keep this in mind when comparing independent solar and stellar models. As an additional point I note that the preceding description is entirely local: it is assumed that \(F_{\mathrm{con}}\) is determined by conditions at a given point in the model, leading effectively to a relation of the form (9).

The motion of the convective elements also leads to transport of momentum which, when averaged, appears as a contribution to hydrostatic support in the form of a turbulent pressure of order

$$\begin{aligned} p_{\mathrm{t}}\sim \rho v^2 \sim {\rho \ell ^2 g \over H_p} (\nabla - \nabla _{\mathrm{ad}}) \; . \end{aligned}$$
(35)

Correspondingly, hydrostatic equilibrium, Eq. (1), is expressed in terms of \(p = p_{\mathrm{g}} + p_{\mathrm{t}}\), where \(p_{\mathrm{g}}\) is the thermodynamic pressure. On the other hand, the superadiabatic gradient \(\nabla - \nabla _{\mathrm{ad}}\) in Eqs. (32) and (35) is essentially a thermodynamic property and hence is determined by the gradient in \(p_{\mathrm{g}}\) or, if expressed in terms of p and \(p_{\mathrm{t}}\), the gradient of \(p_{\mathrm{t}}\). Consequently, including \(p_{\mathrm{t}}\) consistently in Eq. (1) increases the order of the system of differential equations within the convection zone, leading to severe numerical difficulties at the boundaries of the convection zone where the order changes (e.g., Stellingwerf 1976; Gough 1977b). A detailed analysis of the resulting singular points at the convection-zone boundaries was carried out by Gough (1977a). As a result, although the effect of the turbulent pressure on the hydrostatic structure has been included in some calculations based on a local treatment of convection (e.g., Henyey et al. 1965; Kosovichev 1995) \(\nabla - \nabla _{\mathrm{ad}}\) has generally been determined from the total pressure, thus avoiding the difficulties at the boundaries of the convection zone, but introducing some inconsistency (e.g. Baker and Gough 1979).

It is obvious that the local treatment of convection is an approximation, even in the simple physical picture employed here: a convective element senses conditions over a range of depths in the Sun during its motion; similarly, the convective flux at a given location must arise from an ensemble of convective elements originating at different depths. This indicates the need for a non-local description of convection, involving some averaging over the travel of a convective element and the elements contributing to the flux. Noting the similarity to the non-local nature of radiative transfer Spiegel (1963) proposed an approximation to this averaging akin to the Eddington approximation, leading to a set of local differential equations, albeit of higher order, to describe the convective properties (see also Gough 1977a). This was implemented by Balmforth and Gough (1991) and Balmforth (1992).Footnote 35 An advantage of the non-local formulation is that it bypasses the singularities caused by a consistent treatment of turbulent pressure in a local convection formulation; interestingly, Balmforth (1992) showed that the common inconsistent local treatment has a non-negligible effect on the properties of the model, compared with the local limit of the non-local treatment.

Alternative formulations for the convective properties have been developed on the basis of statistical descriptions of turbulence, thus including the full spectrum of convective eddies (e.g., Xiong 1977, 1989; Canuto and Mazzitelli 1991; Canuto et al. 1996) (for a more detailed discussion of such Reynolds stress models, see Houdek and Dupret 2015). Even so, the descriptions typically contain an adjustable parameter, most commonly related to a length scale, allowing the calibration of the surface radius of solar models.

A more physical description of convection is possible through numerical simulation (see, Nordlund et al. 2009; Freytag et al. 2012). In practice this is restricted to fairly limited regions near the stellar surface, and even then requires simplified descriptions of the behaviour on scales smaller than the numerical grid.Footnote 36 Detailed modelling, including radiative effects in the stellar atmosphere, has been carried out by, for example, Stein and Nordlund (1989, 1998) and Wedemeyer et al. (2004). This also includes treatments of the equation of state and opacity which are consistent with global stellar models and hence immediately allow comparison with such models. Magic et al. (2013) and Trampedach et al. (2013) presented extensive grids of simulations for a range of stellar parameters, covering the main sequence and the lower part of the red-giant branch.

The simulations provide an alternative to the usual simplified stellar atmosphere models, which are assumed to be time independent and homogeneous in the horizontal direction. A very interesting aspect is that spectral line profiles calculated from the simulations and suitably averaged are in excellent agreement with observations, without the conventional ad hoc inclusion of additional line broadening through ‘microturbulence’ (e.g., Asplund et al. 2000). Also, the simulations provide a very good fit to the observed solar limb darkening, i.e., the variation across the solar disk of the intensity (Pereira et al. 2013).

The simulations of solar near-surface convection typically extend sufficiently deeply to cover that part of the convection zone where the temperature gradient is substantially superadiabatic (see Fig. 12). Thus they essentially define the specific entropy of the adiabatic part of the convection zone and hence fix the depth of the convection zone. Rosenthal et al. (1999) utilized this by extending an averaged simulation by means of a mixing-length envelope. Interestingly, they found that the resulting convection-zone depth was essentially consistent with the depth inferred from helioseismology (cf. Sect. 5.1.2), thus indicating that the simulation had successfully matched the actual solar adiabat.

As a generalization of these investigations, the simulations can be included in stellar modelling through grids of atmosphere models or suitable parameterization of simple formulations. A convenient procedure is to determine an effective mixing-length parameter \(\alpha _{\mathrm{ML}}(T_{\mathrm{eff}}, g)\) as a function of effective temperature and surface gravity, such as to reproduce the entropy of the adiabatic part of the convection zone (e.g., Ludwig et al. 1999, 2008; Trampedach et al. 1999, 2014a; Magic et al. 2015). It should be noted that since \(\alpha _{\mathrm{ML}}\) determines the entropy jump from the atmosphere to the interior of the convection zone, this calibration is intimately tied to the assumed atmospheric structure, e.g., specified by a \(T(\tau )\) relation also obtained from the simulations (Trampedach et al. 2014b). As an example, Fig. 11 shows the calibrated \(\alpha _{\mathrm{ML}}\) obtained by Trampedach et al. (2014a), as a function of \(T_{\mathrm{eff}}\) and \(\log g\). Interestingly, the variation of \(\alpha _{\mathrm{ML}}\) is modest in the central part of the diagram, along the evolution tracks of stars close to solar. Preliminary evolution calculations using these calibrations were carried out by Salaris and Cassisi (2015) and Mosumgaard et al. (2017, 2018). A similar analysis based on the calibration of the mixing-length parameter was carried out by Spada et al. (2018). As an alternative to use the fitted mixing length, Jørgensen et al. (2017) developed a method to include in stellar modelling the averaged structure of the near-surface layers obtained by interpolating in a grid of simulations. This was used by Jørgensen et al. (2018) to calculate a solar-evolution model incorporating such averaged structure in all models along the evolution track; similarly, Mosumgaard et al. (2020) calculated stellar evolution tracks for a range of masses, including the interpolated simulations along the evolution.

Fig. 11
figure 11

Image reproduced with permission from Trampedach et al. (2014a), copyright by the authors

Mixing-length parameter \(\alpha _{\mathrm{ML}}\) obtained by fitting averaged 3D radiation-hydrodynamical simulations to stellar envelope models based on the Böhm-Vitense (1958) mixing-length treatment, shown using the colour scale, against effective temperature \(T_{\mathrm{eff}}\) (on a logarithmic scale) and \(\log g\). This is based on a fit to the simulations indicated by asterisks and the solar simulation shown with \(\odot \). Stellar evolution tracks, computed with the MESA code (Paxton et al. 2011), are shown for masses between 0.65 and \(4.5 \,M_\odot \), as indicated; the dashed segments mark pre-main-sequence evolution.

Apart from the calibration to match the solar radius (cf. Sect. 2.6) tests of the mixing-length parameter and its possible dependence on stellar properties can be carried out by comparing observations and models of red giants, whose effective temperature depends on the assumed \(\alpha _{\mathrm{ML}}\) (Salaris et al. 2002). A recent analysis was carried out by Tayar et al. (2017) based on APOGEE and Kepler observations, comparing with models computed with the YREC code (van Saders and Pinsonneault 2012). The model fits indicated a significant dependence on stellar metallicity, with \(\alpha _{\mathrm{ML}}\) increasing with increasing metallicity. Interestingly, calibrations based on 3D simulations (Magic et al. 2015) did not show this trend, nor did the results obtained by Tayar et al. match the values obtained by Trampedach et al. (2014a), shown in Fig. 11. However, it should be recalled that the effect of \(\alpha _{\mathrm{ML}}\) on stellar structure depends on other parameters in the mixing-length treatment, as well as on the assumed atmospheric structure and physics of the near-surface layers. Thus comparison of numerical values of \(\alpha _{\mathrm{ML}}\) or trends with, e.g., metallicity requires some care; the discrepancies may be caused by differences in other aspects of the modelling. In fact, in a detailed analysis Salaris et al. (2018), carefully taking into account the other uncertainties in the modelling of the near-surface layers, were unable to reproduce the results of Tayar et al. (2017); on the other hand, they did find some issues when \(\alpha \)-enhanced stars were included in the sample.

A comparison between different formulations of near-surface convection is provided in Fig. 12, in a format introduced by Gough and Weiss (1976). The complete solar models, corresponding to Model S of Christensen-Dalsgaard et al. (1996), have been calibrated to the same solar radius (cf. Sect. 2.6) through the adjustment of suitable parameters; this yields a depth of the convection zone which is essentially consistent with the helioseismically determined value. Evidently, regardless of the convection treatment the region of substantial superadiabatic gradient \(\nabla - \nabla _{\mathrm{ad}}\) is confined to the near-surface layers, as would also be predicted from the simple analysis given above (cf. Eq. 32). Using the Canuto and Mazzitelli (1991) formulation leads to a rather higher and sharper peak in the superadiabatic gradient than for the Böhm-Vitense (1958) mixing-length formulation. On the other hand, it is striking that the detailed behaviour of the averaged superadiabatic gradient resulting from the Trampedach et al. (2013) simulation is in reasonable agreement with the results of the calibrated mixing-length treatment. As already noted, it also appears to lead to the correct adiabat in the deeper parts of the convection zone.

Fig. 12
figure 12

(Adapted from Gough and Weiss 1976)

Properties of the solar convection zone. The lower abscissa is depth below the location where the temperature equals the effective temperature, whereas the upper abscissa is pressure p. The solid curve shows the superadiabatic gradient \(\nabla - \nabla _{\mathrm{ad}}\) in Model S of Christensen-Dalsgaard et al. (1996), using the Böhm-Vitense (1958) mixing-length treatment of convection, and the horizontal arrows indicate the extents of the hydrogen and helium ionization zones in this model. Also, the short-dashed curve shows \(\nabla - \nabla _{\mathrm{ad}}\) in a model corresponding to Model S, including calibration to the same surface radius, but using the Canuto and Mazzitelli (1991) treatment of convection, and the heavy long-dashed curve shows \(\nabla - \nabla _{\mathrm{ad}}\) in an average model resulting from hydrodynamical simulations of near-surface convection (Trampedach et al. 2013). The heavy dot-dashed line shows the mean superadiabatic gradient in a hydrodynamical simulation (Featherstone and Hindman 2016), excluding the outer parts of the convection zone; the initial increase in the most shallow part of the simulation is an artifact of the imposed boundary condition.

Physically realistic simulations of near-surface convection have been carried out extending over 96 Mm in the horizontal direction, thus for the first time also including the scale of supergranules, and to a depth of 20 Mm, around 10% of the convection zone (Stein et al. 2006, 2009; Nordlund and Stein 2009).Footnote 37 Simulations have also been carried out which cover the bulk of the convection zone, but excluding the near-surface region: it is very difficult to include the very disparate range of temporal and spatial scales needed to cover the entire convection zone. Also, the microphysics of such simulations are typically somewhat simplified. On the other hand, the simulations take rotation into account, in an attempt to model the transport of angular momentum and hence the source of the surface differential rotation (cf. Eq. 11) and the variation of rotation within the convection zone (see also Sect. 5.1.4). A detailed review of these simulations was provided by Miesch (2005). As an example of their relation to global solar structure, Fig. 12 includes the average superadiabatic gradient from such a simulation, appropriately located relative to the global models. Apart from boundary effects the simulation is clearly in relatively good agreement with the simplified treatment, in particular confirming that this part of the model is very nearly isentropic.

An interesting issue was raised by Hanasoge et al. (2012) concerning the validity of the deeper convection simulations: based on local helioseismology (see Gizon and Birch 2005) using the time distance technique they obtained estimates of the convective velocity one or two orders of magnitude lower than obtained in the simulations, or indeed predicted from the simple estimate in Eq. (32). This was questioned in an analysis using the ring-diagram technique (Greer et al. 2015), who obtained results similar to those of the simulations. However, Hanasoge et al. (2020) showed, using a helioseismic technique based on coupling of mode eigenfunctions, that large-scale turbulence in the Sun is strongly suppressed compared with the results of global numerical simulations. Thus there is increasing observational evidence for possible limitations in our understanding of the dynamics of convection in the Sun, in particularly at larger scales, where there is essentially no observational evidence for structured flows, unlike what is seen in global simulations of the solar convection zone (for a review, see Miesch 2005). A review of the helioseismic inferences of solar convection was provided by Hanasoge et al. (2016). Simulations by Cossette and Rast (2016) indicated that supergranules might be the largest coherent scales of convection, with energy transport in the deeper, essentially adiabatically stratified, parts of the convection zone being dominated by colder compact downflowing plumes. For a recent short review on solar convection, see Rast (2020).

2.6 Calibration of solar models

The Sun is unique amongst stars in that we have accurate determinations of its mass, radius and luminosity and an independent and relatively precise measure of its age from age determinations of meteorites (see Sect. 2.2). It is obvious that solar models should satisfy these constraints, as well as other observed properties of the Sun, particularly the present ratio between the abundances of heavy elements and hydrogen. Ideally, the constraints would provide tests of the models; in practice, the modelling includes a priori three unknown parameters which must be adjusted to match the observed properties: the initial hydrogen and heavy-element abundances \(X_0\) and \(Z_0\) and a parameter characterizing the efficacy of convection (see Sect. 2.5). This adjustment constitutes the calibration of solar models.

Some useful understanding of the sensitivity of the models to the parameters can be obtained from simple homology arguments (e.g., Kippenhahn et al. 2012). According to these, the luminosity approximately scales with mass and composition as

$$\begin{aligned} L \propto Z^{-1} (1 + X)^{-1} M^{5.5} \mu ^{7.5}, \end{aligned}$$
(36)

assuming Kramers opacity, with \(\kappa \propto Z (1 + X) \rho T^{-3.5}\), and with \(\mu \) given by Eq. (13). Obviously, the strong sensitivity to the average mean molecular weight means that relatively modest changes in the helium abundance can lead to the correct luminosity.

As discussed above, the efficacy of convection in the near-surface layers determines the specific entropy in the adiabatic part of the convection zone and hence the structure of the convection zone, thus controlling its extent and hence the radius of the model. (When the composition is fixed by obtaining the correct luminosity the extent of the radiative interior is largely determined.) With increasing efficacy the superadiabatic temperature gradient \(\nabla - \nabla _{\mathrm{ad}}\) required to transport the flux is decreased; hence the temperature in the convection zone is generally lower, the density (at given pressure) therefore higher, and the mass of the convection zone occupies a smaller volume, and hence a smaller extent in radius. Thus the radius of the model decreases with increasing efficacy. The actual reaction of the model is substantially more complex but leads to the same qualitative result.

As discussed in Sect. 2.5, the treatment of convection and hence the properties of the superadiabatic temperature gradient are typically obtained from the mixing-length treatment. According to Eqs. (32) and (34), assuming that \(F_{\mathrm{con}}\) carries most of the flux and is therefore essentially fixed, an increase in \(\alpha _{\mathrm{ML}}\) causes an increase in the convective efficacy and hence a decrease in \(\nabla - \nabla _{\mathrm{ad}}\), corresponding, according to the above argument, to a decrease in the model radius. Thus by adjusting \(\alpha _{\mathrm{ML}}\) a model with the correct radius can be obtained. In other simplified convection treatments, such as that of Canuto and Mazzitelli (1991), a similar efficiency parameter is typically introduced to allow radius calibration. When \(\alpha _{\mathrm{ML}}\) is obtained through fitting to 3D simulations (cf. Fig. 11) there is no a priori guarantee that this yields the value required to obtain the correct solar radius. In this case a correction factor can be applied to achieve the proper solar calibration (Mosumgaard et al. 2017). Of course, if the simulations provide a good representation of the outermost layers of the Sun, as already found to be the case by Rosenthal et al. (1999), this factor would be close to one, as has indeed been found in practice. The same correction factor is then applied when the fit to the 3D simulations are used for more general stellar modelling.

The details of the calibration depend on whether or not diffusion and settling are included. If these effects are ignored the surface composition of the model hardly changes between the zero-age main sequence and the present age of the Sun. Although the present surface abundance \(X_{\mathrm{s}}\) of hydrogen is affected by the calibration of \(X_0\) the range of variation is typically so small that it can be ignored, and the (constant, in space and time) value of the heavy-element abundance, and hence \(Z_0\), is fixed from \(Z_{\mathrm{s}}/X_{\mathrm{s}}\) and some suitable characteristic value of X. On the other hand, if diffusion and settling are included the change in the convection-zone composition must be taken into account and the value of \(Z_0\) must be adjusted to match properly \(Z_{\mathrm{s}}/X_{\mathrm{s}}\).

The formal calibration problem is then, when including diffusion and settling, to determine the set of parameters \(\{p_i\} = \{X_0, Z_0, \alpha _{\mathrm{ML}}\}\) to match the observables \(\{o_k\} = \{L_{\mathrm{s}}, Z_{\mathrm{s}}/X_{\mathrm{s}}, R\}\) to the solar values \(\{o_k^\odot \} = \{L_{\mathrm{s, \odot }}, (Z_{\mathrm{s}}/X_{\mathrm{s}})_\odot , R_{\odot }\}\). (Specifically, R is here taken to be the photospheric radius, defined at the point in the model where \(T = T_{\mathrm{eff}}\), the effective temperature.) This is greatly simplified by the fact that variations in the parameters generally are fairly limited. Thus in practice the corrections \(\{\delta p_i\}\) to the parameters can be found from the errors in the observables, using a fixed set of derivatives, as

$$\begin{aligned} \delta p_i = \sum _k (o_k^\odot -o_k) {\partial p_i \over \partial o_k}, \end{aligned}$$
(37)

where the derivatives \(\{\partial p_i / \partial o_k\}\) are obtained by varying the parameters in turn and inverting the resulting derivative matrix \(\{\partial o_k / \partial p_i\}\). I have found that the following values secure relatively rapid convergence of the iteration:

$$\begin{aligned} \begin{array}{ccc} \displaystyle {\partial \ln \alpha _{\mathrm{ML}}\over \partial \ln L_{\mathrm{s}}} = 1.15 &{} \displaystyle {\partial \ln \alpha _{\mathrm{ML}}\over \partial \ln R} = -4.70 &{} \displaystyle {\partial \ln \alpha _{\mathrm{ML}}\over \partial \ln (Z_{\mathrm{s}}/X_{\mathrm{s}})} = 0.148 \\ \displaystyle {\partial \ln X_0 \over \partial \ln L_{\mathrm{s}}} = -0.137 &{} \displaystyle {\partial \ln X_0 \over \partial \ln R} = -0.087 &{} \displaystyle {\partial \ln X_0 \over \partial \ln (Z_{\mathrm{s}}/X_{\mathrm{s}})} = -0.132 \\ \displaystyle {\partial \ln Z_0 \over \partial \ln L_{\mathrm{s}}} = -0.111 &{} \displaystyle {\partial \ln Z_0 \over \partial \ln R} = 0.275 &{} \displaystyle {\partial \ln Z_0 \over \partial \ln (Z_{\mathrm{s}}/X_{\mathrm{s}})} = 0.864. \end{array} \end{aligned}$$
(38)

These derivatives are incorporated in the ASTEC code (Christensen-Dalsgaard 2008) and allow efficient and automatic calculation of calibrated solar models. In the case where no iteration for \(Z_0\) is carried out the following values have been used:

$$\begin{aligned} \begin{array}{cc} \displaystyle {\partial \ln \alpha _{\mathrm{ML}}\over \partial \ln L_{\mathrm{s}}} = 1.17 &{} \displaystyle {\partial \ln \alpha _{\mathrm{ML}}\over \partial \ln R} = -4.75 \\ \displaystyle {\partial \ln X_0 \over \partial \ln L_{\mathrm{s}}} = -0.154 &{} \displaystyle {\partial \ln X_0 \over \partial \ln R} = -0.045. \end{array} \end{aligned}$$
(39)

Convergence to a relative precision of \(10^{-7}\) is typically obtained in 5–7 iterations.

3 The evolution of the Sun

To set the scene for this brief overview of solar evolution it is useful to recall the characteristic timescales of stars. Departure from hydrostatic equilibrium causes motion on a dynamical timescale, of order

$$\begin{aligned} t_{\mathrm{dyn}} = \left( {R^3 \over G M} \right) ^{1/2} \simeq 30 \, \mathrm{min} \left( {R \over \,R_\odot }\right) ^{3/2} \left( {M \over \,M_\odot }\right) ^{-1/2}. \end{aligned}$$
(40)

Evolution in phases where the energy is provided by release of gravitational energy happens on the Kelvin–Helmholz timescale, of order

$$\begin{aligned} t_{\mathrm{KH}} = {G M^2 \over L R} \simeq 3 \times 10^7 \,\mathrm{year}\left( {M \over \,M_\odot } \right) ^2 \left( {R \over \,R_\odot }\right) ^{-1} \left( {L_{\mathrm{s}}\over \,L_\odot }\right) ^{-1}. \end{aligned}$$
(41)

As a result of the virial theorem (e.g., Kippenhahn et al. 2012) this is also the timescale for the cooling of the star as a result of loss of thermal energy. Finally, the timescale for nuclear burning on the main sequence can be estimated as

$$\begin{aligned} t_{\mathrm{nuc}} = {Q_{\mathrm{H}} q_{\mathrm{c}} X_0 M \over L} \simeq 10^{10} \,\mathrm{year}{M \over \,M_\odot } \left( {L_{\mathrm{s}}\over \,L_\odot }\right) ^{-1}, \end{aligned}$$
(42)

where \(Q_{\mathrm{H}}\) is the energy released per unit mass of consumed hydrogen and \(q_{\mathrm{c}} \simeq 0.1\) is the fraction of stellar mass that is involved in nuclear burning on the main sequence. Later stages of hydrogen burning typically involve smaller fractions of the mass and take place at higher luminosity and consequently have shorter duration; also, the burning of elements heavier than hydrogen release far less energy per unit mass and the corresponding phases are therefore also relatively short.

3.1 Pre-main-sequence evolution

Stars, including the Sun, are born from the collapse of gas and dust in dense and cold molecular clouds. Brief reviews of star formation were provided by, for example, Lada and Shu (1990) and Stahler (1994); for an extensive review, see McKee and Ostriker (2007). The collapse is triggered by gravitational instabilities, likely through turbulence which may have been induced by supernova explosions (Padoan et al. 2016). Detailed simulations by Li et al. (2018) of star formation in externally driven turbulence successfully reproduced the common filamentary structure of interstellar clouds and the statistical properties of newly formed stellar systems. Evidence for the presence at the birth of the solar system of a nearby supernova, which may have contributed to the dynamics leading to the formation of the Sun, is provided by decay products of short-lived radioactive nuclides found in meteorites (e.g., Goswami and Vanhala 2000; Goodson et al. 2016), allowing a remarkably precise dating of different components of the early solar system (Connelly et al. 2012). Further diagnostics of the early history of the solar system is provided by the ratios of oxygen isotopes (Gounelle and Meibom 2007); in situ measurements of the solar wind by the Genesis spacecraft appear to have further complicated the picture (Gaidos et al. 2009). A detailed review of the environment of solar-system formation was given by Adams (2010).

The collapse of the cloud results in the formation of a core which subsequently accretes matter from the surrounding cloud; detailed simulations of these early phases of stellar evolution have been carried out by, for example, Baraffe et al. (2009). The angular momentum of the infalling material probably leads to the formation of a disk around the star while processes likely involving magnetic fields often result in outflow from the proto-star in highly collimated jets along the rotation axis (Shu et al. 2000), giving rise to the so-called Herbig–Haro objects (e.g., Reipurth and Bally 2001). The gravitational energy released in the contraction of the protostar partly goes to heating it up and is partly released as radiation from the star; the radiation finally stops the accretion and blows away the surrounding material, such that the star becomes directly observable: the star has reached the ‘birth line’.

In these early phases matter in the protostar is relatively cool, leading to a high opacity, and the luminosity is rather large. Consequently, models of the star in this phase are generally fully convective, evolving down the so-called Hayashi line (Hayashi and Hoshi 1961) with contraction at roughly constant effective temperature, and material in the star is fully mixed. In this phase the temperature in the core reaches a point where deuterium burning can take place, but since the initial deuterium content is tiny (around \(1.6 \times 10^{-5}\) of the hydrogen abundance), the energy release has little effect on the evolution. With further contraction the temperature in the central parts of the star becomes so high that convection ceases and the star develops a gradually growing central radiative region. In this initial contraction, where energy for the luminosity and the heating of stellar material is provided by release of gravitational energy, evolution takes place on the Kelvin–Helmholz timescale (cf. Eq. 41), along the so-called Henyey line (Henyey et al. 1955) at increasing effective temperature and luminosity. With the beginning onset of substantial nuclear energy release, readjustments of the structure of the star lead to a reduction in luminosity, and the star settles on the zero-age main sequence (ZAMS). These early evolutionary phases are illustrated in Fig. 13. An extensive description of star formation, although possibly not completely up to date, was given by Stahler and Palla (2004).

Fig. 13
figure 13

(Adapted from Aerts et al. (2010); data courtesy of A. Miglio.)

Pre-main-sequence evolution of stars with masses between 0.8 and \(2 \,M_\odot \), as indicated, computed with the Liège stellar evolution code CLÉS (Scuflaire et al. 2008). The composition is \(X = 0.7\), \(Z = 0.02\). The crosses mark the age along the tracks, in steps of 1 Myr; the ages at the end of the tracks range from 87 Myr at \(0.8 \,M_\odot \) to 32 Myr at \(1.4 \,M_\odot \). The heavy dotted line is a sketch of the so-called birth line, as shown by Palla and Stahler (1993), where the star emerges in visible light from the material left over from its formation.

Interestingly, this somewhat simplistic picture has been questioned by more detailed modelling of the contraction phase, starting from the initial collapsing cloud. Wuchterl and Klessen (2001) and Wuchterl and Tscharnuter (2003) solved the spherically symmetric equations of radiation hydrodynamics, starting from a suitable isothermal model of the original cloud and following the formation of an optically thick protostellar core and the accretion of further matter on this core. They found that deuterium burning takes place during the accretion phase and that the model retains a substantial radiative core throughout the evolution; the later phases of the contraction are parallel to the fully convective Hayashi track, but at somewhat higher effective temperature. These calculations were criticized by Baraffe and Chabrier (2010) on the grounds of the assumed spherical symmetry of the infall. However, by considering episodic infall Baraffe et al. also found models with an early radiative core. Detailed 3D modelling of collapsing molecular clouds, coupled with spherically symmetric modelling of the resulting proto-stellar and pre-main-sequence evolution (Kuffmeier et al. 2018; Jensen and Haugbølle 2018) has confirmed the episodic nature of the accretion. Also, interestingly, the results provide a plausible explanation for the observed properties of young stellar clusters.

As discussed in Sect. 7.1 the detailed pre-main-sequence evolution could have important consequences for the interpretation of the present solar surface composition. Given the importance of rotation and disk formation departures from spherical symmetry in the evolution of the star should clearly be taken into account in the modelling.

At the end of pre-main-sequence evolution, the temperature reaches a level where the full set of reactions in the PP chains (see Eqs. 24 and 25) sets in, supplying the energy lost from the stellar surface. At this point the contraction stops and the star enters its main-sequence evolution, with a balance between the nuclear energy generation and the energy loss from the surface, and hence taking place on a nuclear timescale.

It is likely that the early contraction, and the accretion of matter in the disk, leads to an initial rapid rotation of the star. In fact, it is observed that young stars generally rotate much more rapidly than the present Sun. However, in young open clusters where the stars may be assumed to share the same age substantial scatter in the rotation rates is found (e.g., Soderblom et al. 2001). This is a strong indication of the complex processes controlling the evolution of angular momentum in the initial phases of proto-stellar evolution, involving interactions between the star, the accreting disk and the outflows, likely of magnetic origin (Shu et al. 1994; Bodenheimer 1995), including magnetic locking between the outer layers of the star and the inner parts of a truncated accretion disk.

Disks are commonly observed around protostars, confirming also this part of the description (e.g., Greaves 2005; Williams and Cieza 2011). The ubiquitous presence of planetary systems around other stars (Batalha 2014; Winn and Fabrycky 2015) strongly suggests that the formation of planets in such protoplanetary disks is a common phenomenon. This likely takes place through the formation and subsequent coalescence of dust grains into objects of increasing size, and finally the formation of a planetary system (Lissauer 1993; Alibert et al. 2005; Montmerle et al. 2006; Johansen and Lambrechts 2017). Detailed discussions of the properties of such disks and the formation of planets were provided by Armitage (2011, 2017). Dramatic illustrations of these planet-forming processes have been obtained with the Atacama Large Millimeter/submillimeter Array (ALMA) high-resolution observations (e.g., ALMA Partnership et al. 2015; Isella et al. 2016; Harsono et al. 2018). An example is illustrated in Fig. 14; modelling by Dipierro et al. (2015) showed that the observed gaps are indeed consistent with the presence of newly formed planets. The planet-forming processes probably happen on a timescale comparable with, or shorter than, the gravitational contraction of the star. Thus the ages of meteorites as determined from radioactive dating likely provide good measures of the age of the Sun since it arrived on the main sequence.

Fig. 14
figure 14

Image reproduced with permission from ALMA Partnership et al. (2015), copyright by AAS

ALMA observations, at a wavelength of 1 mm, of the planet-forming disk around the young star HL TaU. The lower-left inset shows the resolution.

3.2 Main-sequence evolution

The evolution after the arrival on the main sequence, past the present age of the Sun, is illustrated in Fig. 15. This is based on a model corresponding to Model S of Christensen-Dalsgaard et al. (1996), discussed in more detail in Sect. 4.1. Additional information about the variation with time of key quantities, normalized to values for the present Sun, is provided in Fig. 16. The evolution is obviously driven by the gradual conversion of hydrogen into helium in the core, leading to an increase in the mean molecular weight of matter in the core. This leads to a contraction of the core, an increase in the central density and temperature and, in accordance with Eq. (36), to an increase in the luminosity. This evolution may be understood in simple terms by noting, from Eq. (12), that the increase in \(\mu \) would cause a decrease in pressure inconsistent with hydrostatic balance, unless compensated for by an increase in \(\rho \) and T resulting from the contraction of the core. The increase in temperature, although partly counteracted by the decrease in X, leads to an increase in the energy-generation rate and, more importantly, to an increase in the radiative conductivity, and hence to the increase in the luminosity. Thus this effect is basic to the main-sequence evolution of stars; unless non-standard effects (such as mass loss; see Sect. 6.5) are relevant there is hardly any doubt that the solar luminosity has undergone a fairly substantial increase since the formation of the solar system. A detailed analysis of this behaviour, in terms of homology scaling, was provided by Gough (1990b).

Fig. 15
figure 15

Evolution track in the Herzsprung–Russell diagram of a model sequence passing through Model S of the present Sun (Christensen-Dalsgaard et al. 1996, see also Sect. 4). Diamonds mark models separated by \(1 \,\mathrm{Gyr}\) in age, and after an age of \(10 \,\mathrm{Gyr}\) plus symbols are at intervals of \(0.1 \,\mathrm{Gyr}\). The Sun symbol (\(\odot \)) indicates the location of the present Sun and the star shows the point where hydrogen has been exhausted at the centre

Such a change in the solar energy reaching the Earth might be expected to have climatic effects; in fact, a naive estimate based on black-body radiative balance indicates that the change of 30% in solar luminosity shown in Fig. 16 would cause a change of around 7% in the surface temperature of the Earth, i.e., around 20 K. Thus one might expect that the Earth was very substantially colder early in its history. In fact, already Schwarzschild et al. (1957) noted that, since in their calculations the solar luminosity was about 20% less than now two billion years ago “[t]he average temperature on the earth’s surface must then have been just about at the freezing point of water, if we assume that it changes proportionally to the fourth root of the solar luminosity. Would such a low average temperature have been too cool for the algae known to have lived at that time?” In contrast to these models, the terrestrial surface temperature shows no indication of dramatic changes over the past 4 Gyr, with evidence for liquid water in even very old geological material (Mojzsis et al. 2001; Wilde et al. 2001; Rosing and Frei 2004). This problem has been dubbed ‘the faint early Sun problem’ (see also Güdel 2007), and led to speculations about errors in our understanding of stellar evolution. It seems more likely, however, that the problem lies in the simplistic climate models used for these estimates of the temperature of the early Earth (e.g., Sagan and Mullen 1972). With a substantially stronger early greenhouse effect, perhaps caused by a higher content of \(\mathrm{CO_2}\), the present temperature could have been reached with a lower energy input. Modelling of the early terrestrial atmosphere by von Paris et al. (2008) suggested that the required abundances of greenhouse gasses may be consistent with geological evidence. This was questioned by Rosing et al. (2010) who suggested that the dominant effect was a reduced cloud cover and hence lower terrestrial mean albedo than at present, resulting in a fainter Sun providing sufficient heating to achieve the required surface temperature on Earth. Shaviv (2003) and Svensmark (2006) noted that modulation of galactic cosmic rays by an initially stronger solar wind could have contributed to the warming of the early Earth, by similarly reducing the cloud cover. Variations with time of solar activity and their possibly effects on planetary atmospheres were also discussed by Güdel (2007). There remains the problem of explaining the apparent stability of Earth’s temperature despite the variation in solar luminosity. Various feedback mechanisms of a geological nature have been proposed that may account for this (e.g., Walker et al. 1981), involving climate-dependent weathering of rocks and \(\mathrm{CO_2}\) outgassing from volcanoes; a detailed review of these processes was provided by Kump et al. (2000).Footnote 38 A comprehensive review of the ‘faint early Sun problem’ was provided by Feulner (2012).

Fig. 16
figure 16

Variation with age of quantities, normalized to the value at the present age of the Sun, in a \(1 \,M_\odot \) evolution sequence, including Model S of the present Sun (see Sect. 4.1). The top panel shows the evolution up to just after the present age, whereas the bottom panel continues the evolution beyond the exhaustion of hydrogen at the centre. Line styles and colours are indicated in the figure. R and \(L_{\mathrm{s}}\) are photospheric radius and surface luminosity, \(d_{\mathrm{cz}}\) is the depth of the convective envelope, in units of the surface radius, and \(T_{\mathrm{c}}\), \(X_{\mathrm{c}}\), \(\rho _{\mathrm{c}}\), \(\epsilon _{\mathrm{c}}\) and \(\kappa _{\mathrm{c}}\) are central temperature, hydrogen abundance, density, energy-generation rate and opacity. Values in the present Sun for most of the quantities are given in Table 2; in addition, \(\epsilon _{\mathrm{c}} = 17.06\,\,\mathrm{erg}\,\,\mathrm{g}^{-1}\,\,\mathrm{s}^{-1}\) and \(\kappa _{\mathrm{c}} = 1.242\,\,\mathrm{cm}^2\,\,\mathrm{g}^{-1}\). At the end of the illustrated part of the evolution, the ratio \(\rho _{\mathrm{c}}/\rho _{\mathrm{c,\odot }}\) is around 340, corresponding to a central density of \(5.3 \times 10^4\,\,\mathrm{g}\,\,\mathrm{cm}^{-3}\)

Beyond the present Sun the increase in luminosity continues, as is evident from Figs. 15 and 16. Also, the radius increases monotonically during the central hydrogen burning. The evolution of the hydrogen-abundance profile is illustrated in Fig. 17. The nuclear reactions cause a gradual reduction of the hydrogen in the core, whereas helium settling, although fairly weak in the Sun, gives rise to an increase in the hydrogen abundance in the convection zone and the formation of a fairly sharp composition gradient at its base. When hydrogen is exhausted at the centre there is a gradual transition to hydrogen burning in a shell around a core consisting predominantly of helium; the core gradually grows in mass and contracts, leading to high central densities and a substantial degree of degeneracy, while the hydrogen-burning shell becomes quite thin. This enhances the increase in the stellar radius: for reasons that are not entirely understood (see, however, Faulkner 2004) the contraction of the core inside a burning shell leads to expansion of the region outside the shell. The resulting strong expansion of the stellar surface radius leads to a decrease of the effective temperature and strong increase in the depth of the convective envelope. The evolution initially takes place at nearly constant luminosity, on the so-called subgiant branch. Eventually, the star reaches a structure that, in terms of distance to the centre, is nearly fully convective, apart from a radiative core of very small radial extent; as a result, the star evolves towards higher luminosity with the increase in radius, parallel and close to the Hayashi track. At the final point illustrated in Fig. 16 the convective envelope extends over 68% of the mass, and 79% of the radius, of the model. As shown in Fig. 17 the resulting mixing with layers previously enriched in helium by settling leads to a reduction in the surface hydrogen abundance.

Fig. 17
figure 17

Hydrogen abundance X against fractional mass m/M for a zero-age main-sequence model (dotted line), a model of age \(4.6 \,\mathrm{Gyr}\) (present Sun; solid line), a model of age \(9.5 \,\mathrm{Gyr}\), where hydrogen has just been exhausted at the centre (dashed line) and the model of age \(11.5 \,\mathrm{Gyr}\), the final model included in Fig. 15 (dot-dashed line). In the latter model the radiative core containing 32% of the stellar mass occupies only 21% of the stellar radius. The evolution sequence corresponds to Model S of Christensen-Dalsgaard et al. (1996, see Sect. 4.1)

For stars from slightly above solar mass and below there is a systematic decrease in the rotation rate with increasing age as the stars evolve on the main sequence; for stars of solar mass (Skumanich 1972; Barnes 2003); this is assumed to result from the loss of angular momentum in a magnetized stellar wind (e.g., Kawaler 1998; Matt et al. 2015), presumably related to the generation of magnetic activity through dynamo action, as inferred in the Sun (for a review, see Charbonneau 2010). Regardless of the substantial spread in early rotation rates, these processes tend to lead to a well-defined rotation rate as a function of age and mass, after an initial converging phase (e.g., Gallet and Bouvier 2013). This forms the basis for gyrochronology, i.e., the determination of ages of stars based on their rotation periods (e.g., Barnes 2010; Epstein and Pinsonneault 2014). The details of these processes, and of the subsequent redistribution of angular momentum in the stellar interior, are highly uncertain, however (Charbonneau and MacGregor 1993; Gough and McIntyre 1998; Talon and Charbonnel 2003; Charbonnel and Talon 2005; Eggenberger et al. 2005). In the solar case the result of the angular-momentum loss and redistribution, as determined from helioseismology, is a nearly spatially unvarying rotation in the radiative interior, at a rate slightly below the equatorial surface rotation rate. These results, and their theoretical interpretation, are discussed in Sect. 5.1.4 in the light of helioseismic inferences of solar internal rotation. Interestingly, by combining asteroseismic determinations of stellar ages (cf. Sect. 7.2) with determinations of stellar rotation rates van Saders et al. (2016) indicated that the steady decrease of rotation rate with increasing age slows down for stars older than a few Gyr, indicating a weakening of the magnetic braking. This would complicate the use of gyrochronology for age determination of stars older than the Sun. However, I note that Barnes et al. (2016) questioned the analysis by van Saders et al. (2016).Footnote 39 Also, Lorenzo-Oliveira et al. (2020) inferred a rotation rate matching the expectations for normal spin-down for the solar twin HIP 102152, with an age of 8 Gyr inferred from isochrone fitting; however, there may be some question about the precision of the age and the modelling of the spin-down (van Saders, private communication). Thus further work is clearly required to define the limits of applicability of gyrochronology.

3.3 Late evolutionary stages

The later evolution of stars of solar mass is discussed in detail by Kippenhahn et al. (2012). The specific case of the Sun was considered by, for example, Jørgensen (1991) and Sackmann et al. (1993). With continuing core contraction and expansion of the envelope the star moves up along the Hayashi track as a red giant, reaching a luminosity of more than \(2000 \,L_\odot \) (for a review of red-giant evolution, see Salaris et al. 2002); needless to say, this is incompatible with life on Earth. The helium core heats up, partly as a result of the contraction and partly through heating from the hydrogen-burning shell whose temperature is forced to increase to match the energy required by the increasing luminosity. When the core reaches a temperature of around \(80 \times 10^6 \,\mathrm{K}\) helium burning starts, in the triple-alpha reaction producing \({}^{12}\mathrm{C}\). Since the core is strongly degenerate the pressure is essentially independent of temperature; thus the heating associated with helium ignition initially has no effect on the pressure and the burning takes place in a run-away process, a helium flash, where the core luminosity exceeds \(10^{10} \,L_\odot \) for several hours. However, the energy released is absorbed as gravitational energy in expanding the inner parts of the star; together with a decrease in the energy production from the hydrogen shell-burning, this results in a drop of the surface luminosity. Detailed calculations of the complex evolution through this phase have been carried out by, for example, Schlattl et al. (2001) and Cassisi et al. (2003b), and are also possible in the general-purpose MESA stellar evolution code (Paxton et al. 2011). Hydrodynamical simulations in two and three dimensions of the evolution during the flash were made by Mocák et al. (2008, 2009), confirming the importance of core convection in carrying away the energy generated during the flash. Only when degeneracy is lifted by the increase in temperature and decrease in density does the core expand and nuclear burning stabilizes in a phase of quiet core helium burning; in addition to the triple-alpha reaction, \({}^{16}\mathrm{O}\) is produced from \({}^4\mathrm{He}+{}^{12}\mathrm{C}\). When helium is exhausted in the core the star again ascends along the Hayashi track, on the asymptotic giant branch. Here the star enters the so-called thermally pulsing phase where helium repeatedly ignites in helium flashes in a shell around the degenerate carbon-oxygen core, after which evolution settles down again over a timescale of a few thousand years (e.g., Herwig 2005). Finally, the star sheds its envelope through rapid mass loss (e.g., Willson 2000; Miller Bertolami 2016), leaving behind a hot and compact core consisting predominantly of carbon and oxygen. The Sun is expected to reach this point in its evolution at an age of around 12.4 Gyr, 7.8 Gyr from now. The ejected material may shine due to the excitation from the ultraviolet light emitted by the core, as a planetary nebula which quickly disperses, with a lifetime of typically of order 10,000 years (e.g., Gesicki et al. 2018). The core contracts and cools over a very extended period as a white dwarf, from its initial surface temperature of more than \(10^5 \,\mathrm{K}\), reaching a surface temperature of \(4000 \,\mathrm{K}\) only after a further 10 Gyr.

The details of this evolution are still somewhat uncertain, depending in particular on the extent of mass loss in the red-giant phases, and on exotic processes that may cool the core and delay helium ignition. An uncertain issue of some practical importance is whether the solar radius at any point reaches a size such as to engulf the Earth, taking into account also the possible increase in the size of the Earth’s orbit resulting from mass loss from the Sun; this depends in part on the variation of the radius during the final thermal pulses. In a detailed analysis of the evolutionary scenarios, Rybicki and Denis (2001) concluded that ‘it seems probable that the Earth will be evaporated inside the Sun’. This was confirmed by more recent calculations by Schröder and Smith (2008), taking into account tidal interactions between the planet and the expanding Sun and dynamical drag in the solar atmosphere, as well as the compensating effects of solar mass loss and their influence on the orbit of the planet. According to their results, planets with a present distance from the Sun of less than around 1.15 AU would be engulfed when the Sun reaches the tip of the red-giant branch.

It is obvious that the continued increase of solar luminosity, even on the main sequence, will have had catastrophic climatic consequences long before this point is reached. Already Lovelock and Whitfield (1982) noted that the increase over only 150 million years would be larger than could be compensated for by a decreasing greenhouse effect caused by a decrease in the atmospheric \(\mathrm{CO_2}\) content, to the minimum level required for photosynthesis. In an interesting, if somewhat speculative, analysis Korycansky et al. (2001) pointed out the possibility of compensating for the increase in solar luminosity by increasing the size of the Earth’s orbit through engineering repeated, although infrequent, carefully controlled encounters with a substantial asteroid. It seems unlikely, however, that such a change could be rapid enough to negate the effect of the increase of the solar luminosity on the red-giant branch. Furthermore, it is hardly necessary to point out that the Earth may face more imminent threats to the climate as a result of the antropogenic effects on the composition of the atmosphere (e.g., Crowley 2000; Solomon et al. 2009; Cubasch et al. 2013).

4 ‘Standard’ solar models

As discussed in Sect. 2.1, the concept of ‘standard solar model’ has evolved greatly over the years; the term goes back at least to Bahcall et al. (1969) who introduced it in connection with calculations of the solar neutrino flux. It may now be taken to be a spherically symmetric model, including a relatively simple treatment of diffusion and gravitational settling, up-to-date equation of state, opacity and nuclear reactions, and a simple treatment of near-surface convection. Other potential hydrodynamical effects, including mixing processes in the radiative interior and the effects of rotation and its evolution, are ignored. The evolution of the concept can be followed in several sets of solar evolution calculations, often motivated by the solar neutrino problem (see Sect. 5.2) and, more recently, by the availability of detailed helioseismic constraints (see Sect. 5.1). An impressive example are the efforts of John Bahcall over an extended period. As reviewed by Bahcall (1989) early models did not include diffusion (e.g., Bahcall and Shaviv 1968; Bahcall and Ulrich 1988). Bahcall and Pinsonneault (1992b) included diffusion of helium, whereas later models (e.g., Bahcall and Pinsonneault 1995; Bahcall et al. 2006) included diffusion of both helium and heavier elements. Other examples of standard model computations are Turck-Chièze et al. (1988), Cox et al. (1989), Guenther et al. (1992), Berthomieu et al. (1993), Turck-Chièze and Lopes (1993), Gabriel (1994, 1997), Chaboyer et al. (1995), Guenther et al. (1996), Richard et al. (1996), Schlattl et al. (1997), Brun et al. (1998), Elliott (1998), Morel et al. (1999), Neuforge-Verheecke et al. (2001a) and Serenelli et al. (2011). A recent comprehensive recomputation of solar models was carried out by Vinyoles et al. (2017), discussed in more detail in Sect. 6.4. A brief review of standard solar modelling was provided by Serenelli (2016).

As representative of standard models I here consider the so-called Model S of Christensen-Dalsgaard et al. (1996); details on the model calculation were provided by Christensen-Dalsgaard (2008). Although more than two decades old, and to some extent based on out-dated physics, it is still seeing substantial use for a variety of applications, including as reference for helioseismic inversions. Thus it provides a useful reference for discussing the effects of various updates to the model physics. Remarkably, as discussed in Sects. 5.1.2 and 5.2, such simple models are in reasonable agreement with observations of solar oscillations and neutrinos.

4.1 Model S

Model S was computed with the OPAL equation of state (Rogers et al. 1996) and the 1992 version of the OPAL opacities (Rogers and Iglesias 1992), with low-temperature opacities from Kurucz (1991).Footnote 40 Nuclear reaction parameters were generally obtained from Bahcall and Pinsonneault (1995), and electron screening was treated in the weak-screening approximation of Salpeter (1954). The computation was started from a static and chemically homogeneous zero-age main-sequence model, and the age of the present Sun, from that state, was assumed to be \(4.6 \,\mathrm{Gyr}\). The time evolution of the \({}^3\mathrm{He}\) abundance was followed, while the other reactions in the PP chains were assumed to be in nuclear equilibrium; to represent the pre-main-sequence evolution the initial \({}^3\mathrm{He}\) abundance was assumed to correspond to the evolution of the abundance at constant conditions for a period of \(5 \times 10^7 \,\mathrm{year}\), starting at zero abundance (see Christensen-Dalsgaard et al. 1974). Similarly, the CN part of the CNO cycle (cf. Eq. 26) was assumed to have reached nuclear equilibrium in the pre-main-sequence phase while the conversion of \({}^{16}\mathrm{O}\) into \({}^{14}\mathrm{N}\) was followed. The diffusion and settling of helium and heavy elements were computed in the approximation of Michaud and Proffitt (1993); the evolution of Z was computed neglecting the effect of nuclear reactions and representing \(D_i\) and \(V_i\) by the behaviour of fully ionized \({}^{16}\mathrm{O}\). Convection was treated in the Böhm-Vitense (1958) formalism. The atmospheric structure was computed using the VAL \(T(\tau )\) relation given by Eq. (31) and illustrated in Fig. 10. The initial composition was calibrated to obtain a present \(Z_{\mathrm{s}}/X_{\mathrm{s}}= 0.0245\) (Grevesse and Noels 1993), while the surface luminosity and radius were set to \(3.846 \times 10^{33} \,\mathrm{erg}\,\mathrm{s}^{-1}\) and \(6.9599 \times 10^{10} \,\mathrm{cm}\), respectively, to an accuracy of better than \(10^{-6}\) (see Sect. 2.2).

Some basic quantities of the model of the present Sun are given in Table 2 below, together with properties of other similar models, discussed in detail in Sect. 4.2. Also, Fig. 18 shows the variation of X and Z through the model. It is striking that the settling of helium and heavy elements causes sharp gradients in X and Z just below the convection zone. Details of the model structure are provided at https://github.com/jcd11/LRSP_models.

Fig. 18
figure 18

Hydrogen abundance X (top panel) and heavy-element abundance (lower panel) against fractional radius, in a model (Model S of Christensen-Dalsgaard et al. 1996) of the present Sun. The inset in the upper panel shows the hydrogen-abundance profile in the vicinity of the base of the convective envelope. The horizontal dotted lines show the initial values \(X_0\) and \(Z_0\)

It is perhaps of some interest to compare the structure of this model with an early calibrated model of solar structure. In Fig. 19 Model S is compared with a \(1 \,M_\odot \) model computed by Weymann (1957), as quoted by Schwarzschild (1958); the model has solar radius and approximately solar luminosity at an age of \(4.5 \,\mathrm{Gyr}\). It is evident that the hydrogen profile differs substantially between the two models, in part owing to the inclusion of settling in Model S, but more importantly because the Weymann model is less evolved. On the other hand, on this scale temperature and pressure look quite similar between the two models. In fact, the central temperature and pressure differ by less than 10%, although there are differences of up to nearly 30% in temperature elsewhere in the model and even larger differences in pressure. Another significant difference is in the depth of the convective envelope which is around \(0.15 \,R_\odot \) in the Weymann model and 0.29 in Model S. Even so, given that Model S provides a reasonable representation of solar structure (see Sect. 5.1.2), it is evident that the early model succeeded in capturing important aspects of the structure of the Sun.

Fig. 19
figure 19

Comparison of Model S of Christensen-Dalsgaard et al. (1996) (dashed curves) with a \(1 \,M_\odot \) model computed by Weymann (1957) (solid curves). The quantities illustrated are temperature T, in \(\,\mathrm{K}\) (top panel), pressure p, in \(\,\mathrm{dyn}\,\mathrm{cm}^{-2}\) (central panel) and hydrogen abundance X (bottom panel)

4.2 Sensitivity of the model to changes in physics or parameters

It is evident that the uncertainty in the input parameters, and physics, of the calculation introduces uncertainties in the model structure. A number of investigations have addressed aspects of these uncertainties. An early example is provided by Christensen-Dalsgaard (1988b) who considered several different changes to the model physics, analysing the effects on the model structure and the resulting oscillation frequencies. Remarkably, he found that the change to the structure was essentially linear in the change in opacity as represented by \(\log \kappa \), even for quite substantial changes. Such linearity in changes to \(\varGamma _1\) was also found by Christensen-Dalsgaard and Thompson (1991). Boothroyd and Sackmann (2003) considered a broad range of changes in the model parameters and physics, emphasizing comparisons with the helioseismically inferred sound speed obtained by Basu et al. (2000). A very ambitious investigation was carried out by Bahcall et al. (2006) who made a Monte Carlo simulation based on 10,000 models with random selections of 21 parameters characterizing the models, in this way assigning statistical properties to the computed model quantities, including detailed neutrino fluxes. It was demonstrated by Jørgensen and Christensen-Dalsgaard (2017) that, owing to the near linearity of the model response to changes in parameters (see also Bahcall and Serenelli 2005), this result could to a large extent be recovered much more economically by computing the relevant partial derivatives with respect to the model parameters; this opens the possibility for more extensive statistical analysis of this nature. A more systematic exploration of the linearity of the response of solar models was carried out by Villante and Ricci (2010) who linearized the equations of stellar structure in terms of various perturbations and, consistent with the numerical experiments discussed above, demonstrated that the resulting changes to the model closely matched the differences between models computed with the assumed perturbations.

Here I consider some examples of changes to the model parameters and physics, emphasizing the updates that have taken place since the original computation of Model S. When not specifically mentioned, the physical properties and parameters of the models are the same as for Model S (see also Table 1), which is also in most cases used as reference. An overview of the models considered is provided by Table 1, while Table 2 gives basic properties of the models, and Table 3 presents the differences between the modified models and Model S. To put the results in context, Fig. 20 shows the helioseismically inferred differenceFootnote 41 in squared sound speed between the Sun and Model S. Note that the statistical errors in the inferences are barely visible, compared with the size of the symbols. The helioseismic results on solar structure are discussed in detail in Sect. 5.1.2.

Table 1 Parameters of solar models
Table 2 Characteristics of the models in Table 1
Table 3 Differences between the model quantities in Table 2 and the corresponding properties of Model [S]
Fig. 20
figure 20

(Adapted from Basu et al. 1997)

Results of helioseismic inversions. Inferred relative differences in squared sound speed between the Sun and Model S in the sense (Sun)–(model). The vertical bars show \(1\,\sigma \) errors in the inferred values, based on the errors in the observed frequencies. The horizontal bars provide a measure of the resolution of the inversion.

To interpret the results of such model comparisons, it is useful to note some simple properties of the solar convection zone (see also Gough 1984b; Christensen-Dalsgaard 1997; Christensen-Dalsgaard et al. 1992, 2005). Apart from the relatively thin ionization zones of hydrogen and helium, pressure and density in the convection zone are approximately related by Eq. (27), with \(\gamma = 5/3\); also, since the mass of the convection zone is only around \(0.025 \,M_\odot \) we can, as a first approximation, assume that \(m \simeq M\) in the convection zone. In this case it is easy to show thatFootnote 42

$$\begin{aligned} c^2 = {\gamma p \over \rho } \simeq (\gamma - 1) G M \left( {1 \over r} - {1 \over R} \right) . \end{aligned}$$
(43)

It follows that c is unchanged at fixed r between models with the same mass and surface radius. Also,

$$\begin{aligned} {\delta _r p \over p} = {\delta _r \rho \over \rho } \simeq - {1 \over \gamma - 1} {\delta K \over K}, \end{aligned}$$
(44)

where \(\delta _r\) denotes the difference between two models at fixed r, and \(\delta K\) is the difference in K between the models. Finally, assuming the ideal gas law, Eq. (12),

$$\begin{aligned} {\delta _r T \over T} \simeq {\delta _r \mu \over \mu }, \end{aligned}$$
(45)

which is obviously constant.

Since the effects of the changes are subtle, some care is required in specifying and computing the differences.Footnote 43 Here I consider differences (also denoted \(\delta _r\)) at fixed fractional radius r/R, where R is the photospheric radius. It should be noted, however, that Christensen-Dalsgaard and Thompson (1997) found differences \(\delta _m\) at fixed mass fraction m/M more illuminating for studies of the effects on oscillation frequencies of near-surface modifications to the model. Such differences are also more appropriate for studying evolutionary effects on stellar models. They showed that the two differences are related by

$$\begin{aligned} \delta _m f= & {} \delta _r f + \delta _m r {\mathrm{d}f \over \mathrm{d}r} \nonumber \\ \delta _r f= & {} \delta _m f + \delta _r m {\mathrm{d}f \over \mathrm{d}m}, \end{aligned}$$
(46)

for any model quantity f.

A prerequisite for sensible studies of solar models and their dependence on the physics is that adequate numerical precision is reached. I discuss this in the Appendix.

I first consider changes in the global parameters characterizing the model. Figure 21 shows the effect of decreasing the model age to the now generally accepted value of 4.57 Gyr (see Sect. 2.2), compared with the reference value of 4.6 Gyr in Model S. To match the solar luminosity at this lower age, a slightly smaller initial hydrogen abundance is required, increasing \(\mu \) (cf. Eq. 36); on the other hand, the increased central hydrogen abundance reflects the shorter time spent in hydrogen burning. As predicted above, the sound-speed difference is virtually zero in the convection zone, except in the ionization zones near the surface where the change results from the change in composition and the resulting change in \(\varGamma _1\). Also, \(\delta _r \ln p\) and \(\delta _r \ln \rho \) are nearly constant and nearly identical in the bulk of the convection zone (cf. Eq. 44) and the change in temperature reflects the change in the mean molecular weight.

Fig. 21
figure 21

Model changes at fixed fractional radius resulting from a change in age, from the reference value of 4.6 Gyr used in Model S to Model [Age] with an age of 4.57 Gyr (see Sect. 2.2), in the sense (Model [Age])–(Model S). The line styles are defined in the figure. The thin dotted line marks zero change

A related issue concerns the neglect of pre-main-sequence evolution in Model S, where evolution starts from an essentially homogeneous zero-age main-sequence model. This was investigated by Morel et al. (2000) who found that, with a shift in the evolution by 25 Myr, the resulting calibrated solar models differed by only a few parts in \(10^4\). Thus the assumption of an initial ZAMS model is adequate.

The effects of changing the radius, from the reference value of \(6.9599 \times 10^{10} \,\mathrm{cm}\) to the value of \(6.95508 \times 10^{10} \,\mathrm{cm}\) found by Brown and Christensen-Dalsgaard (1998), is illustrated in Fig. 22a. Here there is obviously a change in the sound speed in the convection zone, and consequently \(\delta _r \ln p\) and \(\delta _r \ln \rho \), while still approximately constant in the convection zone, differ. Considering the changes in the radiative interior, the use of differences at fixed r/R is in fact somewhat misleading in this case. Much of the change shown in Fig. 22a is essentially a geometrical effect, corresponding to the gradient term in the second of Eqs. (46); the corresponding differences at fixed m (see Fig. 22b) become very small in the deep interior. As a result, the value of \(X_0\) required to calibrate the model is virtually unchanged.

Fig. 22
figure 22

Model changes at fixed fractional radius (a) and fixed mass (b), resulting from a change in photospheric radius, from the reference value of \(6.9599 \times 10^{10} \,\mathrm{cm}\) used in Model S to the value of \(6.95508 \times 10^{10} \,\mathrm{cm}\) in the sense (Model [\(R_{\mathrm{s}}\)])–(Model S). Line styles are as defined in Fig. 21

As illustrated in Fig. 23 the change in luminosity from the reference value of \(3.846 \times 10^{33} \,\mathrm{erg}\,\mathrm{s}^{-1}\) to the value \(3.828 \times 10^{33} \,\mathrm{erg}\,\mathrm{s}^{-1}\) inferred from Kopp et al. (2016) has modest effects on the structure. According to Eq. (36) the calibration to lower luminosity requires a decrease in \(\mu \) and hence an increase in X, accompanied by a decrease in temperature, which is evident in the figure. In the central regions the lower luminosity also corresponds to a smaller nuclear burning of hydrogen and hence a larger abundance. The difference in sound speed is minute.

Fig. 23
figure 23

Model changes at fixed fractional radius resulting from a change in surface luminosity, from the reference value of \(3.846 \times 10^{33} \,\mathrm{erg}\,\mathrm{s}^{-1}\) used in Model S to the value of \(3.828 \times 10^{33} \,\mathrm{erg}\,\mathrm{s}^{-1}\) adopted by Mamajek et al. (2015) as the nominal solar luminosity, in the sense (Model [\(L_{\mathrm{s}}\)])–(Model S). Line styles are as defined in Fig. 21

As discussed in Sect. 2.3.1 the OPAL equation of state has been substantially updated since the computation of Model S. Figure 24 compares a model computed with the up-to-date OPAL 2005 version with Model S. The effects in the bulk of the model are rather modest, with somewhat larger changes in the near-surface layers. A significant failing in the earlier tables was the neglect of relativistic effects on the electrons in the central regions, which have a significant effect on \(\varGamma _1\) (see also Eq. 20). This in fact dominates the sound-speed difference in the deeper parts of the model in Fig. 24.

Fig. 24
figure 24

Model changes at fixed fractional radius, between Model [Liv05] using the OPAL 2005 equation of state and Model S, in the sense (Model [Liv05])–(Model S). Line styles are as defined in Fig. 21, with the addition of the dotted green line showing \(\delta _r \varGamma _1/\varGamma _1\)

Perhaps the most uncertain aspect of the stellar internal microphysics is the opacity (see also Sects. 2.3.2, 6.4). Tripathy and Christensen-Dalsgaard (1998) made a detailed investigation of the effect on calibrated solar models of localized modifications to the opacity. They replaced \(\log \kappa \), \(\log \) being logarithm to base 10, by \(\log \kappa + \delta \log \kappa \), where

$$\begin{aligned} \delta \log \kappa = A_\kappa \exp [ -(\log T - \log T_\kappa )^2/\varDelta _\kappa ^2], \end{aligned}$$
(47)

for a range of \(\log T_\kappa \). They also demonstrated a nearly linear response for even fairly large modifications, by changing \(A_\kappa \) from 0.1 to 0.2. The response of solar models to opacity changes was also investigated by Villante and Ricci (2010). As examples, Fig. 25 shows the changes to the model resulting from opacity changes of the form given in Eq. (47) at \(\log T_\kappa = 7\) and 6.5. It is striking that the changes in temperature and hence sound speed are largely localized in the vicinity of the opacity change, with a somewhat broader response of pressure and density. For the deeper opacity change a modest change in the hydrogen abundance is required to calibrate the model to the correct luminosity: the increase in opacity would tend to reduce the luminosity and this is compensated by a decrease in X and hence an increase in \(\mu \), in accordance with the homology scaling in Eq. (36).

Fig. 25
figure 25

Model changes at fixed fractional radius, resulting from localized changes to the opacity described by Eq. (47) with \(A_\kappa = 0.02, \varDelta _\kappa = 0.02\), in the sense (modified model)–(Model S). The top panel shows results for Model [Opc. 7.0], with \(\log T_\kappa = 7\), and the bottom panel results for Model [Opc. 6.5], with \(\log T_\kappa = 6.5\). Results are shown as a function of fractional radius (bottom abscissa) and \(\log T\) (top abscissa), and the line styles are defined in the figure

The behaviour of \(\delta _r \ln T\) can be understood from the equation for the temperature gradient (Eqs. 45) which we write as

$$\begin{aligned} {\mathrm{d}\ln T \over \mathrm{d}r} = - {3 \over 4 a {\tilde{c}}} {\kappa \rho \over T^4} {L(r) \over 4 \pi r^2}, \end{aligned}$$
(48)

or

$$\begin{aligned} {\mathrm{d}\over \mathrm{d}r} \left( {\delta _r T \over T} \right) = - {3 \over 4 a {\tilde{c}}} {\kappa \rho \over T^4} {L(r) \over 4 \pi r^2} \left( {\delta _r \kappa \over \kappa } + {\delta _r \rho \over \rho } + 4 {\delta _r T \over T} \right) , \end{aligned}$$
(49)

where I neglected the perturbation to L. We write \(\delta _r \kappa / \kappa = (\delta \kappa /\kappa )_{\mathrm{int}} + \kappa _T \delta _r T/T\), where \( (\delta \kappa /\kappa )_{\mathrm{int}}\) is the intrinsic opacity change given by Eq. (47), \(\kappa _T = (\partial \ln \kappa / \partial \ln T)_{\rho , X_i}\) and I neglected the dependence of \(\kappa \) on \(\rho \) and composition. Then Eq. (49) can be written as

$$\begin{aligned} {\mathrm{d}\over \mathrm{d}\ln T} \left( {\delta _r T \over T} \right) + (4 - \kappa _T) {\delta _r T \over T} = \left( {\delta \kappa \over \kappa } \right) _{\mathrm{int}}, \end{aligned}$$
(50)

neglecting again \(\delta _r \rho / \rho \). In the outer parts of the Sun the temperature is largely fixed, for small changes in X, by Eq. (45). Assuming that \(\delta _r T/T \approx 0\) well outside the location \(T = T_\kappa \) of the change in the opacity, and taking \(\kappa _T\) as constant, Eq. (50) has the solution

$$\begin{aligned} {\delta _r T \over T} \approx T^{-(4 - \kappa _T)} \int _{\ln T_{\mathrm{s}}}^{\ln T} T'^{4 - \kappa _T} \left( {\delta \kappa \over \kappa } \right) _{\mathrm{int}} \mathrm{d}\ln T', \end{aligned}$$
(51)

where \(T_{\mathrm{s}}\) is the surface temperature. This explains the steep rise in the outer parts of the peak in \(\delta _r T/T\) (and hence \(\delta _r c^2/c^2\)) and, with \(\kappa _T\) typically around \(-2\) to \(-3\), the relatively rapid decay on the inner side.

To analyse the properties of \(\delta _r p\) and \(\delta _r \rho \) I assume the ideal gas law, Eq. (12), and neglect the change in the mean molecular weight, such that \(\delta _r \ln \rho \approx \delta _r \ln p - \delta _r \ln T\). From the Eq. (1) of hydrostatic equilibrium, neglecting the change in m, it then follows that

$$\begin{aligned} {\mathrm{d}\delta _r \ln p \over \mathrm{d}\ln p} \approx - \delta _r \ln T. \end{aligned}$$
(52)

Below the location of the opacity and temperature change pressure and density are relatively unaffected. Thus the local change in pressure is dominated by the increase with increasing r in the peak of \(\delta _r \ln T\), while \(\delta _r \ln \rho \) has a negative dip in this region, but follows the increase in \(\delta _r \ln p\) outside it. The global behaviour of \(\delta _r \ln p\) and \(\delta _r \ln \rho \) is constrained by the conservation of total mass, such that

$$\begin{aligned} \int _0^R \delta _r \rho r^2 \mathrm{d}r = 0. \end{aligned}$$
(53)

For \(\log T_\kappa = 7\) (top panel in Fig. 25) the region of positive \(\delta _r \ln \rho \) just outside the peak in \(\delta _r \ln T\) therefore forces a region of negative \(\delta _r \ln \rho \) in the outer parts of the model, including the convection zone where \(\delta _r \ln p\) and \(\delta _r \ln \rho \), according to Eq. (44), are approximately constant. For \(\log T_\kappa = 6.5\) (bottom panel) the region of negative \(\delta _r \ln \rho \) in the peak of \(\delta _r \ln T\) results in positive \(\delta _r \ln \rho \) and \(\delta _r \ln p\) in the convection zone. The effect on the hydrogen abundance is less clear in simple terms, although it must be related to the calibration to keep the luminosity fixed. Given that the changes in the deep interior are minute for \(\log T_\kappa = 6.5\), it is understandable that \(\delta _r X\) is very small in this case, except in the region just below the convection zone that is directly affected by changes in diffusion and settling.

The OPAL opacity tables were updated by Iglesias and Rogers (1996), relative to the Rogers and Iglesias (1992) tables used for Model S. As shown in Fig. 26, comparing models that both assumed the Grevesse and Noels (1993) solar composition but using respectively the OPAL96 and OPAL92 tables, the revision of the opacity calculation has some effect on the structure, including a relatively substantial change in the sound speed.

Fig. 26
figure 26

Model changes at fixed fractional radius, between Model [OPAL96] which uses the OPAL96 opacities and Model S, where the OPAL92 tables were used, in the sense (Model [OPAL96])–(Model S). Line styles are as defined in Fig. 25

As noted by, for example, Tripathy and Christensen-Dalsgaard (1998) and Vinyoles et al. (2017) responses to localized opacity changes such as shown in Fig. 25 define ‘opacity kernels’ that can be used to reconstruct the effects of more general opacity changes. An example is illustrated in Fig. 27. Here the top panel shows a fit to the difference between the OPAL96 and OPAL92 tables in the radiative region, based on localized opacity changes of the form in Eq. (47) on a dense grid in \(\log T_\kappa \). Applying the resulting amplitudes to the corresponding model differences yields the red curves in the bottom panel, which are in excellent agreement with the direct differences between the OPAL96 and OPAL92 models, as illustrated in Fig. 26. The changes in \(c^2\) and \(\rho \) are dominated by the substantial negative opacity difference at relatively low temperature, yielding a negative \(\delta _r \ln c^2\) just below, and a negative \(\delta _r \ln \rho \) within, the convection zone. As noted above the change in X, on the other hand, is insensitive to the opacity change in the outer parts of the radiative region, and hence the positive \(\delta \ln \kappa \) in the deeper regions results in a negative \(\delta _r X\).

Fig. 27
figure 27

Top panel: The solid curve shows logarithmic differences between the OPAL96 and the OPAL92 opacity, in the sense (OPAL96)–(OPAL92), at fixed \(\rho \), T and composition in Model [OPAL96]. The dashed curve shows a fit of functions of the form in Eq. (47), with \(\varDelta _\kappa = 0.02\) and on a grid in \(\log T_\kappa \) between 7.2 and 6.2 with a step of 0.01. Bottom panel: differences \(\delta _r \ln c^2\) (solid curves), \(\delta _r \ln \rho \) (long-dashed curve) and \(\delta _r X\) (double-dot-dashed curve). The black curves show results from Fig. 26, whereas the red curves show reconstructions based on ‘opacity kernels’ such as shown in Fig. 25, using the fit shown in the top panel

The effects of changing the atmospheric opacity are illustrated in Fig. 28, comparing the more recent tables of Ferguson et al. (2005) with the Kurucz (1991) tables used in the computation of Model S. There are significant changes in pressure and density in the atmosphere, reflecting the integration of atmospheric structure at the given temperature structure (cf. Sect. 2.4, in particular Eqs. 2830). However, as discussed by Christensen-Dalsgaard and Thompson (1997) the effects of such superficial changes in calibrated solar models are very strongly confined to the near-surface layers; the differences in the bulk of the convection zone and in the radiative interior are minute.Footnote 44

Fig. 28
figure 28

Model changes at fixed fractional radius, between a model computed using the Ferguson et al. (2005) low-temperature opacities and Model [OPAL96], which used the Kurucz (1991) tables, in the sense (Model [Surf. opac.])–(Model [OPAL96]); in both cases the OPAL96 tables were used in the deeper parts of the model. Line styles are defined in the top panel

Relative to the Grevesse and Noels (1993) composition used in Model S a modest revision was proposed by Grevesse and Sauval (1998); the compositions are compared in Table 4 in Sect. 6.1 below. This composition has seen extensive use in solar modelling. The effects of this change on the model structure are illustrated in Fig. 29, using for both compositions the OPAL96 opacity tables. There is evidently some change, at a level that is significant compared with the helioseismic results in the sound speed, as well as a modest change in the hydrogen abundance required for luminosity calibration. In particular, the 10% change in the oxygen abundance (cf. Table 4) and the general decrease in the heavy-element abundance (cf. Table 3) cause a decrease in the opacity of up to 4% just below the convection zone, leading the a significant decrease in the sound speed in the outer parts of the radiative region, as shown in Fig. 29. As discussed in Sect. 6.1 the much greater revision since 2000 of the determination of the solar surface composition has had very substantial effects on solar models.

Fig. 29
figure 29

Model changes at fixed fractional radius, between Model [GS98] using the Grevesse and Sauval (1998) composition and Model [Surf. opac.] which used the Grevesse and Noels (1993) composition (see Table 4), in the sense (Model [GS98])–(Model [Surf. opac.]); in both cases the Ferguson et al. (2005) atmospheric and the OPAL96 interior tables were used. Line styles are as defined in Fig. 28

An indication of the effects of the uncertainties in the opacity computations may be obtained by comparing the use of the OPAL tables with the results of the independent OP project (e.g., Seaton et al. 1994; Badnell et al. 2005; Seaton 2005); the differences between the tables are illustrated in Fig. 6 (note that this shows OPAL–OP). In Fig. 30 models computed with the OP and OPAL tables are compared, in both cases using the Grevesse and Sauval (1998) composition. The effect is clearly substantial, with an increase in the sound speed in the bulk of the radiative interior and in the hydrogen abundance resulting from the luminosity calibration. The model differences can at least qualitatively be understood from the opacity kernels discussed above. The differences in sound speed, pressure and density are probably dominated by the positive table differences at temperatures just below the convection zone, while the change in the hydrogen abundance is dominated by the negative table differences in the deeper parts of the model. Other comparisons of different opacity calculations were carried out, for example, by Neuforge-Verheecke et al. (2001b), who compared OPAL and the Los Alamos LEDCOP tables, and Le Pennec et al. (2015b), comparing OPAL and the recent OPAS tables (Blancard et al. 2012; Mondet et al. 2015) developed at CEA, France.

Fig. 30
figure 30

Model changes at fixed fractional radius, between Model [OP05] using the OP05 opacity tables (e.g., Seaton 2005) and Model [GS98] using the OPAL96 tables, in the sense (Model [OP05])–(Model [GS98]); in both cases the GS98 composition and the Ferguson et al. (2005) low-temperature opacities were used. Line styles are as defined in Fig. 28

As discussed in Sect. 2.5 there is considerable uncertainty in the treatment of convection in the strongly super-adiabatic region just below the photosphere (see Fig. 12). In calibrated solar models, however, this has little effect on the structure of the bulk of the model. To illustrate this Fig. 31 shows differences between a model computed using the Canuto and Mazzitelli (1991) treatment, as implemented by Monteiro et al. (1996), and Model S. There are substantial differences in the near-surface region, but these are very strongly confined, with the differences being extremely small in the lower parts of the convection zone and the radiative interior (see also Christensen-Dalsgaard and Thompson 1997). This effect is similar to the effect of modifying the atmospheric opacity, shown in Fig. 28. As illustrated by the solid blue line, the difference in squared sound speed at fixed mass fraction is much more strongly confined near the surface than the difference at fixed fractional radius. It was argued by Christensen-Dalsgaard and Thompson (1997) that, consequently, \(\delta _q \ln c^2\) provides a better representation of the effects of the near-surface modification on the oscillation frequencies. In fact, model differences such as these or those shown in Fig. 28 provide a model for the near-surface errors in traditional structure and oscillation modelling which have an important effect on helio- and asteroseismic investigation. To illustrate this, Fig. 37 below shows frequency differences between the models illustrated in Fig. 31.

Fig. 31
figure 31

Model changes at fixed fractional radius, between Model [CM] emulating the Canuto and Mazzitelli (1991) treatment of near-surface convection and Model S, in the sense (Model [CM])–(Model S). Line styles are as defined in Fig. 28, with the addition of the solid blue line which shows the difference \(\delta _q \ln c^2\) of squared sound speed at fixed mass fraction q

The effects of the updates to the nuclear reaction parameters since Model S are illustrated in Fig. 32. Panel (a) is based on a model computed with the Adelberger et al. (2011) parameters, while in panel (b) the NACRE rates (Angulo et al. 1999) reaction rates, with the Formicola et al. (2004) update of the \({{}^{14}\mathrm{N}}\) rate, were used. In both cases the dominant change to the overall reaction rate was at the highest temperatures and is closely related to updated quantities for the CNO reactions; at fixed conditions the energy generation decreased by 5–8% relative to the formulation used in Model S. This is directly reflected in the higher hydrogen abundance (see also Table 3) and hence higher sound speed in the core, in both cases. Calibration to fixed luminosity caused modest changes in the structure in the other parts of the models. It should be noticed that while the differences in \(\epsilon \) at fixed \(\rho \), T and composition for the Adelberger et al. (2011) rates are largely confined to the region where \(\log T \ge 7.1\), the differences in the NACRE rates extend more broadly, leading to the substantially larger model differences in the NACRE case (Fig. 32b).

Fig. 32
figure 32

Model changes at fixed fractional radius, corresponding to changes in the nuclear reaction parameters, compared with Model S which used parameters largely from Bahcall and Pinsonneault (1995). a Differences for Model [Adelb11], using the Adelberger et al. (2011) parameters, in the sense (Model [Adelb11])–(Model S), and b shows differences for Model [NACRE] using the Angulo et al. (1999) (NACRE) parameters, with the reaction \({{}^{14}\mathrm{N}}+ {{}^{1}\mathrm{H}}\) updated by Formicola et al. (2004), in the sense (Model [NACRE]–(Model S). Line styles are as defined in Fig. 28

A potential simplification of the calculation is to assume that \({{}^{3}\mathrm{He}}\) is in nuclear equilibrium. The region where this is satisfied approximately corresponds to the rising part of the \({{}^{3}\mathrm{He}}\) abundance shown in Fig. 7 and hence in fact covers most of the region of nuclear energy generation in the present Sun. However, the change in the hydrogen abundance over solar evolution does depend on the details of the nuclear reactions. As illustrated in Fig. 33 assuming nuclear equilibrium of \({{}^{3}\mathrm{He}}\) throughout the evolution indeed generally has a minute effect on the resulting model of the present Sun. The peak in \(\delta _r X\) at \(r/R \approx 0.27\) corresponds closely to the peak in the \({{}^{3}\mathrm{He}}\) abundance (cf. Fig. 7 and probably reflects the local conversion of hydrogen into \({{}^{3}\mathrm{He}}\).

Fig. 33
figure 33

Model changes at fixed fractional radius, between Model [\( {}^3\mathrm{He} \) eql.] where \({{}^{3}\mathrm{He}}\) is assumed to be in nuclear equilibrium and Model S, in the sense (Model [\( {}^3\mathrm{He} \) eql.])–(Model S). Line styles are defined in the figure

As mentioned in Sect. 2.3.3 there has been some discussion about the validity of the classical Salpeter (1954) model of static screening of nuclear reactions, with dynamical simulations indicating absence of screening (Mussack and Däppen 2011). The effects of switching off all screening of nuclear reactions are illustrated in Fig. 34. At fixed conditions corresponding to Model S this results in a reduction in the nuclear energy-generation rate of up to 9% near the centre, where the CNO cycle plays some role (cf. Fig. 8), and around 5% further out, where the PP chains dominate. To achieve luminosity calibration this is compensated by increases in temperature, hydrogen abundance and density, the latter increase requiring a decrease in density in the outer parts of the model to conserve the total mass (cf. Eq. 53). The effects show some similarity to the effects of the revision of nuclear parameters (Fig. 32), probably reflecting also here the larger reduction in the rates of the more temperature-sensitive reactions, but the changes are clearly of a much larger magnitude. Indeed, Weiss et al. (2001) pointed out that the resulting model is inconsistent with the constraints provided by the helioseismically determined sound speed (cf. Sect. 5.1.2; see also Christensen-Dalsgaard and Houdek 2010).

Fig. 34
figure 34

Model changes at fixed fractional radius, between Model [No el.scrn] where electron screening is switched off and Model S, in the sense (Model [No el.scrn])–(Model S). Line styles are as defined in Fig. 33

To illustrate the sensitivity of the models to the detailed treatment of diffusion and settling Fig. 35 shows the effect of increasing \(D_i\) (cf. Eq. 6) by a factor 1.2 (panel a) or increasing both \(D_i\) and \(V_i\) by this factor (panel b). In the former case the effects are small, the dominant changes being confined to the core where the increased diffusion partly smoothes the hydrogen profile, leading to an increase in the hydrogen abundance, with a corresponding increase in the sound speed. There are additional even smaller changes associated with the gradient in hydrogen abundance caused by settling just below the convection zone. When also the settling velocity is increased the changes are more substantial, including a significant increase in the hydrogen abundance in the convection zone and a noticeable increase in the sound speed below the convection zone; note also the near-surface sound-speed changes, of similar shape but opposite sign to the effects of neglecting diffusion and settling (see Fig. 36 below) and, as in that case, reflecting the thermodynamic response to the change in the helium abundance.

Fig. 35
figure 35

Model changes at fixed fractional radius, resulting from changes to the diffusion and settling coefficients, compared with Model S. a Differences for Model [Dc] where the diffusion coefficient \(D_i\) (cf. Eq. 6) was increased by a factor 1.2, in the sense (Model [Dc])–(Model S). b Differences for Model [DVc] where both \(D_i\) and the settling velocity \(V_i\) were increased by a factor 1.2, in the sense (Model [DVc])–(Model S). Line styles are as defined in Fig. 33

Fig. 36
figure 36

Model changes at fixed fractional radius, comparing Model [No diff.] neglecting diffusion and settling with Model S, in the sense (Model [No diff.])–(Model S). Line styles are as defined in Fig. 33; in addition, in the right-hand expanded view of the outer helium and hydrogen ionization zones the green dotted curve shows \(\delta _r \ln \varGamma _1\)

Finally, it should be recalled that early ‘standard solar models’ did not include effects of diffusion and settling. It was shown by Christensen-Dalsgaard et al. (1993) that including just diffusion and settling of helium led to a substantial improvement in the comparison between the model and helioseismic inferences of sound speed, and hence more recent solar models, such as Model S, include full treatment of diffusion. To illustrate this effect Fig. 36 compares a model ignoring diffusion but otherwise corresponding to Model S, including the calibration, with Model S. It is evident that the change in the hydrogen abundance (which obviously to a large extent reflects the Model S hydrogen profile illustrated in Fig. 18) has a substantial effect on the sound speed, hence affecting the comparison with the helioseismic inference. There are more subtle effects on the sound speed near the surface that in part arises from the change in \(\varGamma _1\) caused by the change of the helium abundance in the helium ionization zones, and which affects the frequencies of acoustic modes. This effect illustrates the potential for helioseismic determination of the solar envelope helium abundance (see Sect. 5.1.3).

5 Tests of solar models

The models discussed so far have explicitly been computed to match the ‘classical’ observed quantities of the Sun: the initial composition \((X_0, Z_0)\) has been chosen to match the solar luminosity and present surface composition and the choice of mixing length has been made to match an assumed solar radius, at the assumed present age of the Sun. Since the model has thus been adjusted to match the observed \(L_{\mathrm{s}}\), R and \(Z_{\mathrm{s}}/X_{\mathrm{s}}\) these quantities provide no independent test of the calculation, beyond the feeble constraint that apparently reasonable values of the required parameters can be found which match the observables.

As discussed in the introduction, very detailed independent testing of the model computation has become possible through helioseismology, by means of extensive observations of solar oscillations. Additional information relevant to the structure of the solar core results from the detection of neutrinos originating from the nuclear reactions (cf. Eq. 22). Finally, I briefly consider the surface abundances of light elements or isotopes which provide constraints on mixing processes in the solar interior.

5.1 Helioseismic tests of solar structure

Detailed reviews of the techniques and results of helioseismology have been provided by Christensen-Dalsgaard (2002), Basu and Antia (2008) and Aerts et al. (2010); An extensive review of solar oscillations and helioseismology was provided by Basu (2016) in Living Reviews of Solar Physics. A perhaps broader view, emphasizing also the limitations in the present results, was provided by Gough (2013b). None the less, it is appropriate here to provide a brief overview of the techniques of helioseismology and to summarize the results on the solar interior.

5.1.1 Properties of solar oscillations

Oscillations of the Sun are characterized by the degree l and azimuthal order m,Footnote 45 with \(|m| \le l\), of the spherical harmonic \(Y_l^m(\theta , \phi )\) describing the mode, where \(\theta \) is co-latitude and \(\phi \) is longitude, and by its radial order n. The degree provides a measure of the horizontal wave number \(k_{\mathrm{h}}\):

$$\begin{aligned} k_{\mathrm{h}}= {\sqrt{l(l+1)} \over r}, \end{aligned}$$
(54)

at distance r from the solar centre. Thus, except for radial modes (with \(l = 0\)), the average horizontal wavelength on the solar surface is \(\lambda _{\mathrm{h,s}} \simeq 2 \pi R/l\). The azimuthal order measures the number of nodal lines crossing the equator. The observed cyclic oscillation frequencies \(\nu \), between roughly 1 and 5 mHz, correspond to modes that predominantly have the character of standing acoustic waves, or p modes, and, at high degree, surface gravity waves, or f modes. In the case of the p modes, the frequencies are predominantly determined by the internal sound speed c, with

$$\begin{aligned} c^2 = {\varGamma _1 p \over \rho } \simeq {\varGamma _1 k_{\mathrm{B}}T \over \mu m_{\mathrm{u}}}, \end{aligned}$$
(55)

the latter expression assuming the ideal gas law (cf. Eq. 12).

The f-mode frequencies are to a good approximation given by the deep-layer approximation for surface gravity waves, determined by the surface gravitational acceleration. Thus to leading order these modes provide little information about the structure of the solar interior, although a correction term, essentially reflecting the variation in the appropriate gravitational acceleration with mode properties, provides some sensitivity to the near-surface density profile (Gough 1993; Chitre et al. 1998). The dependence on surface gravity has been used to determine, on the basis of f-mode frequencies, the ‘seismic solar radius’ (Schou et al. 1997) and its variation with solar cycle (e.g., Kosovichev and Rozelot 2018a).

Rotation (or other departures from spherical symmetry) induce a dependence of the frequencies on the azimuthal order m. To leading order the effect of rotation simply corresponds to the advection of the oscillation patterns by the angular velocity as averaged over the region of the Sun sampled by a given mode.

From the dispersion relation for acoustic waves, and Eq. (54), it is straightforward to show that the modes are oscillatory as a function of r in the region of the Sun which lies outside an inner turning point, at distance \(r = r_{\mathrm{t}}\) from the centre satisfying

$$\begin{aligned} {c(r_{\mathrm{t}}) \over r_{\mathrm{t}}} = {\omega \over \sqrt{l(l+1)}}, \end{aligned}$$
(56)

and evanescent interior to this point; here \(\omega = 2 \pi \nu \) is the angular frequency of the mode. Since the sound speed generally increases with decreasing r, the turning point is close to the solar centre for very low degrees at the observed frequencies, the modes becoming increasingly confined near the surface with increasing degree. From a physical point of view this behaviour of the modes corresponds to total internal reflection, owing to the increase in the sound speed with depth, of sound waves corresponding to the given degree: the waves travel horizontally at the inner turning point. With increasing degree the initial direction of the waves at the solar surface is more strongly inclined from the vertical and the turning point is reached closer to the surface.

The frequency of a given acoustic mode reflects predominantly the structure outside the turning point. The observed modes have degree from 0 to more than 1000, and hence turning points varying from very near the solar centre to immediately below the photosphere. This variation in sensitivity allows the determination of the structure with high resolution in the radial direction. Very crudely, the high-degree modes give information about the near-surface region of the Sun. Given this, modes of slightly lower degree can be used to determine the structure at slightly greater depth, and so on, the analysis continuing to the solar core. Similarly, modes of differing azimuthal order have different extent in latitude, those with \(|m| \simeq l\) being confined near the equator and modes with low |m| extending over all latitudes; thus observation of frequencies as a function of m over a range of degrees allows the determination of, for example, the angular velocity as a function of both latitude and distance from the centre.

For completeness I note that there have also been claims of observed solar oscillations with much longer periods. Such modes would be internal gravity waves, or g modes, with greater sensitivity to conditions in the solar core than the acoustic modes.

With the exception of the region just below the surface, and the atmosphere, solar oscillations can be treated as adiabatic to a very high precision. This approximation is generally used in computations of solar oscillation frequencies. However, nonadiabatic effects in the oscillations are undoubtedly important in the near-surface region, as are the processes that excite the modes. The physical treatment of these effects, involving the interaction between convection and the oscillations, is uncertain, and so therefore are their effects on the oscillation frequencies (for a review, see Houdek and Dupret 2015). Also, the structure of the near-surface region of the model is affected by the uncertain effects of convection, including the general neglect of turbulent pressure (cf. Sect. 2.4).

Such inadequacies in modelling the structure and the oscillations very near the solar surface appear to dominate the differences between the observed frequencies and frequencies of solar models (e.g., Christensen-Dalsgaard 1984a; Dziembowski et al. 1988; Christensen-Dalsgaard et al. 1996). Fortunately, the effect of these near-surface uncertainties on the frequencies in many cases has a relatively simple dependence on the mode frequency and degree. This follows from the fact that the physics of the modes, except at very high degree, in the near-surface layers is insensitive to the degree and so, therefore, is the direct effect of these layers on the oscillation frequencies. This, however, must be corrected for the fact that according to Eq. (56) higher-degree modes involve a smaller fraction of the star and hence are easier to perturb. A quantitative measure of this effect is provided by the mode inertia

$$\begin{aligned} E = {\int _V \rho |{\varvec{\delta }}{\varvec{r}}|^2 \mathrm{d}V \over M |{\varvec{\delta }}{\varvec{r}}|_{\mathrm{phot}}^2 }, \end{aligned}$$
(57)

where the integral is over the volume of the star, \({\varvec{\delta }}{\varvec{r}}\) is the displacement vector, and \(|{\varvec{\delta }}{\varvec{r}}|_{\mathrm{phot}}\) is its norm at the photosphere; it may be shown that the frequency shift from a near-surface modification is proportional to \(E^{-1}\) (e.g., Aerts et al. 2010). It is convenient to take out the frequency dependence of the inertia by considering, instead of E, \(Q = E/{\bar{E}}_0(\omega )\), where \({\bar{E}}_0(\omega )\) is the inertia of a radial mode, interpolated to the frequency \(\omega \) of the mode considered, effectively renormalizing the surface effect to the effect on radial modes. The resulting functional form of the effect on the frequencies of the near-surface uncertainties is reflected by the last term in Eq. (61) below (e.g. Christensen-Dalsgaard 1988b; Aerts et al. 2010). Given the very extensive data available on solar oscillations this property of the frequency differences caused by the near-surface effects to a large extent allows their consequences to be suppressed in the analysis of the observed oscillation frequencies, leading to reliable inferences of the internal structure (e.g., Dziembowski et al. 1990; Däppen et al. 1991; Gough 1996b). For distant stars, however, where only low-degree modes are observed, the surface errors represent a significant source of uncertainty in the analysis of the oscillation frequencies. Various procedures have been developed to suppress these effects in fits to the observed frequencies (e.g. Kjeldsen et al. 2008; Ball and Gizon 2014), or, alternatively, the fits can be based on frequency combinations defined to be largely insensitive to them (Roxburgh and Vorontsov 2003; Otí Floranes et al. 2005).

How errors in the near-surface region affect the oscillation frequencies can be illustrated by the model differences shown in Fig. 31, between a model using the Canuto and Mazzitelli (1991) treatment of convection and Model S which used the Böhm-Vitense (1958) mixing-length treatment. Frequency differences between these two models are shown in Fig. 37. To compensate for the fact that with increasing degree the modes involve a smaller part of the Sun (cf. Eq. 56) the differences have been scaled by the normalized \(Q_{nl}\), as discussed above. The figure clearly shows that with this scaling the frequency differences are indeed largely independent of the degree.

Fig. 37
figure 37

Frequency differences for modes of degree \(l \le 100\), scaled by the inertia ratio \(Q_{nl}\), between a model emulating the Canuto and Mazzitelli (1991) treatment of near-surface convection and Model S, in the sense (modified model)–Model S. The corresponding model differences are shown in Fig. 31

Clearly an important goal is to understand the structure and oscillation dynamics in the near-surface layers better and eventually model them consistently in the calculation of the oscillation frequencies; in this context the otherwise strongly constrained solar case will serve as an important test. A key aspect is the treatment of convection in the equilibrium model and the oscillations (see also Sect. 2.5). Schlattl et al. (1997) used a detailed atmospheric model and modelled the outer layers of the convection zone by a variable mixing-length parameter matched to a two-dimensional hydrodynamical simulation of convection; they noted that the resulting model matched the observed solar oscillation frequencies better than did the normal model. A similar improvement of the frequencies was obtained by Rosenthal et al. (1995, 1999) and Robinson et al. (2003) by including suitable averages of convection simulations in the modelling (see also Sect. 2.5). Sonoi et al. (2015) and Ball et al. (2016) studied the effect on stellar oscillation frequencies of using averaged simulations as the outer parts of stellar models, for a range of stellar parameters. Magic and Weiss (2016) also considered the patching of averaged simulations to solar models and in addition devised corrections to the depth scale and density in normal one-dimensional models that mimicked the effects on the frequencies of the patching. In addition to normal simulations they carried out simulations with magnetic fields, representing more active areas of the solar surface, determining the effect of the resulting change in the structure of the solar layers on the oscillation frequencies, although without considering the direct effect of the field on the oscillations. The analysis was extended to a broad range of stellar parameters, ranging from the main sequence to the red-giant branch, by Trampedach et al. (2017), who emphasized the importance of both the expansion of the near-photospheric layers by the effect of turbulent pressure and the so-called ‘convective back-warming’, i.e., the effects of the convective fluctuations on the strongly temperature-sensitive opacity. In similar analyses, Sonoi et al. (2017) included also some effects of the perturbation to the turbulent pressure, based on a time-dependent convection formulation restricted to adiabatic oscillations, while Manchon et al. (2018) emphasized the sensitivity of the near-surface frequency shifts to the metallicity of the stars.

An equally important contribution to the deficiencies in the model frequencies is the physics of the oscillations in the near-surface region. Here the energetics of the oscillations, including the perturbations to the convective flux, must be taken into account in fully nonadiabatic calculations, and the perturbation to the turbulent pressure has a significant effect on the frequencies and the damping of the modes. To treat these effects requires a time-dependent modelling of convection (see Houdek and Dupret 2015, for a review). Time-dependent versions of the mixing-length theory were established by Unno (1967) and Gough (1977b) and have been further developed since then. With a few exceptions the nonadiabatic calculations show that the modes are intrinsically damped; they are excited to the observed amplitudes by stochastic forcing from convection, as confirmed by analysis of the observed amplitude distribution (Chaplin et al. 1997). Consequently the observed linewidths in the frequency power spectra provide a measure of the damping rates of the modes, allowing calibration of parameters in the convection modelling such that the computed damping rates match the observed linewidths. Combining results from hydrodynamical simulations of the outer layers with nonadiabatic computations using a non-local time-dependent convection treatment including also the turbulent-pressure perturbation, Houdek et al. (2017), as illustrated in Fig. 38, obtained a much improved fit to the solar observed frequencies, at the same time showing a reasonable fit to the observed damping rates. Analyses of intrinsic or induced oscillations in hydrodynamical simulations are providing further insight into the physics of the interaction between convection and the oscillations (Belkacem et al. 2019; Zhou et al. 2019), which may be used further to improve the simplified treatments based on mixing-length formulations. In an interesting analysis, Schou and Birch (2020) determined the frequency correction caused by the effect on the oscillations of convection dynamics by matching eigenfunctions in standard oscillation calculations to eigenfunctions resulting from the convection simulations.

Fig. 38
figure 38

Image reproduced with permission from Houdek et al. (2017), copyright by the authors.)

Differences, reduced to the case of radial modes (with \(l = 0\)), between observed and modelled solar oscillation frequencies against frequency, in the sense (Sun)–(Model). The dot-dashed curve uses adiabatic frequencies for a model essentially corresponding to Model S (Christensen-Dalsgaard et al. 1996, see Sect. 4.1). The solid curve is based on a model where the outermost layers were replaced by a suitable average of a three-dimensional radiative-hydrodynamic simulation of convection. In addition, the frequencies were obtained from nonadiabatic calculations taking the interaction with convection, including turbulent pressure, into account.

5.1.2 Investigations of the structure and physics of the solar interior

Very extensive helioseismic data have been acquired over the past decades, from groundbased networks of observatories and from Space (for further details, see for example Christensen-Dalsgaard 2002; Aerts et al. 2010). In most cases observations of radial velocity are carried out, based on the Doppler effect, extending over months or years to achieve sufficient frequency resolution, reduce the background noise and follow possible temporal variations in the Sun. Spatially resolved observations are analysed to isolate modes corresponding to a few combinations of (lm).Footnote 46 From the resulting time series power spectra are constructed through Fourier transform, and the frequencies of solar oscillations are determined from the position of the peaks in the power spectra. Low-degree modes have been studied in great detail through observations in disk-integrated light, observing the Sun as a star, from the BiSON (Chaplin et al. 1996; Hale et al. 2016) and IRIS (Fossat 1991) networks, and with the GOLF instrument on the SOHO spacecraft (Gabriel et al. 1997). Modes of degree up to around 100 were studied for an extended period of time with the LOWL instrument (Tomczyk et al. 1995), extended to the two-station ECHO network, which has now stopped operation. Also, the six-station GONG network (Harvey et al. 1996) has yielded nearly continuous data for modes of degree up to around 150 since late 1995, whereas modes including even higher degrees were studied with the SOI/MDI instrument on SOHO (Scherrer et al. 1995; Rhodes et al. 1997). Since May 2010 these high-resolution observations have been taken over by the HMI instrument on the Solar Dynamics Observatory (Hoeksema et al. 2018), with regular MDI observations ending in April 2011. Detailed analyses of the BiSON low-degree observations were carried out by Broomhall et al. (2009) and Davies et al. (2014), while Larson and Schou (2015, 2018) analysed the MDI and HMI observations for modes of degree up to \(l \approx 300\). At even higher degree the modes lose their individual nature owing to the decreasing separation between adjacent modes and the increasing damping rates; thus the analysis of these modes is affected by systematic errors and interference between the modes (Korzennik et al. 2004; Rabello-Soares et al. 2008). Here special techniques are required for the frequency determination as discussed, e.g., by Reiter et al. (2015) and Reiter et al. (2020), who analysed a 66-day high-resolution set of MDI observations. It should be noticed that, according to Eq. (56), these high-degree modes have their lower turning point quite close to the surface; this makes them particularly interesting for the study of the near-surface layers (e.g., Di Mauro et al. 2002), where thermodynamic effects associated with helium and hydrogen ionization become relevant, and where, as discussed above, the properties of the structure and the oscillations are somewhat uncertain. Very extensive high-resolution data are being obtained with HMI, but these have apparently so far not been analysed to determine properties of high-degree modes.

Owing to their great potential for helioseismic investigations the g modes have been the target of major observational efforts. García et al. (2007) inferred the presence of g modes with the expected nearly uniform period spacing from periodicities in the power spectrum of GOLF observations. However, a review by Appourchaux et al. (2010) found that the attempts up to that point to detect g modes were inconclusive. Recently, Fossat et al. (2017) claimed evidence for g modes of degree \(l = 1\) and 2 through an ingenious and complex analysis of the spacing between solar acoustic low-degree modes observed with GOLF. In a follow-up study Fossat and Schmider (2018) extended this to modes of degree 3 and 4. Interestingly, the results indicated a rapid rotation of the solar core, possibly at variance with the results obtained from the analysis of solar acoustic modes (see Fig. 44 below). However, Schunker et al. (2018), repeating the analysis, found that the results were very sensitive to the details of the fits, including the assumed starting time of the time series of observations. A similar sensitivity to the details of the analysis was found by Appourchaux and Corbard (2019), analysing a recalibrated version of the GOLF data (Appourchaux et al. 2018); on this basis they concluded that the results of Fossat et al. (2017) and Fossat and Schmider (2018) were artefacts of the methodology. Also, the physical effects that might introduce the g-mode signal in the acoustic-mode properties are so far unclear. Indeed, although already Kennedy et al. (1993) proposed this type of analysis they noted that the coupling between the modes is such that to leading order the p-mode frequencies are insensitive to g modes of odd degree (see also Gough 1993), in conflict with the inferences of Fossat et al. This was analysed in more detail by Böning et al. (2019) and Scherrer and Gough (2019). Furthermore, Scherrer and Gough confirmed and extended the results of Schunker et al. (2018) and tried, and failed, to find a similar signal in the MDI and HMI data; they also noted that the inferred rapid rotation of the solar core is difficult to reconcile with the constraints obtained from extensive analyses of well-observed solar acoustic modes (see Sect. 5.1.4). Thus the evidence for solar g modes remains uncertain, and I shall not consider them further in this review.

From Eq. (56) it follows that acoustic modes of low degree penetrate to the stellar core. This is particularly important for investigations of distant stars, where only low-degree modes are observed (see Sect. 7), but low-degree acoustic modes have also been important for the study of the solar core, not least in connection with the solar neutrino problem (e.g., Elsworth et al. 1990, see also Sect. 5.2). The cyclic frequencies \(\nu _{nl} = \omega _{nl}/2\pi \) of these modes satisfy the asymptotic relation (Tassoul 1980; Gough 1993)

$$\begin{aligned} \nu _{nl} \approx \varDelta \nu \left( n + {l \over 2} + \varepsilon \right) - d_{nl}, \end{aligned}$$
(58)

where the large frequency separation

$$\begin{aligned} \varDelta \nu = \left( 2 \int _0^R {\mathrm{d}r \over c} \right) ^{-1} \end{aligned}$$
(59)

is the inverse of the acoustic travel time across a stellar diameter and \(\varepsilon \) is a frequency-dependent phase related to the near-surface layers. Thus to leading order the frequencies are uniformly spaced in radial order, with degeneracy between modes with the same \(n + l/2\). This degeneracy is lifted by the small correction term \(d_{nl}\), leading to the small frequency separations

$$\begin{aligned} \delta \nu _{nl} = \nu _{nl} - \nu _{n-1\,l+2} \simeq - (4 l + 6) {\varDelta \nu \over 4 \pi ^2 \nu _{nl}} \int _0^R {\mathrm{d}c \over \mathrm{d}r} {\mathrm{d}r \over r}. \end{aligned}$$
(60)

Since the integral is strongly weighted towards the stellar centre, \(\delta \nu _{nl}\) is a useful diagnostic for the properties of the stellar core, including stellar age (e.g., Christensen-Dalsgaard 1984b, 1988a; Ulrich 1986, see also Sect. 6.2).

The extensive sets of observed solar oscillation frequencies make possible detailed inferences of the properties of solar structure, through inverse analyses of the observations. Reviews of such inversion techniques were given by, for example, Gough and Thompson (1991), Gough (1996b), Basu and Antia (2008) and Basu (2016). Assuming adiabatic oscillations, the frequencies are determined by the dependence of pressure, density and gravity on r, as well as on \(\varGamma _1\) which relates the perturbations to pressure and density. However, given that the solar model satisfies the equations of hydrostatic support and mass, Eqs. (1) and (2), the mass m and p can be computed once \(\rho (r)\) is specified. It follows that the adiabatic oscillation frequencies are fully defined if \((\rho (r), \varGamma _1(r))\) is specified. Alternatively equivalent pairs can be used; given that the frequencies of acoustic modes are predominantly determined by the sound speed, convenient choices are \((c^2, \rho )\) or \((u, \varGamma _1)\), \(u = p/\rho \) being the squared isothermal sound speed.

It was demonstrated by Gough (1984a) that a simple asymptotic relation for the frequencies, first found by Duvall (1982), forms the basis for an approximate inversion for the solar sound speed; this was used for the first inferences of the solar internal sound speed by Christensen-Dalsgaard et al. (1985). Such asymptotic techniques have been further developed by, for example, Christensen-Dalsgaard et al. (1989), Vorontsov and Shibahashi (1991) and Marchenkov et al. (2000).

Alternatively, as originally noted by Gough (1978a) based on similar techniques in geophysics, a linearized relation that does not depend on the asymptotic properties can be established between corrections to the structure of a solar model, for example characterized by differences at fixed r \((\delta _r c^2, \delta _r \rho )\) between the Sun and the model, and the corresponding frequency differences. This is based on the fact that the oscillation frequencies satisfy a variational principle (e.g., Chandrasekhar 1964), such that the frequency corrections are independent of corrections to the eigenfunctions, to leading order. However, the analysis must also take into account the inadequacies of the modelling of the near-surface layers discussed above. As a result, the relative frequency differences can be written as

$$\begin{aligned} {\delta \omega _{nl} \over \omega _{nl}} = \int _0^{R_{\mathrm{s}}} \left[ K_{c^2, \rho }^{nl}(r) {\delta _r c^2 \over c^2}(r) + K_{\rho ,c^2}^{nl}(r) {\delta _r \rho \over \rho }(r) \right] \mathrm{d}r + Q_{nl}^{-1} {{\mathcal {F}}}(\omega _{nl}), \end{aligned}$$
(61)

where the kernels \(K_{c^2, \rho }^{nl}\) and \(K_{\rho , c^2}^{nl}\) are obtained from the reference solar model, and \(R_{\mathrm{s}}\) is the surface radius of the model. The last term takes into account differences between the model and the Sun resulting from the inadequate modelling of the superficial layers and their effects on the oscillations; here \(Q_{nl}\) is the mode inertia, normalized to the value for a radial mode at the same frequency, and \({{\mathcal {F}}}\) is a function of frequency characterizing these near-surface effects. To these relations must be added a constraint on the density difference resulting from the fact that the total mass of the model must be kept fixed; this can be expressed as

$$\begin{aligned} 0 = \int _0^{R_{\mathrm{s}}} {\delta _r \rho \over \rho } \rho r^2 \mathrm{d}r, \end{aligned}$$
(62)

(as noted already in Eq. 53; see also the related discussion of the model differences in Fig. 25), which is formally of the same form as Eq. (61). Thus this relation can be included directly in the analysis. With a sufficiently extensive set of observed modes the relations in Eq. (61) can be analysed to infer measures of the model differences. Various techniques have been developed to carry out inversions for the structure differencesFootnote 47 (e.g., Gough 1978b, 1985; Dziembowski et al. 1990; Gough and Kosovichev 1990; Dziembowski et al. 1994; Antia 1996; Basu and Thompson 1996). In all cases the techniques are characterized by trade-off parameters which determine the balance between the desired error and resolution of the inferences, as well as the weight given to the suppression of unwanted contributions to the results; in inferring the differences in sound speed, for example, the so-called cross term, i.e., the contribution from the density differences, must be minimized. The technical details of the various inversion techniques were reviewed by Basu (2016), while Rabello-Soares et al. (1999) provided an analysis of the commonly used technique of optimally localized averages, including the appropriate choice of the required parameters.

Although the oscillation frequencies depend predominantly on the sound speed it is also possible to carry out inversions to infer, for example, the density difference between the Sun and the model. It should be noticed, however, that the sound-speed and density inferences are not independent: given the assumed constraints of hydrostatic equilibrium and mass equation, determination of corrections to the hydrostatic structure in terms of \(\delta _r \rho \) and \(\delta _r u\) are in principle equivalent; indeed, Dziembowski et al. (1990) pointed out how \(\delta _r p\) and hence \(\delta _r \rho \) can be determined directly from \(\delta _r u\). Since \(c^2 = \varGamma _1 u\) and \(\varGamma _1 \simeq 5/3\) in most of the Sun, the inferences of \(\delta _r u\) and \(\delta _r c^2\) are also closely related.

As an example of the results on solar structure, Fig. 39 shows inferred sound-speed and density differences between the Sun and Model S (see Sect. 4.1 and Christensen-Dalsgaard et al. 1996) with the Grevesse and Noels (1993) heavy-element composition, as well as for Model [OPAL96], an updated version of this model, with the same composition but using the Iglesias and Rogers (1996) opacity tables with a slightly reduced opacity near the base of the convection zone and hence a reduced sound speed and an increased sound-speed difference relative to the Sun. This model was compared with Model S in Fig. 26. In addition, results are shown for Model [GS98] with the Grevesse and Sauval (1998) composition where the opacity is further somewhat reduced and the sound-speed difference to the Sun increased; the effects on the model of the change in composition were illustrated in the comparison with Model [Surf. opac.], very similar to Model [OPAL96], in Fig. 29. The properties of the models are summarized in Tables 1, 2 and 3. The analysis (see Basu et al. 1997) used a combination of LOWL and BiSON frequencies. Inversion was carried out by means of a technique of optimally localized averages, which explicitly characterizes the inferred quantities as averages of the differences with well-defined localized weight functions, the so-called averaging kernels; the widths of these provide a measure of the resolution of the inversion (see Basu 2016, for details). Also, the errors in the inferred differences are calculated from the quoted errors in the observed frequencies. The differences between the Sun and the models may be considered as relatively small, although very significant compared with the inferred errors, and highly systematic. In particular, the observational errors are much smaller than the effects of the relatively modest modification of the opacity tables illustrated by the dashed curve. In common with the model differences discussed in Sect. 4.2 the density differences are substantially larger than the sound-speed differences. Also, it is interesting that the differences are approximately constant in the convection zone, in accordance with Eq. (44). Independent analyses of other datasets (e.g., Gough et al. 1996; Kosovichev et al. 1997; Turck-Chièze et al. 1997; Couvidat et al. 2003) have yielded very similar results, when applied to the same reference models. As illustrated in Fig. 40a a model that does not include diffusion and settling of helium and heavy elements results in a much larger difference relative to the Sun (see also Christensen-Dalsgaard et al. 1993, and Fig. 36). It is striking that the old Model S, with some problems with the input physics, fortuitously yields the best agreement with the inferred solar structure. Lest this relatively good agreement between the Sun and the models leads to complacency, I note that the revised abundances obtained by, e.g., Asplund et al. (2009) cause much more dramatic effects on the comparison; these are discussed in Sect. 6.

Fig. 39
figure 39

Results of helioseismic inversions. The symbols show inferred relative differences in squared sound speed (top) and density (bottom) between the Sun and the original Model S ( Christensen-Dalsgaard et al. 1996, see also Sect. 4.1), in the sense (Sun)–(model); this uses the Grevesse and Noels (1993) (GN93) heavy-element composition and the Rogers and Iglesias (1992) OPAL opacity tables. The vertical bars show \(1\,\sigma \) errors in the inferred values, based on the errors, assumed statistically independent, in the observed frequencies. The horizontal bars extend from the first to the third quartile of the averaging kernels, to provide a measure of the resolution of the inversion (see Basu 2016). The dashed curves show results for the similar Model [OPAL96], which used the Iglesias and Rogers (1996) tables, whereas the dot-dashed curves are for Model [GS98], where the Grevesse and Sauval (1998) composition was used

Fig. 40
figure 40

a As in Fig. 39 the symbols show inferred difference in squared sound speed between the Sun and the original Model S, in the sense (Sun)–(model). The dashed curve shows results for Model [GS98], using the Grevesse and Sauval (1998) composition, while the solid curve is for Model [No diff.], similar to Model S but neglecting diffusion and settling. b Relative differences between inferred solar squared sound speed \(c_\odot ^2\) obtained by correcting the reference model values with the helioseismically inferred differences (cf. Eq. 63). The solid curve compares the result obtained using the non-diffusive Model [No diff.] as reference with the result of using Model S, in the sense (Model [No diff.])–(Model S); the standard deviations were obtained by combining in quadrature the standard deviations inferred in the two inversions. The dashed curve similarly compares the Model [GS98] inference with that from Model S

For high-degree modes the near-surface effects can no longer be regarded as independent of degree. Consequently, in Eq. (61) \({{\mathcal {F}}}(\omega )\) must be replaced by an expansion in \({\tilde{w}} = (l+ 1/2)/\omega \), with \({{\mathcal {F}}}\) as the leading l-independent term (Gough and Vorontsov 1995). Di Mauro et al. (2002) implemented inversion techniques to take the expansion in \({\tilde{w}}\) into account and applied it to early high-degree observations from MDI. In fact, Reiter et al. (2015, 2020) carried out inverse analyses including high-degree modes, using Model S as a reference, and noted a substantial excess of the model sound speed within the upper 5% of the model (a tendency already hinted at in Fig. 39). However, since the analysis did not include the l-dependent terms in the near-surface correction the result should probably be regarded as preliminary.

Basu et al. (2000) carried out a detailed analysis of the various sources of uncertainties in the helioseismic inferences of solar internal properties, including the effects of different choices of observational data or reference models. They found, for example, that the sound-speed structure resulting from applying the inferred sound-speed difference to the reference model depended relatively little on the assumed reference model, within a reasonable range of models. Thus in this sense the analysis provides a robust determination of the solar internal sound speed. To illustrate this, Fig. 40b illustrates differences between solar squared sound speeds, reconstructed from the model sound speed and the helioseismically inferred sound-speed difference as

$$\begin{aligned} c_\odot ^2 = c_{\mathrm{mod}}^2\left( 1 + {\delta _r c^2 \over c^2} \right) , \end{aligned}$$
(63)

where \(c_{\mathrm{mod}}^2\) is the squared sound speed in the reference model. The figure shows differences of two such reconstructions relative to the reconstruction based on Model S, which is the model amongst those considered so far that most closely resembles the Sun.Footnote 48 Even for the non-diffusive model (solid line), which shows a relatively substantial difference from the Sun, the departure from the Model S reconstruction is less than 0.1% in most of the Sun, and for the Grevesse and Sauval (1998) model (dashed line) the departure is much smaller. Apart from the uncertain central and near-surface regions the largest departures are found just below the convection zone, caused by the sharp gradients in sound-speed differences between the models in this region, which are not fully resolved owing to the finite resolution of the inversion.

It is fairly common (e.g., Degl’Innocenti et al. 1997; Yang 2016) to compare solar models with an existing reconstructed solar sound speed computed as in Eq. (63), based on some inversion; in this case the choice of reference model in the original inversion clearly affects the comparison and hence enters as a component in the error in the inferred sound-speed difference. It is then important that the selection of reference models included in the estimate of that error is realistic (the error would obviously be overestimated by including, for example, models without diffusion and settling). On the other hand, in the analyses in the present paper and in Vinyoles et al. (2017), for example, the differences between the Sun and a model are inferred directly by using the model as reference for a helioseismic inverse analysis; in this case the results provide a direct estimate of the difference between the Sun and the specific model subject to the observational error, the finite resolution of the inversion and the success in suppressing the cross term and the surface contribution, but without involving contributions to the error from the choice of reference model. Even so, Vinyoles et al. did include in their error analysis a contribution obtained from the dispersion of inferences of the solar sound speed based on a set of reference models from Bahcall et al. (2006), varying the composition and other model parameters within the relevant errors. Further investigations of the error estimates, in particular the effect of error correlations, in solar structure inversions are certainly warranted.

The most noticeable feature of the sound-speed difference in Fig. 39 is the peak just below the convection zone. This is a region of a strong gradient in the hydrogen abundance caused by helium settling (cf. Fig. 18). Consequently, the difference can be reduced by partial mixing of the region; this would increase the hydrogen abundance, and hence decrease the mean molecular weight and increase the sound speed (cf. Eqs. 13 and 55). Such mixing might be induced by instabilities associated with the strong gradient in the angular velocity in this so-called tachocline (cf. Fig. 44) (e.g., Brun et al. 1999; Elliott and Gough 1999; Brun et al. 2002; Christensen-Dalsgaard and Di Mauro 2007). Evidence for partial mixing has also been obtained from inverse analyses designed to infer the composition structure of the solar interior (Antia and Chitre 1998; Takata and Shibahashi 2003). Christensen-Dalsgaard et al. (2018) demonstrated that by imposing a combination of a suitable modification of the opacity and suitable diffusive mixing the sound-speed difference can be strongly reduced and the peak in the tachocline region essentially removed; the resulting inferred difference in the squared sound speed is illustrated in Fig. 41. Such additional mixing is also implied by the partial destruction of lithium (see Sect. 5.3). The negative difference in the outer part of the core could be similarly reduced by partly mixing of this region, which would decrease the hydrogen abundance; it is not obvious, however, that any realistic mechanism is available which may cause such mixing.

Fig. 41
figure 41

(Adapted from Christensen-Dalsgaard et al. 2018.)

The symbols show the inferred difference in squared sound speed between the Sun and a model with a suitably adjusted representation of turbulent diffusion beneath the convection zone (Christensen-Dalsgaard et al. 2018); for comparison the dashed curve shows a model corresponding to the original Model S. The models are based on the Asplund et al. (2009) composition, but with an opacity modification to restore the original Model S (see Christensen-Dalsgaard and Houdek 2010, and Fig. 58a below).

A detailed investigation of the solar internal structure was carried out by Basu et al. (2009). They used a combination of very extensive data on low-degree modes from the BiSON network, carefully corrected for the variations caused by the solar cycle, with data from the MDI experiment on SOHO; to test the sensitivity of the result to the assumed data they also considered other combinations of BiSON and MDI data. The analysis was carried out relative to several reference models, including Model S considered also in Fig. 39. Figure 42 shows the resulting sound-speed and density differences. The results are clearly similar to those shown for Model S in Fig. 39, obtained with a different dataset; indeed, Basu et al. found that the inferred sound-speed differences showed only minor dependence on the assumed dataset, although the detailed results in the core depended slightly on the choice of low-degree data. The main difference is in the convection zone where significant differences, increasing in magnitude towards the surface, are found. These are probably caused by residual effects in the treatment of the unavoidable errors in the modelling of the near-surface layers; the data used in Fig. 39 did not include modes of degree above 100 and hence were less sensitive to the near-surface structure of the Sun. I note that according to Eq. (43) we expect the sound speed in the bulk of the convection zone largely to match the solar sound speed, assuming that the model has the correct surface gravity, as was indeed found in the model comparisons in Sect. 4.2. Thus the inferred sound-speed differences between the Sun and the model in the convection zone further indicate problems with the inversion in the outer parts of the Sun, including incomplete suppression of the near-surface inadequacies in the modelling.

Fig. 42
figure 42

(Adapted from Basu et al. 2009)

Inferred relative differences in squared sound speed (a) and density (b) between the Sun and Model S of Christensen-Dalsgaard et al. (1996), in the sense (Sun)–(model); the inversion is based on a combination of 14 years of low-degree data from BiSON observations, corrected for solar-cycle frequency variations, and data from the MDI experiment on the SOHO spacecraft. The vertical bars show \(1\,\sigma \) errors in the inferred values, based on the errors, assumed statistically independent, in the observed frequencies. The horizontal bars provide a measure of the resolution of the inversion.

In the case of density Basu et al. found substantial sensitivity, throughout the Sun, of the results to the choice of dataset. They pointed out that this is related to the constraint in Eq. (62). The density difference in the core is relatively weakly constrained by the observations and hence differs substantially between the inversion results for the different datasets; however, because of the constraint on the mass, any change in the density difference in the core has to be compensated in the rest of the model. This effect is further enhanced by the fact that the density is much higher in the core, requiring a proportionally larger relative change in the outer parts of the model.

The preceding discussion of helioseismic inversion implicitly assumed that the solar radius \(\,R_\odot \) was known, as indicated in Eq. (61). In fact, as discussed in Sect. 2.2 there have been several independent and not fully consistent determinations of \(\,R_\odot \); using an incorrect estimate could yield systematic errors in the inversion results. A preliminary analysis of these issues was carried out by Takata and Gough (2001, 2003), including ways to improve the determination of \(\,R_\odot \) as part of the inversion process. They found that the effects of errors in the assumed radius were small, but not quite insignificant compared with the statistical error in the inferences.

Given the seismically determined sound speed and density in the Sun one may construct a seismic model, i.e., a model that is consistent with the seismic results. The first step is to reconstruct relevant aspects of solar structure from seismically inferred differences, as in Eq. (63), with suitable extrapolation to the regions of the Sun not covered by the inversions. Additional properties, such as the mass distribution and pressure can be obtained by invoking the equations of mass and hydrostatic equilibrium, Eqs. (1) and (2). This process may be iterated, by using the thus reconstructed model as reference for a new inversion (e.g., Antia 1996; Buldgen et al. 2020). Additional properties of the model can be constrained by further equations of stellar structure combined with suitably chosen aspects of the physics of the stellar interior (Antia and Chitre 1997; Takata and Shibahashi 1998). Such a model, based on using Model S as reference, was presented by Gough and Scherrer (2001), with the underlying analysis and further details discussed by Gough (2004). The analysis constrained the composition and temperature structure based on the nuclear energy-generation rate. Given the inferred local luminosity and temperature, the opacity could be estimated in the radiative region from Eqs. (4) and (5). Interestingly, the differences between resulting inferred opacity and the model opacity (based essentially on the model heavy-element abundance, with the Grevesse and Noels (1993) composition) were at most around 1.5%.

5.1.3 Specific aspects of the solar interior

In addition to the general behaviour of the internal solar sound speed and density, more specific aspects of the solar internal structure can be inferred. Already the early asymptotic sound-speed inversion by Christensen-Dalsgaard et al. (1985) showed indications of the location of the base of the convection zone. Further analyses have yielded tighter constraints on this point, understood as the location where the thermal gradient becomes substantially subadiabatic. Christensen-Dalsgaard et al. (1991) determined the depth \(d_{\mathrm{cz}}\) of the convection zone as \(d_{\mathrm{cz}} = (0.287 \pm 0.003) R\), a value confirmed by Kosovichev and Fedorova (1991). A very similar value, but with even higher precision, was determined by Basu and Antia (1997) and Basu (1998). Further information about conditions near the base of the convection zone can be inferred from analysis of an oscillatory behaviour in the frequencies induced by the relatively rapid variations in solar structure in this region; this corresponds to a so-called acoustic glitch, where the structure varies on a scale small compared with the local wavelength of the acoustic waves (e.g., Hill and Rosenwald 1986; Gough and Thompson 1988; Vorontsov 1988; Gough 1990a). In particular, with the normal treatment of convection the second derivative of sound speed is essentially discontinuous here; also, simple models of convective overshoot (e.g., Zahn 1991) predict a nearly adiabatic extension of the temperature gradient beneath the unstable region, followed by an essentially discontinuous jump to the radiative gradient, and hence a stronger variation in the sound speed. From analysis of the oscillatory frequency variations associated with this region the extent of such overshoot has been limited to a small fraction of a pressure scale height (e.g., Basu et al. 1994; Basu and Antia 1994; Monteiro et al. 1994; Roxburgh and Vorontsov 1994; Christensen-Dalsgaard et al. 1995). More detailed modelling of overshoot (e.g., Rempel 2004; Rogers et al. 2006; Xiong and Deng 2001) yields a smoother transition, possibly including a slightly subadiabatic region in the lower parts of the convection zone. Such overshoot models are not obviously constrained by the earlier helioseismic analyses but might still be amenable to helioseismic investigation. Christensen-Dalsgaard et al. (2011) found that a model with somewhat smoothed overshoot was in fact in better agreement with the helioseismic data than was a ‘standard’ model such as Model S; in the latter helium settling causes a relatively sharp change in the sound speed at the base of the convection zone and hence an oscillatory frequency variation larger than observed.

The departures of \(\varGamma _1\) from the simple value for an ideal gas have very great interest as a diagnostics of the equation of state and composition of the convection zone. From the equation of state \(\varGamma _1\) is given as a function \(\varGamma _1(p, \rho , \{X_i\})\) of the hydrostatic structure and composition, with the abundance of helium having the strongest effect. The analysis of this dependence is simplified in the convection zone where the structure is characterized by the approximately adiabatic gradient and where composition can be assumed to be uniform due to the very efficient convective mixing. If the equation of state is assumed to be known, an inference of the sound speed can be used to infer the composition through its effect on \(\varGamma _1\). In fact, it was noted by Gough (1984b) and Däppen and Gough (1986) that the second ionization of helium produces a signature in \(\varGamma _1\), acting as an acoustic glitch, which is potentially a sensitive measure of the helium abundance. This effect has been analysed in a variety of ways, using both asymptotic and non-asymptotic techniques (e.g., Vorontsov et al. 1991; Dziembowski et al. 1991; Kosovichev et al. 1992; Antia and Basu 1994; Pérez Hernández and Christensen-Dalsgaard 1994; Kosovichev 1996; Basu 1998; Richard et al. 1998; Di Mauro et al. 2002). The resulting values of the envelope helium abundance \(Y_{\mathrm{s}}\) depend somewhat on the assumed equation of state, although values around \(Y_{\mathrm{s}} = 0.248\) are typically found, with a formal uncertainty of as low as 0.001 and a somewhat larger systematic uncertainty estimated from the use of different equations of state. As an example, Basu and Antia (2004) obtained \(Y_{\mathrm{s}}= 0.2485 \pm 0.0034\), taking into account also uncertainties in the equation of state. It should be noted that this value is substantially lower than the primordial value \(Y_0 = 0.271\) required to calibrate solar models (cf. Sect. 2.6), thus confirming the importance of helium settling. Indeed, the helioseismically inferred value is close to, and independent from, the value obtained in standard solar models including settling; in Model S, for example, the envelope helium abundance is \(Y_{\mathrm{s}} = 0.245\). Using the value \(Y_{\mathrm{s}} = 0.2485 \pm 0.0034\) obtained by Basu and Antia (2004) and several different solar models, Serenelli and Basu (2010) estimated the primordial solar helium abundance as \(Y_0 = 0.278 \pm 0.006\), which is essentially consistent with the value obtained from the Model S calibration.

Observations of low-degree modes in distant stars provide a similar possibility for determining the envelope helium abundance through the effect on the acoustic-mode frequencies (e.g., Pérez Hernández and Christensen-Dalsgaard 1998; Lopes and Gough 2001; Houdek 2004; Houdek and Gough 2007a). As discussed in Sect. 7.2 this potential has been realized thanks to the very detailed asteroseismic data obtained with the Kepler mission.

It is interesting, particularly in the light of the revisions of solar surface abundances (cf. Sect. 6.1) that it appears possible to constrain also abundances of the dominant heavy elements through their effect on the equation of state. I return to this in Sect. 6.3.

The inverse analysis can be arranged to include also a contribution from the intrinsic difference \((\delta \varGamma _1)_{\mathrm{int}}\) between the solar equation of state and that assumed in the model, i.e., the difference in \(\varGamma _1\) at fixed \((p, \rho , \{X_i\})\), allowing inferences to be made of \((\delta \varGamma _1)_{\mathrm{int}}\), reflecting the errors in the equation of state used in the solar modelling (Basu and Christensen-Dalsgaard 1997). Examples are shown in Fig. 43, based on analyses carried out by Basu et al. (1999). It is evident that the OPAL results are generally closer to the Sun, although with some indications of the opposite tendency very close to the surface. It should be noted that a complete separation of the effects of the intrinsic differences in \(\varGamma _1\) and the differences in composition is not possible; however, the compositional effects are strongly constrained by the fact that the composition is uniform in the convection zone, and the effect of a reasonably uncertainty in, for example, the helium abundance on the results shown in Fig. 43 is modest (Rabello-Soares et al. 2000). Further details about the equation-of-state differences, and hence also further constraints on the composition, could be obtained with data on higher-degree modes (Di Mauro et al. 2002). It should also be noted that specific details on the equation of state can be investigated by comparing suitably parameterized formulations of the equation of state with the helioseismically inferred properties; an interesting example, involving a calibration of the size of hydrogen and helium atoms and ions, was presented by Baturin et al. (2000).

Fig. 43
figure 43

(Adapted from Basu et al. 1999)

Relative intrinsic difference in \(\varGamma _1\) in the sense (Sun)–(model), inferred from inversion of oscillation frequencies of degree up to \(\sim 250\) obtained with the MDI instrument. The closed circles show results for a model using the MHD equation of state while the open circles are for a model computed with the OPAL equation of state. As in Fig. 39 the vertical and horizontal bars measure error and resolution, respectively.

In the core of the Sun it might be expected that the equation of state is relatively simple. It was therefore somewhat surprising that Elliott and Kosovichev (1998) found significant differences in \(\varGamma _1\) between the Sun and the model close to the solar centre, in an inversion based on \((\rho , \varGamma _1)\). They demonstrated that the differences arose solely from the neglect of relativistic effects on the electrons (cf. Eq. 20) in the versions of the OPAL and MHD equations of state used at the time for the model calculation. Taking these effects into account the inferred \(\delta _r \varGamma _1\) was consistent with zero in the core to within errors.

5.1.4 Investigations of solar internal rotation

Solar rotation induces a splitting of the observed frequencies according to their azimuthal order m. Howe (2009) provided an extensive review of the analysis of these data: inversion of the rotational splittings has provided detailed information about rotation in the solar interior (see also Thompson et al. 2003). Already early results (Duvall et al. 1984) indicated that the radiative interior of the Sun rotates at a nearly uniform rate close to but slightly below the surface equatorial rotation rate. This was in striking contrast to models of solar evolution which had led to the expectation of a possible relict rapidly rotating core left over from an initial state of rapid rotation (see Sect. 3.1). An important consequence of the slow rotation of the solar interior is that rotational oblateness causes no significant modification to the Sun’s outer gravitational field, at a level which might affect tests of Einstein’s theory of general relativity on the basis of planetary motion (e.g., Pijpers 1998; Roxburgh 2001; Mecheri et al. 2004; Antia et al. 2008).

Results of the rotational inferences are summarized in Fig. 44. Throughout the convection zone rotation varies with latitude in a manner similar to the directly observed surface differential rotation (cf. Eq. 11), although with subtler details such as the near-surface increase in rotation rate with depth (see also Corbard and Thompson 2002; Barekat et al. 2014). Interestingly, indications of this near-surface variation were already found in the early analysis of high-degree modes by Deubner et al. (1979). On the other hand, beneath the convection zone the rotation rate is essentially independent of position. The rotation rate of the solar core is highly uncertain, as indicated. Only modes of the lowest degree reach the core (cf. Eq. 56), so that few azimuthal orders are available to determine the splitting; also, even for these modes the contribution from the core to the rotational splitting is small. Several independent observations have confirmed the general near-uniformity of rotation in the deep solar interior (e.g., Chaplin et al. 2001b; Fossat et al. 2003; García et al. 2004). The slight indication in Fig. 44 of a downturn in the core is interesting but obviously not significant. Indeed, Corbard et al. (1998) found a strong dependence of the inferred core rotation on the details of the data and inversion techniques used. Also, Chaplin et al. (2004) demonstrated that with typical disk-integrated data, covering modes of degree \(l = 1{-}3\), only a difference in the core rotation, for \(r \le 0.2 R\) of at least of order \(100 \,\mathrm{nHz}\) relative to the overall rate in the radiative interior would be significantly detectable.

Fig. 44
figure 44

Inferred solar internal rotation rate \(\varOmega /2 \pi \) as a function of fractional radius r/R, at the latitudes indicated. As in Fig. 39 errors and resolution are indicated by the vertical and horizontal bars. Results in the outer parts of the Sun, including the convection zone (the base of which is indicated by the dashed line) were obtained from analysis of 144 days of SOI/MDI data (Schou et al. 1998). Results below \(r = 0.45 R\), with no latitude resolution, were obtained by Chaplin et al. (1999) from a combination of BiSON and LOWL data. Note that the results become highly uncertain in the deep interior, although with no indication of a rapidly rotating core

For completeness, I recall the claim made by Fossat et al. (2017) of rapid rotation of the solar core, although based on a questionable analysis claiming detection of solar dipolar g modes (see Sect. 5.1.2).

Given the availability of independent datasets it has been possible to test the consistency of the different observations and data-analysis techniques. While there seems to be considerable consistency between inferences of the solar internal structure based on different datasets (Basu et al. 2003), systematic differences remain between results on solar rotation (Schou et al. 2002); even in the latter case, however, the overall features in the inferred rotation rates are largely consistent. In a detailed analysis involving several different datasets and novel inversion techniques Eff-Darwich and Korzennik (2013) generally confirmed the results shown in Fig. 44, while emphasizing the uncertainty in the inference of rotation of the solar core.

The transition between the different latitudinal variation of the rotation in the convection zone and the radiative interior takes place in a relatively thin region, the tachocline (Spiegel and Zahn 1992). This region probably plays an important role in the presumed dynamo action responsible for the solar magnetic field and its cyclic 11-year variations (e.g., Miesch and Toomre 2009; Charbonneau 2020; Brun and Browning 2017). Thus detailed studies of its properties have been carried out (see Charbonneau et al. 1999, and references therein). Charbonneau et al. determined the width, defined as the region over which 84% of the variation in angular velocity takes place, to be \(w = (0.039 \pm 0.013) \,R_\odot \),Footnote 49 with the centre of the transition being located at \(r_{\mathrm{c}} = (0.693 \pm 0.002) \,R_\odot \). The tachocline region was found to be slightly prolate, with \(r_{\mathrm{c}}\) being closer to the surface by \((0.024 \pm 0.004) \,R_\odot \) at a latitude of \(60^\circ \) than at the equator. No significant latitude variation in the width was found. It should be noticed that the location of \(r_{\mathrm{c}}\) places most of the tachocline below the base of the convective envelope, at \(r_{\mathrm{bcz}} = (0.713 \pm 0.001) \,R_\odot \) (see Sect. 5.1.3). Antia and Basu (2011) confirmed the prolate nature of the tachocline and in addition found a statistically significant increase with latitude in its width.

Rotation in the solar convection zone, including the latitudinal surface differential rotation, is controlled by dynamical transport processes within and just below the convection zone. Simple arguments, and early numerical simulations of the convection zone, indicate that the angular velocity should depend only on the distance to the rotation axis, in what has been called ‘rotation on cylinders’ (e.g., Gilman 1976), which is manifestly not shown by the helioseismic inferences in Fig. 44. More sophisticated simulations have produced rotation profiles quite similar to that of the Sun, with some suitable adjustment of parameters (see, for example Miesch et al. 2006, and references therein).

Assuming that the Sun was born in a state of substantially more rapid rotation, some mechanism must obviously have been responsible for the transport of angular momentum from the deep interior to the convection zone, from which it has presumably been lost through coupling to the solar wind. Early models invoking angular-momentum transport through turbulent diffusion (e.g., Pinsonneault et al. 1989; Chaboyer et al. 1995) result in present internal rotation rates several times the surface value, and hence are definitely inconsistent with the helioseismic results. A detailed analysis of the instabilities induced by rotation and angular-momentum transport was carried out by Mathis et al. (2018), confirming that additional transport mechanisms would be required to account for the observed solar rotation profile. Angular-momentum transport by waves, originally proposed by Schatzman (1993) and further developed by Kumar and Quataert (1997) and Talon and Zahn (1998), may remain a possibility, although requiring a fairly elaborate combination of effects (Talon et al. 2002). As detailed by Talon and Charbonnel (2005) the model involves a so-called shear-layer oscillation just beneath the convection zone, similar to the oscillation demonstrated in the laboratory experiment of Plumb and McEwan (1978), which filters the gravity waves in such a way that in the deeper interior those waves dominate that tend to slow down the radiative interior. Talon and Charbonnel also developed the model to provide lithium destruction consistent with observations of open stellar clusters and of the Sun (see also Charbonnel and Talon 2005). The treatment of the gravity waves was based on a simplified model of wave excitation by convective eddies within the convection zone (e.g., Kumar and Quataert 1997) which in particular ignored the likely strong effects of the penetration of convective plumes into the stable layer underneath (e.g., Hurlburt et al. 1986). Rogers and Glatzmaier (2006) carried out detailed two-dimensional numerical calculations of the excitation of gravity waves in the solar convection zone and the resulting transport of angular momentum. They found strong effects of penetrating plumes just beneath the convection zone, and hence pointed out the need to consider the coupling between variations in rotation in the convection zone, the tachocline and the deep radiative interior. Also, they noted that the gravity-wave spectrum resulting from the simulation predicted a flux that was essentially independent of frequency, unlike the spectrum used by Talon and Charbonnel (2005) which was strongly peaked at low frequency. As a result, they questioned the viability of the gravity-wave mechanism to slow down the solar core. Further simulations by Rogers et al. (2008) of the properties of gravity waves in the solar radiative interior, driven at selected frequencies and wave numbers, confirmed the problems with the earlier models, particularly the effect of a flat spectrum and the importance of previously neglected wave-wave interactions, which are surely relevant under realistic circumstances where a large number of waves are excited simultaneously (see also Rogers 2007). Denissenkov et al. (2008) showed that the gravity-wave transport would have a tendency to produce large-scale oscillations in the angular velocity in the radiative interior which are not observed.

A plausible alternative is that angular-momentum transport is dominated by a weak primordial magnetic field (Charbonneau and MacGregor 1993; Gough and McIntyre 1998). A more detailed analysis of such mechanisms revealed a strong sensitivity to the assumed boundary conditions in the tachocline (Brun and Zahn 2006; Garaud and Rogers 2007; Garaud and Brummell 2008), in order to achieve the latitude-independent rotation in the radiative interior while satisfying Ferraro’s law of isorotation (Ferraro 1937), according to which the angular velocity has a constant value along poloidal magnetic field lines. This requires that the magnetic field is essentially confined to the radiative interior; were that not the case the latitude dependence of the angular velocity in the convection zone would penetrate into the radiation zone, which is not observed (for a review, see Garaud 2007). By imposing plausible boundary conditions at the base of the convection zone on simulations of the dynamics of the tachocline, Garaud and Garaud (2008) obtained the required confinement. The resulting angular velocity of the radiative interior was discussed in terms of a simple model by Garaud and Guervilly (2009). A recent analysis by Garaud (2020), based on simplified numerical modelling and scaling of the relevant physical processes, suggested that the tachocline could be non-magnetic but dominated by three-dimensional stratified turbulence, possibly implying that it is much thinner than \(0.01 \,R_\odot \); given the finite resolution of the inverse analyses, this may well be consistent with the helioseismic inferences.

Weak dynamo action in the radiative interior driven by magnetic instabilities (Tayler 1973; Spruit 2002) has also been proposed as a means of angular-momentum transport; since the resulting transport is much more efficient in the latitudinal than in the radial direction this could account for the latitude-independent rotation below the convection zone. Eggenberger et al. (2005) demonstrated that this formulation could indeed explain the evolution to the present solar internal rotation. However, Denissenkov and Pinsonneault (2007) questioned some aspects of the analysis of Spruit (2002) and hence the applicability of the mechanism to the solar spin-down. Also, Zahn et al. (2007), using three-dimensional magneto-hydrodynamical simulations, failed to find the required dynamo action under solar conditions, although Braithwaite and Spruit (2017) argued that this was caused by the assumed unrealistic high magnetic diffusivity. However, Fuller et al. (2019) revisited the Tayler–Spruit mechanism, including non-linear effects in the generation of the magnetic field; on this basis they found that the resulting angular-momentum transport plausibly could account for the nearly rigid rotation of the solar radiative interior. An overall analysis of the potential for magnetic effects to cause the observed solar rotation profile was presented by Eggenberger et al. (2019).

It is probably fair to say that we do not yet have a full understanding of the origin of the present solar internal rotation. One may hope that further constraints on the modelling may result from asteroseismic results on the internal rotation of other solar-like stars (see Sect. 7.2).

An extensive review of the dynamics of the convection zone and tachocline was provided by Miesch (2005), while Gough (2010) presented a detailed review of the angular-momentum coupling through the tachocline. An overall discussion of the dynamics of the solar radiative interior, in the light of the helioseismic results, was provided by Gough (2015).

5.1.5 Temporal changes of the solar interior

The availability of the detailed helioseismic data over an extended period has allowed studies of the time dependence of solar internal structure and dynamics, potentially related to the solar magnetic cycle. Changes in the oscillation frequencies, reflecting potential changes in solar structure, were first detected by Woodard and Noyes (1985); the dominant variations are closely correlated with the surface magnetic field (e.g., Woodard et al. 1991; Bachmann and Brown 1993; Chaplin et al. 2001a; Howe et al. 2002) and appear predominantly to be a near-surface effect. In an analysis of the response of the Sun and its oscillation frequencies to thermal perturbations Balmforth et al. (1996) concluded that deep-seated perturbations of this nature were inconsistent with the lack of substantial variations in the solar radius. However, tentative but very interesting evidence was found by Gough (1994) of a possible solar-cycle related change in the frequency signature of the acoustic glitch associated with the second helium ionization zone. Similar effects were found in more detailed analyses of observed frequencies by Basu and Mandel (2004) and Verner et al. (2006a), identified by Basu and Mandel as possibly arising from a magnetic effect on the equation of state. Gough (2013a) carried out a detailed analysis of this effect, however, raising some doubts about its reality. Also, Baldner and Basu (2008) found evidence for sound-speed changes between solar activity minimum and maximum at the base of the convection zone, although these have apparently so far not been further confirmed.

As reviewed by Howe (2009), the rotation throughout and possibly below the solar convection zone shows clear changes with solar cycle. The dominant variation has the form of zonal flows, i.e., regions of slightly faster and slower rotation, penetrating to substantial depth into the convection zone and shifting towards lower latitude with time. Such variations in the surface rotation rate were previously detected by Howard and LaBonte (1980), who identified them as torsional oscillations (see also Snodgrass and Howard 1985; Ulrich 2001). As shown by, e.g., Howe et al. (2000) the behaviour is similar to the shift towards the equator of the location of sunspots as the solar cycle advances, in the so-called butterfly diagram (Hathaway 2015). In addition, there is a band of more rapid flow moving towards the poles (Vorontsov et al. 2002). Results covering the last full 22-year magnetic cycle are shown in Fig. 45. Interestingly, the slow decline of solar cycle 23 was matched by a slower shift of the corresponding band of more rapid rotation (Howe et al. 2009). Also, the first appearance of cycle 24 was visible in the zonal flows well before the first appearance of active regions (Howe et al. 2013). Recent analyses (Howe 2016; Howe et al. 2018) show a continuation of this pattern; as illustrated in Fig. 45 the data now show the first indications of the appearance of cycle 25. The physical origin of these zonal flows is so far not understood, although mean-field dynamo models have reproduced some aspects of the flows, including the high-latitude branch (e.g., Rempel 2007, 2012; Pipin and Kosovichev 2019). Basu and Antia (2019) also studied the time variation of solar rotation over cycles 23 and 24. Interestingly, they found that the position and width of the tachocline did not vary, while there were significant variations with time in the change in rotation rate across the tachocline.

Fig. 45
figure 45

Figure courtesy of R. Howe, adapted from Howe et al. (2018)

Rotation-rate residuals, relative to suitably defined average solar rotation rates, inferred from inversions at a target depth of \(0.01 \,R_\odot \) below the photosphere (Howe et al. 2018). Results are shown as a function of time and latitude, using the colour scale at the right. (Note that a rotation-rate variation of 1 nHz corresponds roughly to a flow speed of \(4 \, \mathrm{m \, s^{-1}}\) at the solar equator.) The analysis is based on data from the MDI, HMI and GONG experiments.

These helioseismically inferred variations in the Sun provide a potentially important diagnostics of the apparent changes in solar activity, reflected by the delayed and unusually deep minimum between cycles 23 and 24 and the modest activity in cycle 24. In fact, Basu et al. (2012) noted, after the fact, that the difference between the frequency variations in the descending phases of cycle 22 and cycle 23 might have been used as a prediction of the unusual nature of the cycle 24 minimum, while Howe et al. (2017) speculated that changes in the frequency response to solar activity could reflect a more fundamental change in the solar dynamo. Kosovichev and Rozelot (2018b) analysed solar f-mode frequency splittings from the SOHO/MDI and SDO/HMI instruments extending over 21 years. Much of the variation was clearly correlated with the solar-activity cycle, while a coefficient related to the latitude variation of rotation showed longer-term trends, which might also indicate changes in the solar internal dynamics. It is evident that a continuation of such observations over further solar cycles is of great interest.

5.2 Solar neutrino results

As discussed in Sect. 2.3.3 the nuclear reactions generating energy in the solar core unavoidably produce electron neutrinos. Owing to their small cross section for interaction with matter the neutrinos escape the Sun essentially unhindered. From the total solar luminosity, and the energy generation of around 26 MeV for each produced \({{}^{4}\mathrm{He}}\), it is easy to calculate that the neutrino flux at the distance of the Earth from the Sun is around \(6 \times 10^{10}\,\,\mathrm{cm}^{-2}\,\,\mathrm{s}^{-1}\). It is evident that the detection of this neutrino flux would be a strong confirmation of the importance of nuclear reactions in the core of the present Sun, and potentially a very valuable diagnostics of conditions in the solar core. Here I provide a brief overview of some key aspects and results of the study of solar neutrinos. More detailed reviews of solar neutrino studies have been given by Bahcall (1989), Haxton (1995), Castellani et al. (1997), Kirsten (1999) and Turck-Chièze (1999); Bahcall and Peña Garay (2004) and McDonald (2004) have provided reviews on the theoretical and experimental situation, respectively, and general overviews were given by Haxton et al. (2006, 2013) and Haxton (2008).

The detection of solar neutrinos depends critically on the detailed neutrino spectrum. An example, computed for a standard solar model and referring to observations at one astronomical unit, is shown in Fig. 46. Reactions involving \(\mathrm{e^+}\) decay have continuous spectra, reflecting the sharing of energy between the neutrino and the positron, whereas the reactions involving \(\mathrm{e^-}\) capture are characterized by line spectra. The spectrum is evidently dominated by the neutrinos from the \({{}^{1}\mathrm{H}}({{}^{1}\mathrm{H}}, \mathrm{e^+}\nu _{\mathrm{e}})\,{{}^{2}\mathrm{D}}\) reaction, which, however, have a maximum energy of only 0.42 MeV. In contrast, neutrinos from the \({{}^{8}\mathrm{B}}(\mathrm{e^+}\nu _{\mathrm{e}})\,{{}^{8}\mathrm{Be}}\) (a part of the PP-III chain; cf. Eq. 25) and the \({{}^{1}\mathrm{H}}({{}^{1}\mathrm{H}}\,\, \mathrm{e^-}, \nu _{\mathrm{e}})\,{{}^{2}\mathrm{D}}\) reaction are relatively few in number but have energies up to 15 and 18.7 MeV, respectively.

Fig. 46
figure 46

(Figure courtesy of A. Serenelli.)

The energy spectrum of neutrinos predicted by a standard model of the present Sun. The neutrino fluxes from continuous sources are given in units of \(\,\mathrm{cm}^{-2}\,\,\mathrm{s}^{-1}\,\,\mathrm{MeV}^{-1}\) (despite the ordinate label) at one astronomical unit; the line fluxes are in units of \(\,\mathrm{cm}^{-2}\,\,\mathrm{s}^{-1}\). The spectra from the PP chains are shown with continuous lines: ‘pp’ refers to the reaction \({{}^{1}\mathrm{H}}({{}^{1}\mathrm{H}}, \mathrm{e^+}\nu _{\mathrm{e}})\,{{}^{2}\mathrm{D}}\), ‘\(\,{{}^{7}\mathrm{Be}}\)’ to the reaction \({{}^{7}\mathrm{Be}}(\mathrm{e^-}, \nu _{\mathrm{e}})\,{{}^{7}\mathrm{Li}}\), and ‘\(\,{{}^{8}\mathrm{B}}\)’ to the reaction \({{}^{8}\mathrm{B}}(\mathrm{e^+}\nu _{\mathrm{e}})\,{{}^{8}\mathrm{Be}}\). In addition, two reactions are included which are of no importance to the energy generation but of some significance to neutrino detections: ‘pep’ refers to the reaction \({{}^{1}\mathrm{H}}({{}^{1}\mathrm{H}}\,\, \mathrm{e^-}, \nu _{\mathrm{e}})\,{{}^{2}\mathrm{D}}\), and ‘hep’ to the reaction \({{}^{3}\mathrm{He}}({{}^{1}\mathrm{H}}, \nu _{\mathrm{e}})\,{{}^{4}\mathrm{He}}\). The spectra from the CNO cycle are shown with dashed lines: ‘\(\,{{}^{13}\mathrm{N}}\)’ refers to the reaction \({{}^{13}\mathrm{N}}(\mathrm{e^+}\nu _{\mathrm{e}})\,{{}^{13}\mathrm{C}}\), ‘\(\,{{}^{15}\mathrm{O}}\)’ to the reaction \({{}^{15}\mathrm{O}}(\mathrm{e^+}\nu _{\mathrm{e}})\,{{}^{15}\mathrm{N}}\), and ‘\(\,{{}^{17}\mathrm{F}}\)’ to the reaction \({{}^{17}\mathrm{F}}(\mathrm{e^+}\nu _{\mathrm{e}})\,{{}^{17}\mathrm{O}}\). The neutrino spectra are based on Model B16-GS98, using the Grevesse and Sauval (1998) composition, from Vinyoles et al. (2017). The arrows at the top schematically indicate the sensitivity ranges of the various neutrino experiments (see text).

5.2.1 Problems with solar models?

The possibility of detecting high-energy neutrinos from the Sun was proposed by Fowler (1958), following a revision of nuclear reaction rates that indicated that the PP-III branch was more important than previously thought, and further analysed by Bahcall et al. (1963). The first specific experiment was developed by Raymond (Ray) Davis on the basis of the reaction

$$\begin{aligned} \nu _{\mathrm{e}} + {}^{37}\mathrm{Cl} \rightarrow \mathrm{e^-}+ {}^{37}\mathrm{Ar} \end{aligned}$$
(64)

(Bahcall 1964; Davis 1964).Footnote 50 The detector consisted of a tank containing about 380,000 l of \(\mathrm{C}_2\mathrm{Cl}_4\) at a depth of \(1480 \,\mathrm{m}\) in the Homestake mine in South Dakota; the use of \(\mathrm{C}_2\mathrm{Cl}_4\) provides a manageable way of handling the large amount of chlorine, and the location helps reducing the background from cosmic rays. A discussion of the developments leading to this experiment and its results has been provided by Davis (2003). The reaction (64) on average takes place 15 times a month in the tank; the experiment is typically run for two months after which the argon produced is flushed from the tank with helium and counted, utilizing the fact that \({}^{37}\mathrm{Ar}\) is radioactive, with a half-life of 35 days. The neutrino flux is conventionally measured in units of Solar Neutrino Units (SNU): \(1 \,\mathrm{SNU}\) corresponds to \(10^{-36}\) reactions per second per target nuclei (in this case \({}^{37}\mathrm{Cl}\)). The initial results of the experiment (Davis et al. 1968) found an upper limit to the flux of \(3 \,\mathrm{SNU}\), while the then predicted flux for a ‘standard’ model of the time (Bahcall and Shaviv 1968) was around 20 SNU (e.g., Bahcall et al. 1968a). This was immediately recognized as a potentially serious problem for our understanding of solar structure and energy generation; an early review of the experimental and theoretical situation was given by Bahcall and Sears (1972). Despite continuing measurements and refinements of the modelling this discrepancy persisted: the final average measured value is \(2.56\pm 0.16\,\mathrm{(statistical)}\,\pm 0.16\,\mathrm{(systematic)} \,\mathrm{SNU}\) (Cleveland et al. 1998), while typical model predictions are around \(8 \,\mathrm{SNU}\) (e.g., Bahcall et al. 2001; Turck-Chièze et al. 2001). The Homestake experiment has now ended.

An overview of this and other neutrino experiments, discussed in the following, is provided by Fig. 47. The reaction (64) is sensitive only to neutrinos with energies exceeding 0.81 MeV and, as indicated in the figure, the predicted rate is dominated by \({{}^{8}\mathrm{B}}\) neutrinos. Thus it provides no information about the neutrinos from the basic \({{}^{1}\mathrm{H}}+ {{}^{1}\mathrm{H}}\) reaction.

Fig. 47
figure 47

Figure courtesy of A. Serenelli

Observed and computed neutrino capture rates, for a range of neutrino experiments. In all cases the hatched regions indicate the \(1\,\sigma \) uncertainties. The dark blue bars show the observed values, in SNU for the Cl and Ga experiments, and in terms of the \({}^8\mathrm{B}\) flux, relative to the computed value of \(5.46 \times 10^6 \,\mathrm{cm}^{-2} \,\mathrm{s}^{-1}\), for the KamiokaNDE, SuperKamiokaNDE (SuperK) and SNO experiments. The other bars show the computed values, the dominant contributions being colour coded as indicated. Theoretical results are for the so-called model B16-GS98 (Vinyoles et al. 2017). For the observed values, see text.

A second type of neutrino experiment uses neutrino scattering on electrons in water, causing Čerenkov light from the resulting energetic electrons. Since the electrons are predominantly scattered in the forward direction the detection is sensitive to the direction of the incoming neutrinos, effectively producing a ‘neutrino image’ of the Sun, albeit with low resolution. Great care is taken to purify the water in the detector and shield it from background radiation, including active background detection in a surrounding volume of water. Even so, owing to the dominant background at lower energies, these experiments are limited to neutrino energies above a few MeV. i.e., to the \({{}^{8}\mathrm{B}}\) and hep neutrinos. Such experiments were initiated by Masatoshi Koshiba in Japan (see Koshiba 2003). Early experiments carried out with the KamiokaNDE detectorFootnote 51 in the Kamioka Mine in the Japanese Alps, with a detector volume of 2140 tons of water, found a neutrino flux of less than half the predicted value (Hirata et al. 1989); the experiment also confirmed that the neutrinos originated from the direction of the Sun. The experiment was upgraded to Super-KamiokaNDE, with an inner detector volume of 32,000 tons of water. Fukuda et al. (2001) reported a measured flux of \({{}^{8}\mathrm{B}}\) neutrinos, based on detection of around 18,000 neutrino events, of \(2.32 \pm 0.09 \times 10^6 \,\mathrm{cm}^{-2} \,\mathrm{s}^{-1}\), i.e., 45% of the value predicted by Bahcall et al. (2001). The most recent results, from the so-called Super-Kamiokande-IV phase (Abe et al. 2016), yielded a measured flux of \(2.31 \pm 0.05 \times 10^6 \,\mathrm{cm}^{-2} \,\mathrm{s}^{-1}\), and extending the sensitivity down to neutrino energies of 3.5 MeV; the resulting combined Super-Kamiokande result is \(2.35 \pm 0.04 \times 10^6 \,\mathrm{cm}^{-2} \,\mathrm{s}^{-1}\). Interestingly, Abe et al. found a statistically significant day/night variation of around 4%.

These measurements of solar neutrinos led to the award of the 2002 Nobel prize in physics to Ray Davis and Masatoshi Koshiba. (Davis 2003; Koshiba 2003). They shared the prize with Riccardo Giacconi, who got his part of the prize for work in X-ray astronomy.

Detection of the neutrinos from the \({{}^{1}\mathrm{H}}+ {{}^{1}\mathrm{H}}\) reaction can be made with the reaction

$$\begin{aligned} \nu _{\mathrm{e}} + {}^{71}\mathrm{Ga} \rightarrow \mathrm{e^-}+ {}^{71}\mathrm{Ge}, \end{aligned}$$
(65)

which is sensitive to neutrinos with energies exceeding 0.23 MeV. The germanium must be extracted through chemical processing and counted. Two independent experiments were established to use this technique: the GALLEX experiment at the Laboratori Nazionali del Gran Sasso, located under the Gran Sasso mountain, Italy,Footnote 52 and the SAGEFootnote 53 experiment at the Baksan Neutrino Observatory in Northern Kaukasus, Russia. The GALLEX experiment, using 30 tons of gallium, made the first detection of pp neutrinos (Anselmann et al. 1992), at a capture rate of around \(80 \,\mathrm{SNU}\). This was confirmed by the SAGE experiment which in its full configuration used 57 tons of gallium (Abdurashitov et al. 1994). This rate is essentially consistent with the flux of pp neutrinos (see Fig. 47) but leaves no room from contributions from neutrinos from the remaining reactions, which would lead to a total predicted flux of around \(130 \,\mathrm{SNU}\), as illustrated in Fig. 47. The final result from GALLEX, based on data between 1992 and 1997, was a capture rate of \(73.4\pm 6.0\,\mathrm{(statistical)}\,\pm 3.9\,\mathrm{(systematic)} \,\mathrm{SNU}\) (Hampel et al. 1999; Kaether et al. 2010); the project continued under the name of GNO during the period 1998–2003, with a combined Gallex/GNO rate of \(69.3\pm 4.1\,\mathrm{(statistical)}\,\pm 3.6\,\mathrm{(systematic)} \,\mathrm{SNU}\) (Pandola 2004; Altmann et al. 2005). For SAGE a capture rate of \(65.4\pm 3.0\,\mathrm{(statistical)}\,\pm 2.7\,\mathrm{(systematic)} \,\mathrm{SNU}\) was found (Abdurashitov et al. 2009); they also showed that the combined results of GALLEX, GNO and SAGE yielded \(66.1 \pm 3.1 \,\mathrm{SNU}\).Footnote 54

The original discrepancy between the neutrino measurements and the model predictions immediately led to attempts to modify solar models so as to reduce the neutrino flux. In most cases, this was done under the constraint that the total solar nuclear energy generation rate was kept unchanged. The initial detection with the \({}^{37}\mathrm{Cl}\) experiment was predominantly sensitive to the \({{}^{8}\mathrm{B}}\) reactions. Thus the predicted capture rate depended strongly on the branching ratios between the PP-II and PP-I, and the PP-III and PP-II, chains (cf. Eqs. 2425), which in turn are very sensitive to temperature; at fixed nuclear luminosity the flux of \({{}^{8}\mathrm{B}}\) neutrinos scales roughly as \(T_{\mathrm{c}}^{18}\), where \(T_{\mathrm{c}}\) is the central temperature of the model. Sears (1964) had already noticed a close relation between the composition and the \({{}^{8}\mathrm{B}}\) neutrino flux: decreasing the heavy-element abundance and hence, to maintain the calibrated luminosity (cf. Eq. 36), decreasing the helium abundance and hence the mean molecular weight, reduced the central temperature and hence the neutrino flux. Following the initial measurements, Iben (1968, 1969) made an extensive analysis of this sensitivity and concluded that matching the observed upper limit would require an initial solar helium abundance \(Y_0\) of less than around 0.2; Iben concluded that this would be inconsistent with the Galactic helium abundance inferred from other objects, as well as with early estimates of the Big Bang helium production (e.g., Peebles 1966).

Other attempts to reduce the capture rate through reducing the core temperature of the model were considered. One possibility was substantial mixing of the core; this would increase the central hydrogen abundance and hence allow energy generation to take place at the required rate at a lower temperature (e.g., Bahcall et al. 1968b; Ezer and Cameron 1968). Dilke and Gough (1972) proposed that recent core mixing, in what they called ‘the solar spoon’, might have reduced the nuclear energy generation rate over a period of a few million years, such that the present neutrino capture rate would not be typical of a solar model in equilibrium; owing to the solar thermal timescale of several million years such a lack of equilibrium would not have immediately observable effects. The mixing was supposed to have been initiated through instability to oscillations (Christensen-Dalsgaard et al. 1974; Boury et al. 1975). An alternative mechanism to reduce the core temperature was to postulate a rapidly rotating core (Bartenwerfer 1973; Demarque et al. 1973);Footnote 55 this would reduce the gas pressure in the core required for hydrostatic balance and hence the temperature, potentially leading to models in agreement with the observed neutrino capture rate. A reduction in \(T_{\mathrm{c}}\) could also be accomplished by increasing the efficiency of radiative energy transport in the radiative interior or providing other, non-radiative, contributions to energy transport (e.g., Newman and Fowler 1976) and hence decreasing the temperature gradient; this was accomplished in the models of Iben (1969) through a reduced heavy-element abundance and hence reduced opacity. Joss (1974) proposed that this could be achieved, maintaining the observed solar surface composition, if the solar surface had been contaminated by infalling material rich in heavy elements; in that case the solar interior might have a much lower Z and hence a lower opacity. The idea of stellar pollution was revived in connection with the possibly detected high content of heavy elements in stars that host planetary systems; this could be the result of the accretion by the star of planets rich in heavy elements, which have migrated towards the star (Murray and Chaboyer 2002; Bazot et al. 2005). Also, as discussed in Sect. 6.5, accretion of metal-poor material has been invoked in the solar case to account for the discrepancy between the present observed solar surface abundance and helioseismic inferences of solar structure. A more extreme proposal invoked the presence of the so-called weakly interacting massive particles (‘WIMPs’). Such particles had been proposed to account for the ‘missing mass’, e.g., in clusters of galaxies and galactic halos (Steigman et al. 1978; Steigman and Turner 1985); in fact, there is strong evidence that such non-baryonic dark matter dominates the matter content of the Universe (see Sumner 2002, for a review). If present in the solar interior they could contribute to the energy transport and hence reduce the temperature gradient required for radiative transport (Faulkner and Gilliland 1985; Spergel and Press 1985; Gilliland et al. 1986). This initially appeared to have some support from helioseismology, models with WIMPs yielding improved agreement with early observations of solar oscillation frequencies (Däppen et al. 1986; Faulkner et al. 1986); however, improved observations (e.g., Gelly et al. 1988) and improved modelling (e.g., Christensen-Dalsgaard 1992) have shown that this apparent agreement was in fact spurious.Footnote 56

Given the improvements in the precision and extent of the solar oscillation measurements, it became increasingly difficult to imagine that such modified solar models could be found which were consistent both with the helioseismic inferences and with the neutrino capture rate. Elsworth et al. (1990) pointed out that the measurements of the small frequency separations between low-degree modes, sensitive to the properties of the solar core (cf. Eq. 60), were consistent with normal solar models but inconsistent with models proposed to reduce the neutrino flux (see also Christensen-Dalsgaard 1991). Dziembowski et al. (1990) obtained lower limits on the solar neutrino flux in models consistent with the results of helioseismic inversion and demonstrated that these were inconsistent with the measured neutrino rates. Admittedly, the helioseismic results are sensitive mainly to the sound speed and not directly to the temperature upon which the neutrino flux predominantly depends. Thus, assuming the ideal-gas approximation (Eq. 55) helioseismology constrains \(T/\mu \) but not T and \(\mu \) separately. Even so, given the very small difference between the solar and standard-model sound speed illustrated in Fig. 39, a remarkable degree of fine tuning would be required to reduce the temperature sufficiently to bring the neutrino predictions in line with observations, while keeping the sound speed in accordance with helioseismology (e.g., Bahcall et al. 1997). Also, models modified to eliminate the remaining differences between the model and the solar sound speed produce neutrino fluxes very similar to those of standard models (Turck-Chièze et al. 2001; Couvidat et al. 2003). Thus the evidence was very strong that the structure of solar models was basically correct, and that the solution to the neutrino problem had to be found elsewhere. It should be noted that this conclusion was also reached by, for example, Castellani et al. (1997) on the basis of analysis of apparent inconsistencies between the results of the different neutrino experiments which could not be resolved through modifications to the solar model.

5.2.2 Revision of neutrino physics: neutrino oscillations

Solutions to the neutrino discrepancy involving neutrino physics were considered very early. These are based on the existence of three different types, or flavours, of neutrinos: in addition to the electron neutrino (\(\nu _{\mathrm{e}}\)) produced in nuclear reactions in the Sun, muon (\(\nu _\mu \)) and tau (\(\nu _\tau \)) neutrinos also exist. Although, in the Standard Model of particle physics, neutrinos are massless, non-zero neutrino masses are possible in extensions of the model. Pontecorvo (1967) and Gribov and Pontecorvo (1969) noted that in this case the three mass eigenstates of the neutrinos, which control their propagation, would differ from the flavour eigenstates, causing an oscillation between the flavour states as the neutrinos propagate in vacuum. If an appropriate fraction of the electron neutrinos were to be converted into the other types, to which the \({}^{\mathrm{37}}\mathrm{Cl}\) experiment is not sensitive, the initial apparently anomalously low detection rate might be explained. A more detailed calculation of this effect, taking the neutrino spectrum into account, was carried out by Bahcall and Frautschi (1969). Interestingly, in a brief note Paternò (1981) pointed out that even the limited helioseismic data at that time provided support for such a mechanism. In addition to the vacuum oscillations, transitions between the neutrino flavours are mediated by the weak interaction between the neutrinos and the electrons in solar matter (Wolfenstein 1978; Mikheyev and Smirnov 1985); this is known as the MSW effect. The neutrino oscillations require that at least some of the neutrinos have mass, the mass of \(\nu _{\mathrm{e}}\) differing from that of the other types. The transition rate depends on differences such as \(\varDelta m_{12}^2 = m_2^2 - m_1^2\) between the squared masses of the interacting neutrinos and the so-called mixing angles, e.g., \(\theta _{12}\). As a result of the interaction with solar matter, the survival probability, i.e., the fraction of \(\nu _{\mathrm{e}}\) that reach terrestrial detectors, depends on the neutrino energy. A concise summary of neutrino oscillations was provided by Haxton et al. (2013), while Gonzalez-Garcia and Nir (2003) gave a detailed review of the physics of neutrino mixing.

It was found possible to choose neutrino parameters such that the predictions of neutrino oscillations were consistent with the neutrino observations from the \({}^{37}\mathrm{Cl}\), \({}^{71}\mathrm{Ga}\) and electron scattering experiments (for an overview, see Bahcall et al. 1998). Some independent evidence for neutrino oscillations, involving the muon neutrinos, had been obtained from measurements of neutrinos produced in the Earth’s atmosphere by reactions involving cosmic rays (e.g., Fukuda et al. 1998); this lent credence to the effect as an explanation of the solar neutrino deficit.

Decisive tests of the mechanism came from the Sudbury Neutrino Observatory (SNO) in Canada (see Boger et al. 2000; McDonald 2016). SNO measured solar high-energy (\({}^8\mathrm{B}\)) neutrinos through reactions with deuterium (\({{}^{2}\mathrm{D}}\)) in heavy water as well as through electron scattering. Thus the following neutrino reactions take place in the detector:

$$\begin{aligned} {\begin{array}{rl} \nu _{\mathrm{e}} + {{}^{2}\mathrm{D}}\rightarrow {{}^{1}\mathrm{H}}+ {{}^{1}\mathrm{H}}+ \mathrm{e^-}&{}\qquad \mathrm{(CC)}, \\ \nu _x + {{}^{2}\mathrm{D}}\rightarrow {{}^{1}\mathrm{H}}+ \mathrm{n}+ \nu _x &{}\qquad \mathrm{(NC)}, \\ \nu _x + \mathrm{e^-}\rightarrow \nu _x + \mathrm{e^-}&{}\qquad \mathrm{(ES)}, \end{array}} \end{aligned}$$
(66)

where \(\nu _x\) are neutrinos of any flavour. As indicated, the charged current (CC) reactions are sensitive only to the electron neutrinos, while the neutral current (NC) reactions are sensitive to all neutrino flavours; electron scattering (ES) is mainly sensitive to \(\nu _{\mathrm{e}}\) but also has some, if reduced, sensitivity to \(\nu _\mu \) and \(\nu _\tau \). In all cases the occurrence of a reaction is measured through the emission of Čerenkov light. In the case of electron scattering, as in the KamiokaNDE and SuperKamiokaNDE experiments, this is done for the electron on which the neutrino scatters; this has a strong directionality around the original direction of the neutrino. In the case of the CC reactions the electron again is detected, although with a different directional distribution. Finally, for the NC interactions the neutrons are detected. In the first phase of the experiment the neutrons reacted with \({{}^{2}\mathrm{D}}\) to produce gamma-ray photon, which Compton scattered off electrons in the water, resulting in emission of Čerenkov light, in this case essentially isotropically. Thus from the angular distribution of the Čerenkov light, as well as from the energy spectra, the different reactions can be separated. In the second phase the sensitivity was increased by dissolving NaCl in the heavy water, the neutrons being detected through absorption in \({}^{35}\mathrm{Cl}\), gamma-ray emission, Compton scattering on electrons and Čerenkov-light emission. In the third and final phase the neutrons from the CC reactions were detected by strings of proportional counters suspended in the heavy-water container.

The initial analysis of the SNO results was based on comparing the rate measured with the charged-current reaction in Eq. (66) with the rate from previous electron-scattering KamiokaNDE and SuperKamiokaNDE measurements to deduce the number of \(\nu _\mu \) and \(\nu _\tau \), using the modest sensitivity of the electron-scattering experiments to these flavours. This provided a measure of the extent to which neutrino conversion has taken place and therefore allowed an estimate of the original neutrino production rate in the solar core. The striking result was that the answer agreed, to within errors, with the predictions of standard solar models (Ahmad et al. 2001).

The decisive demonstration of neutrino conversion was obtained from measurements with SNO of the neutrino flux based on the neutral-current reaction using neutron absorption in \({{}^{2}\mathrm{D}}\) (Ahmad et al. 2002), which yielded a flux at the Earth of \(\nu _{\mathrm{e}}\) from \({}^8 \mathrm{B}\) of \((1.76 \pm 0.10)\times 10^6 \,\mathrm{cm}^{-2}\ \mathrm{s}^{-1}\) and a flux of other neutrino types (\(\nu _\mu \) and \(\nu _\tau \)) of \((3.41 \pm 0.65)\times 10^6\,\,\mathrm{cm}^{-2}\,\mathrm{s}^{-1}\). The total \({}^8 \mathrm{B}\) flux was found to be \((5.09 \pm 0.62)\times 10^6\,\,\mathrm{cm}^{-2}\,\mathrm{s}^{-1}\). which, as also indicated in Fig. 47, is consistent with solar models. The final combined results of the three phases of the SNO experiment (Aharmim et al. 2013) yielded a total \({}^8 \mathrm{B}\) flux of \((5.25 \pm 0.20)\times 10^6\,\,\mathrm{cm}^{-2}\,\mathrm{s}^{-1}\), and a measured survival probability, at neutrino energy of 10 MeV, of \(0.317 \pm 0.018\), giving a very strong confirmation of the presence of neutrino oscillations. The 2015 Nobel Prize was awarded to Arthur B. McDonald for the detection of solar-neutrino oscillations (McDonald 2016). He shared it with Takaaki Kajita who got the prize for the detection of oscillations of muon neutrinos produced by cosmic-ray interactions in the upper atmosphere of the Earth (Kajita 2016). The heavy-water phase of the SNO experiment ended in 2006.

A broad range of neutrino results have been obtained over the last decade from the Borexino experiment (Alimonti et al. 2009). This uses a 300-ton liquid scintillator for real-time detection of solar neutrinos, established specifically to study the neutrinos resulting from the electron-capture decay of \({{}^{7}\mathrm{Be}}\) (cf. Eq. 25); furthermore, the background in the detector allows measurement of the \({}^8\mathrm{B}\) neutrinos down to an energy of 2.8 MeV. This provides further constraints on the energy-dependence of the oscillations between different neutrino flavours. Arpesella et al. (2008) detected the signal from the \({{}^{7}\mathrm{Be}}\) neutrino line at 0.862 MeV, and obtained a reduction relative to the model predictions which is consistent with neutrino oscillations, given the parameters determined from the earlier experiments. Also, Bellini et al. (2010) considered the \({{}^{8}\mathrm{B}}\) spectrum at energies at around 8.6 MeV; comparing the results with the previous \({{}^{7}\mathrm{Be}}\) results demonstrated for the first time, using the same detector, an energy dependence of the reduction in the flux of \(\nu _{\mathrm{e}}\) neutrinos that is consistent with the matter-induced effects being important at the higher, and not the lower, energy.

The sensitivity of the Borexino detector extends to energies much lower than the energy cut-off for the pp neutrinos (cf. Fig. 46). Thus, after careful purification of the detector material Bellini et al. (2014) determined this basic flux of solar neutrinos, taking neutrino oscillations into account, to be \((6.6 \pm 0.7) \times 10^{10}\,\,\mathrm{cm}^{-2}\,\,\mathrm{s}^{-1}\), fully consistent with solar models. Also the \(\nu _{\mathrm{e}}\) survival probability in this energy range was found to be \(0.64 \pm 0.12\). Agostini et al. (2019) determined the fluxes of pp, pep and \({}^7 \mathrm{Be}\) neutrinos, whereas Agostini et al. (2020b) determined the flux of \({}^8 \mathrm{B}\) neutrinos. Combining these results with computed neutrino fluxes from solar models provides a determination of the survival probability based on data from a single experiment. The results are illustrated in Fig. 48 for two solar models, compared with a prediction based on neutrino oscillations. The results clearly follow the expected energy dependence fairly well, but with slight preference for the model with higher abundance Z of heavy elements (see also Sect. 6.2). The heavy-element abundance more directly affects the CNO neutrinos, but Agostini et al. (2019) were only able to establish an upper limit of a factor 1.5–2 higher than the model predictions.

Fig. 48
figure 48

Image reproduced with permission from Agostini et al. (2020b), preprint v1

Electron neutrino survival probability against neutrino energy, based on comparing Borexino measurements (Agostini et al. 2020b) with solar models (Vinyoles et al. 2017). The red, blue and azure points show pp, \({}^7\mathrm{Be}\) and pep neutrinos (Agostini et al. 2019), and the black and grey points show the combined and low-and high-energy results for \({}^8\mathrm{B}\). The model in the left panel used heavy-element abundances from Grevesse and Sauval (1998), while the model in the right panel is based on the lower Asplund et al. (2009) abundances (see Sect. 6.1). The curves show computed survival probabilities based on neutrino-oscillation parameters from Esteban et al. (2017).

A combined analysis of the Borexino results was presented by Agostini et al. (2018) and is illustrated in Fig. 49. Covering all reactions making substantial contributions to the nuclear energy generation it also allowed an estimate of the total solar nuclear luminosity, after taking flavour conversion into account. The result, \(L_{\mathrm{nucl}} = (3.89^{+0.35}_{-0.42}) \times 10^{33} \,\mathrm{erg}\,\mathrm{s}^{-1}\), is fully consistent with the observed solar luminosity and provides a first demonstration of the instantaneous solar nuclear equilibrium within a precision of 10%.

Fig. 49
figure 49

Figure courtesy of A. Serenelli and A. Ianni

Observed and computed neutrino capture rates, for the Borexino neutrino experiments (Alimonti et al. 2009). The observations, not corrected for neutrino oscillations, were obtained from Agostini et al. (2018). The results are normalized by the computed values of the B16-GS98 model of Vinyoles et al. (2017): \(5.98 \times 10^{10} \,\mathrm{cm}^{-2} \,\mathrm{s}^{-1}\) for pp, \(1.44 \times 10^{8} \,\mathrm{cm}^{-2} \,\mathrm{s}^{-1}\) for pep, \(4.93 \times 10^{9} \,\mathrm{cm}^{-2} \,\mathrm{s}^{-1}\) for \({{}^{7}\mathrm{Be}}\), and \(5.46 \times 10^{6} \,\mathrm{cm}^{-2} \,\mathrm{s}^{-1}\) for \({{}^{8}\mathrm{B}}\). In all cases the hatched regions indicate the \(1\,\sigma \) uncertainties.

Very recently, Agostini et al. (2020a) announced a robust detection by Borexino of CNO neutrinos, at a rate that this consistent with both the low- and the high-metallicity models.

In parallel with these efforts to study neutrino conversion from solar observations, extensive terrestrial experiments have been carried out, to obtain independent determinations of the neutrino-oscillation parameters. The KamLAND detector in Japan (e.g., Eguchi et al. 2003; Gando et al. 2011) measured the flux of electron antineutrinos \({\overline{\nu }}_{\mathrm{e}}\) from commercial nuclear reactors, with a clear signal of neutrino oscillations which placed constraints on the oscillation parameters. Other experiments have been developed that direct beams of neutrinos from accelerators towards neutrino detectors, over distances of several hundred kilometers. A beam of muon antineutrinos \({{\bar{\nu }}}_\mu \) from the Fermilab accelerator in Illinois was analysed in the MINOS experiment (Adamson et al. 2012) with two detectors: a near detector one km from the neutrino source and a far detector at the Soudan Underground Laboratory, 735 km away in Minnesota. The OPERA experiment (Agafonova et al. 2015, 2018) used a beam of neutrinos from CERN at Geneva to search for conversions from muon to tau neutrinos at the Gran Sasso Laboratory, 730 km away. In the NoVA experiment (e.g., Adamson et al. 2016) a beam of muon neutrinos was sent from the Fermilab accelerator to a detector in Ash River, Minnesota, 810 km away. The T2K experiment (e.g., Abe et al. 2017) sends a neutrino beam from the J-PARC accelerator at Tokai, Japan, to the SuperKamiokaNDE detector, 295 km away. A review of such accelerator experiments was provided by Nakaya and Plunkett (2016). However, the size of the Earth sets a natural limit to the scale of terrestrial experiments. Thus, observations of solar neutrinos remain a very important possibility for studying the properties of the neutrino experimentally, including the MSW effect and its consequences for the energy dependence of the survival probability. Bergström et al. (2016) analysed the solar and terrestrial neutrino data, as a basis for a comparison with the predicted solar model results. A comprehensive analysis of the available data, both solar and terrestrial, was carried out by Esteban et al. (2017), leading to the computed probability shown in Fig. 48, while Maltoni and Smirnov (2016) discussed the importance of solar neutrinos in investigations of neutrino physics and the resulting extensions beyond the Standard Model of particle physics.

With the improved understanding of the properties of the neutrinos, and with further neutrino experiments, we may increasingly use the observations of solar neutrinos as constraints on the properties of the solar core, complementary to those provided by helioseismology. An interesting example of such combined analysis of helioseismic and neutrino-based observations, to which I return in Sect. 6.2, was provided by Song et al. (2018). The present situation was summarized concisely and accurately by Haxton et al. (2013): “Effectively, the recent progress made on neutrino mixing angles and mass differences has turned the neutrino into a well-understood probe of the Sun. We now have two precise tools, helioseismology and neutrinos, that can be used to see into the solar interior. We have come full circle: The Homestake experiment was to have been a measurement of the solar core temperature, until the solar neutrino problem intervened”.

5.3 Abundances of light elements

The present solar surface abundances are the result of the composition of the initial Sun as well as the processes which may have modified the composition. Thus in this sense they represent an archaeological record of solar evolution. Since the convective envelope is mixed on a timescale of a few months its composition is uniform. Thus the relevant aspects are the evolution of the composition beneath the convection zone, as well as mixing processes which might link the composition in the deep solar interior to the solar surface. Furthermore, substantial mass loss may expose material from deeper layers at the surface (see also Sect. 6.5). In this manner the surface composition provides a time integral over solar evolution of the processes in the solar interior.

For refractory elements, meteoritic abundances provide a measure of the initial solar composition, e.g., measured relative to the abundance of silicon which has presumably not been significantly affected by processes in the solar interior. Very interesting cases are the light elements lithium, beryllium and boron which are destroyed over the solar lifetime by nuclear reactions at temperatures found in the solar interior. Specifically, lithium is very substantially reduced over a period corresponding to the solar age at temperatures above \(2.5 \times 10^6 \,\mathrm{K}\), while the corresponding critical temperatures for beryllium and boron are \(3.5 \times 10^6 \,\mathrm{K}\) and \(5 \times 10^6 \,\mathrm{K}\), respectively. Thus mixing down to these temperatures, or mass loss exposing material that has been at such temperatures, should be reflected in reductions in the abundances of these elements relative to the meteoritic values.

No significant depletion is found for boron (Cunha and Smith 1999) or beryllium (Balachandran and Bell 1998; Asplund 2004), limiting mixing to extend at most to temperatures less than \(3.5 \times 10^6 \,\mathrm{K}\). However, as mentioned in Sect. 2.2, the present solar surface abundance of lithium has been reduced by a factor of around 150 (Asplund et al. 2009) relative to the meteoritic value, indicating mixing to temperatures exceeding \(2.5 \times 10^6 \,\mathrm{K}\). As noted by Schatzman (1969) this is substantially higher than the temperature \(T_{\mathrm{bc}}\) at the base of the solar convection zone, during solar evolution.Footnote 57 Thus additional mixing, or mass loss, is required to account for the lithium depletion. It should be noted, however, that possible depletion in pre-main-sequence evolution, including a likely fully mixed convective phase, must be taken into account in using the lithium abundance as a diagnostic. Theoretical (e.g. Piau and Turck-Chièze 2002) and observational (e.g. Bouvier et al. 2018b) studies show that this depletion can be substantial, but strongly dependent on the details of the evolution, including effects of rotation. Also, the general uncertainty about the early evolution of stars (see Section 3.1) must be taken into account.

Additional constraints on mixing or mass-loss processes are provided by the ratio \({{}^{3}\mathrm{He}}/{{}^{4}\mathrm{He}}\) between the abundances by number of \({{}^{3}\mathrm{He}}\) and \({{}^{4}\mathrm{He}}\). This has been measured in the solar wind and in lunar material, as deposited from the solar wind. As shown in Fig. 7 the nuclear reactions in the PP chains cause a build-up of the \({{}^{3}\mathrm{He}}\) abundance with solar evolution. This does not extend to the base of the convection zone; however, mixing extending substantially deeper (or corresponding mass loss) would evidently cause an increase in the isotope ratio at the solar surface and hence in the solar wind.

The observational evidence was discussed by Bochsler et al. (1990). They noted that the initial \({{}^{2}\mathrm{D}}\) in the Sun has been converted to \({{}^{3}\mathrm{He}}\) through the second reaction in the PP-I chain (cf. Eq. 24) which takes place at temperatures as low as those found in the present solar convection zone. From estimates of the primordial solar-system content of \({{}^{2}\mathrm{D}}\) and \({{}^{3}\mathrm{He}}\) they consequently estimated the initial ratio \({{}^{3}\mathrm{He}}/{{}^{4}\mathrm{He}}\) in the Sun as around \(4.4 \times 10^{-4}\). From solar-wind measurements, either from satellites or from foils exposed in the Apollo missions, they found very similar values at present; also, analyses of lunar material indicate that the ratio has not varied much over the last few billion years (Heber et al. 2003). An investigation of the composition of the solar wind with the Genesis spacecraft yielded a value of \({{}^{3}\mathrm{He}}/{{}^{4}\mathrm{He}}= (4.64 \pm 0.09) \times 10^{-4}\) (Heber et al. 2009), consistent with the earlier results.Footnote 58 The general conclusion, therefore, is that there has been little if any enrichment of the solar convection zone with \({{}^{3}\mathrm{He}}\) during solar evolution.

Models including appropriately varying enhancements of the diffusion coefficient assumed to be caused by turbulence, can indeed account for the observed lithium depletion (e.g., Vauclair et al. 1978; Schatzman et al. 1981; Lebreton and Maeder 1987). In the latter two cases the \({{}^{3}\mathrm{He}}/{{}^{4}\mathrm{He}}\) ratio was also considered, yielding a modest increase during solar evolution which may be inconsistent with the present observational situation. Christensen-Dalsgaard et al. (1992) presented a detailed analysis of simpler models, assuming rapid mixing over a region below the convection zone and taking into account the variation with time of the extent of the mixed region. They found that to a good approximation the typical lithium-destruction timescale, averaged over the mixed region and over solar age, could be approximated as twice the timescale at the base of the mixed region in the present Sun.

The helioseismic investigations have provided further information about conditions at the base of the convection zone. As discussed in Sect. 5.1.2, the localized difference between the solar and model sound speed beneath the convection zone (see Fig. 39) may indicate that the gradient in the hydrogen abundance, caused by helium settling, is too steep in this part of the models, suggesting the need for additional mixing. Also, the sharp gradient in the angular velocity in the tachocline (cf. Sect. 5.1.4) could give rise to dynamical processes leading to such mixing. Richard et al. (1996) considered rotationally induced turbulent mixing, following the description of Zahn (1992). They obtained a reasonable sound-speed profile below the convection zone, as well as the observed lithium depletion and the then assumed depletion of beryllium by a factor of two. In a similar analysis, Brun et al. (1999) obtained a smoothed sound-speed difference relative to the helioseismic results, together with the required lithium depletion, no depletion of beryllium, and a \({{}^{3}\mathrm{He}}/{{}^{4}\mathrm{He}}\) ratio consistent with the inferred values. Lithium destruction was also considered in the magnetically dominated model by Gough and McIntyre (1998) of the origin of the tachocline. Clearly any modelling of these effects should aim for simultaneously reducing the sound-speed difference just below the convection zone and obtain the observed surface lithium abundance. A recent analysis by Jørgensen and Weiss (2018) based on various forms of imposed turbulent diffusion assumed to arise from convective overshoot suggests that this may not be straightforward. In Sect. 6.5 I return to this issue, based on complex modelling by Zhang et al. (2019) including early accretion, mass loss and turbulent mixing.

It is evident that the diagnostics of the solar internal structure provided by these abundance determinations is less precise that those obtained from helioseismology. However, they must be kept in mind as constraints on any solar models. In particular, they provide integral measures of the dynamics in the solar interior over the solar lifetime, which is clearly closely related to the evolution of the solar internal rotation. More generally, the observed dependence on stellar parameters of lithium depletion in solar-like stars is an important diagnostics of these processes, and the solar results must be understood in this context (see, for example Charbonnel and Talon 2005, and Sect. 7). Asteroseismic information about the internal rotation of stars is extremely important in this connection.

6 The solar abundance problem

The models presented in the preceding sections can be regarded as ‘classical’ solar models of the late twentieth century; they have been computed using well-established physics, including diffusion and settling, and are based on the observed parameters of the epoch. Interestingly, as discussed in Sect. 5, they are in reasonable if not full agreement with the helioseismic inferences and with the latest neutrino detections, taking into account flavour transitions. In this sense it is perhaps reasonable to regard them as ‘standard’ solar models.

Even so, the models obviously can, and should, be questioned. The remaining differences in structure and physics between the Sun and the model, discussed in Sect. 5.1.2, obviously need to be understood. More seriously, since around 2000 new determinations of the solar surface composition have led to substantial discrepancies between the resulting solar models and the helioseismic results, forcing us to reconsider the computation of solar models. This is discussed below. Indeed, we obviously need to question the simplified assumptions underlying the ‘standard’ model computation. One remaining serious uncertainty of potentially important consequences for solar evolution is the treatment of the loss and redistribution of angular momentum, briefly discussed in Sect. 5.1.4. Below I address a second issue, namely the assumption of no significant mass loss during solar evolution, which was considered by Sackmann and Boothroyd (2003) in connection with the ‘faint early Sun’ problem.

6.1 Revisions to the inferred solar composition

The ‘classical’ modelling of the solar atmosphere in terms of a static horizontally homogeneous layer is obviously oversimplified, given the highly inhomogeneous and dynamic nature of the atmosphere. This affects the profiles of the solar spectral lines and hence the determination of solar abundances. In such analyses a semi-empirical mean structure of the atmosphere is often used, based on observed properties such as the limb-darkening function, i.e., the variation in intensity with position on the solar disk; typical examples are Holweger and Müller (1974), Vernazza et al. (1981). The dynamical aspects of the atmosphere are represented in terms of parameterized ‘micro- and macro-turbulence’, adjusted to match the observed line profiles. A second important issue are the departures from local thermodynamical equilibrium (LTE) in the population of the different states of ionization and excitation in the atoms in the atmosphere. Proper treatment of such non-LTE (NLTE) effects requires detailed accounting of the different radiative and collisional processes that affect the population (e.g., Mihalas 1978).

As discussed in Sect. 2.5, hydrodynamical simulations of solar convection now yield a realistic representation of conditions in the uppermost parts of the solar convection zone and in the solar atmosphere. In particular, the spectral line profiles can be reproduced without the use of additional parameters. Application of these results to the determination of the solar abundance (Allende Prieto et al. 2001, 2002; Asplund et al. 2004, 2005b) provided increasingly strong evidence for a need to revise the solar composition; in particular, the inferred abundances of carbon, nitrogen and oxygen were lower than previous determinations by more than 30%, resulting in \(Z_{\mathrm{s}}/X_{\mathrm{s}} = 0.0165\) (compared, e.g., with the value 0.0245 obtained by Grevesse and Noels (1993) and used in Model S). Overviews of these initial results were provided by Asplund (2005) and Asplund et al. (2005a) (in the following AGS05).

As reviewed in detail by Basu and Antia (2008) and discussed extensively below these changes in the composition assumed in solar modelling led to substantial changes in the model structure and a drastic increase in the helioseismically inferred difference between the Sun and the model (see Fig. 51), leading to questioning of the new abundance determinations. For example, Ayres et al. (2006) criticized the atmospheric models obtained from the hydrodynamical simulations, on the ground that they failed to match the observed centre–limb variation over the solar disk in the continuum intensity; a similar objection was raised by Pinsonneault and Delahaye (2009). Also, Ayres et al. analysed weak CO features and obtained an abundance consistent with the old determinations.

Following this initial work, the Asplund et al. (AGS05) analysis was updated by improved hydrodynamical models; these did indeed, for the first time, succeed in reproducing the observed limb-darkening function, over a broad range of wavelengths (Pereira et al. 2013). Furthermore, the analysis included careful consideration of NLTE effects, whenever possible, and of the choice of atomic input data and of spectral lines and effects of line blending. The resulting comprehensive composition results were presented by Asplund et al. (2009) (in the following AGSS09). The revision led to a slight general increase in the abundances, although still far from recovering the old values. A recent update on these determinations was provided by Grevesse (2019).

Table 4 lists selected abundances from several determinations, including earlier results typically used in the computation of ‘standard’ solar model; I return to the Caffau et al. (2011) results below.

Table 4 Selected solar photospheric abundances in terms of number densities, on a logarithmic scale normalized such that the \(\log N_{\mathrm{H}} = 12\), where \(N_{\mathrm{H}}\) is the number density of hydrogen and \(\log \) is to base 10; \(Z_{\mathrm{s}}/X_{\mathrm{s}}\) is the corresponding ratio between the abundances by mass of heavy elements and hydrogen

Interestingly, the revision to the solar abundances brings them more closely in line with stars or other objects in the solar neighbourhood (e.g., Turck-Chièze et al. 2004; Morel 2009); in contrast, the previous solar abundances tended to be substantially higher than those of nearby hotter and therefore generally younger stars, in conflict with the expectations of galactic chemical evolution. This issue was further analysed by Nieva and Przybilla (2012), on the basis of a characterization of the composition of matter in the present solar neighbourhood based on extensive observations of abundances of early B-type stars. They found that, even with the AGSS09 abundances, the Sun was substantially over-abundant compared with the solar neighbourhood and concluded on this basis that the Sun was formed in a region at a Galactocentric distance of 5–6 kpc, where the heavy-element abundance was higher, and has subsequently migrated to its present distance of 8 kpc.

Independent hydrodynamical modelling and abundance analysis is obviously highly desirable. Caffau et al. (2008) used the \(\mathrm{CO^5BOLD}\) codeFootnote 59 (Freytag et al. 2002; Wedemeyer et al. 2004) to determine the oxygen abundance, obtaining \(8.76 \pm 0.07\) on the logarithmic scale used in Table 4. It appears that the quite substantial increase relative to AGS05 in part was caused by a different assignment of the continuum in the abundance analysis, resulting in increased equivalent widths of the lines considered. Also, Caffau et al. (2009) similarly determined the nitrogen abundance as \(7.86 \pm 0.12\); from these determinations they obtained \(Z_{\mathrm{s}}/X_{\mathrm{s}}= 0.0213\), relatively close to the old determinations. An overview of the results of these efforts are also included in Table 4, based on Caffau et al. (2011) and supplemented by Lodders (2010). A more careful comparison between the assumptions and results of these different abundance analyses is certainly needed, to understand the differences between these results and those of Asplund et al. (2009). Interestingly, a comparison carried out by Beeck et al. (2012) of different hydrodynamical simulations of the solar near-surface layers, including the so-called Stagger code (e.g. Collet et al. 2011)Footnote 60 that is closely related to the codes used in the analyses by Asplund et al. and the \(\mathrm{CO^5BOLD}\) code, found good agreement between the mean structure and turbulent behaviour between the codes. This suggests that the differences between the AGSS09 and C11 compositions arise from more subtle aspects of the analysis including, as also hinted above, the basic analysis of the observations.

The noble gases present particular problems for the abundance determination, since they have no lines in the solar photospheric spectrum. Particularly important is neon which, as shown in Fig. 5, makes a substantial contribution to the opacity. Estimates of the abundances can be obtained from the solar wind or solar energetic particles, or from lines formed in the higher layers of the solar atmosphere, including the corona. These determinations suffer from the uncertain effects of element separation in the solar corona which depends on the first ionization potential (the so-called FIP effect, e.g., Marsch et al. 1995; Laming 2015). An alternative technique, used by AGSS09, is to determine the ratio, e.g., Ne/O between a noble gas and oxygen which may be expected to suffer approximately the same separation effect, and hence the neon abundance. Recently Young (2018) provided a re-assessment of data on the transition region in the quiet Sun, on the basis of new atomic data, to obtain a higher Ne/O ratio and a logarithmic neon abundance of 8.08. Given the derivative in Fig. 5 this increase in the neon abundance of around 40% relative to the assumed AGSS09 value would correspond to an increase in the opacity of up to 5%, just below the convection zone.

Detailed reviews on the solar and solar-system composition, with emphasis on the abundance determinations of refractory elements in meteorites, were also given by Lodders (2003, 2010) and Lodders et al. (2009). Indeed, Vinyoles et al. (2017) argued that the meteoritic abundances of the refractory elements are likely more accurate, and certainly more precise, than the photospheric abundances and hence should be used in preference, when available. In practice, the resulting differences for the elements listed in Table 4 are very small.

6.2 Effects on solar models of the revised composition

The effect on solar models of the change in the heavy-element abundance arises predominantly from the resulting change in the opacity. From Fig. 5 it is obvious that oxygen makes a large contribution to the opacity in much of the interior. Thus the reduction in the oxygen abundance leads to a decrease in the opacity; this reduces the depth of the convection zone, as well as the temperature gradient in the radiative interior, leading to the reduction in the sound speed in much of the interior. An extensive review of the effects on solar models, and the broader consequences for modelling other stars, was provided by Buldgen et al. (2019a).

To illustrate these effects I consider Models [AGS05] and [AGSS09] together with Models S and [GS98], summarized in Table 5. The effects on the model of the present Sun of the changes in composition, relative to Model S, are shown in Fig. 50 and Table 6. The substantial decrease in the sound speed in the radiative interior is obvious, as is the reduction in the depth of the convection zone. Also, according to Eq. (36) the reduction in the heavy-element abundance must be balanced by a reduction in the mean molecular weight, to keep the luminosity fixed, and hence to an increase in X, as shown in Fig. 50, and a corresponding decrease of the helium abundance in the convection zone (see Table 6).

Table 5 Parameters of solar models. Age, R and L are for the model of the present Sun
Table 6 Characteristics of the models in Table 5 with updated abundances. Y0 and Z0 are the initial helium and heavy-element abundances, Tc, ρc and Xc are the central temperature, density and hydrogen abundance of the model of the present Sun, Zs/Xs is the present ratio between the surface heavy-element and hydrogen abundances, Zs and Ys are the surface heavy-element and helium abundances, and dcz is the depth of the convective envelope. The last lines give helioseismically inferred solar values of Ys (Basu and Antia 2004) and dcz/R (Christensen-Dalsgaard et al. 1991; Basu and Antia 1997). For details on the models, see the caption to Table 5
Fig. 50
figure 50

Model changes at fixed fractional radius resulting from the use of the Asplund et al. (2005a) abundances, relative to Model S, in the sense (Model [AGS05])–(Model S). The line styles are defined in the figure. The thin dotted line marks zero change. The thinner grey and magenta lines show the corresponding differences for Model [AGSS09] using the AGSS09 composition

These changes in the solar model have had a drastic effect on the comparison with the helioseismic results. An extensive review of these consequences was provided by Basu and Antia (2008), while a more recent, but brief, review is in Serenelli (2016). The maximum relative change in \(c^2\) resulting from using AGS05, around 2%, is substantially larger than the difference between the Sun and Model S illustrated in Fig. 39 and of the opposite sign. Thus the new abundances greatly increase the discrepancy between the model and helioseismically inferred solar sound speed. This is illustrated in Fig. 51. As expected from Fig. 50, the effect on the sound speed extends through much of the radiative interior; in particular, it is not only a consequence of the error in the depth of the convection zone of the model (see also Fig. 56). Also, as illustrated in Table 6 the envelope helium abundance and convection-zone depth of the model differ strongly from the helioseismically inferred values. Using the more recent AGSS09 composition reduces the discrepancies with the helioseismic results somewhat (see Fig. 51; Table 6) although they remain substantial.

Fig. 51
figure 51

Inferred difference in squared sound speed between the Sun and three solar models, in the sense (Sun)–(model). The open circles use Model S (cf. Fig. 39), the filled circles the corresponding Model [AGS05] based on the Asplund et al. (2005a) composition and the stars Model [AGSS09] based on the Asplund et al. (2009) composition. The vertical bars show \(1\,\sigma \) errors in the inferred values, based on the errors, assumed statistically independent, in the observed frequencies. The horizontal bars provide a measure of the resolution of the inversion

It was in fact immediately obvious that the revised composition created problems in matching solar models to the helioseismic inferences. Basu and Antia (2004) considered envelope models, demonstrating that a substantial increase in opacity would be needed to bring the models in accordance with the seismic observations. A similar conclusion was reached by Bahcall et al. (2004), based on the depth of the convection zone. Guzik and Watson (2004), Montalbán et al. (2004), Turck-Chièze et al. (2004), and Bahcall et al. (2005a) showed that the sound speed in models with the revised composition differed much more from the helioseismically determined behaviour than for models with the old composition, as illustrated in Fig. 51. In a detailed analysis based on the convection-zone depth and envelope helium abundance Delahaye and Pinsonneault (2006) concluded that models with the AGS05 composition were inconsistent with the helioseismic inferences to a very high degree of significance, while models with the old composition were essentially consistent with the observations.

Other aspects of the solar oscillation frequencies show similarly large inconsistencies for models computed with the revised abundances. A convenient measure of conditions in the solar core is provided by the small frequency separations \(\delta \nu _{nl} = \nu _{nl} - \nu _{n-1 \, l+2}\), where \(\nu _{nl}\) is the cyclic frequency of a mode of degree l and radial order n, which according to asymptotic theory (e.g. Tassoul 1980) is largely determined by the sound-speed gradient in the stellar core (cf. Eq. 60). Basu et al. (2007) considered a broad range of models with varying composition and opacity tables and found that models with the GS98 composition were largely consistent with the observed values, whereas the AGS05 composition resulted in a very significant departure from the observations. A detailed analysis of this nature was carried out by Chaplin et al. (2007) who carried out fits to the observations to constrain the heavy-element abundance; this resulted in a lower limit of \(Z = 0.0187\), far higher than in the models computed with AGS05 or AGSS09. Zaatri et al. (2007) also found that the AGS05 composition resulted in small frequency separations that were inconsistent with observations. An illustration of the effect of the composition on the small frequency separations can be obtained by considering the calibration of the solar age, based on fits to the small separations, following Dziembowski et al. (1999) and Bonanno et al. (2002). Here the age is determined from \(\chi ^2\) fits to \(\delta \nu _{n0}\), for models of varying age but calibrated to the correct radius, luminosity and assumed surface composition as characterized by \(Z_{\mathrm{s}}/X_{\mathrm{s}}\). As illustrated in Fig. 52, using the old composition the best-fitting model has an age of 4.57 Gyr, very close to the age of \(4.570 \pm 0.006 \,\mathrm{Gyr}\) obtained from meteorites (Wasserburg, in Bahcall and Pinsonneault 1995), and the minimum \(\chi ^2\) is reasonable. On the other hand, with the AGS05 composition the best-fitting model leads to a high \(\chi ^2\) at an age of 4.83 Gyr which is much higher than the proper solar age, while using AGSS09 leads to a best-fitting age of 4.77 Gyr; in both cases the fits are inconsistent with the meteoritic age, at a high level of significance. The inconsistencies involving properties sensitive to the solar core clearly underline that the effects of the revised composition are not confined to the vicinity of the convection zone.

Fig. 52
figure 52

Adopted from Christensen-Dalsgaard (2009)

Goodness of fit for the small frequency separation \(\delta \nu _{n0} = \nu _{n0} - \nu _{n-1\,2}\), fitting solar models of varying age to the observations of Chaplin et al. (2007). All models were calibrated to the observed surface luminosity and radius and a specified value of \(Z_{\mathrm{s}}/X_{\mathrm{s}}\). The solid curve shows results for the GN93 composition, the dashed curve for the AGS05 composition and the dot-dashed curve for the AGSS09 composition. The vertical dotted lines indicate the interval of solar age obtained by Wasserburg, in Bahcall and Pinsonneault (1995).

An early analysis based on the AGSS09 abundances was carried out by Serenelli et al. (2009) who in addition to the purely photospheric AGSS09 composition provided in Table 4 considered the effects of replacing the abundances of certain elements, such as magnesium and iron, with the probably more reliable meteoritic values. This resulted in a slight decrease in the metallicity, relative to AGSS09 as analysed here, and a corresponding small increase in the sound-speed discrepancy. In a broad review of the solar interior Basu et al. (2015) compared models computed with the different abundances listed in Table 4 with the helioseismic results. Interestingly, the C11 abundances gave results very similar to those for GS98, despite the lower CNO abundances; these were compensated by other abundance differences, leading to roughly similar opacities.

A very extensive analysis of the effects of the revised composition on solar models was carried out by Vinyoles et al. (2017). As did Serenelli et al. (2009) they included meteoritic abundances of refractory elements, resulting in what they called the AGSS09met composition. The modelling used up-to-date physics: the FreeEOS equation of state, reaction rates based on an update of those provided by Adelberger et al. (2011), OP opacities (Badnell et al. 2005), and diffusion using the formulation of Thoul et al. (1994). A careful analysis was carried out of the errors in the model, based on the errors in the input parameters, particularly the composition, the nuclear reactions and the opacity; the opacity uncertainty was scaled according to the difference between OP and OPAL opacities, as well as the results of the Bailey et al. (2015) experiments (discussed in Sect. 6.4) and assumed to vary linearly with \(\log T\). Figure 53 shows the resulting relative sound-speed difference using the AGSS09met composition, compared with corresponding results using GS98, obtained from inversion of a combination of BiSON and MDI frequencies. The dominant modelling uncertainty, common to the two sets of results, is shown as the red shaded region for the AGSS09met results. The shaded grey area illustrates what the authors take to be the uncertainty resulting from the inversion; it includes a modest contribution from the choice of reference model which, given that the inversion is carried out directly based on the model and the observed frequencies, is essentially irrelevant (see Sect. 5.1.2). They also compared with the helioseismically inferred values the envelope helium abundance and location of the base of the convection zone in the models; results for their AGSS09 model are also shown in Table 6 and are clearly similar to those for Model [AGSS09]. The conclusion of the analysis was a statistically significant preference for the GS98 composition; this was particularly strong when excluding from the comparison the bump in \(\delta c/c\) just below the convection zone, which is likely associated with mixing processes missing in the modelling (see Fig. 41). The uncertainty in the inferred sound-speed difference in Fig. 53 is dominated by the abundance uncertainties and the assumed range in the opacity uncertainty. A similar analysis, although with a more sophisticated analysis of the opacity uncertainty and involving a reconstruction of the solar opacity profile, was carried out by Song et al. (2018). I return to the effects of opacity below.

Fig. 53
figure 53

Image reproduced with permission from Vinyoles et al. (2017), copyright by AASυ

Relative sound-speed differences, in the sense (Sun)–(model), at fixed fractional radius. The red curve is based on a model using the AGSS09 abundances, updated with meteoritic abundances for the refractory elements, while the blue solid curve used the GS98 abundances (the dashed curve corresponds to an older GS98 calculation). The red shaded region shows estimated effects of modelling uncertainties, while the grey shaded band is an estimate of the effects of errors in inferring the sound speed from results of an inversion. (For comparison with, e.g., Fig. 51, note that the latter shows differences in squared sound speed.)

Introducing the AGS05 abundances had a relatively modest effect on the predicted neutrino production compared with the uncertainties in the predictions and observations (Turck-Chièze et al. 2004; Bahcall et al. 2005a, c). This is a consequence of the small reduction in the temperature in the core of the model, around 1% (see Fig. 50), leading to a reduction of around 20% in the flux of \({{}^{8}\mathrm{B}}\) neutrinos, with smaller changes in the other neutrino fluxes. A detailed investigation of the uncertainties in the predicted neutrino flux was carried out by Bahcall and Serenelli (2005); they found that their so-called conservative uncertainties in the surface composition, estimated from the differences between the compositions of individual elements in the emerging revised determinations and GS98, still provided the largest contribution to the total uncertainty in the computed neutrino fluxes. As part of their detailed revised solar modelling, discussed above, Vinyoles et al. (2017) carried out a careful analysis of the impact of the AGSS09 composition on the predicted solar neutrinos, including a determination of the uncertainties in the predictions taking into account other uncertainties in the modelling, and using updated measured neutrino fluxes, including the Borexino \({{}^{7}\mathrm{Be}}\) results. They found reductions, although barely significant compared with the model uncertainties, in the \({{}^{8}\mathrm{B}}\) and \({{}^{7}\mathrm{Be}}\) fluxes, as a result of the reduction in the core temperature (see Fig. 50); comparison with the observations showed a slight but insignificant preference for the GS98 composition, as also hinted by Fig. 48. Agostini et al. (2018) presented the effect of the solar composition on the flux of \({{}^{8}\mathrm{B}}\) and \({{}^{7}\mathrm{Be}}\) neutrino fluxes (cf. Fig. 54) and concluded that the neutrino data provide no significant distinction between the GS98 and AGS09 composition. Similarly, Bergström et al. (2016) concluded that current neutrino data have “absolutely no preference for either [the GS98 or the AGSS09] model”.

Fig. 54
figure 54

Image reproduced with permission from Agostini et al. (2018), copyright by Springer Nature

Comparison of the observed and computed \({{}^{8}\mathrm{B}}\; (\varPhi _{\mathrm{B}})\) and \({{}^{7}\mathrm{Be}}\; (\varPhi _{\mathrm{Be}})\) neutrino fluxes, indicated by 68% confidence contours. The red and blue areas show model results using the GS98 (SSM-HZ) and AGSS09 (SSM-LZ) compositions (Vinyoles et al. 2017). The green area shows Borexino results, while the grey area was obtained from a combined analysis of all solar, as well as the KamLAND, data.

Buldgen et al. (2017a) applied a very interesting procedure to the analysis of the effects of the composition updates. This is based on inversion for differences in the Ledoux discriminant,

$$\begin{aligned} A = {\mathrm{d}\ln \rho \over \mathrm{d}r} - {1 \over \varGamma _1} {\mathrm{d}\ln p \over \mathrm{d}r} \; \end{aligned}$$
(67)

(see also Gough and Kosovichev 1993), using the structure pair \((A, \varGamma _1)\). They applied the analysis to models computed with the FreeEOS equation of state (developed by A. Irwin, see Cassisi et al. 2003a, and Sect. 2.3.1), with opacities from the OPAL and OPLIB tables (see Sect. 2.3.2) and using both the GN93 and AGSS09 compositions. Some results are shown in Fig. 55. Interestingly, the differences are concentrated just below the convection zone, in the region of the bump in the sound-speed differences in the GN93 models (e.g., Fig. 39). This suggests that the differences in A are sensitive to this feature, even in the AGSS09 models where it is hidden by the larger general sound-speed difference. Just below the convection zone the GN93 models are closer to the Sun, while around \(r = 0.64 R\) the differences are smaller for the AGSS09 models. In a second interesting analysis Buldgen et al. (2017c) carried out inversion for the same four models in terms of \(S_{5/3} = p/\rho ^{5/3}\), which in the ideal-gas approximation is closely related to the specific entropy, using again \(\varGamma _1\) as the second variable. Here substantial differences were found in the convection zone, essentially corresponding to different values of the specific entropy in the adiabatic part of the convection zone resulting from the model calibration, with some preference for the GN93 models. These analyses are potentially very valuable tools, as supplements to the more common sound-speed and density inversion, particularly for the investigation of the lower boundary of the convective envelope which undoubtedly is the site of substantial uncertainties in the modelling, related to possible overshooting or other types of mixing beyond the convection zone.

Fig. 55
figure 55

Image reproduced with permission from Buldgen et al. (2017a), copyright by the authors

Inversion for differences, at fixed fractional radius, in the Ledoux discriminant (cf. Eq. 67) between the Sun and models with the OPAL and OPLIB opacities, using the GN93 and AGSS09 compositions (see legend). Horizontal bars show the resolution and the (barely visible) vertical error bars are propagated from the observational frequency errors.

In the following I discuss possible solutions to the problems of solar models with the revised abundances; however I note already now that these were discussed in more detail by Basu and Antia (2008), based on the AGS05 composition, leading to the general conclusion that no definite satisfactory solution had at that time been found. This still holds.

6.3 Are the revised abundances correct?

Given the difficulty in reconciling the AGS05 and AGSS09 abundances with the helioseismic results, it has been natural to question these abundances. In their favour is the fact that they bring the Sun into closer agreement with the abundances of objects in the solar neighbourhood, as mentioned above, although this is perhaps not decisive.

An independent determination of the envelope heavy-element abundances can in principle be obtained from the effects of the heavy elements on the thermodynamic properties of the gas and the resulting influence on the solar oscillation frequencies or the helioseismically inferred properties (Gong et al. 2001a; Mussack and Gough 2009). This is analogous to the determination of the envelope helium abundance discussed in Sect. 5.1.3, although obviously far more demanding, given the lower abundance and the correspondingly smaller effects. Takata and Shibahashi (2001) carried out an early inverse analysis targeting the heavy-element abundance and found an indication that it was lower by 20–30% than in Model S in the convection zone. Early results by Lin and Däppen (2005) provided slight indications for a decrease in the heavy-element abundance, relative to the Grevesse and Noels (1993) value, while Antia and Basu (2006) and Lin et al. (2007) obtained results consistent with the GN93 abundances (for a review, see also Basu and Antia 2008). A somewhat indirect determination was made by Houdek and Gough (2011) who used low-degree observations from BiSON, combining analysis of the helium glitch with use of the asymptotic behaviour of the acoustic modes, to determine a seismic measure of solar age and the heavy-element abundance through model calibration, focusing on the structure of the core resulting from the hydrogen fusion. The age was consistent with the value obtained from radioactive decay, while the inferred heavy-element abundance, \(Z_{\mathrm{s}} = 0.0142\), was intermediate from the values obtained for the GS98 and AGSS09 compositions. A potential problem with the analysis may be indicated by the fact that the model fitting resulted in an envelope helium abundance of \(Y_{\mathrm{s}} = 0.224\), substantially below values obtained from helioseismic analyses of just the effects of the helium glitch (see Sect. 5.1.3). It is evident that the use of \(\varGamma _1\) as a composition diagnostics depends critically on the assumed equation of state, probably even more for the heavy-element abundance than in the case of the determination of the helium abundance. Careful analyses were carried out by Vorontsov et al. (2013, 2014), fitting helioseismic observations to solar convective-envelope models based on a variety of equations of state, including the so-called SAHA-S implementation (Baturin et al. 2013). They found that SAHA-S provided a substantially better fit to the observations than other formulations, with a heavy-element abundance in the range \(Z = 0.008 - 0.013\), i.e., strongly supporting the revised low values of Z, while acknowledging that a complete solar model with this abundance would be inconsistent with seismic inferences of the radiative interior. Buldgen et al. (2017b) carried out numerical inversions based on corrections to the Ledoux discriminant (cf. Eq. 67), \(Y_{\mathrm{s}}\) and \(Z_{\mathrm{s}}\); the analysis was tailored to obtain determinations of \(\delta Z_{\mathrm{s}}\), suppressing the contributions from A and \(Y_{\mathrm{s}}\) (see also Sect. 5.1.2, and Basu 2016). The results showed a substantial scatter, depending on the choice of reference model and inversion details, but with a strong trend towards a heavy-element abundance substantially below GS98, in accordance with the results of Vorontsov et al. (2013). Thus several independent lines of investigation point towards the lower abundance, in support of AGSS09.

In principle the composition of the solar atmosphere can be directly sampled through analysis of the solar wind. In practice, this is greatly affected by the fractionation of elements taking place in the acceleration of the solar wind, particularly the FIP effect. However, it was argued by von Steiger and Zurbuchen (2016) that this effect is largely absent in polar coronal holes. Thus they used observations of the solar wind from the Ulysses spacecraft, with an orbit passing repeatedly over the solar poles, to estimate the solar photospheric composition. Interestingly, the inferred heavy-element abundance, \(Z = 0.0196 \pm 0.0014\), is consistent with the older and helioseismically preferred composition. On the other hand, these results were criticized by Serenelli et al. (2016) who pointed out that using the detailed composition inferred by von Steiger and Zurbuchen substantially increased the neutrino-flux discrepancy of the models; they furthermore questioned the analysis of the FIP effect. Similar results were reached in a detailed analysis by Vagnozzi et al. (2017).

An interesting connection between the abundance issue and the neutrino observations was noted by Haxton and Serenelli (2008). They pointed out that future development in detector technology may allow measurement of the flux of neutrinos from the \({{}^{13}\mathrm{N}}\) and \({{}^{15}\mathrm{O}}\) decays (see Eq. 26), and hence of the rate of the CNO reactions; given the present well-determined nuclear parameters this would provide an independent determination of the CNO abundances in the solar interior. The resulting numerical relation between the flux of CNO neutrinos and the central heavy-element abundance of the Sun was derived by Gough (2019). Very recent Borexino results (Agostini et al. 2020a) provide a solid detection of the flux that, however, is consistent with both the high- and low-metallicity compositions. New detectors are being developed with the specific goal of reaching a sufficiently low background to detect the CNO signals (for an overview, see Bonventre and Orebi Gann 2018). A detailed analysis was carried out by Cerdeño et al. (2018) of the potential of the Borexino detector and planned new detectors for making a significant determination of the CNO composition of the solar core. One example of a planned detector potentially capable of distinguishing between the low- and high-CNO models is the Jinping detector in China (Wan 2019), with a planned liquid-scintillator detector mass of up to 4 kton.

6.4 Possible corrections to the solar models

The very serious discrepancies between the models with the new composition and the helioseismic results have led to many attempts to find modifications to the models that will improve the agreement. As reviewed by Guzik (2006, 2008) these attempts have met with limited success.

A perhaps not uncommon misconception is that the principal effect of the revised composition is the decrease in the depth of the convection zone. Basu et al. (2015), for example, stated that ‘[t]he most dramatic manifestation of the change of metallicities is the change in the position of the convection-zone base, which changes the sound-speed difference between solar models and the Sun’, implying that the change in the sound speed is caused by the change in the location of the base of the convection zone. To test this I applied a localized change to the opacity in the AGSS09 model in Fig. 51, of the form used in equation (1) of Christensen-Dalsgaard et al. (2018) but calibrated to obtain the same depth of the convection zone as in Model [GS98] (cf. Fig. 40). As indicated by Fig. 25 such a local opacity modification has a local effect on the sound speed; a more detailed discussion of the effects on the model structure was provided by Christensen-Dalsgaard et al. (2018). The helioseismically inferred differences in the squared sound speed between this model and the Sun are compared in Fig. 56 with the corresponding results for Model S, Model [GS98] and Model [AGSS09]. The figure shows a small shift in the sound-speed difference in the opacity-modified model compared with the original Model [AGSS09], corresponding to the shift in the base of the convection zone, and a related modest decrease in the maximum of the sound-speed difference; however, in the bulk of the radiative interior the difference for the original and modified AGSS09 models are very similar. Thus it is clear that the sound-speed difference is not just a consequence of the shift in the base of the convection zone (see also Ayukov and Baturin 2017).

Fig. 56
figure 56

Inferred differences in squared sound speed between the Sun and four solar models, in the sense (Sun)–(model). As in Fig. 51 the open circles are for Model S and the stars (connected by a dotted line) for Model [AGSS09] using the AGSS09 composition; the dashed curve shows the results for Model [GS98] based on the GS98 composition. The solid curve shows results for a model corresponding to Model [AGSS09] but with a localized change in the opacity near the base of the convection zone to bring the depth of the convection zone into agreement with Model [GS98]

As the heavy elements predominantly affect the structure through the opacity, an obvious correction to the model calculations is to increase the opacity. This was noted by Basu and Antia (2004) and Montalbán et al. (2004) who estimated that an opacity increase of 10–20% would be required. Bahcall et al. (2005a) found the opacity difference between models with the old and new composition to be up to around 15%, the largest values being close to the base of the convection zone and reflecting the contribution from the oxygen abundance illustrated in Fig. 5. Christensen-Dalsgaard et al. (2009) evaluated the change in opacity, assumed to be a function of temperature, required to reproduce the structure of Model S with the AGS05 composition. The result is shown in Fig. 57, including also the similar analysis based on the AGSS09 composition (Christensen-Dalsgaard and Houdek 2010); at the base of the convection zone the required increase is around 30% when AGS05 is used, while AGSS09 requires an opacity increase of up to around 23%. The effects of the latter increase on the results of sound-speed inversion and model structure are shown in Fig. 58. It is evident that the opacity modification, applied to the AGSS09 opacities, largely recovers the difference in squared sound speed between the Sun and the model structure found with Model S (see also Fig. 39). Furthermore, comparing panel (b) with Fig. 50 shows that most of the difference in other properties of the model structure is also suppressed. In particular, as shown in Table 6 the model is as successful as Model S in matching the inferred solar envelope helium abundance and depth of the convection zone. A similar estimate of the required opacity change, but based on combining intrinsic changes to the opacity with changes in the composition and taking into account also the constraints of the observed neutrino fluxes, was obtained by Villante et al. (2014).

Fig. 57
figure 57

Image reproduced with permission from Christensen-Dalsgaard and Houdek (2010), copyright by Springer; see also Christensen-Dalsgaard et al. (2009)

Intrinsic opacity corrections, assumed to be functions of temperature alone, required to bring models with the revised composition into agreement with Model S. The solid curve is for the AGS05 composition and the dashed curve is for the AGSS09 composition.

Fig. 58
figure 58

a Result of sound-speed inversion using as reference a model based on the AGSS09 opacities, but with the modification shown as a dashed curve in Fig. 57. The dashed curve shows the inversion result against Model S, illustrated in Fig. 39 which also defines the error bars. b Logarithmic differences between the model with the modified AGSS09 opacities and Model S. Line styles are defined in Fig. 21

It is far from clear that such intrinsic increases in the opacity are realistic. A measure of the uncertainty in the opacities is perhaps provided by differences between the totally independent calculations and their effects on the results; as discussed in Sect. 2.3.2 several such calculations are now available. Fig. 30 shows that replacing the OPAL tables by OP increases the squared sound speed by up to about 0.7% below the convection zone, resulting in a modest reduction in the difference between the Sun and the model. Analyses using the more recent OPAS and OPLIB tables, with the AGSS09 composition, have been carried out by Buldgen et al. (2019a) and Villante, Serenelli and Vinyoles (in preparation). The resulting sound-speed profiles are compared with the Sun in Fig. 59. While OP and OPLIB yield results rather similar to those for OPAL, the sound-speed difference for OPAS is generally lower than the rest, probably reflecting the somewhat higher opacity just below the convection zone, shown in Fig. 6. Buldgen et al. (2019a) noted that the generally lower OPAS opacity in the bulk of the radiative interior requires a lower helium abundance for luminosity calibration and hence exacerbates the discrepancy between the model and the helioseismically inferred surface helium abundance. Interestingly using OPLIB with the AGSS09 composition results in small frequency separations \(\delta \nu _{nl}\) in good agreement with the observations, while, as discussed above, using the OPAL opacities and AGSS09 results in very significant differences between model and observations (Buldgen et al. 2017c). On the other hand, the OPLIB opacities result in a substantial reduction in the core temperature and hence in neutrino fluxes that are inconsistent with the observations (A. Serenelli, private communication). This is a strong demonstration of the complementary information available from helioseismic and neutrino data, and makes the OPLIB less attractive for solar modelling. In any case, the spread between different current opacity tables and its dependence on temperature in no way justify the opacity correction illustrated in Fig. 57.

Fig. 59
figure 59

Figure courtesy of Aldo Serenelli

Inferred relative sound-speed differences, at fixed fractional radius, between the Sun and models with the AGSS09 composition and using the OP, OPAL, OPLIB and OPAS opacity tables. The pink shaded region indicates the uncertainty resulting from the inversion procedure, whereas the grey area indicates uncertainties in the modelling (see Vinyoles et al. 2017). From Villante, Serenelli and Vinyoles (in preparation).

It cannot be excluded that effects ignored by current opacity calculations, or contributions from other chemical elements not included in the calculations, could have a substantial effect. Thus it is very interesting that Bailey et al. (2015), in an experiment at conditions close to those corresponding to the base of the solar convection zone obtained using the so-called Z-pinch technique, measured absorption coefficients for iron substantially higher than those resulting from atomic modelling and used in opacity determinations. Further experiments on chromium and nickel by Nagayama et al. (2019), using the same facility, also found substantial discrepancies but of a somewhat different nature, particularly for chromium, indicating sensitivity to the details of atomic structure. The origin of these differences between atomic modelling and experiments is still not clear, and independent experiments now under way or being planned (e.g., Le Pennec et al. 2015a; Perry et al. 2020) will be very valuable. However, they indicate that there may be significant deficiencies in our understanding of the physics of the opacity. Trampedach (2018) made an estimate of the consequences for opacity calculations of the Bailey et al. results, indicating that it may correspond to increases not dissimilar to those shown in Fig. 57 to correct for the effects of the AGSS09 composition. Also, Pradhan and Nahar (2018) reviewed issues with current opacity calculations that might account for the experimental and solar discrepancies.

Alternatively, the opacity could be increased by increasing the abundances of other elements to compensate for the decrease in the abundances of oxygen in the AGSS05 and AGSS09 composition tables. Figure 5 shows that neon contributes substantially to the opacity. As in the case of helium, the neon abundance cannot be determined directly from photospheric spectral lines, and hence is highly uncertain. The same is true of argon. Antia and Basu (2005) found that an increase by a factor of around 4 in the neon abundance could bring their envelope models in agreement with helioseismology. Bahcall et al. (2005b) considered increases of both neon and argon and found models with abundance increases of around a factor of three that approximately matched the helioseismically inferred sound speed. Similar effects of increases in the neon abundance were found by Zaatri et al. (2007). Possible support for such increases was provided by the determination by Drake and Testa (2005) of similarly high neon abundances in what was claimed to be solar-like stars. However, the relevance of these abundances for the solar case has been seriously questioned (e.g., Schmelz et al. 2005; Robrade et al. 2008). Also, Morel and Butler (2008) found no evidence in the neon abundances of near-by B stars for such a high neon content. On the other hand, Young (2018) obtained an increase by about 40% in the chromospheric Ne/O ratio, increasing the logarithmic normalized abundance (cf. Table 4) from the value 7.93 quoted by AGSS09 to 8.08. As shown by Buldgen et al. (2019b) the resulting increase in the opacity (see also Fig. 5) results in a modest increase in the depth of the convection zone and the envelope helium abundance, although still far from enough to match the observed values.

An obvious question is the extent to which the observed surface abundance is representative of the abundance of the radiative interior and hence of the opacity. In normal solar models settling causes a significant difference between the surface heavy-element abundance and the abundance beneath the convection zone (see Fig. 18). Increasing the rates of diffusion and settling therefore increases the heavy-element abundance in the interior relative to the surface and hence compensates for the decrease in the surface abundance. This is indeed the case (e.g., Basu and Antia 2004; Montalbán et al. 2004; Guzik et al. 2005; Christensen-Dalsgaard and Di Mauro 2007; Yang and Bi 2007), although to obtain a significant effect a considerable change (by factors of 1.5 or more) have to be made; this may be physically unrealistic. Also, the resulting models typically have an envelope helium abundance substantially below the helioseismically inferred value. The effects of increasing diffusion are illustrated in Fig. 35b which shows that an increase by 20% in the diffusion and settling coefficients for both helium and heavy elements leads to a relative increase in the squared sound speed by about 0.3%, with a similar increase in the surface hydrogen abundance. This is also reflected in the decrease in the envelope helium abundance for Model [DVc] shown in Tables 2 and 3. Compensating for the effect on the interior sound speed of the revised abundances while maintaining the envelope helium abundance would require a strong increase in the heavy-element settling with little change in helium settling; this seems hard to justify physically.

Ayukov and Baturin (2017) carried out an extensive analysis of solar models with the various heavy-element compositions, based on the analysis of solar envelope models by Vorontsov et al. (2013). As a constraint on the properties of the solar convective envelope they used the quantity \(M_{75}\) defined as the mass, in units of \(\,M_\odot \), inside a distance of \(0.75 \,R_\odot \) from the solar centre. This is determined by the density structure in the convection zone and hence essentially characterizes the entropy in the adiabatic part of the convection zone. From the results of Vorontsov et al. (2013) they chose \(M_{\mathrm{75}} = 0.9822\) as a reference and aimed to fit this, together with the radius, luminosity and various seismic parameters of the model. In addition to various forms of opacity changes they also included a possible increase in the \({{}^{1}\mathrm{H}}+ {{}^{1}\mathrm{H}}\) reaction rate. They did obtain a model providing a generally good fit, with essentially the AGSS09 composition, but requiring an increase in the reaction rate of around 5%, much higher than its estimated uncertainty. They noted that this could be accounted for by a major increase in the electron screening of the reaction, although in fact molecular dynamics calculations have indicated that electron screening could be far less efficient than normally assumed (see also Sect. 2.3.3). Furthermore, the \({{}^{8}\mathrm{B}}\) neutrino flux was substantially lower than observed.

A comprehensive analysis of solar modelling and helioseismic diagnostics was carried out by Buldgen et al. (2019b). The modelling used the GN93, GS98 and AGSS09 compositions, a range of different opacity tables, and different equations of state. In addition, a variety of modifications to the modelling, including opacity modifications and convective overshoot or turbulent diffusion below the convective envelope were considered. The helioseismic analyses was carried out in terms of the sound speed, the Ledoux discriminant (cf. Eq. 67) and the entropy proxy \(S_{5/3} = p/\rho ^{5/3}\), as well as the envelope helium abundance and the depth of the convective envelope. Buldgen et al. concluded that obtaining a model in agreement with the observations, given the revised surface composition, will require addressing several different aspects of solar modelling. As a very important point they noted that the often subtle issues involved in the analysis of differences between models and observations require improved confidence in the modelling, which can only be achieved by careful comparison of the results of independent modelling codes.

I finally note that the present surface heavy-element abundance could be lower than the interior composition as a result of later accretion of material less rich in heavy elements; also, early solar mass loss has a significant effect on the present internal sound speed (Guzik et al. 2009). I return to the consequences of these effects, in relation to the revised abundances, in the following section.

6.5 Effects of accretion or mass loss

The solar models considered so far have all been evolved at constant mass, neglecting any effects of mass loss or accretion. The present rate of mass loss to the solar wind, around \(2 \times 10^{-14} \,M_\odot \,\mathrm{year}^{-1}\) (e.g., Schrijver et al. 2007), is too low to have a significant effect on solar evolution. The same is true of the loss of mass resulting from the fusion of hydrogen to helium in the solar core.Footnote 61 However, accretion or a much higher mass-loss rate in the past cannot a priori be excluded.

A simple way to obtain the observed surface composition, maintaining a higher heavy-element abundance in the radiative interior as apparently required by the helioseismic constraints, is to postulate that the solar convection zone has been affected by the accretion of material low in heavy elements (Guzik et al. 2005); this possibility has also been proposed in connection with detailed comparisons between the surface compositions of the Sun and similar stars (see Sect. 7.1). However, it appears to be difficult to construct such models that satisfy all the helioseismic constraints (Guzik 2006; Castro et al. 2007; Guzik and Mussack 2010). An extensive investigation of models with accretion, varying the timing of the accretion during early solar evolution and the composition and mass of the accreted material, was carried out by Serenelli et al. (2011), comparing the GS98 and AGSS09 compositions; the models were compared both with the helioseismic inferences and the neutrino data. The conclusion was that, over the extended set of parameters considered, accretion was unable to achieve an agreement with the solar data for models using the AGSS09 composition that matched the results for the traditional model using the GS98 composition.

The possible effects of mass loss on the solar abundance problem are less obvious although, as discussed below, significant. An obvious consequence of a higher initial solar mass would be a higher initial solar luminosity, as indicated by the luminosity scaling relation, Eq. (36); this has the potential to alleviate the ‘faint early Sun problem’ (cf. Sect. 3.2). Also, by dragging material originally at greater depth and hence at higher temperature into the convection zone, substantial mass loss would change the composition of the solar surface; in particular, it would lead to increased destruction of lithium (Weymann and Sears 1965) and increase the abundance ratio \({{}^{3}\mathrm{He}}/{{}^{4}\mathrm{He}}\) (see also Sect. 5.3). Guzik et al. (1987) computed evolution sequences with exponentially decreasing mass loss, starting at a mass of \(2 \,M_\odot \) and calibrated to match solar properties at the present age. They found that such high mass loss led to the complete destruction of lithium and beryllium, thus requiring additional processes in the near-surface region to account for the observed abundances. Apart from this, no obvious conflicts with the then known properties of the Sun were identified; Guzik et al. did note that the \({{}^{3}\mathrm{He}}\) abundance on the solar surface was strongly increased in the mass-losing models, but they did not consider the available observations sufficiently secure to rule out such models. Swenson and Faulkner (1992) considered mass loss as an explanation of the observed lithium abundances in the Sun and in the Hyades cluster. In the solar case, they found that the observed present solar lithium abundance could be accounted for with an initial solar mass of \(1.1 \,M_\odot \) and either exponentially decreasing or constant mass loss. A similar conclusion had been reached by Boothroyd et al. (1991).

The availability of detailed helioseismic data obviously provides further constraints on the mass-losing models. Guzik and Cox (1995) compared models with a total loss of \(0.1 \,M_\odot \), to match the lithium destruction, with observed frequencies from Duvall et al. (1988). They concluded that such mass loss extending over a timescale substantially exceeding \(0.2 \,\mathrm{Gyr}\) was ruled out by the observed frequencies. Mass loss on a shorter timescale had little effect on the structure of the present Sun; indeed, it is obvious that early mass loss affects the structure of the present Sun almost entirely from the resulting change in the composition profile, and the evolution of the composition profile during the first \(0.2 \,\mathrm{Gyr}\) is modest. Consequently, the computed frequencies were very similar to those of a model without mass loss. However, such rapid mass loss required an initial mass-loss rate of around \(5 \times 10^{-10} \,M_\odot \,\mathrm{year}^{-1}\), more than four orders of magnitude higher than the present rate.

A detailed analysis of the helioseismic implications of early mass loss was carried out by Sackmann and Boothroyd (2003). This was motivated by the possible problem posed by the low initial luminosity of the Sun, given evidence for liquid water on the Earth and possibly Mars in the early phases of their evolution; they noted that an early higher solar mass would increase the solar luminosity and decrease the distance between the Sun and the planets, both leading to a higher solar flux at the Earth and Mars. They considered three different mass-loss models, all calibrated to correspond to the present solar wind at solar age, and initial masses between 1.01 and \(1.07 \,M_\odot \). The sound speed in the model of the present Sun was compared with the helioseismic inference of Basu et al. (2000). Sackmann and Boothroyd (2003) found that an initial mass of \(1.07 \,M_\odot \) would lead to a flux at Mars high enough \(3.8 \,\mathrm{Gyr}\) ago to be consistent with liquid water. The effects in this case on the present solar sound-speed profile were quite modest; in fact, mass loss slightly decreased the difference between the helioseismic and the model sound speed, although the effect was not significant, given other uncertainties in the modelling. They noted that even with a mass loss of \(0.07 \,M_\odot \) additional mixing would be required to account for the observed lithium depletion; the helium isotope ratio was not discussed but is likely not significantly affected by such a modest early mass loss.

Minton and Malhotra (2007) considered the mass loss required to ensure that the average temperature of the Earth had been above freezing throughout the evolution of the solar system. They found that this could be accomplished with an initial mass as low as \(1.026 \,M_\odot \), with a resulting model at the present age which would likely be consistent with helioseismic inferences. However, they noted that the required mass-loss rate during the early stages of solar evolution would have been substantially higher than the rates observed in sun-like stars at similar stages in their evolution. In addition, they found that solar mass loss would have had some effect on the dynamics of the bodies in the solar system, although none with clear observable consequences at present.

Following Sackmann and Boothroyd (2003), Guzik et al. (2009) and Guzik and Mussack (2010) investigated the effect of mass loss on the comparison with the helioseismic sound-speed inferences, given the revision of the solar composition (see Sect. 6.2). Interestingly, they found that a model with initial mass of \(1.3 \,M_\odot \) and an exponentially decreasing mass-loss rate with an e-folding time of \(0.45 \,\mathrm{Gyr}\), using the AGS05 composition, largely reproduced a model with no mass loss and the GN93 composition. However, such a large amount of mass loss would bring to the solar surface material that had been exposed to temperatures in excess of \(5 \times 10^6 \,\mathrm{K}\), resulting in a complete destruction of lithium. Also, the initial mass-loss rate of \(6.6 \times 10^{-10} \,M_\odot \,\mathrm{year}^{-1}\) may be inconsistent with observations of other similar young stars.

To illustrate the effects of mass loss on solar evolution and the present solar structure Fig. 60 shows the evolution in the surface luminosity for a normal model and two mass-losing models of initial mass 1.15 and \(1.3 \,M_\odot \) (Guzik and Mussack 2010). All models were calibrated to match the solar properties at the present age of the Sun. The mass-loss rate was assumed to decrease exponentially with age with an e-folding time of \(0.45 \,\mathrm{Gyr}\). The initial luminosity is evidently well above the present solar luminosity in both mass-losing models; however, with the assumed rapid decrease in the mass loss the minimum luminosity is still only about 80% of the present solar luminosity. The effect on the structure of the model of the present Sun is illustrated in Fig. 61 for a starting mass of \(1.3 \,M_\odot \). Comparison with Fig. 50 confirms that such mass loss to a large extent compensates for change in solar models caused by the change in the surface composition, from the Grevesse and Noels (1993) to the Asplund et al. (2005a) values.

Fig. 60
figure 60

Adapted from Guzik and Mussack (2010); data courtesy of Joyce Guzik

Evolution with age in surface luminosity, in units of the present luminosity of the Sun, for a model without mass loss (solid curve) and mass-losing models with an initial mass of \(1.15 \,M_\odot \) (dashed curve) and \(1.3 \,M_\odot \) (dot-dashed curve). The models were calibrated to match solar properties at the present age of the Sun. They were computed with the AGS05 composition (Asplund et al. 2005a).

Fig. 61
figure 61

Adapted from Guzik and Mussack (2010); data courtesy of Joyce Guzik

Differences, at fixed fractional radius, between models of the present Sun in a mass-losing evolution sequence with initial mass \(1.3 \,M_\odot \) and a normal sequence, in the sense (mass-losing model)–(normal model). The line styles are defined in the figure; the dotted line marks zero difference. The models were calibrated to match solar properties and the present age of the Sun. They were computed with the AGS05 composition (Asplund et al. 2005a).

Lacking direct determinations of the early solar mass loss, constraints can be sought from observations of young solar analogues. Based on radio observations of such stars Fichtinger et al. (2017) concluded that the total amount of mass lost by the Sun in the early phases of main-sequence evolution was likely at most 0.4%. From the results discussed here this would clearly be insufficient to compensate for the low early solar luminosity or the change in the solar surface composition.

A comprehensive effort to match observational data for the Sun, given the revised solar composition, was carried out by Zhang et al. (2019), involving both pre-main-sequence accretion and early mass loss. In addition to the helioseismic data, the models were also fitted to the observed lithium abundance (see also Sect. 5.3) and tested against the observed neutrino data. The models used the AGSS09 composition with the updated Neon abundance following Young (2018). Overshoot below the convection zone was treated using a model of the transport of turbulent kinetic energy. The most novel aspects of the model were the inclusion of selective and somewhat heuristic composition effects in the pre-main-sequence accretion and early mass loss, to match the detailed distribution of the helium abundance, as inferred from the helioseismically determined sound speed. Inferred sound-speed and density differences for the resulting so-called Model TWA are illustrated in Fig. 62, compared with the results for Model S, while overall model properties are included in Table 6. Even though largely using the AGSS09 composition the model clearly provides a better match to solar sound speed and density than does Model S, while the convection-zone depth and envelope helium abundance are in good agreement with the helioseismically inferred values.

Fig. 62
figure 62

Results of helioseismic inversions, for Model TWA of Zhang et al. (2019), including mixing below the convection zone and chemically differentiated accretion and mass loss in early phases of stellar evolution. The symbols show inferred relative differences in squared sound speed (top) and density (bottom) between the Sun and the model. The vertical bars show \(1\,\sigma \) errors in the inferred values, based on the errors, assumed statistically independent, in the observed frequencies. The horizontal bars extend from the first to the third quartile of the averaging kernels, to provide a measure of the resolution of the inversion (see Basu 2016). For comparison, the dashed curves show results Model S (see also Fig. 39)

7 Towards the distant stars

Although the main focus here is the Sun, it is of course interesting to consider broader aspects of stars, in relation to those of the Sun. An important question in this regard is whether the Sun is in fact a typical star. Gustafsson (1998) addressed this in a paper with the title “Is the Sun a sun-like star?”. He answered this in the affirmative, find that the Sun is indeed typical of stars with similar mass and age. One important exception is that the Sun is a single star, setting it apart from the many stars that are in binary systems. A second possible exception concerns the detailed mixture of heavy elements; I return to this below.

To place solar evolution into a broader context, Fig. 63 shows evolutionary tracks for a broad range of stars, on the main sequence and just beyond. To avoid problems with excessively rapid settling for masses only slightly higher than solar (see Sect. 2.3.4), diffusion and settling were neglected in these calculations. Otherwise the physics corresponds essentially to what was used in Sect. 4, and the mixing-length parameter and initial abundance were calibrated to obtain a model at the present age of the Sun matching the observed properties. In accordance with Eq. (36) the luminosity generally increases with evolution during central hydrogen burning. However, it is evident that the qualitative behaviour of the evolution tracks changes at a mass of around \(1.15 \,M_\odot \), with the appearance of a ‘hook’, where the effective temperature increases with age for a brief period. This reflects that such more massive stars, unlike the Sun, have a convective core. The convective instability is a result of the increasing central temperature and hence increasing importance of the highly temperature-sensitive CNO cycle (see Eq. 23). This causes the energy generation to be strongly concentrated near the stellar centre, leading to a high value of L(r)/m(r) near the centre and hence, according to Eqs. (5) and (8), to a tendency for convective instability. Since the convective core is fully mixed, the hydrogen abundance decreases uniformly in the core up to the point where hydrogen disappears in all, or much of, the region where the temperature is high enough to allow nuclear burning. In the last phases of central hydrogen burning this causes a contraction of the entire star to drive up the central temperature in order to maintain the luminosity, hence leading to the increase in the effective temperature. This behaviour stops when the energy generation is taken over by hydrogen burning in a shell around the hydrogen-depleted core; as in the lower-mass stars the surface radius of the star increases with evolution and the effective temperature therefore drops.

Fig. 63
figure 63

Evolution tracks during and just after central hydrogen burning for stellar masses between 0.8 and \(6 \,M_\odot \). Selected masses are indicated in the figure. The track for \(1 \,M_\odot \) is shown with a bolder curve, and the location of the Sun is marked by the green sun symbol (\(\odot \)). The models are characterized by an initial composition with \(X_0 = 0.7062\), \(Z_0 = 0.01963\) and a mixing-length parameter \(\alpha _{\mathrm{ML}}= 1.8914\). Evolution starts from chemically homogeneous models on the Zero Age Main Sequence (ZAMS), indicated by the dotted curve. The red dashed curve and plusses mark the Terminal Age Main Sequence (TAMS), where the central hydrogen abundance decreases below \(10^{-5}\). The inset shows the evolution track for \(2 \,M_\odot \) on an expanded scale. Here the diamond marks the point where the convective core disappears

It is evident that much of the detailed discussion of solar modelling and evolution presented in this paper is immediately relevant to other stars. Indeed, a key aspect of the helioseismic investigations of the solar interior is the ability to test the theory of stellar structure and evolution in very considerable detail. Also, the Sun is in many ways an ideal case for such tests, even apart from the obvious advantage of its proximity. Compared with most other stars its properties are relatively simple. It has had no convective core during the bulk of its main-sequence evolution.Footnote 62 It is slowly rotating, so that rotation has no obvious immediate consequences for the structure of the present Sun. The physical conditions of matter in the Sun are relatively benign, the departures from the ideal-gas equation of state being modest although still large enough to be investigated with helioseismology. Thus it is perhaps not unreasonable to hope that even our simple models can give a reasonable representation of the properties of the solar interior, and this indeed seemed to be the case, as discussed in Sect. 5.1.2, at least until the revision of solar abundances (see Sect. 6.2).

Such complacency is clearly naive, however, given the potential of the solar interior for complexities far beyond our simple models. As discussed in Sect. 5.1.4 the origin of the present internal solar rotation is not understood. It is very likely that the phenomena leading to the present near-uniform rotation of the solar radiative interior has had some effect also on solar structure, for example through associated mixing processes. Also, it should be kept in mind that even the relatively successful models, such as Model S discussed extensively here, show a highly significant departure from the helioseismic inferences (cf. Fig. 39). However, it is perhaps mainly the consequences for solar models of the revision of the solar composition that has served as a wake-up call for reconsidering the basics of solar modelling. As discussed by Guzik (2006) there seems to be no straightforward way to reconcile normal models computed with this composition with the helioseismic inferences. This should motivate looking for more serious flaws in our understanding of stellar structure and evolution.

Abundances of solar-like stars are often measured relative to those of the Sun. Thus, the modifications to the inferred solar abundances discussed in Sect. 6 affect also the modelling of other stars. As an example, VandenBerg et al. (2007) noted that isochrones for the open cluster M67, computed based on the AGS05 solar composition, provided a worse match to the observed colour-magnitude diagram than did models based on the GS98 composition. Specifically, the best-fitting isochrone lacked the hook near the end of central hydrogen burning. Such a hook is found with the GS98 composition and appears to be reflected in the observations. In this case the dominant consequence of the change in the composition is the decrease in the importance of the CNO cycle in hydrogen burning resulting from the reduced abundances, and hence a reduced tendency for convective instability in the core. It was pointed out, however, by Magic et al. (2010) that favouring GS98 on this basis depended critically on other assumptions in the modelling. Including, for example, diffusion and settling (which was not taken into account by VandenBerg et al.) the GS98 and AGS05 models were equally successful in reproducing the hook, while other aspects of the modelling similarly had substantial effects on the morphology of the isochrones; effects on the properties of convective cores of composition and other aspects of the model physics were also investigated by Christensen-Dalsgaard and Houdek (2010). Thus, although the details of the morphology is an interesting diagnostics of the model physics, it does not provide a definite constraint on any one feature such as the composition.

One obvious failing of standard modelling is that rotation is ignored. The dynamical effect, resulting from the centrifugal force, is relatively straightforward to include, assuming that the rotation rate is given, at least for relatively slow rotation allowing a spherical approximation with a modified equation of hydrostatic equilibrium. For more rapid rotation departures from spherical symmetry must be modelled explicitly. This is the goal of the ESTER project (Evolution STEllaire en Rotation; Espinosa Lara and Rieutord 2013; Rieutord et al. 2016), to carry out fully self-consistent two-dimensional calculations of stellar structure. A recent example is the modelling of the rapidly rotating star Altair (Bouchaud et al. 2020), for which detailed interferometric observations are available on the surface distortion and temperature variations induced by rotation.

Even more difficult is the treatment of circulation and instabilities associated with rotation, and of the evolution of the internal angular velocity and associated transport processes, which is still far from fully understood. Zahn (1992) developed a simplified, if hardly simple, treatment of these processes which has seen extensive use in computations of the evolution of massive stars (for a review, see Maeder and Meynet 2000) and has been further developed by, for example, Maeder and Zahn (1998) and Mathis and Zahn (2004). Effects on these processes from diffusion-induced gradients in the mean molecular weight were considered by Théado and Vauclair (2003a, 2003b), while Talon and Charbonnel (2005) developed a combined treatment of the effects of rotation, internal gravity waves and atomic diffusion. Maeder (2009) provided a comprehensive discussion of the effects of rotation on stellar evolution. Transport by gravity waves was proposed by Schatzman (1993, 1996) and has been extensively discussed in connection with solar internal rotation (cf. Sect. 5.1.4). As discussed there, effects of magnetic fields are also likely to be relevant. Ambitious efforts to include all these effects in stellar modelling were discussed by Mathis et al. (2006) and Palacios et al. (2006) (for a recent overview, see Aerts et al. 2019).

Observational tests of these models obviously require considerations of stars other than the Sun. An important constraint comes from the dependence of stellar surface rotation on the mass and age of the star, which may provide additional constraints on the, so far somewhat uncertain, processes responsible for the evolution of the solar internal rotation (see Sect. 5.1.4). Additional information comes from the stellar surface abundances and their dependence on stellar types which reflect the mixing processes in the stellar interiors, possibly associated with the evolution of rotation. Particularly important are the abundances of lithium and beryllium (see also Sect. 5.3); since these elements are destroyed by nuclear reactions at relatively modest temperature, their abundances provide stringent constraints on the depth to which significant mixing has occurred (see also Sects. 2.25.3). Théado and Vauclair (2003c) showed that the dependence on effective temperature of the lithium and beryllium abundances in stars in the Hyades cluster could be well explained in a model combining rotationally induced mixing with an appropriate treatment of the gradient in the mean molecular weight resulting from helium settling. Also, Charbonnel and Talon (2005) showed that modelling the evolution of rotation by gravity-wave transport could account for the dependence of lithium depletion on stellar age.

Israelian et al. (2009) found an interesting possible relation between enhanced lithium depletion and the presence of planets around Sun-like stars, including the Sun. Bouvier (2008) related this to the rotational history of the stars; he suggested that the planet formation could be related to locking to a long-lived proto-planetary disk which would lead to slow rotation of the outer layers of the star and hence a strong internal rotation gradient, causing mixing and lithium destruction. In a careful study of solar-twin stars, however, Carlos et al. (2016) found a strong correlation between lithium abundance and age but no indication of enhanced depletion in planet hosts. Even so, a close connection was found by Bouvier et al. (2018a) between rotation rates, determined photometrically, of stars in the Pleiades cluster and their Li abundances, with slowly rotating stars showing a stronger Li destruction; this provides some support to the relation inferred by Bouvier (2008) between long-lived disk locking, rotation and the Li destruction and points to the importance of such abundance studies in investigations of stellar evolution. It should be noted that the general issue of lithium destruction has an important relation to cosmology, given the observed nearly uniform deficiency of lithium in halo stars compared with the predictions of Big Bang nucleosynthesis (for a review, see Cyburt et al. 2016), perhaps raising questions about the cosmological models. However, a detailed analysis by Korn et al. (2006, 2007) of abundances in the globular cluster NGC 6397 demonstrated the importance of settling and turbulent mixing for the lithium abundance in old metal-poor stars; they concluded that these processes can account for a previously inferred discrepancy between the observed abundances in such stars and the predictions of Big Bang nucleosynthesis.

7.1 Solar twins

A very interesting particular class of stars are the so-called ‘solar twins’, i.e. stars with properties very similar to those of the Sun. Very interesting analyses have been carried out comparing the solar surface composition with such stars, benefitting from the development of very precise techniques for stellar abundance determinations (see, e.g., Nissen and Gustafsson 2018, for a review). Specifically, a solar twin is defined by requiring that the effective temperature, gravity and metallicity, characterized by [Fe/H], should agree with the Sun to within one standard deviation. Here the logarithmic abundance difference is defined by

$$\begin{aligned} \mathrm{[A/B]} = \log (N_{\mathrm{A}}/N_{\mathrm{B}})_* - \log (N_{\mathrm{A}}/N_{\mathrm{B}})_\odot , \end{aligned}$$
(68)

where \(N_{\mathrm{A}}\) and \(N_{\mathrm{B}}\) are the abundances of elements A and B, \(\log \) is logarithm to base 10 and the difference is between the stellar and solar values. Fixing thus the iron abundance relative to hydrogen to the solar value, for a set solar twins Meléndez et al. (2009) and Ramírez et al. (2009) compared abundances for other elements, relative to iron, with the corresponding solar abundances. As illustrated in the example in Fig. 64 this showed a highly systematic dependence on the condensation temperature of the element. Meléndez et al. related this to the formation of the solar system. Specifically, if planetary systems are not generally found in the solar twins, condensation leading to the formation of the solar-system planets may have depleted the material accreting on the proto-sun of refractory elements, leading to the observed dependence of the solar abundance deficit on condensation temperature. The effect of accretion on the final solar composition depends critically on the mass contained in the convectively mixed region during the relevant accretion phase. In most models of pre-main-sequence evolution the star goes through a fully convective phase (see also Sect. 3.1) which would require an unrealistically large amount of material condensated in the form of rocky planets or planet cores to account for the observed solar composition depletion. This led Meléndez et al. (2009) to propose a rather late accretion, at a point in the evolution of the proto-sun where the convective envelope had reached approximately the present extent. Alternatively, Nordlund (2010) recalled the detailed modelling of pre-main-sequence evolution by Wuchterl and Tscharnuter (2003) which indicated that the convective envelope did not involve a large fraction of the stellar mass during the accretion phase, as also found by Baraffe and Chabrier (2010) in models with episodic infall (see also Sect. 3.1). This might lead to a sufficient depletion of the convection-zone abundance with realistic condensation. Nordlund also noted that the resulting difference between the solar surface composition and the composition of the radiative interior might resolve the conflict between the effect on solar models of the composition revision by Asplund (2005), Asplund et al. (2009) and the helioseismically inferred solar structure (see Sect. 6.2; I recall, however, that such models apparently do not provide a fully satisfactory solution to the discrepancy). The results of Meléndez et al. (2009) were confirmed by the analysis of a much larger sample of stars by Bedell et al. (2018), who also pointed to a possible connection with the existence of the solar system. I note that this argument is somewhat weakened by the ubiquitous presence of planets inferred by the Kepler mission (e.g., Batalha 2014), although, as pointed out by Bedell et al., a planetary system matching the properties of the solar system has yet to be found. As an alternative explanation Gustafsson (2018) suggested that material accreted in the later phases of solar formation could have been depleted in refractory material through cleansing of dust by the effect of solar radiation on the dust grains. A more detailed discussion of these composition differences and the proposed explanations was provided by Nissen and Gustafsson (2018).

Fig. 64
figure 64

Image reproduced with permission from Meléndez et al. (2009), copyright by AAS

Logarithmic differences (cf. Eq. 68) between the solar surface abundances, normalized to iron, and the averages of stars identified as being ‘solar twins’. The abscissa shows the condensation temperature (Lodders 2003) of the corresponding element in the proto-solar nebula.

7.2 Asteroseismology

Despite the importance of the abundance studies it is evident that observations with more direct sensitivity to stellar interiors would be very valuable. As in the case of helioseismology, the study, known as asteroseismology,Footnote 63 of stellar interiors from observations of oscillations provides such a possibility. Extensive reviews of asteroseismology were provided by, for example, Cunha et al. (2007) and Aerts et al. (2010), while Chaplin and Miglio (2013) discussed solar-like oscillators and Hekker and Christensen-Dalsgaard (2017) considered the very interesting seismology of red giants. A review of the field was provided recently by García and Ballot (2019).

Although stellar properties have been investigated by means of observations of stellar oscillations for several decades (e.g., Petersen 1973; Bradley and Winget 1994), the field has developed rapidly in recent years owing to large-scale observational projects and new observing techniques. Particularly dramatic has been the development of observations of solar-like oscillations. A major breakthrough came with new spectroscopic techniques that enabled the analysis of oscillations in radial velocity with amplitudes of a few \(\,\mathrm{cm}\,\mathrm{s}^{-1}\) (e.g., Kjeldsen et al. 2005). Missions for space-based high-precision photometry combining the search for extra-solar planets (exoplanets) using the transit technique with asteroseismology have revolutionized stellar astrophysics. The CoRoT Footnote 64 satellite (Baglin et al. 2009, 2012), launched in 2006 and operating until 2012, yielded asteroseismic data for a substantial number of stars. Much more extensive data were obtained from the NASA Kepler mission (Gilliland et al. 2010; Borucki 2016) which was launched in March 2009 into an Earth-trailing heliocentric orbit. It operated in the nominal mission observing one field in the Cygnus-Lyra region until May 2013, when the second of four reaction wheels failed; since then it was repurposed to the K2 mission (Howell et al. 2014), observing a large number of fields along the ecliptic for around 80 days each. The mission was finally stopped in October 2018, when the spacecraft ran out of fuel. The TESSFootnote 65 mission (Ricker et al. 2014), launched in April 2018, is surveying about 80% of the sky, emphasizing relatively bright and nearby stars in a search for exoplanets and carrying out asteroseismology of a large number of stars. In the slightly more distant future very extensive studies, coordinating investigations of extra-solar planetary systems and asteroseismic studies of stellar properties, will be carried out with the ESA PLATOFootnote 66 mission (Rauer et al. 2014), which was adopted in 2017 for a planned launch in 2026.

Even given the huge advances provided by the space-based photometric observations, ground-based radial-velocity observations still offer important advantages, particularly in terms of the ratio between the oscillation signal and the stellar background noise, which is much higher for radial velocity than for photometric observations of solar-like oscillations (Harvey 1988; Grundahl et al. 2007). Also, with a dedicated network of telescopes extended observations can be obtained for particularly interesting stars. This is the goal of the planned 8-station SONGFootnote 67 global network dedicated to asteroseismology (Grundahl et al. 2014) which is under development. Currently (2020) one node of the network, at Observatorio del Teide on Tenerife, in collaboration with Instituto de Astrofísica de Canarias has been in operation since 2014; one remarkable result is several hundred nights of observations of the subgiant \(\mu \) Her (Grundahl et al. 2017). Two additional nodes are under development in China and at University of Southern Queensland, Australia, while collaboration is sought for additional nodes.

In the foreseeable future the lack of spatial resolution in general limits observations of stellar oscillations to modes of spherical harmonic degree of at most 3.Footnote 68 At a very basic level the oscillation frequencies scale as \(t_{\mathrm{dyn}}^{-1}\) (cf. Eq. 40), i.e., as \({\bar{\rho }}^{1/2} \propto M^{1/2} R^{-3/2}\), where \({\bar{\rho }}\) is the mean density of the star. In particular, for solar-like oscillations, i.e., acoustic modes of high radial order, the large frequency separation (cf. Eq. 59) satisfies \(\varDelta \nu \propto {\bar{\rho }}^{1/2}\). Also, Eq. (56) shows that these are the modes which penetrate most deeply and hence provide information about the stellar core. The change in sound speed resulting from the fusion of hydrogen to helium affects the frequencies in a manner that provides information about the evolutionary state of the star and hence its age (Christensen-Dalsgaard 1984b, 1988a; Ulrich 1986); the sensitivity to the central composition is reflected in the dependence of the small frequency separation on an integral of the sound-speed gradient, weighted towards the centre (cf. Eq. 60), although the determination is obviously affected by other uncertainties in the stellar modelling (Gough 1987). These properties make solar-like oscillations powerful tools for determining the global properties of stars, i.e., mass, radius and age, which are very important for the characterization of exoplanetary systems (for recent reviews, see Christensen-Dalsgaard and Silva Aguirre 2018; Lundkvist et al. 2018). Lebreton and Goupil (2014) made a careful analysis of asteroseismic data for a star observed by CoRoT and demonstrated that, combining these with ‘classical’ observations, precise estimates of the mass and age of the star could be obtained. Silva Aguirre et al. (2015) carried out a comprehensive asteroseismic analysis of stars detected as exoplanet hosts by Kepler, demonstrating that precise stellar parameters could be obtained. Also, the so-called LEGACY sample of Kepler stars, selected as being particularly well-observed for asteroseismology, was the basis of extensive analyses by Lund et al. (2017) and Silva Aguirre et al. (2017). This sample will undoubtedly form the basis for further investigations of the detailed properties of these stars.

The sharp gradient in composition and hence sound speed at the edge of a convective core has distinctive effects on the frequencies (e.g. Mazumdar et al. 2006; Cunha and Metcalfe 2007). As has already been found in solar data (see Sect. 5.1.2) sufficiently extensive observations will also be sensitive to effects of other such acoustic glitches, i.e., aspects of the structure of the star which vary on a scale short compared with the wavelength of the oscillations; examples are effects of helium ionization on \(\varGamma _1\) and the change in the sound-speed gradient at the base of a convective envelope (e.g., Pérez Hernández and Christensen-Dalsgaard 1998; Monteiro et al. 2000; Ballot et al. 2004; Verner et al. 2006b). A careful analysis of this type of investigation was provided by Houdek and Gough (2007a), with particular emphasis on the determination of the envelope helium abundance. Through constraining other aspects of the star, such analyses may also help reducing the systematic errors in the determination of stellar age (Monteiro et al. 2002; Mazumdar 2005; Houdek and Gough 2007b, 2011). Mazumdar et al. (2014) identified acoustic glitches associated with both the base of the convective envelope and the second helium ionization zones in a number of stars observed by Kepler, while Verma et al. (2014) used the helium glitch to determine the helium abundance in the solar analog binary 16 Cyg, observed by Kepler. In a remarkable analysis Verma et al. (2017) used the acoustic glitches to determine the depth of the convective envelope and the helium abundance in the Kepler LEGACY stars. Such a largely independent determination is very valuable in breaking the degeneracy in fits to asteroseismic data between the mass and the helium abundance, implicit in the relation (36) for luminosity in terms of mass and mean molecular weight. In an interesting application, Verma and Silva Aguirre (2019) used determinations of helium abundances in three stars with masses around \(1.4 \,M_\odot \) to constrain the extent of extra mixing below the convective envelope required to counteract helium settling (see also Sect. 2.3.4).

Investigation of internal rotation based on just low-degree modes is restricted by the small number of m values available and the limited sensitivity of the frequencies to rotation in the deep interior (e.g., Lund et al. 2014a). However, determination of the rotational splitting provides an average of the rotation rate of the stellar interior; combined with measurement of the surface rotation rate, e.g., from photometric variations induced by spots, this can give some information on the variation of rotation with position in the star and hence on the effects of the evolution of internal rotation. Also, the relative amplitudes of the different m components provide information about the inclination i of the rotation axis, if the average intrinsic amplitude is assumed to be independent of m (Gizon and Solanki 2003; Ballot et al. 2006). This has been used to study the inclination between the rotation axis and the orbital plane for exoplanets detected using the transit technique (e.g. Huber et al. 2013; Lund et al. 2014b; Campante et al. 2016). Benomar et al. (2015) determined the mean interior rotation rate from observations of rotational splitting and combined that with spectroscopic measurements of \(v \sin i\), for 22 main-sequence stars observed by Kepler, with i determined from the asteroseismic data. In most cases the results were consistent with no variation of angular velocity between the surface and the interior. Interestingly, this is essentially consistent with the properties of solar rotation, as inferred from helioseismology (cf. Sect. 5.1.4). For completeness I note that in more evolved stars, such as subgiants and red giants, modes with a mixed character between p and g modes allow detailed determination of the rotation of the deep interiors of the stars, showing an increasing ratio between the core and envelope rotation rate, although far less drastic than predicted by models of the evolution of stellar rotation (see Chaplin and Miglio 2013; Hekker and Christensen-Dalsgaard 2017, for reviews).

In some cases the Kepler data were sensitive to the dependence of the rotational splitting on m, leading to constraints on the variation of angular velocity with latitude. In this way Benomar et al. (2018) showed the presence of latitudinal differential rotation in some stars, in the same sense as in the Sun, i.e., with the equator rotating more rapidly than the poles. Combining asteroseismic measurement of the differential rotation with rotation periods from photometric variations induced by starspots Bazot et al. (2018) inferred the presence in a Kepler star of cyclic activity variation, including the shift of the preferred latitude of starspots, qualitatively similar to the solar butterfly diagram (see Sect. 5.1.5), although with a much shorter period. These investigations of stellar rotation are clearly important also for a broader understanding of how the rotation depends on stellar properties, with a mutual interplay between the deep probing of solar rotation and the broad investigation of stars of different mass and evolutionary stage.

More detailed information about the variation of rotation with depth and latitude in distant stars will require observations with spatial resolution. Such observations are planned with the interferometric Stellar Imager (Schrijver et al. 2007; Carpenter et al. 2009), now under concept study as a NASA project. This would allow observation of modes of degree as high as 60 in selected stars and hence inference of the rotation rate in the entire radiative interior of a star as the Sun, including a possible tachocline (see Fig. 44). Such observations are crucial for the study of the effect of the dynamics of the base of the convection zone on the dynamo mechanism likely responsible for stellar cyclic activity. Needless to say, such observations of modes of relatively high degree would also revolutionize investigations of stellar internal structure.

8 Concluding remarks

When I started my PhD-studies in 1973 in Cambridge under the supervision of Douglas Gough very little was known about the solar interior. The apparent deficit of solar neutrinos in the Davis experiment was a serious concern, leading to a range of proposals for possible changes to solar and stellar modelling, with potentially important consequences for our general understanding of stellar evolution. The initial goal of my project was to carry out more reliable calculations of the stability of the Sun towards g-mode oscillations, which might affect solar structure and decrease the computed neutrino flux. Part of this was to develop a more accurate code to calculate solar evolution.

The direction of the project changed drastically with the first announcements of possible global solar oscillations, into what became part of the early development of helioseismology. As will be clear from this review, the results of this development have fundamentally changed our investigations of the solar interior and, as a result, our general knowledge about stellar evolution. We now know the structure of most of the present Sun, as characterized by, for example, the sound speed and density to a remarkable accuracy. In parallel, the improved understanding of the properties of neutrinos and new experiments to detect them have advanced the study of solar neutrinos to a point where the measurements of the neutrino flux provide additional very valuable information about the properties of the solar core. Strikingly, at the level of precision often applied in astrophysics, the agreement between models and the observationally inferred properties is reasonable, typically within a few per cent. This applies to models where no direct attempts have been made to adjust parameters to match the observations, apart from the classical calibration of initial composition and treatment of convection to obtain the correct radius, luminosity and overall surface composition of the model of the present Sun.

A fascinating aspect of these solar investigations is that the accuracy and information content of the data on solar oscillations is far higher than most astrophysical data. This makes it meaningful to use the observations as deep probes of the physics of the solar interior. This sensitivity also makes the Sun a potential detector of more esoteric physical effects, such as the effects of dark matter. In fact, the accuracy and agreement with the observations that have been reached in solar modelling is very far from matching the accuracy of the observations. A striking example is the thermodynamical properties of solar matter, where models based on the current sophisticated treatments still do not match the observations. An additional open issue that has emerged in the last two decades is the revision of determinations of the solar surface composition, which has increased the discrepancy between the models and the helioseismic inferences and cast doubt on the calculation of opacities or other aspects of solar modelling.

Indeed, it should be no surprise that current simple solar models are inadequate; the surprise is perhaps rather that they work as comparatively well as they do. The models neglect a number of physical processes that must have been active in the Sun during its evolution and still affect it. This includes the evolution of solar rotation, involving redistribution of angular momentum and likely flows that would change the composition structure of the Sun. Also, magnetic fields are typically neglected, yet they could have a significant effect on the structure and dynamics of the solar interior. The next steps in the investigation of the Sun will surely involve models that take such effects into account, considering the interplay between the structure and dynamics of the solar interior. Here insight into the relevant physical processes can be sought in increasingly, but far from fully, realistic detailed hydrodynamical simulations. An important issue is to understand the origin and effects of the solar cyclic magnetic activity and the extent to which it involves larger parts of the Sun. Also, the helioseismic data accumulated over the preceding more than two decades have very far from been fully exploited and offer excellent opportunities for tests of such refined modelling. New data-analysis techniques are required to make full use of the data, including understanding their statistical properties and the extent to which the resulting conclusions are significant.

From the understanding of solar oscillations as caused by stochastic excitation by near-surface convection follows that all stars with outer convection zones are expected to show similar oscillations; the question is whether or not they are detectable. Early detections of such oscillations were made with ground-based spectroscopic observations starting in the 1990s, but the real breakthrough and a revolution in asteroseismology based on solar-like oscillations, starting in 2007 came with the photometric space missions CoRoT and Kepler. We now have extensive data on stellar oscillations for a very broad range of stars in all evolutionary phases, and hence an excellent possibility to study stellar evolution over a broad range of parameters in mass and chemical composition, as a complement to the detailed investigations in the solar case. This will undoubtedly also improve our understanding of the Sun, perhaps providing pointers towards the resolution of the discrepancies caused by the revised determinations of the solar surface composition. As for the Sun, the detailed exploitation of these data is just beginning, and there is a huge potential for continuing investigations, in parallel with the improvements in the techniques for modelling, and leading to a continued development in our understanding of solar and stellar structure and evolution.

The fields of solar and stellar astrophysics are very much alive; and so, therefore, should this review be.