1 Introduction

1.1 The Surface Energy Balance and Its Relevance

The surface energy balance (SEB) is an essential cornerstone of any theoretical description of the Earth’s climate system. It can be assessed locally on the ecosystem scale if the turbulent heat fluxes, the soil heat flux and net radiation are measured independently,

$$ H + \lambda E + G + Imb = R_{n} , $$
(1)

where H is the sensible heat flux, λE is the latent heat flux, both of which are most directly measured using the eddy-covariance (EC) technique, G is the soil heat flux at the surface, which can be quantified by a combination of heat-flux plates, soil temperature, and soil water content sensors (energy storage in the canopy can potentially also be included here), and Rn is net radiation, which is ideally measured by a high-quality four-component net radiometer. The imbalance (Imb) is considered as the total contribution of neglected effects and uncertainties. All terms on the left-hand side of Eq. 1 are defined positive for transferring energy away from the surface, while Rn is traditionally defined positive in the daytime.

This budget equation is to be ideally applied for an interface of infinitesimal depth between the atmosphere and the surface. However, constraints on how the different terms of the budget are measured make it more convenient to define the energy budget for a volume surrounding the interface. There is a correspondence between the evolution equation of the temperature of this volume and the SEB equation, as shown for instance in Cuxart et al. (2015). Hence, the term Imb includes at least, (1) the tendency and advection terms, respectively, related to surface heterogeneities and non-stationarity; (2) natural and anthropogenic thermal processes, including heat storage, plant metabolism or urban-induced thermal processes; (3) the vertical divergences of the turbulent heat fluxes or the net radiation within the volume of interest, and (4) the uncertainties related to the experimental display.

Therefore, strictly speaking, closure of the SEB can only be expected under the assumption of a horizontally uniform two-dimensional exchange surface without a canopy, while the fluxes are perpendicular to this surface. These conditions are assumed to be fulfilled in commonly used numerical land-surface models. In spite of these shortcomings, accurate field measurements of the SEB are essential for the calibration and validation of land-surface models. Nevertheless, when comparing observed and modelled energy fluxes, the uncertainty of the observed values has to be considered, since models have a closed SEB in contrast to the observations.

It is important that these models provide a realistic representation of the underlying processes, especially for seasonal predictions and climate simulations, where land–atmosphere feedbacks play a critical role (Arneth et al. 2012; Green et al. 2017). In order to address the discrepancies in energy balance closure between models and observations, already Sellers et al. (1989) included a correction factor for the evaporation fraction of the turbulent fluxes in his simple biosphere model. Kracher et al. (2009) investigated how different land-surface models compare to field measurements of the energy balance components. They found that if the model uses the surface temperature to calculate all terms of the energy balance iteratively the resulting fluxes agree quite well with the EC data corrected under the assumption that the Bowen ratio is preserved.

Some land-surface models compute explicitly H and λE at the surface and use Rn provided by the radiation scheme, obtaining G as the residual of those three terms. In the case of realistic values for \( \lambda \)E, H and Rn when compared to observations, this approach implies that the residual is added to G, resulting in too high values of this flux and excessive heat transport into the ground (as shown in Cuxart et al. 2015 for the ECMWF model). Similarly, some remote-sensing applications estimate H, Rn and G, and obtain λE as the residual, also very often overestimating the value of evapotranspiration (see the review of Liou and Kar 2014).

Similar problems occur when EC measurements are used for the validation of biological models, e.g. to calculate the surface conductance of water, which also implicitly assume a closed energy balance (Wohlfahrt et al. 2009). Another important question in biological and ecological research is whether trace gas fluxes measured by eddy correlation are equally affected by a related systematic error (Wang 2009; Foken et al. 2011). A study on a large number of FLUXNET sites suggests that there is a link between the energy imbalance and CO2 fluxes (Wilson et al. 2002). For a given value of the photosynthetically active radiation, the magnitude of CO2 uptake was less when the energy imbalance was greater. Similarly, respiration (estimated by nocturnal CO2 release to the atmosphere) was significantly less when the energy imbalance was greater.

The journal Boundary-Layer Meteorology has played a prominent role in the scientific discussion of the SEB closure problem over the last 25 years. It has provided a platform for highly innovative, pioneering and ground-breaking work, particularly related to atmospheric transport processes and EC methodology: see Table 1 for highly cited articles.

1.2 Historical Development and Description of the Problem

The first indications of a non-closed energy balance became obvious in the comparison of EC measurements with Bowen-ratio measurements; the latter close the energy balance by definition. Similar results were found in experiments in Australia in 1981 (Leuning et al. 1982) and in June of the years 1982, 1983, and 1984 in Germany (Koitzsch et al. 1988). In both experiments net radiometers were used that were highly accurate for the 1980s, the Funk (1959) and the Sonntag (1975) devices, respectively (see also Sect. 2.1). In both studies problems of the EC method were postulated. Desjardins (1985) aimed to measure primarily CO2 fluxes and found an underestimation in the turbulent energy fluxes in comparison to the available energy. During an experiment at Kursk (Russia) in 1988 an unclosed energy balance was found, initially not explicitly published (Tsvang et al. 1991), but then analyzed by Foken (1990) and later by Panin et al. (1998). Nevertheless, an investigation of the SEB closure was formulated as the aim of a field experiment near Tartu (Estonia) in 1990 (Foken et al. 1993) and the follow-up experiment at Kursk 1991 (Panin et al. 1998). G. N. Panin was also involved in the First International Satellite Land Surface Climatology Project (ISLSCP) Field Experiment (FIFE) in Kansas, USA in 1989, and found an unclosed energy balance there too (Kanemasu et al. 1992), while a closed energy balance was reported in other published FIFE studies. Further pioneering work on the SEB closure problem was conducted by Blanford et al. (1991) and Bernhofer (1992a), who were the first to explain the flux underestimation with large-scale non-turbulent transport mechanisms. About ten years later, the problem was addressed through ecological networks (Aubinet et al. 2000; Wilson et al. 2002) and a first correction was proposed for general flux measurements (Twine et al. 2000).

At the beginning of the 1990s, T. Foken submitted an overview article on the problem, but this was rejected because reviewers doubted the accurate calibration of the instruments. An updated version was later published (Foken 1998). A first international workshop addressing the problem was held in Grenoble in 1994 (Foken and Oncley 1995). During this workshop, the idea of the first concerted field experiment to address the problem was born, which then actually took place in California in 2000, in the framework of the Energy-Balance Experiment (EBEX-2000) (Oncley et al. 2007). Based on these results and many other studies, the first peer-reviewed review article on the SEB closure problem was published (Foken 2008), stressing the importance of secondary circulations.Footnote 1

1.3 Analysis of Energy Balance Closure at Multiple Sites

Studies on the energy balance closure are particularly valuable when datasets from multiple sites are analyzed, in order to generalize the problem and to extract common underlying reasons. At a single arbitrary site, a large variety of errors can cause non-closure of the energy balance. Hence, a single-site SEB closure analysis may provide insights on how to optimize the instrumental set-up. However, the phenomenon of a general systematic behaviour becomes only apparent in an analysis of data from many sites with different characteristics, e.g., with respect to instrumentation, canopy structure, and atmospheric conditions.

For example, Panin et al. (1998) investigated several sites in North America and Europe, and found a relation with surface inhomogeneity around the EC measurements. Moreover, Wilson et al. (2002) found a clear relationship between SEB closure and friction velocity \( u_{*} \) for 22 FLUXNET (Baldocchi et al. 2001) sites. An investigation on the SEB closure of eight ChinaFLUX (Li et al. 2005) sites discusses certain factors that contribute to the imbalance of energy, such as systematic errors associated with the mismatch of sampling areas, systematic instrument bias, neglected energy sinks, low and high frequency losses of turbulent fluxes and advection of heat and water vapour. A dependence of SEB closure on atmospheric stability, in particular the flux Richardson number, was found by Stoy et al. (2006) for a successional chronosequence in the south-eastern United States. A similar response of the SEB closure to \( u_{*} \) and atmospheric stability is described by Barr et al. (2006) for several Canadian sites and by Hendricks-Franssen et al. (2010) for several European FLUXNET sites.

Based on multi-site analyses, surface heterogeneity on the landscape scale beyond the actual flux footprint has been found to play an important role (Mauder et al. 2007c; Panin and Bernhofer 2008), because this scale is relevant to the formation of secondary circulations, as shown, e.g., for the LITFASS-2003 (Lindenberg Inhomogeneous Terrain—Fluxes between Atmosphere and Surface: a Long-term Study) multi-site experiment (Foken et al. 2010). This hypothesis is supported by fluxes derived from area-averaging measurement techniques, i.e. scintillometers and aircraft, which measure larger heat fluxes than EC towers in the same area (Meijninger et al. 2006; Xu et al. 2017a). The relationship with the friction velocity \( u_{*} \) and landscape-scale heterogeneity is confirmed by the most comprehensive analysis of SEB closure for 173 FLUXNET sites around the world (Stoy et al. 2013). In addition, it was demonstrated that an accurate measurement of the storage term is important for closing the surface energy balance for tall vegetation canopies, particularly at half-hourly time scales (Leuning et al. 2012), under the condition that all sources of measurement and data processing errors in the EC system are minimized (Xu et al. 2019).

1.4 Overarching Research Question

Based on the extensive amount of literature available on the SEB closure problem, we structure our review based on the following overarching research questions:

  • What are the reasons/underlying processes for the non-closure?

  • Is the closure problem different in daytime and nighttime conditions?

  • How should the residual be partitioned (including effects on other scalar fluxes)?

  • How can one predict the magnitude of the energy balance residual?

2 Hypotheses for the Underlying Reason

The hypotheses for the underlying reasons of the SEB closure problem can be grouped in four major topics: instrumental errors, data processing errors, usually neglected additional terms of the SEB equation, and sub-mesoscale transport processes, which contradict the theoretical assumptions of tower-based EC measurements, and therefore are inherently not captured (Fig. 1).

Fig. 1
figure 1

Overview and classification of hypothesis for the underlying reasons of the SEB closure problem

2.1 Instrumental Errors

Instrumental errors must be discussed in the context of the measurement area that is represented by the sensors, ranging from horizontal scales of 0.1 m for the ground heat flux, 10 m for the net radiation, and up to several 100 m for the turbulent fluxes (Schmid and Oke 1990; Schmid 1997; Foken 2008), as illustrated in Fig. 2. To overcome this problem, the surface characteristics below the net radiometer should be similar to the surface in the flux footprint of the EC system, and the ground heat flux should be measured under the same condition; more than one sensor is highly recommended. In addition, the EC footprint should be determined for each averaging interval, since it varies with wind speed, stratification and boundary-layer depth. However, this effect should not lead to a general systematic underclosure but rather can be considered as a random error for most sites or only a small bias (Richardson et al. 2012).

Fig. 2
figure 2

Modified after Foken (2017), first published in 2003, © Springer Nature

Schematic view on the measurement area of the different parts of the energy balance.

2.1.1 Sonic Anemometers

Sonic anemometers are the core instrument for EC measurements. Therefore, researchers intensively investigated their response characteristics to determine whether these instruments explain a larger part of the SEB closure problem. Based on the results of intercomparison experiments, error estimates were assigned to several types of sonic anemometers, which had been classified into two groups during the 1994 SEB closure workshop in Grenoble (Foken and Oncley 1995; Mauder et al. 2006). However, no general systematic biases were found here for the standard deviation of the vertical velocity, which would translate into a bias of energy fluxes. In parallel, several researchers found a larger systematic error for certain wind-tunnel experiments on the order of 10% of the flux, which is comparable to the magnitude of imbalances of many sites (van der Molen et al. 2004; Nakai et al. 2006). The term angle-of-attack correction was coined for the proposed adjustment method. Nevertheless, doubts were raised whether these corrections determined under quasi-laminar wind-tunnel conditions are transferable to real-world turbulence (Högström and Smedman 2004), because the extent of wakes behind an obstacle is much shorter for conditions of high turbulence intensity (Wyngaard 1981; Rodríguez et al. 2015).

This topic was also addressed as part of the EBEX-2000 campaign by a comprehensive intercomparison experiment, with the result that there were some characteristic differences between different sonic anemometers, but none of them was sufficiently large and systematic to explain the magnitude of the energy balance residual (Mauder et al. 2007d). To date, the discussion in the literature about possible systematic errors of flux measurements using sonic anemometers is still ongoing (Kochendorfer et al. 2012; Horst et al. 2015; Frank et al. 2016), indicating that the error might be on the order of 3–5% for the energy fluxes. New experimental designs have recently been explored to employ additional independent reference estimates for sonic anemometers, particularly under turbulent conditions, such as large-eddy simulation (LES, Huq et al. 2017), spectral ratios in the inertial sub-range (Peña et al. 2019), and high-resolution Doppler lidar (Mauder et al. 2020). While the absolute accuracy of sonic anemometers is still not fully quantified, it was at least established that the precision of various types of modern sonic anemometers is very good regarding general flux measurements (Mauder and Zeeman 2018).

2.1.2 Hygrometers

There was a significant change in the application of optical hygrometers around the year 2000; before 2000, most of the instruments were institutional fabrications or from small companies. Foken et al. (1995) gave an overview of the developments of ultraviolet and infrared hygrometers. If the instruments were carefully calibrated, no significant influence on the energy balance closure and bias to recent sensors can be expected. However, the frequently-used Lyman-alpha hygrometers were not very stable. This issue can be controlled by an additional humidity measurement and re-calibration. For field calibrations of these ultraviolet sensors, a method with changing pathlength was developed that can be applied under nearly constant absolute humidity for Lyman-alpha hygrometers (Foken et al. 1998) or under constant pressure conditions for Krypton hygrometers (Foken and Falke 2012) because Krypton hygrometers are affected by an oxygen cross-sensitivity (Tanner and Campbell 1985). A prototype of the latter system was applied during EBEX-2000. Because ultraviolet hygrometers are nearly no longer in use, the calibration techniques are not applied any more. Nevertheless, ultraviolet hygrometers have the advantage that they are very sensitive and therefore only require a very short pathlength on the order of 0.01 m, while the pathlength of an infrared hygrometer is typically on the order of 0.1 m.

The first comparisons of the Krypton hygrometer with the infrared open-path hygrometer LI-7500, which is currently considered the reference sensor, were conducted during EBEX-2000 (Mauder et al. 2007d). Since then, infrared hygrometers have almost completely replaced the ultraviolet hygrometers because they have a number of advantages, such as superior long-term stability and the additional measurement of CO2 concentration. However, a larger pathlength is needed for infrared hygrometers in comparison with ultraviolet hygrometers, which leads to spectral losses. Closed-path hygrometers are afflicted with additional low-pass filtering effects due to tube dampening (Haslwanter et al. 2009). These spectral losses are corrected effectively as a function of relative humidity (Ibrom et al. 2007; Fratini et al. 2012) as long as the error is not too large. Therefore, closed-path gas sampling systems need to be carefully designed and maintained (Aubinet et al. 2016; Metzger et al. 2016).

2.1.3 Net Radiometers

The reason that the non-closed surface energy balance was not found already in the 1970s and 1980s was the lack of accurate net radiometers, which was shown by Halldin and Lindroth (1992). They compared different net radiometers with the Schulze-Däke instrument with polyethylene (Lupolen®) dome (Däke 1972) that was one of the best instruments at that time (the Sonntag (1975) instrument was similarly constructed). It was found that the Funk (1959) net radiometer underestimated net radiation only by a few percent but the widely used Fritschen (1963) net radiometer underestimated net radiation in the same order as the residual of the energy balance. Therefore, the SEB closure problem was only found in experiments where the Fritschen instrument was not used.

During the EBEX-2000 experiment several types of instruments were compared, among them also the Schulze-Däke and the Q7 Fritschen type net radiometers (Kohsiek et al. 2007). The Schulze-Däke agreed well with the widely used Kipp&Zonen CNR1, which is now mostly replaced by the ventilated CNR4 with improved accuracy. Again, the Q7 radiometer significantly underestimated net radiation. A similar underestimation was also found for the NR-lite instrument (Brotzge and Duchon 2000). In general, careful and frequent cleaning of the sensors was found to be essential for high-quality data. Nevertheless, the relative uncertainty of net radiation sensors can still be considerable, i.e. 10% already under favourable conditions and larger percentages at night when absolute radiative fluxes are small. However, this should normally not lead to a systematic error.

Since the beginning of the 1990s, much progress was made in radiation measurements (Ohmura et al. 1998), and the measurement of shortwave radiation (pyranometer) is now often separated from longwave radiation (pyrgeometer). The latest state of the art is documented in a recent ISO guideline (ISO 2018).

2.1.4 Ground-Heat-Flux Measurements

In the beginning of the investigations on the lack of energy balance closure, the ground heat flux was discussed as a potential reason for the lack of closure because of its large relative errors (Culf et al. 2004). A frequently-found asymmetry of the closure between the morning and afternoon hours supported this assumption. Kukharets and Tsvang (1999) and Kukharets et al. (2000) investigated possible errors mainly of the heat storage in the upper soil layer, with a special focus on the time shift between the air temperature and the surface temperature changes, and changes of the soil heat capacity due to a changing soil water content (Peters-Lidard et al. 1998).

Liebethal et al. (2005) conducted a sensitivity analysis on different ground-heat-flux measurement techniques and concluded that heat-flux plates should be installed at a depth of about 0.2 m and the heat storage should be calculated from a high-resolution temperature profile with associated soil moisture measurements. Furthermore, Liebethal and Foken (2007) compared different parametrization approaches for the ground heat flux with the results that a simplified measurement and force-restore method best approximates the measurements. In contrast to these mostly calorimetric approaches, Heusinkveld et al. (2004) developed an harmonic approach for homogeneous soils. Both approaches can reduce the residual of the energy balance, the harmonic approach with a larger effect, but it cannot solve the general problem (Jacobs et al. 2008). Nevertheless, for some sites over flat and homogeneous terrain, the energy balance was nearly closed when the ground heat flux was carefully calculated (Kabat et al. 1997; Heusinkveld et al. 2004; Mauder et al. 2007c). For weather conditions not evolving drastically, the integration of the ground heat flux over a full diurnal cycle should give a value close to zero. Thus, omitting G and integrating Eq. 1 over one or several days might help to characterize the imbalance term apart from the potential errors related to G.

2.2 Errors Due to the Eddy-Covariance Processing Chain

2.2.1 Averaging Time

It is a fundamental assumption for EC measurements that the mean vertical wind component over the averaging period should be zero. In the past, different methods were applied to achieve this condition: a running mean filter (McMillen 1988), linear detrending (Rannik and Vesala 1999), or block averaging, often over 30 or 60 min, (Kaimal and Finnigan 1994; Finnigan et al. 2003; Wang 2010). It is important to note that all of these averaging operators act as high-pass filters. Sometimes, a clear spectral gap is found between the frequencies of atmospheric motions, which however often appears not to be as evident, especially above rough surfaces and in unstable stratification. Some scientists prefer using block averaging by calculating the arithmetic mean over a certain time period, in order to avoid potential conflicts with the Reynolds decomposition rules (Aubinet et al. 2012). Nevertheless, if low-frequency contributions beyond the averaging interval are significant, and if no appropriate high-pass filtering correction is applied, this may produce an underestimation of the total flux.

Wavelet methods are useful to determining the relevant scales in the covariances, allowing determination whether a spectral gap is present (Terradellas et al. 2001). The so-called ogive method (Desjardins et al. 1989; Oncley et al. 1990) is often employed to check the low-frequency part of the cospectra for significant flux contributions. This was also done by Foken et al. (2006) to investigate the SEB closure problem with the conclusion that an extended averaging time would lead to larger fluxes in some cases.

Finnigan et al. (2003) proposed a long-term integration over several days to close the energy balance. This was also shown for another dataset by Mauder et al. (2006) with the result that most of the missing energy would be added to the sensible heat flux. The problem of this method is that the averaging time needs to be on the order of days, and within this period no significant changes of the weather condition should occur, in order to fulfil the stationarity requirement. The study of Mauder et al. (2006) was later extended to other sites of the same field campaign (Charuchittipan et al. 2014) with the result that SEB closure does not improve in the same way for different land-use classes. The observed remaining lack of closure even for very long averaging times can be explained because large-scale eddies often do not propagate with the mean wind; hence they violate Taylor’s frozen turbulence hypothesis, and therefore inherently cannot be fully captured by single-tower measurements (Mahrt 2010).

2.2.2 Flux Corrections

An ideal EC system would be able to measure the vertical wind fluctuations and the scalar to be transported at a single point at the same instant and in units that do not change with changes of atmospheric pressure. In reality, these conditions are not completely fulfilled and therefore a number of so-called flux corrections, which are sometimes merely unit conversions, are required (Foken et al. 2012). The basis for accurate flux estimates is a consensus about a suite of required corrections, which has been laid out in the Handbook of Micrometeorology (Lee et al. 2004). Nevertheless, different choices of methods to correct for the same effect and different implementations in a certain software package can lead to small discrepancies in the resulting fluxes (Fratini and Mauder 2014).

The effect of different flux corrections on the SEB closure was investigated by Mauder et al. (2006) for the LITFASS-2003 field campaign, which was carried out over a heterogeneous but flat area in central Europe. In addition, Mauder et al. (2007d) also compared different EC processing software from different institutions and countries as part of EBEX-2000. These studies conclude that it is important to apply the appropriate corrections in the right way, but the potential discrepancies in correction algorithms cannot explain the observed general lack of SEB closure.

2.3 Additional, Normally Neglected, Terms

There is a series of additional energy balance terms, which are often, but not always, insignificant compared to the error margins of flux measurements (Eq. 1). In the framework of EBEX-2000, the following aspects were investigated (Oncley et al. 2007):

  • Heat storage in the canopy

  • Biochemical storage in the canopy due to photosynthesis and respiration

  • Vertical and horizontal flux divergence

  • Horizontal advection

  • Water pumping by the plants

Accounting for all these additional terms did indeed improve the energy balance closure for EBEX-2000, although they were often difficult to estimate with the available measurements. Nevertheless, despite this effort, the EBEX dataset still shows an imbalance on the order of 10% of the available energy that could not be explained for this rather low-vegetation measurement site (cotton). The values in crops of the canopy heat storage and biochemical processes added are typically less than 10 W m−2 (Oke 2002). Water pumping by plants is driven by suction at the stomata in the leaves and osmotic pressure at the root level, and the involved energy being related to these processes at the cell level must be added.

For tall vegetation sites, especially forests, it is clear that the above-ground heat storage can play an important role for the SEB closure. This has been demonstrated for a number of sites in different climate zones (Haverd et al. 2007; Moderow et al. 2009; Lindroth et al. 2010). Leuning et al. (2012) showed that including the above-ground heat storage can even lead to a satisfactory energy balance closure for several largely homogeneous forest sites in Australia, where advective fluxes are probably small.

The energy converted into biochemical storage through photosynthesis is typically only a few percent of the incoming shortwave radiation. In order to determine this term, gross primary productivity of the plants needs to be quantified and then converted into energetic units. Oncley et al. (2007) did this for EBEX-2000 and report a daily average of 8 W m−2 for a cotton canopy. Other estimates in the literature are, e.g., daytime values of 10–20 W m−2 for a maize crop and 5–10 W m−2 for a soybean crop under favourable conditions (Meyers and Hollinger 2004). Adding this photosynthesis term in the SEB closure equation normally improves its closure, but a major part of the imbalance remains in most cases.

2.4 Effect of the Diurnal Cycle on the SEB

Figure 3a displays the evolution of the terms of the SEB equation obtained at a station on the island of Majorca for a clear-sky summer day with a very dry surface. The shadowed area between − 10 and 10 W m−2 indicates the range of values where the turbulent fluxes are below the limit of detection, and comprises most of the nocturnal values and some small values of the latent heat flux in the daytime that do not fulfill the quality criteria of Mauder et al. (2013).

Fig. 3
figure 3

a Surface energy budget for a dry summer day (21 July 2016) at the University of the Balearic Islands, Mallorca, 10 km away from the coast. The black, yellow and green lines join the 30-min values of net radiation (Rn), the ground heat flux at the surface (G) and the imbalance (Imb). The symbols indicate the 30-min values of the turbulent sensible (H, brown squares) and latent (λE, blue triangles) heat fluxes. The rose band between − 10 and 10 W m−2 signals the range corresponding to the detection limit of the turbulent fluxes. b The energy imbalance term (green line) compared to the estimated values of the thermal advection using land-surface temperature fields at different spatial resolutions from a remotely-pilot aircraft system and from satellite. The symbols indicate the advection estimates at scales around 10 m (yellow triangles), 50 m (red circles) and 100 m (blue squares). The crosses are the estimates from satellites Landsat 7 at 30-m resolution (vertical crosses) and ASTER at 90-m resolution (tilted cross)

During daytime, when Rn is positive, this term is the largest and it is maximum in the central part of the day. Under these dry conditions, the evapotranspiration is small and most of the energy is used in transferring heat to the atmosphere by turbulence mixing (H) with a substantial part being transported into the ground (G). Since the soil is very dry, most of the heat storage takes place in the upper soil layers. The imbalance (Imb) is of the same order as G and very significant in the morning and evening transitions.

In the night-time, the radiative cooling of the surface is mostly compensated by heat transfer from the ground. The turbulence fluxes are very small, with some occurrences of positive values of λE above the detection limit indicating evaporation from the surface. Small nocturnal values for the turbulent fluxes are common, as shown in Cuxart et al. (2015, Table 1), where two-year averages for an inland site in the Iberian Peninsula indicate nocturnal values below 20 W m−2 for H and below 15 W m−2 for λE, values comparable to those of the Imb term.

Table 1 Overview of the most cited papers in Boundary-layer Meteorology on the topic of SEB closure in chronological order; citation numbers are according to Web of Science (Clarivate Analytics), accessed on 18 March 2020

Therefore, the usually neglected processes in the SEB, such as storage, soil and plant respiration, lateral transport or significant vertical divergences of radiative or turbulent heat fluxes may have nocturnal contributions of the same order of magnitude as the turbulent fluxes. The result is that these processes should not be neglected and eventually the Imb term could be taken as the surrogate of their combined contribution. Similar arguments may be used in the morning and evening transitions, when the turbulent fluxes are small and the processes at the surface are very relevant. In absolute numbers, however, the Imb term is much larger during daytime.

For a better understanding of the processes causing this behaviour with the diurnal cycle, Fig. 3b compares the imbalance to advection estimates based on remotely-sensed land-surface temperature (LST) fields, following Garcia-Santos et al. (2019). In that work, LST was determined at 1-m resolution from a remotely-piloted aircraft system flying at a height of 200 m above the ground in the area surrounding the SEB station. The advection term is estimated for the volume of interest as

$$ A = \rho c_{p} z_{m} U(\Delta T/\Delta x) $$
(2)

where ρ is the air density, cp is the heat capacity of air at constant pressure, U is the wind speed, zm is the measurement height of the atmospheric variables, while ΔT is the temperature difference between two parcels separated a distance Δx along the wind direction. This advection term is estimated using the wind speed at the SEB station and the LST of the adjacent parcels for different resolutions obtained degrading the original high-resolution field.

The symbols in Fig. 3b indicate the average values of the advection estimated at scales of 10 m, 50 m and 100 m. In addition, the same computation is made from the LST fields of the available Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) and Landsat 7 for that day, which are respectively at resolutions of 90 and 30 m. It is seen that in the daytime the estimates at 50 and 100 m are comparable to the SEB imbalance, while those at 10 m are much larger and essentially describe small-scale heterogeneities that are expected to be blended by turbulence mixing. These occur probably due to the high surface variability of the LST related to the patchiness of the vegetation cover at small scales in the summertime semi-arid conditions. At night, the advection estimates are significantly smaller than at daytime and there is no evidence that the hectometre scale is the one closest to the imbalance, indicating that other mechanisms besides advection may contribute significantly to the lack of closure.

2.5 Sub-mesoscale Transport Processes and Secondary Circulations

Roll vortices are a very common phenomenon in the daytime convective boundary layer. In the presence of such large-scale organized structures, single-tower measurement must be biased, because the associated vertical energy transport is inherently not captured (Etling and Brown 1993). Under strongly unstable conditions, hexagonal cell-like structures develop instead of rolls, and these also result in a bias of single-tower measurements, if they are not carried along the EC system within a given flux-averaging time by the mean flow (Segal and Arritt 1992; Etling and Brown 1993). Both phenomena generally occur in the convective boundary layer even over homogeneous surfaces as shown by many LES studies (Kanda et al. 2004; Huang et al. 2008; Patton et al. 2016; De Roo et al. 2018).

Over heterogeneous surfaces, additional flux-sampling errors are to be expected due to non-propagating circulations or standing cells, which are often linked to surface features, e.g., patches of different surface temperature (Blanford et al. 1991). Under such circumstances, ergodicity, and as a result Taylor’s frozen turbulence hypothesis, is violated, which is an important assumption underlying the EC method. As a consequence, “energy transported non-turbulently will not be sensed by EC systems and a bias towards lower energy fluxes will result” (Blanford et al. 1991). Only a spatial EC method, instead of the usual temporal EC method, will result in the correct flux estimates (Mahrt 1998).

After applying Reynolds averaging, the total sensible heat flux in kinematic units can be written as

$$ H_{kin} = \overline{wT} = \bar{w}\overline{{(T - T_{0} )}} + \overline{{ w^{\prime } T^{\prime } }} , $$
(3)

where w is the vertical wind speed, T is the temperature, and T0 is the so-called base temperature (e.g. Webb et al. 1980). It is the temperature from which each parcel of air is warmed (or cooled) during the vertical transfer of the heat supplied (or removed) at the underlying surface (Webb 1982). The overbar denotes a temporal average and the prime denotes a perturbation from a time-averaged quantity. Following Webb et al. (1980) the mean vertical velocity \( \bar{w} \) can be estimated as

$$ \bar{w} = \overline{{w^{\prime } T^{\prime } }} /\bar{T}. $$
(4)

This first term on the right-hand side of Eq. 3 is usually neglected for planar homogeneous flows, since \( \overline{{(T - T_{0} )}} \) is mostly much smaller than \( \bar{T} \). This leads to the conclusion that the transport of sensible heat can be described by the covariance only. However, this simplification may not be justified in in some cases as the first term can be expanded by using a perturbation from spatial averages, resulting in

$$ H_{kin} = \overline{wT} = <\bar{w}><\overline{{T - T_{0} }}>+ <\bar{w}^{*} \bar{T}^{*}>+ \overline{{w^{\prime } T^{\prime } }} , $$
(5)

where angle brackets enclose a spatial average and a star indicates a local perturbation from a spatial average. The important second term on the right refers to the energy transported by standing eddies or spatially organized time-invariant convection cells, which is associated with a dispersive flux (Raupach and Shaw 1982). In contrast, the first term of Eq. 5 can now indeed be neglected in most cases with the assumption \( \bar{w} = 0 \).

Such theoretical considerations on the SEB closure problem inspired researchers to apply spatial EC methods in field experiments by calculating spatial fluxes from aircraft and multi-tower data, as soon as such an undertaking became feasible. Indeed, they found additional (sub-)mesoscale fluxes sufficiently large in magnitude to explain a major part of the imbalance (Mauder et al. 2007a, 2008b, 2010).

Large-eddy simulation studies show that the sum of the additional dispersive fluxes is likely to be positive under unstable conditions in the presence of sub-mesoscale secondary circulations (Kanda et al. 2004), and hence the EC measurements systematically underestimate the total surface flux (Fig. 4). An analogous mechanism can be described for the latent heat flux as well (Huang et al. 2008). Hence, sub-mesoscale transport is the most prominent underlying hypothesis for studies on the SEB closure problem today.

Fig. 4
figure 4

Schematic of the mechanism causing a systematic bias in tower-based EC measurements in the presence of large-scale organized structures in the convective boundary layer. Warm (moist) air near the surface is transported upward by secondary circulations, which leads to a positive sensible (latent) heat flux that cannot be capture by single-tower measurements, if these structures are not carried along the EC system by the mean wind within the flux-averaging time. Virtual control volumes are indicated by a blue box around the EC tower, indicating how secondary circulations (bending grey arrows) lead to advection locally. Please note that the potential temperature θ is generally lower when it enters the control volume than when it leaves the control volume, as long as the surface is heated and the atmosphere is unstably stratified. Also note that a red upward arrow and a blue downward arrow both represent a positive transport of sensible heat, because they correspond to upward and downward vertical motion

Several researchers attempted to quantify advective heat fluxes, either by using single towers (Gay and Bernhofer 1991; Bernhofer 1992b; Lee 1998; Paw et al. 2000), or directly by using multiple towers (Oncley et al. 2007; Moderow et al. 2007, 2009). However, accurate measurements of advection under field conditions remain a methodological challenge (Aubinet et al. 2000).

Moreover, a control volume approach was applied to LES, in order to explain the observed SEB closure at two contrasting sites in a highly heterogeneous setting (Eder et al. 2015a). According to earlier theoretical considerations (Finnigan et al. 2003; Wang 2010), the total surface flux of potential temperature in kinematic units, H0,kin, can be determined at a height zm as

$$ H_{{0,{\text{kin}}}} = \left. {\overline{{w^{\prime } \theta^{\prime } }} } \right|_{{z_{\text{m}} }} + \mathop \int \limits_{0}^{{z_{m} }} \frac{\partial \theta }{\partial t}{\text{d}}z + \mathop \int \limits_{0}^{{z_{m} }} \left( {\bar{u}\frac{\partial \theta }{\partial x} + \bar{v}\frac{\partial \theta }{\partial y}} \right){\text{d}}z + \mathop \int \limits_{0}^{{z_{\text{m}} }} \bar{w}\frac{\partial \theta }{\partial z}{\text{d}}z + \mathop \int \limits_{0}^{{z_{\text{m}} }} \left( {\frac{{\partial \overline{{u^{\prime } \theta^{\prime } }} }}{\partial x} + \frac{{\partial \overline{{v^{\prime } \theta^{\prime } }} }}{\partial y}} \right){\text{d}}z, $$
(6)

where the first term represents the usual EC measurement, the second term is the storage in the layer below (including above ground biomass), the third and fourth terms are horizontal and vertical advection respectively, and the fifth term is the horizontal flux divergence. An analogous equation can be written for the latent heat flux.

3 Investigations on Sub-mesoscale Transport Processes

3.1 Idealized LES Studies

Large-eddy simulation is a powerful tool for investigating the processes behind the energy balance closure problem quantitatively. It can help to explore the effect of large-scale coherent structures on single-tower measurements systematically quantitatively under controlled conditions, where the true flux is known. As an early example, Lohou et al. (2000) used LES to examine the impact of coherent structures on vertical fluxes. Their finding has an important consequence for traditional flux measurements based on the hypothesis of isotropic and homogeneous turbulence, since it can explain part of the underestimation of the surface fluxes often reported in the literature.

Subsequently, Kanda et al. (2004) showed that secondary circulations lead to local advection by the mean flow, which cannot be captured by single-tower measurements, and that this leads to an underestimation of the EC measurement on average. A few years later, the LES study of Inagaki et al. (2006) distinguished between turbulent organized structures, which develop over homogenous surfaces in the convective boundary layer, and thermally-induced circulations, which are driven by differential heating. Moreover, LES was used to assess the representativity bias of an EC measurement and to calculate the number of towers needed to capture the total surface flux using spatial eddy covariance (Steinfeld et al. 2007).

Using idealized LES over homogenous surfaces, Huang et al. (2008) found a relationship of the energy balance residual with \( u_{*} /w_{*} \), where \( w_{*} \) is the convective velocity scale (Deardorff 1970). A dependence on \( u_{*} \) was also found in a year-long simulation by Schalkwijk et al. (2016). The LES study of Zhou et al. (2018) found, for homogeneous surface forcing, that the imbalance depends on atmospheric stability (and hence also on \( u_{*} \)). For heterogeneous surfaces, it also depends additionally on the turbulent kinetic energy and the difference between the potential temperature at the measurement height and surface temperature.

Recently, De Roo and Mauder (2018) investigated the dependence of the flux underestimation on the scale heterogeneity and on the location within a heterogeneous landscape. They found that horizontal flux divergence and advection are anti-correlated for heterogeneous heating with patch sizes on the order of kilometres, but advection dominates the overall effect. However, for heterogenous heating with patch sizes on the order of hectometres, horizontal flux divergence and advection are positively correlated, so that flux divergence amplifies the effect of advection by the mean flow. Moreover, areas with higher than average sensible heat fluxes generally have better closure (Fig. 5).

Fig. 5
figure 5

Modified after De Roo and Mauder (2018). Here, the energy balance ratio is defined as the ratio of the simulated eddy-covariance estimate at the measurement height divided by the surface flux

Correlation between normalized horizontal flux divergence and normalized advection by the mean flow versus energy balance ratio for kilometre scale heterogeneity (top row) and hectometre scale heterogeneity (bottom row).

3.2 Spatially-Resolving Field Measurements

3.2.1 Multi-tower Experiments

EBEX-2000 was the first multi-tower experiment addressing the SEB closure problem. It comprised 10 sites on an area of 1600 × 800 m2, and at several of the sites more than one EC system was deployed in order to characterize the instrumental uncertainty with side-by-side comparisons and in order to quantify vertical flux divergence with measurements at different heights (Oncley et al. 2007). In addition, the horizontal advection of sensible and latent energy by the mean flow was quantified from this rich dataset, and the energy balance closure gap was reduced by approximately 50%, however with quite a large uncertainty.

Inspired by the LES study of Steinfeld et al. (2007), Mauder et al. (2008b) conducted a multi-tower experiment with 25 towers distributed over an area of 4 × 4 km2. A simplified spatio–temporal EC technique was employed, so that only one tower was required to have high-frequency turbulence measurements and the other 24 were equipped with slow-response sensors, which were used to determine a spatio–temporal mean temperature. The goal was to capture the flux due to large-scale organized structures, which was partly successful. This approach helped to improve energy balance closure, but it was not able to measure the entire exchange of energy between the surface and the atmosphere, probably because it still invokes assumptions during the derivation of the method, amongst others horizontal homogeneity, which was obviously not fulfilled (Mauder et al. 2010).

A similar idea was implemented by Engelmann and Bernhofer (2016) in an experiment using nine sonic anemometers in an array covering 10 × 10 m2. The resulting sensible heat flux increased by 9% when applying spatio–temporal instead of purely temporal averaging for the covariance calculation. Also other studies report that accounting for dispersive fluxes yields a better spatial ensemble (Christen and Vogt 2004; Poggi et al. 2004; Klosterhalfen et al. 2019) and highlights the limitations of Taylor’s frozen turbulence hypothesis (Tsvang et al. 1991).

Zhang et al. (2010) analyzed turbulence measurements at different heights during EBEX-2000, and they extracted two important findings. Firstly, they focused on an obvious sawtooth pattern in the latent heat flux. Despite a smooth forcing by Rn, large variations in λE were observed with magnitudes as high as 200 W m−2 (Fig. 6). These variations suggest that other external forcings beyond Rn affect evapotranspiration. Moreover, these variations in λE are more significant at a height of 8.7 m than at 2.7 m and follow a similar pattern in the afternoon. This behaviour can be explained by turbulent organized structures penetrating from above, thereby transporting latent heat away from the surface in those 30-min periods with very low λE. However, they are not fully captured by the EC system, because they are either at time scales larger than the 30-min averaging time or not propagating with the mean flow.

Fig. 6
figure 6

Diurnal variations in the energy balance components during EBEX-2000 on 7 August 2000. The fluxes were measured at 8.7 and 2.7 m above ground level. Ruw is the correlation coefficient between the u and w velocity components and Res is the energy balance residual (Zhang et al. 2010, (C) Springer Nature)

This interpretation is supported by spectral analysis. When the authors classified the entire dataset into two groups, one with high latent heat fluxes (HF) and one with low latent heat fluxes (LF), they found striking differences in the spectra between the two heights, particularly also in the cospectra. The ensemble cospectra for the HF cases behave as expected in accordance with common cospectral models. However, for the LF cases, cospectra are depressed in the mid-frequency range. This effect is stronger for the measurements at 8.7 m than for those at 2.7 m (Fig. 7). The reduced latent heat fluxes of the LF group are primarily responsible for the concurrently increased residuals of the SEB closure (Fig. 6). It is intriguing that the energy balance was closed for this EBEX-2000 dataset when correcting for the phase shift between vertical velocity and the respective scalar, either temperature or humidity (Gao et al. 2017). Based on this result, the authors conclude that enlarged phase differences of large eddies are linked to entrainment or advection occurrence, which leads to increased residuals of the surface energy balance.

Fig. 7
figure 7

Modified after Zhang et al. (2010), (C) Springer Nature

Cospectra for uw and wq as a function of normalized frequency for HF cases (a, b) and LF cases (c, d) as measured during EBEX-2000. Here, \( \varvec{u}_{\varvec{*}} \) is the friction velocity and \( \varvec{q}_{\varvec{*}} = \overline{{\varvec{w}^{\prime } \varvec{q}^{\prime } }} /\varvec{u}_{\varvec{*}} \).

Eder et al. (2015b) reported similar sawtooth pattern as in Zhang et al. (2010), however it affected both latent and sensible heat fluxes equally at their site, which may be related to the much higher Bowen ratio there. Similar to Zhang et al. (2010), high heat fluxes (and small energy balance residual) were correlated with high momentum fluxes or friction velocity \( u_{*} \). An anti-correlation between \( u_{*} \) and the energy balance residual was found at many other sites (Hendricks-Franssen et al. 2010; Stoy et al. 2013; Eder et al. 2015a). Also similar to Zhang et al. (2010), low heat fluxes (and large energy balance residuals) are correlated to increased low-frequency motions in the horizontal velocity spectra. While Zhang et al. (2010) based their interpretation only on indirect evidence of large organized eddies, Eder et al. (2015b) actually measured the large organized eddies, either roll-like or cell-like, depending on the background wind speed, by using a triple Doppler lidar configuration, which further substantiates this explanation.

3.2.2 Airborne Measurements

Airborne measurements represent one of the most direct means of obtaining spatially-resolved 3D turbulence data (Mahrt 1998). Therefore, such datasets illustrate the presence of secondary circulations in the surface layer. With the help of wavelet analysis, it is possible to extract the associated flux at longer wavelengths (Mauder et al. 2007a). It was shown for a dataset of 20 flights, at 30-m height over a length 115 km above heterogeneous boreal forest, that the mesoscale flux is of similar magnitude to the lack of energy balance closure of nearby EC towers (Mauder et al. 2007a). Eder et al. (2014) analyzed the same dataset to investigate the Bowen ratio of the mesoscale fluxes. They found the mesoscale Bowen ratio to be similar to the small-scale turbulent Bowen ratio in most cases.

For a heterogeneous, but flat, region in Central Europe, Foken et al. (2010) found that the energy balance closure generally improved when using the airborne flux measurements instead of a tower-based composite, however, the uncertainty was quite large. Despite the advantages of airborne measurements, they also have challenges, e.g., the need for extensive flow-distortion corrections (Metzger et al. 2012), the constantly changing flux footprint along the flight track (Schuepp et al. 1990; Mauder et al. 2008a) and the high costs and security requirements of airborne operations.

3.2.3 Scintillometers

Scintillometers also provide spatially-integrated flux measurements, which prompted researchers to use these instruments in order to capture the otherwise neglected flux by non-propagating large eddies. However, these instruments do not measure the heat flux directly and are only sensitive to a part of the turbulence spectrum, i.e. the inertial subrange, and they rely on the validity of Monin–Obukhov similarity theory in order to determine fluxes from the measured structure parameters with all its implicit assumptions (Ward 2017). The underlying similarity functions have been determined from tower-based EC measurements and therefore are not disconnected from the energy balance closure problem. Nevertheless, scintillometers often yield improved SEB closure, in particular large-aperture scintillometers with a measurement path on the order of kilometres and measurement heights on the order of 10 m.

For the LITFASS-2003 field campaign, a combination of an infrared and a microwave scintillometer was deployed together with several SEB stations underneath the measurement path. Results show that the scintillometer-based fluxes are considerably larger than the composite of tower fluxes for the same footprint and that the energy balance closure is improved when the scintillometer-based flux are used (Meijninger et al. 2006). According to these measurements, the underestimation of the EC-based latent heat flux was twice as large as the underestimation of the EC-based sensible heat flux. Moreover, Foken et al. (2010) showed that scintillometer-based fluxes are in close agreement the LES-based spatial EC data for that area.

Similar comparisons between scintillometer and EC measurements were conducted for a large-scale field campaigns in China (Liu et al. 2011; Xu et al. 2017b). Here, the SEB closure improved considerably when using the scintillometer-based fluxes, and the flux underestimation was mostly attributed to the sensible heat flux, while the EC-based latent heat flux agreed well with the scintillometer-based latent heat flux.

3.2.4 Lidar Measurements

Lidar technology has developed rapidly in recent years, and these remote sensing instruments represent another means of obtaining spatially-resolved measurements of atmospheric variables. This approach is particularly promising for the future, since a large amount of spatial data can be obtained simultaneously and continuously along the laser path for quite a large spatial extent without interfering with the atmospheric variables to be measured. However, for 3D turbulent wind measurements, a coordinated scan pattern of at least three Doppler lidars is required (Mann et al. 2009) and at the current resolution of Raman and differential absorption lidars it is still challenging to obtain turbulent fluxes directly (Wulfmeyer et al. 2018; Behrendt et al. 2019).

Single lidar measurements by Drobinski et al. (1998) were, more or less, the first to provide observational evidence of organized large eddies in the surface layer. Eder et al. (2015b) provided a more detailed picture of these organised large eddies by using a triple lidar set-up, and related the associated long-wavelength fluxes to the observed pattern in SEB closure at two EC stations for that area. The advective flux of water vapour that results from heterogeneity-induced secondary circulations was quantified by Higgins et al. (2013) using a high-resolution Raman lidar.

3.3 Integrated Studies

Let us first define what we mean by integrated studies in the context of SEB closure: these are experiments that combine multi-tower measurements with airborne and/or lidar measurements and LES in an integrative way to obtain a better understanding of the transport processes in the atmospheric boundary layer leading to the observed lack of SEB closure in EC tower measurements.

The first realistic LES study on energy balance closure for a multi-tower experiment was conducted by Brötz et al. (2014) for the Convective and Orographically induced Precipitation Study (COPS) campaign in summer 2007 (Wulfmeyer et al. 2011). Here, a sodar was used together with several EC towers. The authors found that the energy balance in the surface layer was closed during low wind periods, but was not closed after the onset of a local valley breeze. The associated LES analysis indicates that the missing flux components of sensible heat are the main reason for the unclosed energy balance in the considered situations.

Another example of an integrative study on the SEB closure problem was conducted by Eder et al. (2015a), who deployed two EC towers, one Doppler lidar and a backscatter lidar in and around a solitary forest in the Negev desert in Israel. This pine plantation has an extent of several kilometres and is surrounded by semi-arid shrubland, which constitutes a very strong and distinct heterogeneity with respect to albedo and roughness. Interestingly, the energy balance is closed for the tower inside the forest, while an imbalance of roughly 20% is found at the shrubland site, although both sites are expected to be heavily influenced by advection. This systematic distinction between the two sites was reproduced in the associated LES study, which shows that 3D advection and horizontal flux divergence almost cancel each other out over the forest, which has a high sensible heat flux, but they do not in cooler areas, so that the heat fluxes are underestimated using the EC method. This unexpected finding prompted a systematic LES study by De Roo and Mauder (2018), which showed that this behaviour can be generalized if the scale of heterogeneity is in the kilometre range.

A simplified approach to estimating the impact of surface heterogeneity in the SEB equation was made by Cuxart et al. (2016) using data from the Boundary Layer Late Afternoon and Sunset Turbulence (BLLAST) campaign in the northern foothills of the Pyrenees (Lothon et al. 2014). There, a number of LST fields from different sources and scales were used to estimate the lateral advection in a volume of depth zm, typically taken as the height of the EC system, using the approach described in Sect. 2.4. Land-surface temperature was taken as an approximation of the surface-layer temperature and the wind speed was taken as 1 m s−1, with the intention of comparing the order of magnitude of advection at different scales, using numerical model results, satellite imagery, unmanned aircraft system measurements, and thermal photography. It was found (see their Table 1) that the variability of LST decreases very little with increasing scale and, in consequence, the importance of the advection term increases with resolution.

When compared with the imbalance, it is seen that advection is very small for the kilometre scales, but it is much larger for the decametre and metre scales. The hectometre scales, which are related to the elements of the landscape, have values of advection of the same order of magnitude as the imbalance, and are candidates in generating non-random transport, in good correspondence with the numerical results of De Roo and Mauder (2018).

These results were further explored with measurements taken at a flat site and with heterogeneous semi-rural land use on the island of Mallorca in 2016. A display of nine masts with temperature and moisture at 2 and 0.2 m above the surface, wind speed and soil measurements, separated by about 150 m, was used to estimate advection for a centrally located SEB station (Simó et al. 2019). Air temperature instead of LST and the actual wind speed were used to overcome some limitations of the previous study. The results found in BLLAST were essentially confirmed for the hectometre scales, with values of advection between 10 and 20 W m−2 in the day in 3-h averages (even larger in 30-min averages), and between 1 and 10 W m−2 at night. These values were consistent with the satellite-derived estimations of Garcia-Santos et al. (2019) for the same experiment.

Another kind of integrated experiment was conducted by Cheng et al. (2017), who used high-resolution spatially-distributed temperature data from fibre-optics measurements (distributed temperature sensing) in combination with LES. They found that, due to the violation of Taylor’s hypothesis, energy spectra are underestimated in the inertial subrange by 10–30%, and they proposed a correction to compensate for this underestimation, which can be applied to single-tower EC measurements. It is speculated that the resulting fluxes would lead to an improved SEB closure, although this remains to be confirmed at the time of writing.

4 Partitioning of the Energy Balance Residual

The residual Imb term of the surface energy balance can be determined when all four main terms are measured independently (Eq. 1). In order to use this residual for correcting the measured turbulent heat fluxes, it is critical to know how the otherwise neglected sub-mesoscale transport ought to be partitioned between the sensible and the latent heat fluxes. The most straightforward method preserves the Bowen ratio, i.e., adjusts both the sensible and the latent heat fluxes by the same percentage, based on the 30-min energy balance residual (Twine et al. 2000). At first, this kind of partitioning was chosen in the absence of a better knowledge of the underlying processes and under the assumption of scalar similarity. It was later substantiated by wavelet analysis of the low-wavenumber flux contributions for airborne EC measurements, indicating that the Bowen ratio for mesoscale fluxes is often similar to the small-scale turbulent Bowen ratio (Mauder et al. 2007a; Eder et al. 2014).

Comparisons of EC-based latent heat flux measurements with lysimeter measurements as a reference also support an SEB closure correction that is roughly Bowen-ratio preserving (Gebler et al. 2015; Hirschi et al. 2017). Widmoser and Wohlfahrt (2018) agree with this kind of partitioning in principle, and they stress the importance of considering the uncertainty in the measurement of the available energy when such a correction is practically applied for non-ideal sites. For practical reasons, Mauder et al. (2013) proposed a variation of the Bowen-ratio-preserving partitioning method. Instead of closing the energy balance for every 30-min interval, they use daily energy balance ratios as the basis for the adjustment in order to reduce the scatter in the resulting heat fluxes. In addition, they apply this method only during daytime based on the argument that large-scale organized structures generally develop only in convective boundary layers.

There are, however, also studies that point towards a different partitioning of the residual. For example, Wohlfahrt et al. (2010) found that EC measurements at a mountainous grassland site agree best with lysimeters, if the entire residual is attributed to the latent heat flux. In contrast, Ingwersen et al. (2011) compared the measured EC fluxes over an agricultural site with data from a land-surface model, and found best agreement when the entire residual is attributed to the sensible heat flux. Charuchittipan et al. (2014) proposed to attribute the residual mostly, but not entirely, to the sensible heat flux. They relate the correction factor to the ratio between the buoyancy flux and the sensible heat flux, so that the fraction of the residual attributed to the sensible heat flux is calculated as

$$ f_{HB} = \frac{{\overline{{w^{\prime } T^{\prime } }} }}{{\overline{{w^{\prime } T_{\text{v}}^{\prime } }} }} = \left( {1 + 0.61\frac{{\bar{T}c_{\text{p}} }}{\lambda Bo}} \right)^{ - 1} , $$
(7)

where \( \overline{{w^{\prime } T_{\text{v}}^{\prime } }} \) is the buoyancy flux computed from the fluctuation of the virtual temperature Tv, λ is the latent heat of vaporization, and Bo is the Bowen ratio.

This approach was tested against simulations using a third-order closure model for tall vegetation (Gatzsche et al. 2018), who found good agreement with the buoyancy-flux- based correction for larger Bowen ratios, while the Bowen-ratio-preserving correction shows a better agreement for Bo < 1.5 (Fig. 8). Pointing towards a similar direction, Mauder et al. (2018) found, for a comparison with lysimeters at two grassland sites, that evapotranspiration is overestimated using a Bowen-ratio-preserving adjustment method and underestimated using a buoyancy-flux-based correction method.

Fig. 8
figure 8

Fraction of the residual attributed to the sensible heat flux for the forest site Waldstein–Weidenbrunnen (DE-Bay), under the assumption that the ACASA model calculated the true Bowen ratio, and according to the correction methods with the Bowen ratio (Twine et al. 2000) and the buoyancy flux (Charuchittipan et al. 2014), after Gatzsche et al. (2018)

5 Attempts to Parametrize the Magnitude of the Residual

As soon as researchers recognized that the lack of SEB closure is not only an instrumental problem, attempts have been undertaken to find systematic relationships with potential drivers in order to predict the magnitude of the residual. Panin et al. (1998) analyzed data from several micrometeorological experiments and found indications that point to a relationship between the underestimation of turbulent fluxes and terrain inhomogeneity. In order to systematically correct for this effect a scheme is suggested that uses the fetch for different types of surface for the sites surrounding the environment. Later, the principle idea behind this correction was refined by introducing a metric for landscape-scale heterogeneity rather than the heterogeneity within the flux footprint (Panin and Bernhofer 2008). However, this correction would only predict the average flux underestimation for a certain site and not its variation with changes in stability.

Huang et al. (2008) present a correction based on a LES parameter study with purely homogeneous surface forcing. They identify the non-local stability parameter \( u_{*} /w_{*} \) and the normalized measurement height \( z_{m} /z_{i} \) as suitable predictors of the energy balance residual. Note that both scaling variables depend on the boundary-layer height zi. Despite this pioneering work, this parametrization was hardly applied to real-world data, mostly because their correction function is not valid for measurement heights in much of the surface layer where EC tower measurements are usually conducted (Eder et al. 2014). This limitation was due to a relatively course grid resolution related to the computational resources available at the time.

Recently, De Roo et al. (2018) conducted a similar parameter study, however this time with almost ten times finer grid resolution due to advances in high-performance computing and a newly developed vertical-grid refinement method. They confirmed the principle functional relationship of the residual with \( u_{*} /w_{*} \) and \( z_{m} /z_{i} \) also for lower measurement heights within the surface layer, and they provide separate correction functions for the sensible and the latent heat fluxes, so that a larger part of the residual is attributed to the sensible heat flux rather than the latent heat flux. This correction is only valid for zi/L < 0 because unstable stratification is a prerequisite for the development of large-scale organised structures in the boundary layer (L is the Obukhov length).

6 Summary, Open Questions and Outlook

The enigmatic SEB closure problem has inspired a large number of researchers to revisit questions regarding data quality control, flux corrections, and the theoretical foundations of the EC method. Hence, other related fields of research besides micrometeorology, such as wind-energy research, hydrology and ecology have benefited from the substantial scientific progress that has been triggered by this puzzling finding.

The underlying reasons for the general underestimation of turbulent fluxes are much clearer today than they were 20 years ago when the EBEX-2000 field campaign was organized. We can now rule out instrumental errors as a major contributor to the missing flux; moreover, uncertainties due to the choice of postprocessing software have been minimized. (Sub-)mesoscale transport has been identified as the main reason for the non-closure, which is manifest as a dispersive flux when taking a non-local perspective and as advection by the mean flow and horizontal flux divergence when taking a local perspective.

This improved understanding about the underlying process and dedicated comparison experiments with independent estimates of the turbulent heat fluxes has also led to a greater clarity on the partitioning of the residual. If availably energy is measured accurately, and if sites are flat and homogeneous within the flux footprint area, the residual can be distributed between the sensible and latent heat fluxes. Recently, evidence tends towards a partitioning somewhere between a buoyancy-flux-based and a Bowen-ratio-preserving adjustment method, meaning that often sensible heat fluxes ought to be corrected by a larger percentage than latent heat fluxes. High-resolution LES support this notion, and the dissimilarity between the sensible and latent heat fluxes with respect to large-scale organized structures can be explained by the dissimilarity of the vertical flux and concentration profiles, which is often found in the convective boundary layer (Huang et al. 2009). If we follow this line of argument, we would expect that other trace gas fluxes are also affected by the same transport phenomena, but probably in a dissimilar manner.

Systematic high-resolution LES studies appear to be the method of choice for developing a parametrization of the magnitude of the energy balance residual, because the true surface flux is known and instrumental errors can be excluded. The LES-based energy flux correction recently proposed by De Roo et al. (2018) can now be tested and validated for different real-world sites with independent measurements of energy fluxes (tovi.io/collaboration). This method relies mostly on driving variables, which are measured by an EC system anyway, i.e. friction velocity, sensible heat flux, air temperature, and the measurement height. In addition, an estimate for the boundary-layer height is required, which can be obtained either from collocated ground-based remote sensing systems, or by using a slab model (e.g. Tennekes 1973; Batchvarova and Gryning 1990), or from reanalysis data.

Nevertheless, there is still a number of open questions with respect to the SEB closure problem. For example, it can be expected from Patton et al. (2016) that consideration of the feedback between large-scale coherent structures and plant canopies will improve the predictability of the energy balance residual near the surface. Furthermore, it remains to be explored how the impact of surface heterogeneity on the residual can be incorporated in this model (De Roo and Mauder 2018; Zhou et al. 2019).

New projects with relevance to the SEB closure problem are ongoing at the time of writing. The integrated studies of CHEESEHEAD (Chequamegon Heterogeneous Ecosystem Energy-balance Study Enabled by a High-density Extensive Array of Detectors) and IPAQS (Idealized Planar Array experiment for Quantifying Surface heterogeneity) aim at employing a spatial EC method by using data from a large number of fast-response tower measurements, which allow the direct determination of dispersive fluxes (Margairaz et al. 2020). These measurements are combined with LES, ground-based remote sensing and airborne measurements, in order to gather as much information on the 3D transport mechanisms in the boundary layer as possible. Another promising complementary approach is the use of machine-learning techniques for modelling the energy imbalance and the spatial representativeness of single-tower measurements (Xu et al. 2018). Since the lack of SEB closure is not an instrumental problem the relevance to a possible correction for trace gas fluxes such as carbon dioxide is still an open question. The progress in the investigation of SEB closure in the last decade could be an impulse to transfer ideas and concepts to the correction of other trace gas fluxes.