Introduction

The distribution of vegetation within canopies varies with tree allometry and competition strategies, leading to variations in canopy structure and ultimately, in environmental conditions (Purves et al. 2007; Thorpe et al. 2010; Pretzsch and Dieler 2012). Species-specific canopy structures create different microhabitats, light conditions, and microclimates, which in turn influence the rates at which stands sequester carbon. Species have different carbon allocation strategies that evolve during the growth season, relative to above and below-ground carbon allocation. Examples include growing fruit, deploying leaves, and growing roots for better access to nutrients and water (De Pury and Farquhar 1997; Lacointe 2000; Stark et al. 2012). These growth and allocation strategies result in distinct species-specific tree shapes.

Traditionally, the characterization of stand vertical distribution of aboveground biomass required either on-site estimation of biomass per vertical layer (often requiring destructive measurements), scaffolding, or tree climbing, all of which are time-consuming and limit surveys to small areas (MacArthur and Horn 1969; Aber 1979; Bassow and Bazzaz 1997; Tackenberg 2007).

The increasing availability of data collected remotely from airborne, spaceborne, and terrestrial sensors has improved our understanding of forest dynamics (White et al. 2013; Wulder et al. 2012; Beland et al. 2019). Among these sensors, light detection and ranging (LiDAR) has the ability to penetrate the canopy and provide information on the spatial location of reflective material. LiDAR is increasingly used to perform both extensive and highly detailed imaging of forest. It uses a laser, at a precisely known location and orientation, to record a 3D scene. A discrete LiDAR pulse is reflected back to the sensor by the vegetation as it penetrates the canopy. The sensor then records the total travel time of the pulse and its energy to deduce the distance from the sensor to the location of reflection. LiDAR can be used from both the ground, and an aircraft to provide different perspectives on the forest. Terrestrial LiDAR provides highly detailed structural information about individual trees, but its extent is limited to a few hundreds of meters (Beland et al. 2019; Crespo-Peremarch et al. 2020). Airborne LiDAR, on the other hand, is typically less detailed than terrestrial LiDAR but can be used for extensive landscape measurements.

Airborne LiDAR has been used to characterize vertical canopy structure, crown shape, and aboveground biomass for entire ecosystems (Lefsky et al. 2002; Parker et al. 2004; Coops et al. 2007; Stark et al. 2012; Harding et al. 2001; Cao et al. 2014; Ellsworth and Reich 1993; Papa, 2020). It has also been used to identify tree species (Heinzel and Koch 2011, 2012; Vaughn et al. 2012; Axelsson et al. 2018; Hovi et al. 2016; Fassnacht et al. 2016; Budei et al. 2018; Fedrigo et al. 2018) and to study stand characteristics such as age, crown cover, and basal area (Korhonen et al. 2011; Racine et al. 2014; White et al. 2013; Karna et al. 2019). Airborne LiDAR is an important tool for biomass quantification and offers the opportunity to explore large-scale phenomena that could only be observed on the field (Vierling et al. 2008, 2010; Seavy et al. 2009; Karna et al. 2020).

One limitation of the airborne LiDAR data is its inability to distinguish differences in foliage condition or in species spectral variation. This is because it lacks the spectral information commonly used to classify species such as variations in red, green, and blue or near-infrared multispectral imagery. In response, the most common strategy for distinguishing species using LiDAR has been to differentiate individual tree shapes and texture (Holmgren and Persson 2004; Kim et al. 2011; Fassnacht et al. 2016), and add spectral information from multispectral imagery. More recently multispectral LiDAR has also been used to distinguish stand species (Budei et al. 2018; Budei and St-Onge 2018).

Most studies on species classification using aerial LiDAR have focused on species identification based on individual tree crowns. However, tree crown delineation requires a large number of LiDAR returns and highly accurate registration of ground observations (Ørka et al. 2009; Muss et al. 2011). Using a small observation area such as an individual tree crown causes a loss in the shape of the vertical distribution of LiDAR returns. This is due to a decrease in the number of LiDAR returns, causing the distribution to become a collection of random variates. Thus, although the accuracy of species identification would be improved by a very high point density in excess of 10 pt/m2 (Fassnacht et al. 2016), it is difficult to apply and validate these approaches for use over large areas and with less-dense LiDAR datasets.

An alternative to individual tree-crown extraction approaches is the application of area-based approaches [see e.g. White et al. (2013)]. This approach generally uses a pixel or a stand on which LiDAR returns are aggregated and predictors are derived before being used in a model to predict forest attributes. Predictors are often derived from the vertical distribution of LiDAR returns: the number of LiDAR returns per height slice. The vertical distribution of LiDAR returns is often presented as quantiles, projections of quantiles (such as principal component analysis), or parametric functions (such as a Fourier, beta or Weibull functions), which are used to predict stand attributes (Mehtätalo 2006; Coops et al. 2007; Racine et al. 2014; Maltamo et al. 2005; Palace et al. 2015; Magnussen et al. 1999; Riggins et al. 2009; Falkowski et al. 2009). The area-based method can be effective with a point density as low as 1 pt/m2, which reduces cost and require less processing compared to tree-crown approaches (White et al. 2013).

Lowering the minimum LiDAR point density threshold for species classification would increase the number of potential surveys where this method could be applied, and at the same time reduce the cost of acquisition. The species information could then be used for extensive forest management, or landscape-scale studies. One way to achieve area-based species mapping is to increase our understanding of the interaction between LiDAR and stand-level vegetation. However, the vertical distribution of LiDAR returns is difficult to analyze without resorting to dimension reduction, a method that generally limits the interpretability of results.

We hypothesize that using functional general linear models (GLM) and a novel non-parametric graphical test of significance (Mrkvička et al. 2019) would allow us to link the forest attributes to the vertical distribution of returns from low-density LiDAR surveys. The test provides a framework to compare a function (i.e. the vertical distribution of LiDAR returns) against categorical and continuous variables, as well as a visual understanding of the effects of the variables on the function. Versions of non-parametric graphical tests have been applied to economical data (Mrkvička et al. 2020), and non-parametric inference is commonly used in neuroimaging (Winkler et al., 2014). However, to our knowledge, our study is the first to use a functional GLM combined with a non-parametric graphical test of significance in the field of ecology or remote sensing.

The information contained in the vertical distribution of LiDAR returns is closely related to the distribution of the vegetation. Reduced crown cover (the proportion of area covered by vegetation; Gonsamo, 2013) typically increased the probability for the LiDAR returns to reach lower vegetation (Hilker et al. 2010), while age influences the vertical position of the crown within a stand (Coops et al. 2009; Racine et al. 2014). Using direct foliage measurements, Aber (1979) observed that the vertical concentration of the foliage evolved with stand age, and that the end point of forest succession seemed to reach an equilibrium where the foliage was relatively evenly distributed within the canopy. Martin-Ducup et al. (2016) noted that crown cover and stand maturity affected the shapes of the crown of sugar maples when measured using terrestrial LiDAR.

The shapes of different species influence the distribution of LiDAR returns, but the interpretations from most studies are limited and hard to generalize across ecosystems or LiDAR surveys (Fassnacht et al. 2016). Some studies explicitly compared the vertical distribution of LiDAR returns between species. For example, Ørka et al. (2009) found that the first and last returns were more dispersed in Birch (Betula spp.) stands than in Norway spruce [Picea abies (L.) Karst.]. The vertical distribution of first returns was also skewed toward the top of the canopy, and increased stand height was shown to influence the overall return distribution (Ørka et al. 2009).

In this study, we verify that we can differentiate the distinct patterns for individual species in the vertical distribution of LiDAR returns from a low-density area-based survey. We hypothesize that the use of a functional GLM combined with a non-parametric graphical test of significance makes it possible to identify these inter-species differences in the vertical distribution patterns after stand crown cover and age effects are accounted for.

Methods

Study area

The study was conducted in Matane Wildlife Reserve (Quebec, Canada, 48° 41′ N, 66°58′ W) and covering 1600 km2 (Fig. 1). The reserve is a mixed forest dominated by balsam fir [Abies balsamea (L.) Mill.], paper birch (Betula papyrifera Marsh), and black spruce [Picea mariana (Mill.) BSP]. Other species in the reserve include white spruce (Picea glauca Moench), aspen (Populus tremuloides Michx.), Norway spruce [Picea abies (L.) Karst.], jack pine (Pinus banksiana Lamb.), and other non-commercial deciduous species. Most of the reserve is subject to active commercial logging. Logging activities in that area have been documented since 1962; 27% of the area has been harvested (mostly total harvesting) and half of that has been replanted. Two percent of the stands originated from natural perturbations (e.g. windthrow, insect epidemic, fire) and the remaining 71% was undocumented.

Fig. 1
figure 1

Location of Matane Wildlife Reserve (left panel) and selected stands within the study area (right panel, dark patches)

Data

Airborne imagery and LiDAR were acquired during the summer of 2007. All data were collected by the Quebec Ministry of Natural Resources as part of their decennial forest mapping program. The LiDAR survey used a nominal point density of 3 points/m2 with an Optech ALTM 2050 sensor that recorded the first and last returns at 40 kHz. The survey was flown at 1200 m above ground, with a flight overlap of 30% and a maximal scan angle of 15° from nadir and a footprint diameter of 25 cm.

Airborne photography was conducted using a Leica ADS-40 pushbroom camera at a resolution of 0.2 m for panchromatic, and 0.5 m for near-infrared, green, and blue bands (NIR, G, B respectively centered on 860, 560, and 460 nm); the side overlap was 40% to ensure complete coverage of the area. This imagery was used to partition the territory into stands following provincial guidelines adapted from MRNQ (2007) by expert photo-interpreters. Using digital stereopsis (virtual 3D vision) from the forward and nadir-facing ADS-40 sensor images, the expert photo-interpreters identified tree species by integrating information on landscape positions, crown shapes, textures, and colors. Every photo was segmented into homogeneous stands using species, crown cover, height, and geomorphic criteria. Each stand was classified based on species composition, crown cover, age, height, and other ecological variables using a combination of photography, ground-control points, and historical data, which was used as a reference. Stand age was estimated using ground control plots where trees were cored. This information was then combined with available archives, height, and ecology to estimate age from aerial photography. Stand age was divided into six regular classes: 10 (0–20), 30 (21–40), 50 (41–60), 70 (61–80), 90 (81–100), and 120 (101, ∞), and irregular age classes (such as uneven-aged and multi-stratum stands). Crown cover was also estimated by visually comparing the proportion of open ground with the space occupied by the mature tree crowns. Crown covert was categorized in 9 classes: 10 (5–14), 20 (15–24), 30 (25–34), 40 (35–44), 50 (45–54), 60 (55–64), 70 (65–74), 80 (75–84), and 90 (85–100). Species were identified by their distinctive features in the composite images of near-infrared, green, and blue (NIR + G + B) mapped onto red, green, and blue (R, G, B) channels, and their frequent associations in forest stands (Table 1). Finally, the interpretation of the aerial images relied on the ground-control points and the ecological knowledge of the photo-interpreters. We used this photo-interpreted forest map as our reference data for species, crown cover, and age.

Table 1 Description of tree shapes, associated species, preferred conditions, and colors

Even-aged stands are less complex than irregular stands when analyzed from a vertical structure point of view. Therefore, we focused our analysis on even-aged stands where clearly dominant species represented at least 50% of the stand cover. Using a low threshold for species dominance increased the number of usable observations for the analysis. This improved the ability of the analysis to detect effects but also increased the noise from other co-dominant species.

LiDAR data processing

The steps required to prepare LiDAR data for processing are summarized in Fig. 2. We excluded stands with average LiDAR sampling rates below 2 pt/m2 or an area of less than 4 ha to ensure a sufficient number of LiDAR returns. From the remaining stands, we kept only those with a dominant species that was observed at least in 100 stands: aspen, balsam fir, black spruce, paper birch, and white spruce. From the 22,365 stands on the original forest map, we reduced our dataset to 5428 stands representing 35% of the study area (Fig. 1). Stand area distributions were similar for all selected species (median of 9 ha, minimum of 4 ha, maximum of 137 ha).

Fig. 2
figure 2

Step-by-step preparation of the LiDAR data

We registered the LiDAR returns by scaling their height between 0 and 1 to allow comparisons across stands of different heights. We subtracted the ground elevation from the absolute point cloud elevation (Muss et al. 2011). For each stand, LiDAR returns were binned into a vertical distribution histogram of 39 height slices from their highest LiDAR return (Coops et al. 2007; Harding et al. 2001). We then divided each slice count by the total number of points in the stand, so the vertical distribution of each stand summed to one, regardless of their inconsistent shapes, areas, and LiDAR point densities.

Statistical analysis

We used a novel method that builds on the functional data analysis field (Ramsay and Silverman 2005) to compare the complete vertical distribution of LiDAR returns. Most studies of species classification have focused on the accuracy of classifiers and the value of predictors for species identification (Fassnacht et al. 2016), often using dimension reduction to decrease and decorrelate the number of predictors. Some of these reduction methods include linear discriminant analysis and principal component analysis (Koenig and Höfle 2016; Räty et al. 2016; Axelsson et al. 2018). However, dimension reduction diminishes the ability to understand the effect of individual variables and interpret the results of the model.

To compare inter- and intraspecific variations in the distribution of LiDAR returns and the effect of three variables, we used a nonparametric graphical test of significance (Mrkvička et al. 2019; Myllymäki and Mrkvička 2020). We modeled the vertical distributions of LiDAR returns as a function of dominant species, crown cover, and age using the general linear model

$$d\left(h\right)={\beta }_{0}\left(h\right)+{X}_{\mathrm{s}\mathrm{p}}\cdot {\beta }_{\mathrm{s}\mathrm{p}}\left(h\right)+{X}_{\mathrm{c}\mathrm{c}}\cdot {\beta }_{\mathrm{c}\mathrm{c}}\left(h\right)+{X}_{\mathrm{a}\mathrm{g}\mathrm{e}}\cdot {\beta }_{\mathrm{a}\mathrm{g}\mathrm{e}}\left(h\right)+\varepsilon \left(h\right),$$
(1)

where, for every scaled height \(h\), \(d\left(h\right)\) is the \(N\times 1\) vector of observed LiDAR distributions at scaled height \(h\), and \({\beta }_{\mathrm{s}\mathrm{p}}\left(h\right)\), \({\beta }_{\mathrm{c}\mathrm{c}}\left(h\right)\), and \({\beta }_{\mathrm{a}\mathrm{g}\mathrm{e}}\left(h\right)\) are the parameter vectors related to species in \({X}_{\mathrm{s}\mathrm{p}}\), crown cover in \({X}_{\mathrm{c}\mathrm{c}}\), and age in \({X}_{\mathrm{a}\mathrm{g}\mathrm{e}}\); \(\varepsilon \left(h\right)\) is the vector of random errors with mean zero and finite variance \({\sigma }^{2}\left(h\right)\). The crown cover was considered a continuous variable, while species and age were considered categorical variables (given that the last age class was open). This model that incorporates all three variables is referred to as the full model.

We studied the effect of each variable after accounting for the other variables (also called the nuisance factors by Freedman and Lane (1983)) using the methodology illustrated in Fig. 3. We tested the effect of each three variables using the following null hypothesis: \({\beta }_{\mathrm{s}\mathrm{p},m}\left(h\right)=0\) for all \(m\) and \(h\), \({\beta }_{\mathrm{c}\mathrm{c}}\left(h\right)=0\) for all \(h\), and \({\beta }_{\mathrm{a}\mathrm{g}\mathrm{e},l}\left(h\right)=0\) for all \(l\) and \(h\), where \(m\) and \(l\) refer to the elements of the parameter vectors \({\beta }_{\mathrm{s}\mathrm{p}}\left(h\right)\) and \({\beta }_{\mathrm{a}\mathrm{g}\mathrm{e}}\left(h\right)\) for each species and age groups, respectively. The \(\beta \) coefficients of the discrete factors were constrained to sum to zero. The corresponding three null models were obtained by removing the studied variable from the full model (Eq. 1):

Fig. 3
figure 3

Description of the steps of the graphical test of significance of a variable (e.g. species) on the vertical distribution of LiDAR returns

$$d\left(h\right)={\beta }_{0}\left(h\right)+{X}_{\mathrm{c}\mathrm{c}}\cdot {\beta }_{\mathrm{c}\mathrm{c}}\left(h\right)+{X}_{\mathrm{a}\mathrm{g}\mathrm{e}}\cdot {\beta }_{\mathrm{a}\mathrm{g}\mathrm{e}}\left(h\right)+\varepsilon \left(h\right),$$
(2)
$$d\left(h\right)={\beta }_{0}\left(h\right)+{X}_{\mathrm{s}\mathrm{p}}\cdot {\beta }_{\mathrm{s}\mathrm{p}}\left(h\right)+{X}_{\mathrm{a}\mathrm{g}\mathrm{e}}\cdot {\beta }_{\mathrm{a}\mathrm{g}\mathrm{e}}\left(h\right)+\varepsilon \left(h\right),$$
(3)
$$d\left(h\right)={\beta }_{0}\left(h\right)+{X}_{\mathrm{s}\mathrm{p}}\cdot {\beta }_{\mathrm{s}\mathrm{p}}\left(h\right)+{X}_{\mathrm{c}\mathrm{c}}\cdot {\beta }_{\mathrm{c}\mathrm{c}}\left(h\right)+\varepsilon \left(h\right).$$
(4)

We used the coefficients of the variable of interest (the left-out \(\beta \) in Eqs. 24) for the statistical test as suggested by Mrkvička et al. (2019). For the continuous crown cover variable, the test statistic we used was the vector \({\beta }_{\mathrm{c}\mathrm{c}}\left(h\right)\) for all \(h\); and for age, the test was based on the values of the effect \({\beta }_{\mathrm{a}\mathrm{g}\mathrm{e},l}\left(h\right)\) of all the age groups \(l\) for all \(h\). To examine species differences, the test was based on all differences \({\beta }_{\mathrm{s}\mathrm{p},m}\left(h\right)-{\beta }_{\mathrm{s}\mathrm{p},n}\left(h\right)\) for species \(m\) and \(n\) with \(1\le m<n\le 5\).

The test we used relies on two procedures: (1) the Freedman-Lane algorithm (described in details by Winkler et al. 2014, p. 385) to permute the residuals of the null model and create the reference distribution of the coefficients under the null hypothesis, and (2) the global extreme rank length envelope test to build a null global envelope for the above-mentioned test statistics and correct for the multiple tests conducted along \(h\) (Myllymäki et al. 2017). The Freedman-Lane algorithm includes all the steps ranging from the simulation under the null hypothesis to the final estimation of the chosen test statistic (Fig. 3). We used 2999 random permutations to build the distributions under the null hypothesis. The global extreme rank length envelope test (Myllymäki et al., 2017; Mrkvička et al., 2020) was then constructed from the test statistics calculated from the empirical vertical distribution of LiDAR returns and the 2999 permuted data sets.

The null hypothesis was rejected if the empirical test vector left the 98% global envelope at any point. To account for the three variables tested (species, crown cover, and age), we used a Bonferroni adjusted significance level \(\alpha =0.05/3\) in addition to the inherent correction applied within the global extreme rank length envelope test to account for multiple \(h\). We observed that the variances of the model residuals were heterogeneous for all the variables (Winkler et al. 2014). Following Mrkvička et al. (2020), we transformed the matrix of vertical LiDAR returns distribution (Fig. 3) by scaling the functions \({d}_{i}\) of each group \(j\) according to their variance dispersion. The initial \({d}_{i,j}\) function was transformed into a \({S}_{i,j}\) (scaled) function by

$${S}_{i,j}\left(h\right)=\frac{{d}_{i,j}\left(h\right)-\bar{{d}_{j}}\left(h\right)}{\sqrt{\mathrm{V}\mathrm{a}\mathrm{r}\left({d}_{j}\left(h\right)\right)}}\cdot \sqrt{\mathrm{V}\mathrm{a}\mathrm{r}\left(d\left(h\right)\right)}+\bar{{d}_{j}}\left(h\right)$$
(5)

where the group sample variance \(\mathrm{V}\mathrm{a}\mathrm{r}\left({d}_{j}\left(h\right)\right)\) is used to correct for unequal variance among groups. The group sample mean \(\bar{{d}_{j}}\left(h\right)\) and overall variance \(\mathrm{V}\mathrm{a}\mathrm{r}\left(d\left(h\right)\right)\) are used to preserve the original scale of the mean and variability of the functions.

Our experiments with simulated data showed that transforming all groups at once (by combining all group levels) removed the heterogeneity of the variance. However, this required there to be sufficient observations in all categories, which was not the case for our data. We, therefore, relied on the successive application of the transformation (Eq. 5) for each of the three variables, and we found that the order in which the transformation is applied can reintroduce heterogeneity of variance. We settled on the successive transformation of crown cover, age, and species which provided the best results and reduced heterogeneity. We verified the importance of the correlation by running Breusch–Pagan test of heteroscedasticity (Breusch and Pagan 1979) using the squared residuals \( \left( {d\left( h \right) - \hat{d}\left( h \right)} \right)^{2}\) on the left-hand side of the reduced Eqs. 24. While the test indicated significant heteroscedasticity in some areas of the curves, the coefficient of determination was less than 4% for all variables and all \(h\), which confirmed that no further adjustments were required.

We performed the analysis using R version 3.6.3 (R Core Team 2020), the GET package (Mrkvička et al. 2019; Myllymäki and Mrkvička 2020), and Lastools (Isenburg 2012) to correct and extract the LiDAR data.

Results

The comparisons from the nonparametric graphical test of significance confirm that the differences between all species, after accounting for age and crown cover, were significant (\(p\) < 0.001) (Fig. 4). We observed two groups of species where differences were significant but small: balsam fir–paper birch, and black spruce–white spruce. Differences between balsam fir and paper birch were small and localized at 65–70% and below 12% of stand height. White spruce and black spruce also had small localized differences at 22–30% and 5–8% of stand height. However, some differences between groups of species were larger (indicated by the bold lines in Fig. 4): black and white spruces had the lowest distribution of LiDAR returns when compared to other species. Black spruce had fewer returns than balsam fir between 38 and 62% of stand height (40–57% for paper birch), while it had more returns below 28% of stand height (30% for paper birch). The vertical distribution of LiDAR returns for aspens had a distinctive shape when compared to every other species: it did not display a prominent peak, which made the distribution more uniform than for other species (Figs. 5, 6). Aspen was significantly different from other species (\(p\)<0.001) except in the areas between 57 and 60% of height, and 8–18% for balsam fir and paper birch specifically. Overall, the areas with the largest differences between species were located at around 5, 25, 50, and 70% of stand height.

Fig. 4
figure 4

Nonparametric graphical tests of significance comparing species using contrasts: the observed difference between the coefficients of two species (black curve), where the species in the rows are subtracted from the species in the columns (e.g. first column, second row is the aspen − balsam fir contrast). The 98% global envelope (grey bands), shows the area of acceptance of the null hypothesis (no effect, \(p\)<0.001) obtained from the permutations of the residuals of the null model (Eq. 2). The observed curve that is outside the envelope is in bold. Panels above the diagonal are the reflection of the observed functions and the global envelope from the lower part (the tests were performed only once)

Fig. 5
figure 5

Vertical distribution of LiDAR returns as a function of crown cover (columns) and species (rows). Black lines represent the median distribution; shaded areas represent 95% and 50% local variation envelopes

Fig. 6
figure 6

Vertical distribution of LiDAR returns as a function of age (columns) and species (rows). Black lines represent the median distribution; shaded areas represent 95% and 50% local variation envelopes

An increase in crown cover was associated with a decrease in the density of LiDAR returns below 33% of stand height, while the density of LiDAR returns above 38% increased (Fig. 7). The largest increase was concentrated around the middle height (50%), while the largest reduction effect occurred closer to the ground (around 3% height). Figure 5 displays the vertical distribution of LiDAR returns for each species per crown cover (including the effect of age). The increased crown cover concentrated the returns at around 50% of the stand height, except in the case of aspen, where the peak was higher in the stand (around 80% of stand height), and white spruce, where the peak was lower (around 20%). Although the black spruce peak was centered at 50% of the stand height for high values of crown cover, the observed variability was skewed toward the lower stand heights. Some species were more widely represented in younger or older age classes, which inflated the variation envelopes. The model from Fig. 7 accounted for this effect.

Fig. 7
figure 7

Nonparametric graphical tests of significance for crown cover. The observed coefficient (black curve) and the 98% global envelope (grey band) that shows the area of acceptance of the null hypothesis (no effect, \(p\)<0.001) obtained from permutations of the residuals of the null model (Eq. 3). The observed curve that is outside the envelope is in bold

Increased age was associated with a gradual upwards shift in LiDAR return distribution (Fig. 8). This shift occurred until 70 years of age, when a plateau was reached. The effect of age was significant at most stand height between 10 and 70 years of age.

Fig. 8
figure 8

Nonparametric graphical tests of significance for age. The observed coefficients of the six age groups (black curves), and the 98% global envelope (grey bands below the diagonal) that shows the area of acceptance of the null hypothesis (no effect, \(p\)<0.001) obtained from the permutations of the residuals of the null model (Eq. 4). The observed curve outside the envelope is in bold

Stands that were in the 10- and 30-year age groups had an abundance of returns below 28% and 41% of stand height, respectively, and fewer returns above 33% and 49%, respectively. However, the effect is reversed in stands in the 50- and 70-year groups, where age inflated the distribution between 38 and 97% of stand height (44–100% for 70 years), and deflated it below 31% of stand height (38% for 70 year). The significant age effects for stands in the 90- and 120-year age groups were smaller, and occurred below 69% and 44% of the stand height, respectively (\(p\)<0.001). These stands also had an abundance of returns in the middle and upper part of the stand (38–68% for 90 year; 33–44% for 120 year). For stands that were in the 90- and 120-year groups, there were reduced numbers of returns below 28% and 23% of stand height, respectively.

Overall, the rate of change in the vertical distribution of LiDAR returns gradually decreased as the age increased. For example, changes in the distribution between 10 and 30 years were larger than the changes between 70 and 90 years. Areas of higher variability for age groups were located around 5%, 20%, and 50% of stand height. Figure 6 displays the median vertical distribution of LiDAR returns for each species, by age class (including the crown cover effect). The young white spruce and black spruce stands displayed a high degree of asymmetry toward the lower part of the stand that disappeared in older stands.

The coefficient of determination for the models varied across stand heights (Fig. 9). The most correlated areas for every model were around 5–15% and 46–54% of stand height. All models displayed a sharp decline between 28 and 40%. The highest R2 was for the full model at 0.47 (at 10–13% of stand height) and 0.42 (at 49–51% of stand height). The model that excluded the species variable (Eq. 2) produced a difference in R2 of more than 0.10 for 72–92% stand height compared to the full model. When the crown cover variable was excluded from the model (Eq. 3) for between 0–21% and 46–67% of stand height, there was a difference in R2 of more than 0.10 with the full model. The model that excluded the age variable (Eq. 4) produced a difference in R2 of more than 0.10 with the full model at 18–38% and 62–64% of stand height.

Fig. 9
figure 9

Comparison of \( R^{2}\left( h \right)\) of the full model (Eq. 1), and the three reduced models (Eqs. 24) where one variable (either Species, Crown Cover, or Age) was excluded

Discussion

Our objective was to study the influence of species, crown cover, and age on the vertical distribution of LiDAR returns. We found that even-aged stands exhibit species-specific patterns that predictably evolve with crown cover and age. We observed two groups of species: the first, balsam fir and paper birch had more symmetrical vertical distributions of LiDAR returns that were centered between 40 and 60% of stand height, and the second, white spruce and black spruce, had distributions of LiDAR returns that were generally skewed lower in the stand (below 30% of the stand height). Aspen displayed a more even distribution of LiDAR returns with a higher proportion of returns higher in the stand compared to other species.

While individual trees are plastic and can adapt to various light and environmental conditions, we found that stands of the same species share similar vertical characteristics. These characteristics distinguish them from other species. This observation is consistent with previous field observations that were conducted over a smaller area using different measurement methods (Purves et al. 2007). However, we expected clearer vertical distribution patterns along shade-tolerance gradients or between conifer and deciduous trees; we found that patterns of paper birch were more similar to those of balsam fir (a shade-tolerant) than aspen, another shade-intolerant deciduous. Since we used the dominant species to classify each stand, the similar vertical distribution patterns could be the result of the frequent occurrence of balsam fir and paper birch within the same stand. The effect of species associations could be controlled by using a higher dominance threshold than the 50% we used. However, we found that using a 60% dominance made little difference to the conclusions, whereas a higher threshold removed too much data for some species.

The results in this paper demonstrate that vertical LiDAR return distribution can be useful for species classification. Based on these findings, models could be trained to link these vertical distributions to species and then used to estimate species occurrence over the landscape. Many studies performing species classification with LiDAR use the vertical distribution of LiDAR returns (Fassnacht et al. 2016), however, the approach developed here allows the drivers of cover and age to also be integrated into the estimation, thereby making the species prediction more robust when applied more broadly.

Additional evidence is required to establish whether the relationships between vertical density of LiDAR returns and the species, crown cover, and age are specific to our study area. Including other variables, such as LiDAR scan angle and abiotic factors, might improve the model, making it more applicable to other study areas or complex stand structures.

The increased crown cover led to an increased concentration of the vertical distribution of LiDAR returns to higher in the stand. In turn, stands with decreased crown cover exhibited a vertical distribution of LiDAR returns that was more dispersed and lower to the ground, with fewer returns higher in the stand. The gradual displacement of LiDAR returns from below 33% to above 38% associated with crown cover was also apparent when we compared the variance explained by the models in Fig. 9 where crown cover appeared to explain at least 25% of the variance of the full model below 21% and between 46 and 67% of stand height.

Increased age was associated with the vertical distribution of LiDAR returns being displace higher in the stand for up to 70 years, followed by a plateau or decline in the 90- and 120-year-old groups. The rate of change between age groups decreased as the age increased. Vertical distribution of LiDAR returns from younger stands appeared to undergo a rapid transformation in the 10-year group, which gradually decelerated and stabilized in the 70-year-old group. The 90- and 120-year-old groups displayed the least amount of change. While it is well established that absolute stand height varies predictably with age, the changes we observed here are relative to the maximal height of the stand.

The effect of age was akin to that of crown cover: as the age or crown cover increased, there was an increased concentration of points in the upper part of the stand. However, age explained the variation slightly higher in the stand than the crown cover (between 18–38% and 62–64% of stand height). Unlike Aber (1979) who observed a stable distribution of foliage at the end point of forest succession, the vertical distribution of LiDAR returns of the 120-year group displayed a small concentration of returns at 33–44% of stand height and a small decrease below 23% of stand height.

The most pronounced differences in vertical distributions of LiDAR returns occurred below 80% of stand height. The upper sections of the stand often yielded very small differences in distribution while the lower section yielded large variations. The distribution of LiDAR returns in the upper and lower sections of the stand can heavily dependent on the LiDAR survey parameters and species crown shape (Roussel et al. 2017, 2018). This might make these sections more variable across different surveys. Nonetheless, the full model (Eq. 1) still explained 20% of the variability from the ground up to 90% of the stand height. The difference in variability explained by the species model compared to the full model was greatest at heights above 72%, and peaked at around 80% of the stand height (Fig. 9). One possible explanation for this is that the section encompassing 74–90% of stand height exhibited considerable differences between aspen and all other species.

A common challenge associated with functional data analyses is making functions comparable, a process called registration (Ramsay et al. 2009). We registered the vertical distributions of LiDAR returns by using the highest measured LiDAR return of each stand. However, this registration makes the distributions potentially more affected by extremely high LiDAR returns and could explain, in part, why the upper section of the stands had a lower R2. For example, in young stands where there might only be a single remaining mature tree, the vertical distribution of LiDAR returns could be compressed. Using a height quantile as a registration point, such as 95% height rather than the highest return, could reduce the chances of compression and improve the vertical distributions. In addition to compression, the vertical distribution of LiDAR returns can be distorted by strong slopes (Liu et al. 2017). Furthermore, the occlusion of vegetation in lower stand parts underestimates the density of vegetation. Some of the noise in the vertical distribution can be mitigated by using correction methods such as voxels, or partly accounting for laser incidence angle, footprint size, and pulse density (Wilkes et al. 2016; Roussel et al. 2017, 2018).

There was a decline in R2 between 31 and 46% of the stand height, where age explained most of the variance of the full model (Fig. 9). While the reasons for the decline are unknown, this section of stand height is associated with an increase in LiDAR returns in the 120-year-old group. This vertical section of the stands also seems to be a transition for the spruce group and the balsam fir–paper birch group between lower distributions and distributions centered around 50% of the stand height. The decrease in R2 might be attributed to the intersection of these two distributions.

The nonparametric graphical test of significance that we used is slightly liberal because is based on the Freedman-Lane algorithm which uses an approximation from permutations (Mrkvička et al. 2019). However, this permutation method is regarded in the literature as one of the best methods when there are confounding variables (Anderson and Robinson 2001; Winkler et al. 2014). The graphical output from our analyses is advantageous, especially for identifying the sections of rejection. This graphical interpretation allowed us to determine the relative heights at which the differences in species, crown cover, and age occurred. When interpretability is desired, we think that this method could complement, and in some cases replace other methods of analysis for the vertical distribution of LiDAR returns, such as the principal component analysis or linear discriminant analysis (Fedrigo et al. 2019).

In a review of tree species classification using remote sensing, Fassnacht et al. (2016) noted that most studies pursued the optimization of classification accuracy and provided little information on the causal understanding of species discrimination. Using nonparametric graphical tests of significance can help us understand the evolution of forest structure by highlighting the association of variables with the distribution of reflective material. In our study, we applied this method to a selection of simple stands (that are even-aged with clear species dominance) to visually understand the effect of each of these factors. Further work is necessary in order to apply this method to multi-species or irregular stands.

The median stand area in our study area was 9 ha, which is large compared to other area-based approaches. This allowed for the aggregation of LiDAR returns to identify specific patterns. Stands are by design the most homogeneous unit of the forest landscape, and vertical LiDAR distributions are more reliable for identifying species when there is a sufficient number of aggregated returns grouped together. To determine the optimal area for observing these patterns will require additional research.

Conclusion

LiDAR vertical distributions can provide an understanding of the interplay between tree species, crown cover, and age. The use of functional generalized linear models combined with graphical tests of significance enabled us to interpret the differences in distributions caused by multiple variables. Using airborne LiDAR surveys makes it is possible to identify ecosystems that are at multiple evolutionary stages. Vertical LiDAR distribution patterns variations can be observed and eventually linked to structural and functional dynamics. Our results show that individual species feature distinctive vertical distributions of LiDAR returns that concentrate with crown cover and rise with age. Balsam fir and paper birch had similar vertical distributions of LiDAR returns, as did white spruce and black spruce. Aspen was the most unique species, yielding a more uniform distribution of LiDAR returns and a peak in the upper part of the stand. The balsam fir and paper birch group exhibited a peak centered at around 50% of stand height, while the distributions from the white spruce and black spruces groups were skewed to below 30% of the stand height. Increases in crown cover concentrated the distributions of all species at around 50% of the stand height and deflated the distribution below 33%. The effect of age was more diffuse across the whole stand height. Age increase was associated with a gradual displacement of the vertical distribution higher in the stands up to 70 years. The distribution of the 90– and 120-year-old groups then plateaued and slowly declined. These results could improve our understanding of the evolution of the forest structure in changing conditions and could be used for LiDAR stand-level species classification.

Author contributions

ER is the primary author of the manuscript. ER, NCC and JB contributed to the design of the study; the analysis was conducted by ER and MM; and the redaction was conducted by ER and revised by other authors. All authors read and approved this version of the manuscript.