Introduction

Human activities and globalisation in the last 100 years have led to unprecedented exchanges of flora and fauna between continents, countries, and regions, resulting in near homogenisation of many natural ecosystems (Mack et al. 2000; Pyšek et al. 2011; Seebens et al. 2018). The exotic species, once in their new environment, go through sequential periods of adjustments involving colonisation/extinction/recolonization, establishment, and naturalisation (Pyšek and Hulme 2005; Aikio et al. 2010). If biotic, environmental, and landscape conditions continue to be favourable, a proportion (~ 10%—see Kowarik 1995) of the exotic species will increase dramatically in abundance and geographical distribution to become invaders with significant impact on nature conservation and agriculture (Mack et al. 2000; Cook and Dias 2006; Clements and Ditommaso 2011; van Kleunen et al. 2018).

While the effects and proliferation of some exotics are immediately obvious (e.g. Osunkoya and Perrett 2011; Perrett et al. 2012), some may take a considerable time to manifest, resulting in lag times (real or perceived) between introduction and species becoming invasive. Whether most invasions endure lag phases and why they occur remains controversial (Mack et al. 2000; Aikio et al. 2010; Larkin 2012; Antunes and Schamp 2017; Coutts et al. 2017; van Kleunen et al. 2018). A lag phase may result from several factors and forces acting singly or in combination. These factors include: (i) the frequency and spatial arrangement of infestations of the immigrants (e.g., widely separated, small size infestation foci may be ineffective for rapid population growth compared to numerous, close by infestation foci), (ii) the time requirement for natural selection to operate for the evolution of new genotypes that can adapt to the novel environment within the immigrant populations, (iii) adjustments to the vagaries of environmental conditions, and (iv) human construct arising from our limitation to detect and make appropriate inferences of population growth of the invaders (Mack et al. 2000; Sakai et al. 2001; Coutts et al. 2017; van Kleunen et al. 2018).

Lags in plant invaders can last for decades, with a mean time of ~ 50 years (Kowarik 1995; Mack et al. 2000; van Kleunen et al. 2018), making it difficult to predict if an exotic plant species will remain low in abundance (i.e., naturalised) and non-threatening or is a “sleeper weed” (sensu Cunningham et al. 2004, Crook 2005; Coutts et al. 2017) with dire consequences in the future. Sleeper weeds are a subset of invasive plants that have been introduced into a new area and are low abundance and distribution locally for a period before rapidly increasing in population size. Because the best opportunity to control an invader comes when its population size is small, the cryptic nature of lag phase and/or sleeper weeds is an unfortunate paradox (Larkin 2012), but must be taken into consideration during risk assessments and prioritization of exotic species. Some invasion ecologists regard lag phases as artefacts that arise from erroneously distinguishing two stages of biological invasion when a single process would suffice for the dynamics (Mack et al. 2000; Williamson et al. 2005). Others suggest that lags may result from changes in sampling effort with time as is often the case for herbarium records from where some of the invasion trends are retrospectively inferred (Aikio et al. 2010; Antunes and Schamp 2017).

Statistical methods to discriminate between single and two-stage population dynamics of invaders have been developed. One method uses a piecewise model fitting approach (with at least two separate growth functions) that objectively quantifies lag phase, rate of increase, and asymptote value of species records after the lag phase. The increase phase period is modelled either as a logistic function for accelerating and sigmoidal relationship or as a von Bertalanffy growth function which fits linear and decelerating relationships (Pyšek and Prach 1993; Williamson et al. 2005; Aikio et al. 2010). The lag phase is then determined by statistically varying lag time in sequential steps from 0 (no lag) to the maximum time (years) of records of the invader species, and finding an estimated (true) lag time that minimises the total least square error (the sum of the least square errors for the linear and non-linear parts of the model). Hyndman et al. (2015) criticized this method and presented an alternative statistical approach that estimates the lag phase based on annual rather than cumulative data using a generalized piecewise linear splines model that incorporates a log link function for overall collection effort. In recent time, an algorithm for determining these indices simultaneously (called segmented regression) has been developed and has gained currency in the scientific literature, especially in the medical field (e.g., Muggeo 2008; Kazemnejad et al. 2014). The segmented regression approach has a higher statistical power than the previous methods as it allows different slopes and inflection (turning) point/s for specific values of a continuous predictor (e.g. time) to be generated simultaneously (see Muggeo 2008).

Other factors affecting invaders’ success (occupancy, abundance, and impact) in a novel range are their intrinsic (inherent) traits (e.g., life form, life cycle etc.) and historical factors (e.g., time since introduction, pathway of introduction, nature of habitat invaded etc.) (Mack et al. 2000; Castro et al. 2005; Sutherland 2004; Osunkoya et al. 2019a; 2020). However, there is little agreement amongst researchers on the generality and predictive power of these traits (Goodwin et al. 1999; Castro et al. 2005). In particular, time since introduction (henceforth, residence time) is controversial and there are few empirical studies on its role in influencing an invader’s range. Residence time is important for invaders as it underpins greater propagule pressure and drives seasonal or human-related spread events. These “time required” spread events allow the operation of micro-evolutionary process, offering chances for the exotic invaders to: (i) escape from demographic bottlenecks via genetic mixing thus making the invaders to cope with varying landscape conditions in their novel ranges, (ii) bond with mutualists, and (iii) explore potential routes for propagule dispersal between focal patches (Mack et al. 2000; Brandle and Brandl 2012; Mao et al. 2019). Analyses of several pools of invader species have shown that the more time invaders spent in their introduced ranges, the more likely they are to become widespread (Castro et al. 2005; Pyšek and Hume 2005; Pyšek and Jarosik 2005; Osunkoya et al. 2019a; 2020).

Herbarium records provide some of the most comprehensive information on plant distribution available and are critical repository sources for the construction of invasion curves (i.e., changes in abundance and distribution with time), and in estimating speeds (rates) and patterns of range expansion (Aikio et al. 2010; Antunes and Schamp 2017). Herbaria offer exceptionally large datasets over broad geographic areas, often dating back centuries. However, collection biases over time due to inconsistent collection intensities often limit the utility of such herbarium records (Aikio et al. 2010; Lavoie et al. 2013; Mosena et al. 2018). Hence it has been suggested that for construction of robust invasion curves, herbarium records of invader species should be standardised by the collection records of native species from the same locality. Delisle et al. (2003) developed the “proportion curve” to address inconsistent collection intensities (bias) by comparing the recorded distribution of introduced and native species in the same locality over time with the expectation that records of both groups will be equally impacted by variable collection intensity and hence should address the bias. However, the correction procedure highlighted above may not always suffice to reduce collection biases (Fuentes et al. 2013; Daru et al. 2018; Lang et al. 2019; Aiello-Lammens 2020).

For most invaded ecosystems, there are only anecdotal invasion curves and lag time estimates available for established and emerging weeds (Larkin 2012; Antunes and Schamp 2017; Sindel 2009; Victoria Government 2010; Fleming et al. 2018). These time-line invasion indices are also often lacking at the regional and continental scales (e.g., Australia) despite the usefulness of such information for policy planning and management. To fill this knowledge gap and to compare invasiveness indices of weeds in the State of Queensland, Australia with those elsewhere around the globe, we herein use herbarium records to develop invasion and (standardised) proportion curves of changes in distribution with time for ~ 100 established and emerging exotic weeds of the State. The aims of this paper are to:

  1. 1.

    Construct invasion curves for established and emerging weeds of the State of Queensland, Australia based on herbarium records;

  2. 2.

    Explore cross-species variation in lag time, spread rate (expansion), and range occupancy. Lag time especially has rarely been determined for many weed species as the trait is thought to be unpredictable and hence cannot be anticipated or screened for (Coutts et al. 2017); and

  3. 3.

    Determine the independent and/or interaction effects, as well as the relative importance of species-specific traits (e.g. plant life form, biogeographic origin) and historical factors (e.g. residence time, invasion pathway) on observed invasion patterns.

Methods

Study area

The study location (The State of Queensland) lies in the north-eastern part of Australia. The average minimum annual temperature varies from  − 10.6–5.4 °C, and average maximum annual temperature varies from 36.0–49.7 °C; mean precipitation ranges from 600 to 780 mm per year (Australia Bureau of Meteorology- http://www.bom.gov.au/). Spanning an area of almost two million km2, Queensland encompasses significant climatic and environmental gradients. Consequently, Queensland’s invasive flora, just like its native flora (Neldner 2014), varies considerably between its regions (Osunkoya et al. 2019a).

Data compilation from the herbarium

We initially selected 108 established and emerging weeds of Queensland, Australia. The majority of these species are identified as priority species (either prohibited or restricted matter) under the Queensland Biosecurity Act 2014 (https://www.legislation.qld.gov.au/view/pdf/inforce/current/act-2014-007—accessed Jan. 13 2021). As a result, these species are targeted for active management at the State, regional and local government area levels (Osunkoya et al. 2019a, b, 2020). Data on the distribution (i.e., presence) of these species from 1850–2010 were extracted from the Australasian Virtual Herbarium (AVH). Data conversion and cleaning, including removal of duplicate recordings of a species specimen at a given spatial point/grid, were conducted in ArcMap (Ver. 10.7.1). To allow for consistent comparisons across species, it was necessary to convert the point-based herbarium records into a grid-format. To achieve this objective, the herbarium records were overlayed with a 0.5 × 0.5 degree grid system, which is roughly 50 km × 50 km and totals 664 grids across Queensland. For each species, we recorded the first mention of a herbarium specimen in each grid; thus, grids were assigned to a species based on the earliest year that the species was recorded within that grid. The number of grids occupied by each species per decade was then calculated using the ArcMap summary statistics tool and exported to Microsoft Excel and SPSS (Ver. 25) for further analysis. Across Queensland, for each focal weed species and at 10 year intervals, the following details were collated: (i) the year of the first mention (presence) of a herbarium specimen in each grid, (ii) the number of records per grid, and (iii) the total number of occupied grid cells.

Species-specific traits, including accepted taxonomic nomenclature, growth form, life cycle, habitat invaded and introduction pathway were compiled through reviews of online, the grey and scientific literature (see https://www.ipni.org/index.html—accessed Jan. 13 2021; Osunkoya et al. 2019a), and consultation with botanists. Some invasive plant taxa within a given genera are complex (as species delineation is uncertain, and/or species are known to hybridise easily- e.g., many invasive Sporobolus grass species), and as such were treated as a group. We present data for both individual species, and, wherever possible, species-complex group.

Derivation of simple and standardised (proportion) invasion curve

From the data aggregated, four graphs were plotted for each species: the number of records against each decadal time interval from 1850–2010, the cumulative number of records against decadal time, the time-specific and the cumulative proportion of records of invasive to native species against decadal time (i.e., proportion curves—see below for justification and further explanation).

We used the cumulative number of records (y-axis) vs. decadal time (x-axis) to estimate various indices relating to invasiveness: (a) speed (log10) of spread for overall dataset (slopelog-normal), (b) speed of spread during the lag phase (slopelag), if any, (c) speed of spread at the exponential (slopeexpo) phase, (d) lag time period, (e) inflection point (year when spread is accelerated), (f) decelerating (slowing down and/or reaching an asymptote) rate after the exponential phase, if any, and (g) time period at which the decelerating phase is attained, if any. The difference between the earliest record and first inflection point was estimated as the lag time (in years). In the past, these indices of invasiveness have been estimated by simple eye-balling of the resultant graphs (e.g. Williamson and Brown 1986), or by breaking the dataset into subsets, each reflecting the different stages of a sigmoidal curve and estimating the indices independently (Pyšek and Prach 1993; Williamson et al. 2005; Aikio et al. 2010; Larkin 2012). In recent time, an algorithm for determining these indices simultaneously (called segmented regression approach) has been developed and has gained currency in the scientific literature, and are often referred to as “break-points, change points, transition points, threshold or switch points'' analyses (Muggeo 2008; Kazemnejad et al. 2014). At each step in the procedure, every breakpoint estimate is updated through the relevant “gap” and “difference in slope” coefficients using permutation procedure based on likelihood ratio test. Model fit between linear and non-linear trends are evaluated by comparing the sum of squares of residuals and/or changes in coefficient of determination (R2) values (Muggeo 2008). Given apparent nonlinearity in many invasion curve relationships, we used the segmented linear regression to quantify any abrupt change in the response variable (grid cell occupancy), identifying specific breakpoints and/or thresholds beyond which the slope of the relationship significantly changes (for example, the first inflection year gives the break point and hence time interval between arrival and population explosion [lag time]). The analyses were performed using the Segmented Library in R—a package that has been designed to fit regression models with broken-line relationships (Muggeo 2008; R Core Team 2019).

The rate of collection in the wild for herbarium records is not random and changes over time (Aikio et al. 2010; Hyndman et al. 2015). We corrected for these underlying sampling artefacts by recording, as a baseline, the accumulation rate of native records in herbaria from where the invasive records were derived (in our case, Queensland). This was then followed by scaling the number of invasive records by the number of native records at each decadal time (Delisle et al. 2003; Aikio et al. 2010; Antunes and Schamp 2017; Mosena et al. 2018; Pili et al. 2019). The resulting detrended (standardised) data when plotted against time is called “the proportion curve” (Delisle et al. 2003). In addition, in the derivation of our proportion curve, we standardised each invasive species’ time-specific distribution records by number of time-specific records of the most common native species of similar plant life form (Antunes and Schamp 2017) [see Supplementary Data S1 for list of native species used]. In this respect, a total of 67 common native species were used, comprising grasses (11 species), herbs (24 species), shrubs (14 species), trees (6 species) and vines (12 species). Delisle et al. (2003), the originator of the methodology, stated: “If the proportion (of exotic vs. native species) is increasing for a particular time period, this strongly suggests that the area occupied by the exotic species is really expanding, because it is expanding faster than if it was strictly the result of better spatial coverage of the sampling for herbarium specimens”. Where the proportion remains stable, the distribution of the exotic species may indeed be increasing; however, this increase may also have resulted from a better spatial coverage of the sampling effort and hence neither hypothesis can be rejected. If the proportion is declining, the area occupied by the exotic species may still be expanding, but at a very slow rate: in such a case, although the knowledge of the spatial distribution of plant species is improving, additional unit grid areas occupied by the exotic species are rarely found. The construction of proportion curves from the detrended (standardised) data enabled us to derive two additional invasiveness indices apart from the ones mentioned earlier (i.e. a–g). These indices are: (h) slope(proportion)—a measure of spread rate following data correction, and (i) invasion wave—a measure of the frequency of occurrence of significant increase (spike) in spread rate at each decadal time interval during which the proportion of the invaders relative to natives increased more than 5% (see also Mosena et al. 2018; Pili et al. 2019). We normalised the invasion wave frequency by residence time (year) to derive the invasion wave probability per year.

Note that because of sparse collection rates in 1850—early 1900s (see Results), we initially explored differences in invasion indices between the full datasets vs. post-1930 datasets only. Minimal differences exist between the two datasets in many of the indices, especially for invasion wave frequency, and hence the entire datasets spanning 1850–2010 were used in all our analyses.

Drivers of weed invasiveness

Information on species-specific traits (life form, biogeographic origin, and life cycle) and historical variables (residence time, invasion pathway, type of introduction, and habitat known to be invaded) were compiled for all focal species (see Osunkoya et al. 2019a for details). The influence of these factors on rates and patterns of invasion (i.e. on distribution) at the species level was examined using Generalised Linear Model (GLM) and Classification And Regression Tree (CART) analyses. Residence time was based on the date of the first appearance of the species in the herbarium records. In Australia, the date of first collection in herbaria has been shown to be significantly (P = 0.0001) correlated with observed and documented introduction date (Hamilton et al. 2005), and hence we are confident in the approach we have taken. Nonetheless, to reduce historical bias, we grouped residence time into 40-year intervals (see also Castro et al. 2005; Pyšek and Jarosık 2005; Osunkoya et al. 2019a). To explore determinants of spread of plant invaders (a count, response variable) across the State of Queensland, the species-specific and historical variables were entered as predictors in GLM-ANOVA (using negative binomial with log link distribution as the count data exhibited over-dispersion). A series of bivariate (normal and partial) correlations were also carried out between the traits to explore relationships between them.

CART, an exploratory statistical technique is flexible, robust, and distribution-free with capacity to deal with both categorical and/or numeric variables and is invariant to monotonic transformations (Breiman et al. 1984; Death and Fabricius 2000). CART is useful where there are many independent variables with complex interactions (e.g., in our case, species-specific traits in interaction with invasion history attributes etc.) that may influence a response variable (in our case, the total number of grid cells occupied by 2010 or species lag time). The technique provides a hierarchical dichotomous classification of the data set into smaller groups in which the within-group variation has been minimized with respect to the response variable. Regression trees make better predictions than GLM (De’ath and Fabricus 2000). Consequently, CART (using SPSS version 25) analysis was used to examine the relative role of invasiveness traits, species-specific and historical variables on lag time and final distribution (i.e. total number of grid cells occupied) of our focal invader species. The total number of grid cells occupied, or lag time was the response variable, while the species-specific traits, historical factors, as well as indices of spread derived from the constructed invasion curves were the predictors.

In the SPSS software, the CART tree-growing function of CHAID (chi-squared automatic interaction detector) automatically performs cross-validation (using the tenfold method- Breiman et al. 1984), and calculates the cross-validation error rate (the expected error rate for use of regression tree with new data). This parameter is important as it evaluates the performance of the resulting regression tree with changing tree size. The optimal regression tree to select is the one that minimises the relative cross-validation error rate to the expected error rate. We constrained CART splits to stop when the minimum numbers of cases (species) in parent and child nodes are 12 and 6, respectively and with a tree depth of four as further splitting no longer adds value to the prediction. CART also generates the importance value of the independent variables, reflecting the contribution of each variable stemming from both its role as a splitter and as a surrogate across all nodes of the tree. In the process, traits (variables) of little explanatory power are excluded from the tree.

Results

General trend

All life forms were fairly represented in our dataset of plant invaders, consisting of grass (N = 16), herb (N = 28), shrub (N = 20), succulent (N = 12), tree (N = 23) and vine (N = 9) species. There were insufficient data for 17 of our 108 (15.5%) focal species to fit spread trends with decadal time. The final database of 91 species has a diverse phylogeny. However, members of the family Poaceae (16), Fabaceae (15), Asteraceae (14) and Cactaceae (9) made up most of the species on the list. The historical pattern of sampling effort for our focal weed species was similar to that of native species considered (Fig. 1). Few herbarium specimens were collected up until the 1930. Collection efforts accelerated thereafter and reached its peak in 1990–2000 followed by a precipitous drop for both native and invasive species. The proportion (detrended) data gave a somewhat different trend (Fig. 2a), especially in the early collection decade (1850s) where there was a disproportionately higher collection effort for weeds relative to native species; after that early period, differences in collection efforts (invasive: native) were still apparent but no more dramatic (Fig. 2a).

Fig. 1
figure 1

Time specific collection records for (a) all plant invader species (N = 108), and (b) common native species used (N = 67) of Queensland, Australia. Note differences in the values for each y-axis

Fig. 2
figure 2

Time specific trends in collection records of plant invaders relative to native species. Trends are expressed as proportions for (a) all species pooled, and (bg) for each plant life form. Bars above the dashed horizontal lines signify periods of significant spikes (indicated by asterisk, *) in collection rates of invasive species

Standard invasion and proportion (detrended) curves

We use the standard invasion and the proportion curves to infer speeds and periods of spread for our focal species. Standard and proportion curves at individual species level can be found in Supplementary Data files S2 and S3, while derived indices from the curves are summarised in Table 1.

Table 1 Summary statistics for invasiveness indices of established and emerging weeds of Queensland, Australia

We defined an invasion wave as a phase of expansion or a period of invasiveness (seen as a spike in proportion curve) at a time interval of 5–10 years during which the proportion (of invader relative to native) is increasing more than 5% (see also Delisle et al. 2003; Mosena et al. 2018; Pili et al. 2019). In general, four major periods of weed spread (spikes) can be distinguished in the proportion curves or frequencies: 1850s, 1900–1920, 1950 and 2000–2010 (Fig. 2a). Invasive grasses, shrubs/trees, and vines mirrored this general trend (Fig. 2b, d, f, g). However, herbs and succulents did not conform to the general trend, as these life forms only showed broad and diffuse invasiveness patterns (Fig. 2c, e). Across species, the general trends (using whole or post 1930 datasets) that can be inferred of periods of invasiveness (seen as spikes in spread rates) from these proportion curves (Fig. 3, Table 1, and [Supplementary Data Files S3-S4]) are:

  1. (i)

    Close to half of the weed species examined (42.3%) exhibited no spike in spread pattern with time (Fig. 3); and

  2. (ii)

    Many of those with periods of spikes in spread occurred only 1–2 decadal times (mean spike: 1.82 ± 2.54 [SD]), except for few species with up to 6 spikes (American rat’s tail grass [Sporobolus jacquemontii], giant Parramatta grass [S. fertilis], grader grass [Themeda quadrivalvis], lantana [Lantana camara], leucaena [Leucaena leucophylla], and mesquite [Prosopis pallida]), and 7–13 spikes (calotrope [Caltropis procera], chinee apple [Ziziphus mauritiana], parkinsonia [Parkinsonia aculeata], prickly acacia [Vachellia nilotica], and rubber vine [Cryptostegia grandiflora]) (see also Supplementary data File S4). This trend suggests that species with tendencies for significant increase in spread rates with time are mainly trees/shrubs and grasses (Table 1).

Fig. 3
figure 3

Frequency distribution of invasion waves inferred from proportion curves. An invasion wave was defined as a phase of expansion or a period of invasiveness (seen as a spike in proportion curve) at a time interval of 5–10 years during which the proportion (of invader relative to native) is increasing more than 5%

Spread rates and patterns

Using segmented regression analysis, 54 of the 91 species (59%) with sufficient time-recorded datasets have at least one inflection point. Thus, these weeds showed evidence of non-linear increase in cumulative distribution with decadal time and hence exhibited lag phases (Table 1, [Supplementary Data File S2]). 41% (37 of 91 species) indicated no lag phase, and hence straight lines on log-arithmetic plots are apparent. As expected, for both species with linear (N = 37) and non-linear (N = 54) spread patterns, invasion indices derived from the herbarium records differ significantly amongst species (Table 1).

The identity of species with linear (no lag) and non-linear (lag) spread patterns was significantly (P < 0.05) influenced by residence time (many invaders with no lag phase are of recent origin—Tables 1 and 2), by species-specific traits of life cycle (species lacking a lag phase are more likely to be perennials rather than annuals) and introduction pathway (non-lag phase species are more likely to come into the novel range via waterways of aquaculture and ballast ships) (Osunkoya OO, unpublished data). In contrast, life form, biogeographic origin and the nature of habitat invaded played no significant role in the dichotomy between species with and without a lag phase period (Table 2).

Table 2 Summary results of species-specific traits and historical factors on invasiveness traits of Queensland weeds

Typical examples of species with linear (no lag) and non-linear (lag) spread patterns (with inflection years) are shown in Fig. 4. Of the 37 species with no lag phase, majority are of recent introduction with mean year of arrival of 1965.28 ± 27.10 yr (SD) compared to species with lag phase of mean year of arrival of 1928.42 + 29.14 yr (SD). The top ten species in this group of no lag phase have representatives of all the plant life forms, except the vine group (Table 1). Species exhibiting linear (no lag) spread pattern have a significantly (P < 0.05) lower spread rate (0.43 ± 0.08 (SE) of 50 × 50 km per year) than species with non-linear (lag) spread patterns at their exponential stage (1.94 ± 0.11 (SE)) of 50 × 50 km per year). Spread rates of linear (no-lag) and non-linear (lag) expansion species are both influenced by life form (grass ≥ tree and succulent > herb and vine > shrub) (Table 2; Fig. 5a).

Fig. 4
figure 4

Examples of fits of segmented regression lines to invasion curves for different species. These trends characterized the variations observed in the datasets: a,b—species with no lag phase; c,d- species with single inflection (break) point, and eh—species with multiple inflection (break) points

Fig. 5
figure 5

Effects of plant life form of invader species on (a) spread rates of species exhibiting linear (non-lag phase) (N = 37) and non-linear (lag phase) (N = 54) expansion patterns, and (b) lag phase period of invader species exhibiting non-linear expansion patterns

Lag phase versus exponential phase

Very few species (6/91 = 7%) showed evidence of two inflection years (Table 1, Fig. 4). After the expansion phase, two of these six species (water hyacinth [Eichhornia crassipes]) and annual ragweed [Ambrosia artemisiifolia]) exhibited asymptotes with time in their accumulated spread patterns, while four species indicated evidence of further increase (Bathurst burr [Xanthium spinosum], African love grass [Eragrostis curvula], rubber vine [Cryptostegia grandiflora], and lantana [Lantana camara]) (Table 1, Fig. 4; Supplementary Data File S2).

Within the lag phase period, spread rate as indicated by slope values differed amongst species (Table 1). The lag phase ranged between 12–126 years with a mean of 45.9 ± 22.0 yr (SD). The lag phase period of species of different life forms differed significantly (P < 0.05) and was of the order: tree (54.3 yrs) ≥ shrub (52.3 yrs) ≥ grass (51.6 yrs) > vine (40.8 yrs) > herb (35.2 yrs) > succulent (22.0 yrs) (Fig. 5b; Table 2). Lag phase was also influenced by residence time (Table 2), with later arrival (i.e., shorter residence time) species showing shorter lag phases (Fig. 6a). Of the six intrinsic and extrinsic traits explored, the spread rate during the lag phase (slopelag) was influenced only by residence time (recent arrival > earlier arrival) and life cycle (spread rate of perennial > annual) (Table 2). In contrast, spread rate at the exponential phase (slopeexpo) differed between life forms (grass > vine = tree > succulent = herb > shrub) (P = 0.03; Fig. 5a), by biogeographic origin of the weed (Africa > Asia > America > Europe) (P = 0.001), by habitat invaded (multiple habitats > riparian = agricultural lands > grasslands > woodland/forests > wetlands) (P = 0.03) and residence time (P = 0.001) (Table 2). Lag time itself appeared to have a non-linear, inverse relationship with the slope of spread during this quiescent period (Fig. 6b). In other words, propensity to spread decreased non-linearly and precipitously to a minimum with increasing lag time up until 50 years and changed little thereafter (R2 = 0.12, P = 0.008). In contrast, at the expansion phase, no significant trend (R2 = 0.01, P = 0.25) was detected between lag time and spread rate.

Fig. 6
figure 6

Bivariate relationships between (a) lag time and residence time, and (b) lag time and spread rates at the lag phase

We define sleeper weeds as species with lag phase periods of more than 50 years (sensu Cunningham 2004; Groves 2006). This is a threshold value that is close to the mean lag phase period of 45.9 ± 22.0 (SD) years inferred from this study. Minimal spread rate for lag phase species was also observed beyond this threshold year in our data set (Fig. 6b). In this respect, 21/54 (39%) of identified lag phase species (or 21/91 = 23% of all species considered) will fall into this category. Majority of the categorised sleeper weeds are trees and shrubs (60%) and occasionally grasses (20%) or herbs (10%) (Table 1); vines and succulents are absent. Within species exhibiting lag phases, sleeper weed species are not significantly different from their non-sleeper weed counterparts in spread rate (using raw data, slopelag: 0.29 ± 0.08 vs. 0.34 ± 0.06; P = 0.12; slopeexpo: 1.17 ± 0.18 vs. 0.97 ± 0.14; P = 0.37), nor in terms of species-specific and historical traits (Table 2). In summary, using CART analyses to explore the relative influence of the traits considered, the main determinants of variation in lag phase period, in decreasing order, are residence time, overall spread rate (slopelog-normal), and plant life form (Fig. 7a; see Supplementary Data File S6 for the regression tree).

Fig. 7
figure 7

The relative contribution of factors to the optimal regression tree generated via CHAID for (a) lag time of 54 invader species exhibiting non-linear spread pattern and (b) final number of occupied grid cells (i.e., occupancy or range size) for all species (N = 91) studied. The most important variable always has a relative importance of 100%, and other traits are ranked in relation to this most important trait

Relative contribution of species-specific and historical factors on weed spread

CART analyses indicated that the optimal regression tree for weed spread (as modelled by total number of grid cells infested, i.e. range size) has six terminal nodes and four depths (Fig. 8), and explained 94% of the variation in the dataset. Invasion wave frequency was responsible for the main split (depth I). Expansion rate of linear spread (i.e. non-lag phase) species and, again, invasion wave frequency were the main discriminators for depth II, and from which three independent nodes (groupings) are produced. Depth III of the regression tree was driven primarily by the expansion rate of linear spread (non-lag phase) species and generated one independent node. The lowest tree branch (i.e. depth IV with two independent nodes) has species’ residence time as the main discriminator. The relative influence of the 12 traits (both as splitter and surrogate variables) used in building the regression tree are shown in Figs. 7b. Overall, the relative roles of factors considered as main determinants of final weed distribution (grid occupancy) are of the order: Invasion wave frequency > spread rate of non-lag phase (linear-spread) species > spread rate of lag phase species at their exponential period (slopeexpo) > spread rate from proportion data (slopeproportion) ≥ residence time. Other traits of moderate importance are introduction pathway > spread rate at the lag phase period (slopelag) of non-linear spread species ≥ plant life form > and break point (inflection) year. Habitat invaded, lag time, bio-geographic origin, and life cycle played little or no role in the total number of grid cells infested (i.e., range size), and hence in defining invasiveness (spread).

Fig. 8
figure 8

The optimal regression tree model with four depths (I–IV) and six terminal nodes (labelled nodes 4, 5, 6, 8, 9, and 10) for 91 weed species of Queensland, Australia. Range size (total number of grid cells occupied) was the dependent variable. The species-specific traits and historical factors, as well as indices of spread (invasiveness) derived from the constructed proportion and invasion curves were the predictors. Boxes at the nodes and leaves showed mean range size (average grid cells occupied) and n, is the number of species for each group formed. Each split is labelled with the invasiveness trait and its values that determine the split. Histogram within a node box provides the frequency distribution of total grid cells occupied by species within the group. Species membership of each node can be found in the last column in Table 1

Discussion

Exploring and scrutinising phases of biological invasions are necessary to tease apart the factors and processes driving colonisation, naturalisation and/or spread of introduced species in novel ranges, and to better predict and reduce the possible negative impact of the phenomenon. For a significant number of our study species (41%), the rate of spread is constant, implying expansion in range size immediately after introduction and establishment. It appeared that this group of species adapt quickly to their novel environment due to habitat/climate similarity to that of their native range (Mack et al. 2000; Lavoie et al. 2013) and/or rapid adaptive evolution (Crook 2005; Brandle and Brandl 2012; Winkler et al. 2019). Multiple introductions can also be a contributing factor as it helps to reduce genetic bottlenecks and offer opportunities for simultaneous and random meta-population explosion at numerous infestation foci (Crooks 2005; Mack et al. 2000; Winkler et al. 2019). An inverse relationship between lag phase period and number of introduced populations and/or polyploidization has often been reported (Brandle and Brandl 2012; Clements and Ditommaso 2011). Thus, it would be instructive to examine, for example, the level of polyploidy and spatial genetic differentiation in non-lag phase (linear spread) versus lag phase species.

Often, and as found in this study for a greater proportion of our focal species (59%), lags in population growth occur prior to range expansion (Pyšek and Prach 1993; Hobbs and Humphries 1995; Hastings 1996; Williamson et al. 2005; Wangen and Webster 2006; Coutts et al. 2017). The slow population growth rates that define the lag phase varied among species, but they are all positive. The observed lag times with low but positive growth rates can be explained by purely spatial dynamics such as radial expansion and/or simple logistic population growth from a single point (Hastings 1996; Sakai et al. 2001; Crooks 2005), and demographic stochasticity (Parker 2004; van Kleunen et al. 2018). Other contributing factors to occurrence of a lag phase are ecological, including negative density dependence (e.g. Allee effects and evolutionary consequences of small population sizes resulting initially in low genetic diversity, genetic drift and bottlenecks) (Mack et al. 2000; Sakai et al. 2001; Clemens and Ditommaso 2011), lack of mutualists (Parker 2004), long period between reproductive events (Wangen and Webster 2006), spatial heterogeneity (Hastings 1996) and variable habitat connectivity (Mack et al. 2000). The observed lag time range of 12–126 yrs and mean of 45.9 yrs, as well as the proportion of weeds classified as having a lag phase (59%), are similar to values reported in the literature (Daehler 2009; Larkin 2012; Crooks 2005; Aikio et al. 2010; Hyndman et al. 2015). However, it should be noted that longer lag times (up to 300 years) have also been reported for weeds of temperate, Mediterranean, and other colder (e.g., subantarctic) regions (Kowarik 1995; Weber 1998; Brandle and Brandl 2012).

For species with lag phases, we detected a positive but non-significant relationship between spread rates at the lag phase vs. the exponential (expansion) phase (Supplementary Data File S5). From the trend observed, we argue that past or present performance of a weed is a poor predictor of potential/future population growth and range expansion. Consequently, the use of lag phase growth to predict future weeds will have a low explanatory power (Crooks 2005). Better knowledge of drivers of lag phases, such as the use of a set of intrinsic traits that can reliably predict the phenomenon, could inform on risks posed by potential invaders. However, as seen in this study, apart from residence time, many of the species’ intrinsic traits examined did not differentiate lag phase from non-lag phase species. This finding echoes the perplexing submission of many previous workers of a paradox: the best chances for success in control or eradication of pests occur when they are low in abundance and in their lag phases, but lag phase characteristics offer little information for predicting which exotic species will eventually become weeds of tomorrow (Crooks 2005; Aikio et al. 2010; Larkin 2012; Coutts et al. 2017).

A positive relationship between lag time and residence time was detected. This suggests that new introductions are spreading faster and earlier than longer established invaders (Daehler 2009; Aikio et al. 2010). This finding supports the parsimonious assertion that Queensland regions, as in many landscapes around the globe, may have become more susceptible to invasion in recent time (Seebens et al. 2018). The susceptibility is exacerbated by increases in population density and economic activities, including land clearing—making the landscape more interconnected and disturbed, and hence more receptive to a greater number of invader species and foci (William and West 2000; Seebens et al. 2018).

Lag times were related to species’ life forms: shrubs, trees and grasses have longer lag phases than vines, succulents, and herbs. Longer lag periods are expected for trees and shrubs as part of their inherent property of long generation times; grasses have shorter generation time. A similarity of long lag phases in trees and grasses, despite differences in life form and generation time is thus perplexing (see also Sutherland 2004). It is plausible that because members of both groups were intensely introduced simultaneously in late 19th and early twentieth centuries (Cook and Dias 2006; van Klinken and Friedel 2018), the subset of introduced grass species that evolved with time to become high impact invaders (e.g. grader grass [Themeda quadrivalvis], hymenachne [Hymenachne amplexicaulis] and Aleman grass [Echinochloa polystachia]) have followed similar high spread trajectories as that of the trees—seemingly encouraged by ecological novelty (unprecedented human-mediated changes at different ecological levels), high propagule pressure and ability to respond to and even alter natural disturbances of fire and inundation (van Klinken and Friedel 2018). The subset of introduced grasses with longer lag phases (> 50 years), and hence are categorised as sleeper weeds (especially thatch grass [Hyparrhenia rufa], Parramatta grasses [Sporobolus fertilis and S. africanus], and fountain grass [Cenchrus setaceus]), may have lower ability to cope with recurring fires common in Queensland’s dry tropics, experience reduced seed bank population, lack multiple introductions, may be pollen limited, have reduced hybridisation potential with conspecifics or congeneric, and have been prone to greater use as fodders compared to other life forms or weedy grasses lacking a lag phase (Parker 2004; Poulin et al. 2005; van Klinken and Friedel 2018).

The observed precipitous decline in herbarium records of recent time (i.e. this century) for both native and exotic species (Fig. 1) has also been documented in Europe and North America, and attributed to declines in the number of funded floristic projects, dwindling of trained and amateur plant collectors or societal perception that herbarium collection is no longer necessary (Prather et al. 2004; Renner and Rockinger 2020). Overall, we found four periods of significant spread (spikes) of weed species (mainly grasses and trees): 1850s, early 1900s, 1950s, early 2000—with increasing intensity of these spikes from 1950 onward (Fig. 2). Note that pre-1930, the observed spikes in weed spread may not be a robust finding due to sparse collection efforts at these periods, thus there is a need for further verification (Aiello-Lammens 2020). Spikes observed in mid 1950s are in line with major introduction periods (1930–1960) of many exotic grasses and trees (62 and 137 species, respectively) into Australia for agricultural and livestock production, including into Queensland, chiefly through the work of Commonwealth Scientific and Industrial Research Organisation (CSIRO) and the State’s Department of Agriculture (Cook and Dias 2006). Land clearing from 1990 onward has encouraged the proliferation of weed species seen in 2000–2010 (Neldner 2014). Spikes in weed spread at the turn of this century have also been linked to increasing anthropogenic disturbance resulting from human population growth and commerce (Seebens et al. 2018; Lang et al. 2019).

The role of residence time in the ecology of invaders has been widely debated but now recognised (e.g., Mack et al. 2000; Castro et al. 2005; Pyšek and Jarosik 2005; Wilson et al. 2007; Brandle and Brandl 2012; Schmidt et al. 2017), and was apparent in this work, especially for lag phase species. Residence time on its own correlated positively with lag time for an obvious fact that plants introduced very recently will, if at all, show only short time lags, while those introduced a longer time ago, can show both short and long time lags. Consequently, one expects a decrease in time lag with decreasing time since introduction (Brandle and Brandl 2012; Larkin 2012; Aikio et al. 2010). Hence some scholars have downplayed the role of residence time as a determinant of the spread of invaders (Aikio et al. 2010; Larkin 2012). However, the CART analyses of lag phase period and range size indicated the prominent role residence time plays in the classification of our focal species (see also Wilson et al. 2007; Schmidt et al. 2017). Nonetheless, it should be noted that residence time is less important than spread rates (a proxy for the complex effect of all factors related to invasions) in determining range size of weed species in our study (see also Pyšek and Jarosik 2005). Lastly, apart from plant life form, many species-specific and historical traits (life cycle, biogeographic origin, introduction pathway, and habitat invaded) played minor or no roles in spread patterns and rates of invaders in our study. This observation is in line with previous studies (e.g. Goodwin et al. 1999; Castro et al. 2005; Sakai et al. 2001; Osunkoya et al. 2019a).

Conclusions

We have demonstrated that herbarium records can provide valuable information on patterns and spread rates of introduced species. We are aware of limitations of the use of such repository records, including biases arising from opportunistic collection. As recommended (Delisle et al. 2003; Antunes and Schamps 2017), we have standardised the dataset with native species data collected at the same period in Queensland and used the ensuing proportion curve to validate many of the findings. However, collection biases between invaders and native species in the same locality cannot be completely dismissed (Fuentes et al. 2013; Lang et al. 2019). It is heartening to note that robust methods (e.g. simulated vs. real data, and rarefaction analyses) are being developed to address the challenge (Lavoie 2013; Aiello-Lammens 2020).

Like in many studies, we showed that lag phase is a common phenomenon for many (but not all) weeds, though our reported values are not as high as those of some previous work (e.g., Kowarik 1995; Weber 1998). Our use of segmented regression method (see Muggeo 2008) to estimate spread rate and inflection years is an improvement over techniques used in the past (i.e., eye-balling and/or piecewise model fittings). Segmented regression procedure has a higher statistical power, generating and evaluating different slopes simultaneously for specific values of our continuous predictor (i.e., time). We advocate the use of such techniques in future biological invasion work.

The low predictive power of lag time on range size, or our inability to identify a set of species-specific and historical factors to link with lag time, will suggest that retrospective analyses like the one done here offer us little hope in the development of robust generalisation to identify weeds of tomorrow. Lag time reflects genetic, demographic, habitat and climatic challenges. Hence, disentangling the relative power of these non-mutually exclusive processes on lag time is fraught with difficulty (Brandle and Brandl 2012; Coutts et al. 2017). Nonetheless, we have identified a group of species that exhibit lag periods > 50 years prior to exponential population growth (sleeper weeds; see Table 1). While many of these lag phase weeds are listed in the State of Queensland Biosecurity Act 2014 and are being proactively managed either for eradication and/or integrated weed management (e.g. lantana [Lantana camara], rat’s tail grasses [Sporobolus complex], and water hyacinth [Eichhornia crassipes]), others in the same category are not formally listed (specifically, lippia [Phyla canescens], calotrope [Caltropis procera], thatch grass [Hyparrhenia rufa], and Chinese violet [Asystasia gangetica]). We hope the afore-mentioned, unlisted weeds attract attention for policy and necessary management actions, especially because many are still confined to few regions or local government areas (see Osunkoya et al. 2020).