Introduction

Fagopyrum L. (buckwheat) is a pseudocereal grain crop belonging to the family Polygonaceae. Some 20 species are known in the genus (Zhou et al. 2018) of which two are cultivated: F. esculentum Moench. (common buckwheat) and F. tataricum Gaertn. (Tartary buckwheat). Taxonomic nomenclature follows World checklist of vascular plants (2021) (https://wcvp.science.kew.org/). Both cultivated species have potential to meet global food security challenges because of their high nutritional values (Klykov 2018; Sytar et al. 2018). Net global production of buckwheat in 2018 was over 1.8 million tonnes, of which China is the world’s second top-producing country at 0.43 million tonnes in 2018 (FAO 2021). The vast majority of the global buckwheat grown is F. esculentum, while F. tataricum is mainly grown only in high altitude regions of India, Nepal and the Himalayan fringe of China.

As a minor crop with low research investment, the factors setting the limits of buckwheat cultivation are poorly understood. The major climatic factor usually cited which limits distribution of F. esculentum is its frost intolerance. Biotic factors may also be important. Successful seed set in F. esculentum is sensitive and frequently limits yield, as this is an insect pollinated, obligate outcrossing species (Jacquemart et al. 2012). Study of the variables that limit the cultivable range of buckwheat is important for crop improvement against a background of climate change, and also for our understanding of the history of this crop that has been cultivated in China for at least 6,000 years (Hunt et al. 2018). The spread of buckwheat in the past is of particular interest because it appears to have been domesticated relatively late, following a different geographical model from that of any of the other early Chinese grain crops. Most Fagopyrum taxa have a restricted distribution in southwest China and neighbouring countries in the southeastern Himalaya region, including the wild ancestral forms of F. esculentum (F. esculentum ssp. ancestrale) and F. tataricum (F. tataricum ssp. potanini) (Fig. 1, Ohnishi and Yasuo 1998). Genetic analysis has, unsurprisingly, indicated this region as the centre of origin of the two domesticated species; specifically, the Sanjiang (Three Rivers) region of northern Yunnan/western Sichuan/eastern Tibet has been suggested as the region of origin of Fagopyrum esculentum (Ohnishi 2009). However, the macro- and microfossil archaeobotanical evidence places the earliest buckwheat cultivation in semi-arid northern China, which remains the part of the country where most of it is grown today. Factors that might explain this discrepancy with the archaeological data include the possibilities that the range of F. esculentum ssp. ancestrale in the mid Holocene extended further north, or that it was originally domesticated at the northern margins of the range of its progenitor in southwestern China, but was not substantially cultivated until it reached further north. The latter scenario could be linked to a significant change in ecological adaptations following domestication (Hunt et al. 2018).

Fig. 1
figure 1

Approximate distributions of wild Fagopyrum species, based on Ohnishi (2018, Figs. 12.5 and 12.6) and Hunt et al. (2018, Fig. 3). Land cover data from Natural Earth 1 (naturalearthdata.com)

The routes and chronology of buckwheat dispersal remain elusive, largely owing to the scarcity of Fagopyrum macrofossils in the archaeobotanical record, and issues related to its identification (Hunt et al. 2018). The reasons for the strikingly low number of macrofossils in archaeobotanical assemblages are uncertain. It may be the result of the smaller scale of cultivation due to the lower yield of buckwheat, as compared to other staple crops grown in China at the time. However, the scarcity of Fagopyrum macrofossils might also relate to taphonomic processes such as the distance of crop processing activities from fires. In the context of legumes, Fuller and Harvey (2006) make the interesting suggestion that taphonomic pathways to carbonization are greater when grinding becomes prominent in crop processing, in contrast to a lesser signal in the carbonised record from a whole grain tradition. We may speculate that something similar might apply in the case of the buckwheat record.

Fagopyrum more frequently occurs in the fossil record as pollen, but is still sparse because of its limited production and dispersal capacity as is typical of an insect pollinated plant. Although charred buckwheat achenes (seeds) retain species-specific features, notably in the ridging and surface texture of the pericarp (Sharma 1986), the two cultivated species are not always clearly differentiated in publications (Hunt et al. 2018). Fagopyrum pollen is also usually identified only to the genus level, which also includes several wild species.

In the absence of species identifications, estimation of cultivated Fagopyrum has been made on the basis of the geographical distribution of records and of abrupt rises in its pollen percentages. The modern distribution of virtually all wild Fagopyrum species is restricted to southwestern China, so records from outside the region have been usually interpreted as evidence for cultivation. Hunt et al. (2018) used as a criterion the abrupt increase in Fagopyrum pollen percentages as an indication of its cultivation, following common practice in the primary Chinese palynological literature. Owing to its limited pollen dispersal, such a pattern is unlikely to be observed unless crop processing activities such as threshing or sieving took place in the vicinity (Liu et al. 2018), although it could also result from the deposition of whole flowers in the forming sediment (Hunt et al. 2018). Based on this criterion they collated a dataset with 14 records of pollen, ten of macrofossils and two of starch granules. Of these, the macrofossils have been identified as F. tataricum only from Kyung-Lung Mesa/Kaerdong, whereas those from other sites have been described either as F. esculentum or identified only to genus. In the current study, we discuss F. esculentum as the geographically and ecologically more widespread species. We make the assumption that genus records interpreted as cultivated and from outside the narrow high altitude range of F. tataricum also represent F. esculentum.

There are substantial climatic and ecological differences both within southwest China and with other parts of the country. Such an ecological setting is particularly sensitive to both past and future climate change, hence understanding the limitations to buckwheat cultivation is a powerful approach to studying the past spread of crops and peoples across prehistoric China. In this paper, we use species distribution modelling (SDM), reviewed in Franklin (2010) and Peterson et al. (2011), to generate ecological suitability maps of the study area for the past 8,000 years, using modern buckwheat productivity data. This builds on previous work in China applying thermal niche modelling to study the spread of the major Triticeae and Paniceae cereal crops to the Tibetan plateau and rice in the Shandong peninsula (d’Alpoim Guedes et al. 2015a, b, 2016; d’Alpoim Guedes and Bocinsky 2018), as a specific application of niche construction theory (d’Alpoim Guedes 2016).

Dispersal of economic plants resulted from the interactions between humans creating their cultural niches, and in doing so expanding ecological niches for particular plant taxa (Odling-Smee et al. 2003; Rowley-Conwy and Layton 2011; O’Brien and Laland 2012; Piperno 2017). Crops can be dispersed by movements of people, or by interactions between peoples such as trade or exchange, taking them into new habitats with suitable ecological conditions.

This niche following process might occur between adjacent geographical regions, but also by bypassing unsuitable areas or natural barriers as a form of relocational niche construction (Odling-Smee et al. 2003) and the process can be further facilitated by the adaptive plasticity of the plant (cf. Donohue 2005; Piperno 2017). Alternatively, plant dispersal can need more intensive human intervention in the ecosystem and changes to the landscape, so that the ecophysiological requirements of particular plants are mainly met by human actions such as pronounced modification of water and soil environments. These human actions can either mitigate changes in the existing conditions, such as climate deterioration (counteractive perturbation), or facilitate relocation into new environments (inceptive relocation) (Odling-Smee et al. 2003). This niche forming process can cause further evolutionary responses resulting in genetic and cultural adaptations of the plant and its cultivation practices to the new environments and the stabilisation of desired characteristics of the plant (Piperno 2017).

In the context of Fagopyrum, the distinction between niche following and niche forming expansion can be illustrated by reference to modern cultivation in Nepal (niche following) and the Russian Altai region (niche forming). In Nepal, Fagopyrum is currently the sixth most important staple after Oryza (rice), Triticum (wheat), Zea (maize), Eleusine (finger millet) and Hordeum (barley) (Luitel et al. 2017). The first three of these are niche forming crops, for which the lower altitude farming landscapes are significantly changed to accommodate the crops. Along the margins of these crops, and on poorer soils up to altitudes of 4,500 m a.s.l., plots are sown with several landraces of the two species of domesticated Fagopyrum (19 recorded landraces of F. esculentum and 37 of F. tataricum), chosen according to their varied resilience to water stress, low soil fertility and short seasons (Luitel et al. 2017). A contrasting practice can be found in Russia, currently the world’s largest producer of buckwheat (Sen Nag 2017). In the Altai region around half the total area under arable cultivation is sown with Fagopyrum. It is cultivated in almost all agroclimatic zones, where it is sustained by a system of intensive sowing and staged fertilizer management (Vazhov et al. 2013).

In this paper we examine whether the initial spread of F. esculentum mainly represents a niche following or a niche forming model. We emphasise here that the two processes are not mutually exclusive, but we expect the suitability of environmental conditions in places where buckwheat is grown to remain largely unchanged through time under the niche following scenario. A predominantly niche forming process should become obvious where environmental conditions deteriorate, as less suitable conditions are encountered and overcome by human action. We investigate this by (1) employing SDM, using the current distribution of the crop to determine the environmental conditions suitable for growing F. esculentum; (2) extrapolating the model backwards to past climatic conditions to estimate changes in the potential ecological niche of buckwheat over time; (3) extrapolating environmental suitability estimates from sites with archaeobotanical records of Fagopyrum; (4) comparing these estimates through time to the overall distribution of suitable conditions within the study area.

Materials and methods

To investigate the dispersal of F. esculentum in relation to its ecological niche, we estimated the relative suitability of environmental conditions for its cultivation at sites with evidence for buckwheat cultivation over the past 8,000 years. Following Hunt et al. (2018), we consider an abrupt increase in Fagopyrum pollen as evidence for nearby cultivation, as well as the presence of starch grains or charred seeds identified as either Fagopyrum or F. esculentum (including synonyms). In addition to those included in Hunt et al. (2018, Table 1), we included five additional records meeting the same criteria that have subsequently come to light. We added the earliest record of charred buckwheat seeds, dating from 3,500 bp, from the site of Donghuishan (Wei 2019) in northwestern Gansu and others dating from 1,000 bp from the site of Bangga in Tibet (Xizang) (Tang et al. 2021). We have also identified three more sites with sharp increases in Fagopyrum pollen records from the database in Cao et al. (2013). At Heshui (Shaanxi), Fagopyrum pollen was present in the palaeosol that formed in the early to mid Holocene (Jiang et al. 2013), indicating the presence of buckwheat in the area by 4,000 bp. At Charisu (Nei Menggou, Inner Mongolia), Fagopyrum pollen appears in significant quantities from 900 bp onwards, and at Wangjiadian (Henan) there is a sudden increase around 8,000 bp (Cao et al. 2013), making it the earliest possible record (ESM 1). In addition to sites with pronounced sharp increases or peaks, smaller quantities of Fagopyrum pollen were found in lake sediments dated to 700 bp and 5,600 bp from the lakes Shayema (Sichuan) and Sihailongwan (Jilin), respectively, as well as in river sediment from the Changjiang (Lower Yangtze) (HQ98) dated to around 5,000 bp (ESM 1; Cao et al. 2013). As vertical movement of pollen through sediments of this kind is unlikely, these results were also included in the analysis. Of the macrofossil records, the sites of Kyung-lung mesa/Kaerdong and Qugong, where the buckwheat was explicitly identified as F. tataricum (d'Alpoim Guedes et al. 2014; Song et al. 2018; Gao et al. 2021) were excluded. Therefore, the total number of occurrence records comprises 34 sites, two with starch grains, 11 with charred seeds and 21 with pollen records (Fig. 2, ESM 1, 2, 3).

Fig. 2
figure 2

Archaeobotanical records of Fagopyrum in China. Land cover data from Natural Earth 1, naturalearthdata.com). Borders of China and its administrative divisions from DIVA-GIS (2017). 1, Xindian; 2, Beizhuangcun; 3, Wang Xianggou; 4, Xishanping; 5, CM97; 6, Jingbian; 7, Fuxian; 8, Canduntou; 9, Wenhai lake; 10, H9602; 11, Jinchuan; 12, Muchang; 13, GH09B; 14, Lucheng; 15, Chenqi mogou; 16, Changning; 17, Dingjiawa; 18, Haimenkou; 19, Xueshan; 20, Yingpandi; 21, Yangjiawan; 22, Maquan; 23, Mozuizi; 24, Kyung-lung mesa; 25, Bayantala; 26, Sunchangqing; 27, Donghuishan; 28, Qugong; 29, Bangga; 30, Heshui; 31, HQ98; 32, Charisu; 33, Luochuan; 34, Shayema lake; 35, Wangjiadian; 36, Sihailongwan lake; 37, Xindian

The relative suitability for buckwheat growing of sites in China in the past was inferred from data on its present distribution using SDM, an established methodology with a wide range of ecological applications (Franklin 2010; Peterson et al. 2011; Booth et al. 2014), including the estimations of past and predictions of future distributions of plants and animals in connection to climate change (for example, Varela et al. 2011; Sillero and Carretero 2013). SDMs model the relationship between observed distributions and the environmental conditions to estimate the probability of occurrence of a taxon or its potential abundance. The output of SDMs can be interpreted along a spectrum between potential or fundamental niche (where species could grow in the absence of competition) and realised niche (where species actually grow) of the particular taxa, depending on features of the distribution data and the modelling approach (Jiménez-Valverde et al. 2008; Austin 1999). In this study, we use SDM to estimate the part of the potential niche with environmental conditions that were suitable for growing buckwheat by past societies. The potential niche occupied by Fagopyrum in the past must have been narrower than it is now, owing both to modern agricultural technology and to adaptation of buckwheat to the local environmental conditions since its domestication, for example in flowering time (Romanova et al. 2018). However, there is currently insufficient information on the evolution of these traits to incorporate modelling of the effects which these adaptations had in the past on the suitable niche. Predictions based on the present distribution of buckwheat will therefore be presented together with evidence for its past cultivation to estimate its greatest possible extent in the past. As such, the model will represent the absolute limits of the dispersal of buckwheat, rather than the accurate reconstruction of its past distribution.

Present day data on the global distribution of F. esculentum (Monfreda et al. 2008) were obtained from EarthStat (2021). This includes data on harvested area, yield and total production of different administrative regions, and information about data quality with regard to their geographical resolution, provided at the level of administrative units (AU) for which the data were collected. To minimize the effects of variability of data quality, we matched the geographical scope of our analyses to the geopolitical boundaries of China, where most of the data were collected at the xiàn 县 level, which is the third administrative unit level below provinces and prefectures in most of the country. This ensures comparability with our search area for archaeological evidence of buckwheat cultivation, as well as using units of area with higher consistency and detail compared with neighbouring countries (ESM 4). Although data on harvested area, yield and total production of buckwheat in China display very similar geographical patterns (ESM 5), we have selected the harvested area as a proxy for buckwheat abundance, because data were directly collected in the majority (2,152 out of 2,409) of AU rather than interpolated. More specifically, we used the harvested area fraction to account for differences in the size of the AU and log-transformed to remove the positive skew from the data distribution.

We used as predictors a total of five climatic variables related to temperature and precipitation selected from an initial larger pool of 17 bioclimatic variables, and net primary productivity (NPP), the mass of carbon accumulated per unit area per year. Estimates were obtained from scaled down simulation data based on the HadCM3 general circulation model and corrected for bias (Beyer et al. 2020), with a spatial resolution of 0.5° and a time resolution of 1,000 years for the last 21,000 years. To avoid multicollinearity affecting the results, we identified five clusters of highly correlated variables (r > 0.75) and selected one predictor from each cluster (ESM 6), based on their relevance for buckwheat cultivation and interpretability. Thus, temperature seasonality (BIO4) (the standard deviation of 12 mean monthly temperature values, multiplied by 100), mean temperatures of the warmest and driest quarters (BIO10 and BIO9) and precipitation of the driest quarter (BIO17) (which overlaps with the winter months in the study area, because the summer monsoon brings most of the precipitation), were included to represent the climatic extremes. Net primary productivity (NPP) was included as a suitable measure of the net carbon gain by vegetation; in the context of the study this represents the cluster of variables that measure annual and seasonal trends in precipitation (see ESM 6 for the correlations between the selected variables and the other bioclimatic variables). Before the analysis, the values of predictor variables were averaged for each administrative unit and standardised with the data on buckwheat production.

Various algorithms have been developed for presence or presence–absence records and quantified distribution data (Qiao et al. 2015). While many of these are applicable to the present day abundance data available for buckwheat, recent studies have emphasised the merits of Bayesian approaches which model the parameters as random variables, enabling a more accurate representation of uncertainty (Martínez-Minaya et al. 2018). Within a hierarchical framework they also allow the explicit modelling of spatial dependencies, improving the predictions (Golding and Purse 2016; Hefley and Hooten 2016; Redding et al. 2017; Domisch et al. 2019). We used a hierarchical Bayesian intrinsic conditional autoregressive model (ICAR), which incorporates information about the spatial structure of the administrative divisions of China. A quadratic equation was used to capture the expected non linear response of buckwheat productivity to environmental variables (as suggested by empirical observations, cf. Farooq et al. 2016) in the form of parabolic curves that allow the productivity to peak at a certain value and decrease towards either one or both climatic extremes.

The model estimates the log area under buckwheat cultivation for each AU (administrative unit) i, expressed as the proportion of the total area of that AU (\({y}_{i}\)), using the formula:

$${y}_{i} \sim Normal \,\left({\mu }_{i},\sigma \right)$$
$${\mu }_{i}=\alpha +{\beta }_{1}{x}_{i,1}+{\beta }_{2}{{x}_{i,1}}^{2}+\dots +{\beta }_{k-1}{x}_{i,l}+{\beta }_{k}{{x}_{i,l}}^{2}+{\varphi }_{i}$$

where \({x}_{i,1},\dots ,{x}_{i,l}\) are the values of environmental predictors for the focal AU, \({\beta }_{i1},\dots ,{\beta }_{k}\) are the parameters that define the relationship between the outcome \({y}_{i}\) and the predictor variables and \({\varphi }_{i}\) represents the spatial random effect for each AU derived from the values of its neighbours. The model estimates the values of \(\beta\) and \(\sigma\) using the following weakly informative priors:

$${\beta }_{i1},\dots ,{\beta }_{k} \sim Normal \,\left(\mathrm{0,\,1}\right)$$
$$\sigma \sim Exponential \,\left(1\right)$$

The vector of spatial random effects for all AUs in the study area: \(\varphi =\left[{\varphi }_{i},\dots ,{\varphi }_{n}\right]\) models the spatial interactions between the pairs of AUs, assuming that the interaction \(\left(ij\right)\) exists if AUs \(i\) and \(j\) are neighbours and \(i\ne j\). In the ICAR model the log-likelihood for each \({\varphi }_{i}\) has a normal distribution with a mean equal to the average of its neighbours, and the variance inversely proportional to their number. The log probability density of vector is estimated from the pairwise difference formulation, in which the difference between the values of neighbouring AUs penalize the likelihood of \(\varphi\), resulting in spatial smoothing (Morris et al. 2019):

$$p\left(\varphi \right)\propto exp$$

Because in the hierarchical Bayesian intrinsic conditional autoregressive (ICAR) model the prior for \(\varphi\) is improper and the pairwise difference formula is non-identifiable, the sum of \(\varphi\) is constrained to 0 to enable the estimation of the log probability density and to centre the model. For efficiency reasons, we have imposed a soft version of this constraint recommended by (Mitzi 2019), which tightly restricts the mean of \(\varphi\) to approximate 0, by setting the prior:

$$\sum \left(\varphi \right) \sim Normal \,\left(\mathrm{0,\,0.001}n\right)$$

where \(n\) is the number of AUs.

Model fitting was carried out using Stan and the R interface (Stan Development Team 2018; R Core Team 2020) using five chains, each composed of 2,000 warm-up samples and 8,000 subsequent iterations. The convergence was verified through visual inspection (ESM 7) and using the Gelman-Rubin convergence diagnostic (Rhat) (ESM 8) and the model fit was compared with the observed values. Subsequently, the model was extrapolated over the environmental conditions in the study area for the past 8,000 years, at intervals of 1,000 years. The complete workflow used for the analysis is included in the supplementary repository archived in Zenodo (https://doi.org/10.5281/zenodo.5226104).

The usefulness of the model depends on how well it predicts the fraction of the area under cultivation and how appropriate it is as proxy evidence for the suitability of environmental conditions. The model is necessarily a simplification and as such cannot account for the full range of factors affecting the amount of land under buckwheat cultivation, which include environmental, cultural, economic, historical and other factors. This can lead to some discrepancy between the predicted and observed values and affects the level of uncertainty about the response to environmental variables. However, the uncertainty is explicitly quantified and explained by the model and can be accounted for in the interpretation.

As the log area under buckwheat cultivation for each AU in China reflects its realized distribution under the present farming system, the value of the log area estimated by the species distribution modelling (SDM) can be directly interpreted as the expected amount of arable land with the crop. In the context of modern farming, we can expect this to represent less than the total area where environmental conditions allow economically viable buckwheat cultivation. However, the threshold for economic viability as well as the intensity and extent of buckwheat growing would have been more limited in pre-industrial societies. Therefore, when the predictions of the model are extrapolated to the past, they reflect the absolute limits of possible buckwheat growing, rather than the actual distribution and cannot be directly interpreted as estimates of the area under cultivation. Reliance on modern production data as a proxy for environmental suitability in turn rests on the assumptions that 1, two variables are correlated with one another and that 2, between 8,000 bp and the present there were negligible changes in the response of buckwheat to the predictor variables, so that it became better adapted to previously less suitable environments than those it had previously occupied.

Results

Model evaluation

To evaluate the performance of our model, we compared the predicted and the observed values of the size (natural logarithm) of the present-day cultivated area fraction in China. The differences between the mean predictions and the observations mostly fall between − 2 and 4 log area fraction, and all observations are within the estimated 90% confidence interval, which for most AU ranges between 1 and 5 log area fraction (ESM 9). However, there are a number of outliers with high overestimates and large uncertainty about the predictions, especially among AUs with lower observed values (ESM 10). Geographically, the largest differences between the predicted and observed values are observed in the larger AUs of western and northern China, as well as on Hainan island and the southernmost coastal areas facing it, where the uncertainty is also the highest. On the other hand, the model slightly underestimates (up to 3.94 log area fraction) the predicted values in the narrow corridor of AUs stretching from the southern fringes of the Himalaya into northeastern China and in a few places along the western border (ESM 11).

Response to environmental variables

To clarify the impact of individual predictors on the outcomes of the model, we constructed counterfactual plots (Fig. 3) showing the response variable in relation to changes in each of the environmental covariates, when all the other ones are kept constant at their average value. For further validation, we compared modelled responses with information derived from experimental studies and empirical observations.

Fig. 3
figure 3

Responses of the predicted log area under cultivation to changes in each of the predictor variables, while all the other ones are kept constant, at their mean values

Within the range of values observed in the study area over the past 8,000 years, the variation of net primary productivity (NPP) results in the greatest difference (10–20 log area fraction) between predictions for its lowest and highest values. This variable is a composite of the water availability and solar energy, required for photosynthesis, which agrees with the positive impact on buckwheat cultivation predicted by the model. As NPP rises to about 700 g of carbon, the log area fraction increases within the very narrow confidence interval in an almost linear fashion. Above this level, the benefits of a further rise in NPP are uncertain, as the credible interval extends between almost completely flat and steeply rising curves. Given the sharp increase of NPP towards the coastal areas of southern China with a tropical climate (ESM 12a, e) this variable seems to account for most of the high uncertainty about the predictions for the region. This suggests that the amount of data on buckwheat abundance in this area might have been insufficient for reliable estimates, qualifying the higher predicted estimates in tropical climate areas. The predictions are also strongly responsive to the mean temperature of the driest quarter (BIO9) and can vary by as much as 6–8 log area fraction across the spectrum of observed values. The proportion of the area under cultivation peaks between − 8 and 2 °C and decreases on either side of those values, with some uncertainty about the slope of the curve towards the minimum and the maximum. Additionally, the precipitation during the driest quarter (BIO17) negatively correlates with the response variable in a roughly linear fashion.

The effects of the temperature of the warmest quarter (BIO10) and the temperature seasonality (BIO04) are less clear owing to the ribbon-shaped confidence interval that includes a range of different shapes the response curves may take. BIO04 has weak negative correlation with the response variable, while the confidence interval for BIO10 includes both weak positive and weak negative correlations. These two variables have a relatively small impact on the model, with almost flat mean predictions and comparatively little variation between the estimates produced for their highest and lowest observable values.

Predictions based on past climatic conditions

Model predictions show an overall small increase in the suitability of the environmental conditions for buckwheat cultivation during the last 8,000 years. This is evident both in the small upwards movement of the overall distribution of predicted values through time (Fig. 4) and in the geographical enlargement of the potential niche (Fig. 5), defined as the areas that have conditions that are as good, or better than those estimated for the AU where buckwheat was cultivated in the past. We have shown the potential niche based on the minimum value predicted for an AU where buckwheat remains were found and for the time period to which they were dated. Specifically, this value was defined using the Fagopyrum pollen record from Sihailongwan lake in northeast China, dated to 5,600 bp, considered one of the earliest cases of secure evidence for buckwheat cultivation. The only possible earlier site, Wangjiadian, also located in northeast China, had a much higher predicted suitability. All the later sites with buckwheat micro- and macrofossils were located in areas predicted to have either similar or better conditions than at Sihailongwan (Fig. 4). In contrast, the predictions for the site of Kyung-lung mesa/Kaerdong from which the seeds of F. tataricum from the early 2nd millennium bp were recovered, are well below this threshold, emphasising the different tolerances of the two species to environmental stress.

Fig. 4
figure 4

Distribution of predicted values of log area under cultivation for administrative units in China (each weighted by its area) over the past 8,000 years, with 15,000 bp also included for contrast. Grey shadows represent violin-like plots that show the shape of the distribution; box plots represent the interquartile range for the same values. They are overlaid with a scatter of points representing values predicted for administrative units with buckwheat finds (pollen, charred seeds and starch grains). Sites mentioned in the text are labelled

Fig. 5
figure 5

Predicted log area under cultivation for that past 8,000 years, overlaid with locations containing past buckwheat records. Sites and areas mentioned in the text are labelled. The dashed black line indicates a minimum threshold at which buckwheat is present in the archaeological record and approximates the potential niche

Sihailongwan is located in the Changbai mountains region, which lies to the northeast of the Liaoning and Liao river basins, where there are archaeological sites of the Xinlongwa culture which is associated with early evidence for plant cultivation. The region was generally assumed to have had little agricultural activity up to the beginning of the 20th century, as natural vegetation covered the area during the Qing Dynasty (1616–1912 ce), when it was largely used for hunting. However, there is evidence for crop cultivation before that period, including charred seeds of Panicum miliaceum and Setaria italica, Triticum and Cannabis (Jia 2005; Stebich et al. 2015) as well evidence for forest clearing related to farming activity (Jiang et al. 2008), while Makohonienko et al. (2008) argue for Fagopyrum cultivation in the area since the 9th century ce, based on the pollen records from the Jinchuan site. Therefore the earlier small scale cultivation suggested by the pollen records from Sihailongwan is also possible.

The area of the estimated ecological niche was similar throughout the eight time periods, and stretched from southwestern to northeastern China, reaching a latitude of approximately 50°N in the far northeast (Fig. 5). There is a sizeable geographical gap in the potential niche around 6,000 bp in inland southeast China, but this gap became smaller over time and had almost completely disappeared by 1,000 bp. This reduction seems to have been largely driven by the decrease in the mean temperatures of the driest quarter. This area has the highest mean temperatures of the driest quarter (BIO9) in all time periods (ESM 12a, b). Variation in the precipitation during the driest quarter (BIO17) also contributed to the reduction of the geographical gap (ESM 12f). The best conditions for buckwheat cultivation in all time periods were predicted in northeastern China and in the southernmost coastal areas, although with considerable uncertainty about the estimates for the latter region (ESM 11). The extent of the niche to the north and northwest was relatively stable over time, however with some variation in how far it extended northwards and westwards, especially along the edges of the Himalaya. In the north, the mean temperature of the driest quarter seems to have been the major limiting factor, as the boundary of the niche runs along the lowest values of this variable (ESM 12a, b).

Although the overall conditions across the whole study area improved over time (Fig. 4), none of the recorded evidence for past buckwheat cultivation came from the regions where the niche expanded. Until 3,000 bp, buckwheat finds appear only in the northern parts of the niche as we have defined it, specifically northeastern China (Fig. 5, area B), north of the inferred centre of Fagopyrum domestication in Yunnan province (Fig. 5, area A), and the mouth of the Yangtze river. After this date, buckwheat also appears from the southwest, as well as from sites at the mouth of the Huang He (Yellow river), the area with the highest predicted values.

Discussion

Modelled response to environmental variables

The response of buckwheat to environmental variables predicted by our model is consistent with experimental studies and empirical observations. The positive correlation with the net primary productivity (NPP) represents water and solar energy requirements, while the parabolic shape of the mean temperature of the driest quarter (BIO9) reflects the thermal tolerance, such as the lack of frost resistance and aversion to high temperatures during the flowering period (Jacquemart et al. 2012). Specifically, the model suggests that suitability of environmental conditions starts to decrease as the average temperature of the driest quarter falls below about −8 °C (Fig. 3). As buckwheat has a relatively short growing season of 70–100 days (Li et al. 2016), this is unlikely to directly overlap with the driest quarter, which in most of the study area falls between October/November and January/February. However, cold temperatures in the driest quarter are also associated with an increased likelihood of frost in the months before and after, which may fall within the range of sowing times. As buckwheat can grow in different rotation systems throughout the year, the timing of the growing season varies in different parts of China. In the highlands in the northern part of the country where it is one of the principal crops, sowing starts from late May to early June. In the plains around the Yellow river valley, it is a fill-in crop, sown in July or August, after wheat is harvested. In the south, buckwheat is grown as an autumn and winter crop and can be sown until early to mid November, although the sowing dates vary considerably between different regions (Lin et al. 1992; Lin 1994, 2006). The temperature of the driest quarter (BIO9) is therefore likely to differ in relevance between regions, but it is generally an important proxy for the response of buckwheat to temperature. Finally, the negative correlation between the precipitation in the driest quarter (BIO17) and the area under cultivation may in turn reflect an overall better adaptation to temperate ecozones than to tropical and subtropical ones, as the gradient of change in BIO17 across China closely follows these climate zones (ESM 12a, c).

These predictions mainly suggest that Fagopyrum has a very broad temperature tolerance, as indicated by the wide peak of the BIO9 response curve, as well as a moderate and gradual response to changes in precipitation, as indicated by the positive correlation with the NPP response curve (Fig. 3), which correlates with a cluster of bioclimatic variables (BIO12,13,16,18) related to precipitation (ESM 6). As a result, changes in the values of both variables over the time period considered in this study are not sufficient to significantly alter the size and shape of the predicted ecological niche. This can be contrasted with the noticeably smaller size of the niche predicted for 15,000 bp (Fig. 4), when the climatic conditions differed more substantially. It is possible that a climatic deterioration over shorter time periods could have resulted in temporary localised gaps in the potential niche, as well as a shorter phase of niche expansion not detectable at the scale of analysis. However, the time resolution of the climatic reconstructions is too low to capture potential short term fluctuations and therefore we focus on the long term changes in the climatic suitability for buckwheat cultivation.

Implications of the model for the spread of buckwheat

As noted previously, there is a discrepancy between the areas with Fagopyrum growing shown either by archaeobotanical evidence or the genetic estimates of the geographical origins of domesticated F. esculentum. The former shows that earliest buckwheat macrofossils from the 8th millennium cal bp came from sites in northern China, the latter suggests that buckwheat domestication occurred in the Sanjiang (Three Rivers) region in southwest China, from where, however, there are no archaeobotanical records pre-dating the early 3rd millennium cal bp. This discrepancy could be explained either by a historically far more widespread distribution of the progenitor taxon F. esculentum ssp. ancestrale, or by a ‘proto-domestication’ and limited cultivation at the margins of the F. esculentum ssp. ancestrale range in the southwest, followed by its being taken northwards and cultivated more intensively there (Hunt et al. 2018). The second scenario could also be associated with the ‘edge effect’ which describes the idea that the investment of farming efforts leading to domestication would be cost effective only outside the core distribution area of the wild ancestors. This has been proposed to explain similar patterns observed in other crops, including emmer and einkorn wheats in southwest and rice in southeast Asia, whose centres of domestication are beyond or on the edges of their wild ancestral ranges (Jones and Brown 2007).

Our analysis suggests that southwest China has relatively high suitability for growing buckwheat (Fig. 5, area A), but the area with the most suitability is instead around the lower reaches of the Yellow river and the coast of the Bohai sea in northeast China (Fig. 5, area B). Area B includes the site of Wangjiadian, where the earliest recorded spike in pollen percentage evidence of buckwheat is dated to the 8th millennium bp (Fig. 5). Higher suitability of this region than southwest China could have encouraged more growing of buckwheat in this area, supporting the second of the two hypotheses above. Archaeobotanical records and historical documents confirm the importance of buckwheat cultivation in area B in later periods (Li et al. 2016), which is in agreement with the consistently high suitability for buckwheat cultivation suggested by our model throughout eight millennia.

In inland southeast China, we identified an area below our suitability threshold, which could have acted as a barrier to the dispersal of buckwheat into southeast Asia, where it is now cultivated at very low levels. This may differentiate buckwheat from other cereals; Deng et al. (2020) suggest that an inland route was important for the spread of rice and millet into the southeastern coastal regions of China and then into southeast Asia. The expansion of rice and millet appears to date back to ca. 4,850 bp, although southeastern China is poorly explored from an archaeobotanical standpoint (Deng et al. 2020). Buckwheat cultivation, however, would not have been viable in large parts of this region (Jiangxi, Hunan and Hubei provinces) until around 2,000 bp, according to our model. This geographical and chronological pattern is largely caused by the high mean temperatures of the driest quarter (BIO9, ESM 12a, b). The high temperatures in this region in the early to mid Holocene correspond to the Holocene Climatic Optimum inferred for southeast China from other proxy evidence, in which the vegetation was dense subtropical evergreen and broadleaved forests (Zhou et al. 2004; Yue et al. 2015). Compared with other regions in China, buckwheat is not much grown in the southeast today (ESM 5, 11) and this region contributes a low proportion of buckwheat diversity as measured by germplasm resources.

In some parts of our study area, the extent of the potential niche of buckwheat suggested by our model stops short of the modern geopolitical boundary of China, so the boundary of the niche is clear. The presence of archaeobotanical buckwheat records from the northwestern (Xindian, Xishanping and Beizhuangcun) and northeastern (Chongming island) margins of the estimated niche between 5,500 and 4,500 cal bp, as well as from the entrance to the Hexi Zhoulang (corridor) in Donghuishan by the 4th millennium bp, suggests that it might have spread into all suitable areas in the northern part of China by this time. It is notable that in western China, the boundary runs in an approximately North–South direction through western Gansu, eastern Qinghai, western Sichuan and southeastern Tibet (Xizang), thus excluding the cold and/or arid regions of the main Tibetan plateau and Xinjiang. This is significant in terms of the prehistoric spread of agriculture, as it implies that, in contrast to other short season grain crops which originated within China (in particular, millets), there was no period when the environmental niche permitted the westward spread of buckwheat via the inner Asian mountain corridor route (cf. Liu et al. 2019). This is consistent with the absence of buckwheat in the archaeobotanical record from Central Asia.

In other parts of China, the potential niche that we modelled reaches the modern national border, and as such is expected to extend beyond it, though to an unknown extent. This is true for a relatively short stretch around 42°N along the border between the Inner Mongolia region of China (Nei Menggou) and Mongolia, and along the entire southern and western boundary of Yunnan province and neighbouring parts of Tibet, bordering Myanmar, northwest India, Bhutan and eastern Nepal. Thus, from the environmental standpoint, southern and westward routes beside the Himalaya would have been possible for westward expansion. From the 3rd millennium bp, there is direct evidence for the presence of buckwheat from the site of Haimenkou in Yunnan province. It is suggested that cultivated F. esculentum may have spread westwards from eastern Tibet or Yunnan along the southern side of the Himalaya. Fagopyrum pollen was found at Tso Mori lake in Ladakh, India, in a layer dated to 1,100 bp (Cao et al. 2013). Macrofossils of both F. esculentum and F. tataricum were also found from sites in the Jhong valley in contexts dated to the early 1st millennium bce (Knörzer 2000), though recent new analyses have suggested that these buckwheat finds may be modern intrusions (Meurers-Balke et al. 2020). If the samples were ancient, this chronology would fit both with the confirmed macrofossil dates for Yunnan and with the more favourable environment for buckwheat in southwest China from 3,000 bp as shown by our model. Buckwheat is grown today on the southern side of the Himalaya in Jammu and Kashmir and in the states Himachal Pradesh and Uttarakand in northwestern India, in West Bengal through to Arunachal Pradesh in northeastern India (Rana et al. 2016) and in many regions of Nepal (Luitel et al. 2017).

Niche following pattern

Our species distribution modelling (SDM) suggests a slow and steady improvement in the suitability of environmental conditions for buckwheat cultivation in the study area over the past eight millennia (Figs. 4, 5). However, the spatial extent of the potential ecological niche remained relatively stable over time, with limited expansion into a few inland regions in southeast China. This suggests that climate change in the study area did not generate opportunities for dispersal to new places, as these were already available at least from 8,000 bp. It follows that the dispersal of buckwheat did not require human modifications of the physical environment (counteractive niche construction) beyond those involved in the initial domestication process. The range of suitability estimates for sites with fossil evidence of buckwheat remains consistent over time (Fig. 4). The lack of expansion of buckwheat into areas with lower suitability, and therefore requiring human modification, appears to favour a niche following process. However, this inference depends on the small set of past occurrence records, whose identification relies on the series of assumptions discussed earlier. As such, the overall scarcity of buckwheat in the archaeological record limits our ability to distinguish between the absence of buckwheat cultivation from less suitable environments and the absence of evidence for expansion resulting from a sampling bias. Therefore, further archaeobotanical research in the areas without records, particularly in southeast China, is needed to verify the niche following hypothesis.

Conclusions

In this paper, we have used species distribution modelling (SDM) to estimate the potential past ecological niches of Fagopyrum esculentum (buckwheat) in China. We compared this with the archaeobotanical record to find whether these data suggested that buckwheat growing was more consistent with a niche following or a niche forming process. While the small sample size of our fossil record does not allow for a direct assessment of the actual past niche of buckwheat and its changes through time, our analysis has revealed how the potential niche of buckwheat has remained almost unchanged during the Holocene and we therefore suggest that the dispersal of buckwheat followed a niche following process.

This approach is particularly well suited for studying taxa with low occurrence records that have inevitably received less attention. It offers an opportunity to compare the dispersal of major and minor crops on an equal basis and examine the impact of changing ecological opportunities for them. As such, our approach also provides a basis for evaluating the direct competition between domesticates.