Extrapolation in species distribution modelling. Application to Southern Ocean marine species
Introduction
Among the broad array of analytical tools developed for marine ecology studies over the last two decades, Species Distribution Modelling (SDM) has been increasingly used (Peterson, 2001, Elith et al., 2006, Austin, 2007, Gobeyn et al., 2019) and applied to Southern Ocean pelagic (Pinkerton et al., 2010, Freer et al., 2019), benthic organisms (Loots et al., 2007, Pierrat et al., 2012, Basher and Costello, 2016, Xavier et al., 2016, Gallego et al., 2017, Guillaumot et al., 2018a, Guillaumot et al., 2018b, Fabri-Ruiz et al., 2019, Jerosch et al., 2019) and even marine mammals (Nachtsheim et al. 2017). SDM represents a complementary approach to individual-based modelling and eco-physiological experiments, quickly and synthetically identifying environmental correlates of species distribution (Brotons et al., 2012, Feng and Papeş, 2017, Feng et al., 2020). SDM is also used to define species distribution spatial range (Nori et al., 2011, Walsh and Hudiburg, 2018) and can be used as decision criteria for conservation purposes (Guisan et al., 2013, Marshall et al., 2014). For instance, it is currently used in proposals developed by national committees of the CCAMLR (Commission for the Conservation of Antarctic Marine Living Resources) to support the definition and delineation of marine protected areas (Ballard et al., 2012, CCAMLR report WG-FSA-15/64, 2020, Arthur et al., 2018).
Applying SDM to Southern Ocean case studies is particularly challenging due to major constraints and biases that may reduce modelling performance. As for many oceanographic studies, access to environmental data with high temporal and spatial resolutions is difficult (Davies et al., 2008, Robinson et al., 2011). Antarctic coastal areas, in particular, are rarely accessed and documented due to logistical constraints, access being for example impossible during the austral winter due to sea ice cover (De Broyer et al. 2014). The availability of species absence records is also a limiting factor to modelling performances and model calibrations (Brotons et al., 2004, Wisz and Guisan, 2009). Models are usually based on a limited number of presence-only records and limited number of sampling sites, which are both spatially aggregated in the vicinity of scientific stations, where access is frequent and datasets from different seasons, have been compiled over decades and even beyond (De Broyer et al., 2014, Guillaumot et al., 2018a, Fabri-Ruiz et al., 2019, Guillaumot et al., 2019).
When generating a SDM, the model is fit to data with a given range of value for each environmental descriptor (i.e. the calibration range). When transferring model predictions, a portion of the environment may cover additionnal conditions that are outside this calibration range: these are non-analog conditions and the model extrapolates (Randin et al., 2006, Williams and Jackson, 2007, Williams et al., 2007, Fitzpatrick and Hargrove, 2009, Owens et al., 2013, Yates et al., 2018). Considering the limited number of species presence-only records occupied by each marine benthic species, and the poor quality and precision of environmental descriptors available for modelling Southern Ocean species distributions (Guillaumot et al., 2018a, Fabri-Ruiz et al., 2019), a large proportion of cells might be expected to be extrapolations beyond the calibration range of the model.
The Multivariate Environmental Similarity Surface (MESS) approach analyses spatial extrapolation by extracting environmental values covered by presence-only records and estimates areas where environmental conditions are outside the range of conditions contained in the calibration area (Elith et al. 2010). The method considers that extrapolation occurs when at least one environmental descriptor value is outside the range of the environment envelop for model calibration (more details given in Appendix 4).
The MESS approach was initially used to determine the environmental barriers to the invasion of the cane toad in Australia, when facing new environments and under future conditions (Elith et al. 2010). Implemented in MaxEnt (Elith et al. 2011), MESS was subsequently used by several authors for defining the climatic limits to the colonisation of new environments by non-native species, such as the American bullfrog in Argentina (Nori et al. 2011), for studying contrasts between native and potential ecological niches like in the study of the spotted knapweed (Centaurea stoebe) (Broennimann et al. 2014), or for defining the limits to model transferability and predicting the distribution of trees under future environmental conditions (Walsh and Hudiburg 2018).
More recently, the MESS approach was used to define model uncertainties related to extrapolation (Escobar et al., 2015, Li et al., 2015, Cardador et al., 2016, Luizza et al., 2016, Iannella et al., 2017, Milanesi et al., 2017, Silva et al., 2019) and extrapolation areas where environmental conditions are non-analog to conditions of model calibration (Fitzpatrick and Hargrove, 2009, Anderson, 2013). Associating uncertainty information to model predictions has been acknowledged as a necessity for reliable interpretations of model predictions (Grimm and Berger, 2016, Yates et al., 2018). It is also a requirement for specifying the level of risk associated with predictions and evaluating whether uncertainty can be mitigated to improve model outcomes (Guisan et al. 2013).
This study addresses the importance of extrapolation and associated uncertainties in SDMs generated at broad spatial scale for Southern Ocean species: an analysis that is seldom performed although important to characterise model reliability. Using the case study of six abundant and common sea star species in marine benthic communities, objectives of this work are to evaluate the importance of extrapolation proportions in wide projection areas, and to provide some methodological clues to mitigate the effects of extrapolation and improve model accuracy.
Section snippets
Studied species and environmental descriptors
The distribution of six sea star species (Asteroidea: Echinodermata) was studied (Table 1). The six species, Acodontaster hodgsoni (Bell, 1908), Bathybiaster loripes (Sladen, 1889), Glabraster antarctica (Smith, 1876), Labidiaster annulatus Sladen, 1889, Odontaster validus Koehler, 1906 and Psilaster charcoti (Koehler, 1906) are abundant and common in benthic communities in the Southern Ocean. The biology, ecology and distribution of these species have been extensively studied and are
Extrapolation and the extent of projection areas
All generated SDMs are accurate and performant, with high AUC (AUC > 0.91), TSS (TSS > 0.559) and COR (COR > 0.68) values, low standard deviations and good percentages of correctly classified presence-only test data (77–90%) (Table 2). Descriptors that contribute the most to SDMs are depth (22–34%), minimum POC (6–21%), POC standard deviation (8–20%), mean ice cover depth (7–17%) and mixed layer depth (3–10%). Contrasts between species are in the respective percentage of contribution of these
Modelling performances and extrapolation
SDMs were generated for Southern Ocean sea star species, with contrasting distributions and different numbers of presence-only records available (Table 1, Appendix 1). Overall, species presence-only records are spatially concentrated in the most accessible and visited areas of the Southern Ocean. Most of the sea star samples were collected close to the coasts of the Western Antarctic Peninsula, the Ross Sea and sub-Antarctic Islands such as the Kerguelen Islands. Consequently, high spatial
Conclusions
This study shows that when modelling species distribution on broad scale areas, such as the Southern Ocean, important proportions of predicted distribution probabilities (suitable or not) are model extrapolations. This extrapolation uncertainty relies on the completeness of species sampling, and the definition of its occupied space to calibrate the model. Extrapolation occurs in areas where habitat suitability is unknown as no information on species presence or absence is provided.
Reducing
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
This work was supported by a “Fonds pour la formation à la Recherche dans l’Industrie et l’Agriculture” (FRIA) and “Bourse fondation de la mer” grants to C. Guillaumot.
This is contribution no. 46 to the vERSO project (www.versoproject.be), funded by the Belgian Science Policy Office (BELSPO, contract n°BR/132/A1/vERSO). Research was also financed by the “Refugia and Ecosystem Tolerance in the Southern Ocean” project (RECTO; BR/154/A1/RECTO) funded by the Belgian Science Policy Office (BELSPO),
References (109)
- et al.
Genetic differentiation in the circum—Antarctic sea spider Nymphon australe (Pycnogonida; Nymphonidae)
Deep Sea Res. Part II
(2011) - et al.
Managing for change: Using vertebrate at sea habitat use to direct management efforts
Ecol. Ind.
(2018) Species distribution models and ecological theory: a critical assessment and some possible new approaches
Ecol. Model.
(2007)- et al.
Coexistence of mesopredators in an intact polar ocean ecosystem: the basis for defining a Ross Sea marine protected area
Biol. Conserv.
(2012) - et al.
Predicting suitable habitat for the cold-water coral Lophelia pertusa (Scleractinia)
Deep Sea Res. Part I
(2008) - et al.
How many species in the Southern Ocean? Towards a dynamic inventory of the Antarctic marine species
Deep Sea Res. Part II
(2011) - et al.
A global map of suitability for coastal Vibrio cholerae under current and future climate conditions
Acta Trop.
(2015) Diversity in deep-sea benthic macrofauna: the importance of local ecology, the larger scale, history and the Antarctic
Deep Sea Res. Part II
(2004)- et al.
Evolutionary algorithms for species distribution modelling: A review in the context of machine learning
Ecol. Model.
(2019) - et al.
Robustness analysis: Deconstructing computational models for ecological theory and applications
Ecol. Model.
(2016)
Broad-scale species distribution models applied to data-poor areas
Prog. Oceanogr.
DNA barcoding reveals new insights into the diversity of Antarctic species of Orchomene sensu lato (Crustacea: Amphipoda: Lysianassoidea)
Deep Sea Res. Part II
Overcoming the rare species modelling paradox: a novel hierarchical framework applied to an Iberian endemic plant
Biol. Conserv.
The performance of state-of-the-art modelling techniques depends on geographical distribution of species
Ecol. Model.
Species distribution modelling to support marine conservation planning: the next steps
Mar. Policy
Constraints on interpretation of ecological niche models by limited environmental ranges on calibration areas
Ecol. Model.
Spatial and seasonal distribution of adult Oithona similis in the Southern Ocean: predictions using boosted regression trees
Deep Sea Res. Part I
Effects of sample size on accuracy of species distribution models
Ecol. Model.
Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS)
J. Appl. Ecol.
The effect of the extent of the study region on GIS models of species geographic distributions and estimates of niche evolution: preliminary tests with montane rodents (genus Nephelomys) in Venezuela
J. Biogeogr.
A framework for using niche models to estimate impacts of climate change on species distributions
Ann. N. Y. Acad. Sci.
Standards for distribution models in biodiversity assessments
Sci. Adv.
Selecting pseudo-absences for species distribution models: how, where and how many?
Methods Ecol. Evol.
The past, present and future distribution of a deep-sea shrimp in the Southern Ocean
PeerJ
Incorporating uncertainty in predictive species distribution modelling
Philos. Trans. R. Soc. B: Biol. Sci.
Overcoming limitations of modelling rare species by using ensembles of small models
Methods Ecol. Evol.
Contrasting spatio-temporal climatic niche dynamics during the eastern and western invasions of spotted knapweed in North America
J. Biogeogr.
Presence-absence versus presence-only modelling methods for predicting bird habitat suitability
Ecography
Modeling bird species distribution change in fire prone Mediterranean landscapes: incorporating species dispersal and landscape dynamics
Ecography
Combining trade data and niche modelling improves predictions of the origin and distribution of non-native European populations of a globally invasive species
J. Biogeogr.
A new method for dealing with residual spatial autocorrelation in species distribution models
Ecography
Biogeographic atlas of the Southern Ocean
Combining field phenological observations with distribution data to model the potential distribution of the fruit fly Ceratitis rosa Karsch (Diptera: Tephritidae)
Bull. Entomol. Res.
Global mapping of highly pathogenic avian influenza H5N1 and H5Nx clade 2.3. 4.4 viruses with spatial cross-validation
Elife
Wrong, but useful: regional species distribution models may not be improved by range-wide data under biased sampling
Ecol. Evol.
Novel methods improve prediction of species’ distributions from occurrence data
Ecography
A working guide to boosted regression trees
J. Anim. Ecol.
The art of modelling range-shifting species
Methods Ecol. Evol.
A statistical explanation of MaxEnt for ecologists
Divers. Distrib.
Can we generate robust species distribution models at the scale of the Southern Ocean?
Divers. Distrib.
Benthic ecoregionalization based on echinoid fauna of the Southern Ocean supports current proposals of Antarctic Marine Protected Areas under IPCC scenarios of climate change
Glob. Change Biol.
Keep collecting: accurate species distribution modelling requires more collections than previously thought
Divers. Distrib.
Can incomplete knowledge of species’ physiology facilitate ecological niche modelling? A case study with virtual species
Divers. Distrib.
Physiology in ecological niche modeling: using zebra mussel's upper thermal tolerance to refine model predictions through Bayesian analysis
Ecography
A review of methods for the assessment of prediction errors in conservation presence/absence models
Environ. Conserv.
The projection of species distribution models and the problem of non-analog climate
Biodivers. Conserv.
Predicting future distributions of lanternfish, a significant ecological resource within the Southern Ocean
Divers. Distrib.
On the need to consider multiphasic sensitivity of marine organisms to climate change: A case study of the Antarctic acorn barnacle
J. Biogeogr.
Cited by (15)
Acoustic-based classification of marine geophysical data for benthic habitat mapping in the littoral zone of Qaitbay Citadel of Alexandria
2024, Egyptian Journal of Aquatic ResearchBenthic habitat mapping: A review of three decades of mapping biological patterns on the seafloor
2024, Estuarine, Coastal and Shelf ScienceImproved environmental mapping and validation using bagging models with spatially clustered data
2023, Ecological InformaticsLow vulnerability of the Mediterranean antipatharian Antipathella subpinnata (Ellis & Solander, 1786) to ocean warming
2023, Ecological ModellingCitation Excerpt :The proportion of correctly classified test data also showed high values for the four groups of occurrence data used for spatial cross-validation, with 98.42 ± 2.49% (mean ± se) for group 1, 98.19 ± 2.81% for group 2, 99.60 ± 1.47% for group 3 and 96.69 ± 4.04% for group 4. The proportion of extrapolated area was 65.86% (Fig. 3), supporting the use of the MESS method (Guillaumot et al., 2019, 2020a). In addition to the occurrence data of A. subpinnata in the Tyrrhenian, Ligurian, Adriatic, Ionian and Aegean Seas (Fig. S1), the model projected very high presence probabilities in the Alboran Sea (South coast of Spain and North coast of Morocco), along the Algerian coast, around the Balearic Islands, along the East coast of the Adriatic Sea (Coasts of Croatia, Bosnia and Herzegovina, Montenegro, Albania) and in the South coast of Greece, near Athens (Fig. 3).