Elsevier

Ecological Modelling

Volume 447, 1 May 2021, 109526
Ecological Modelling

Submitted to: Ecological Modelling: Original research paper
Population projections from holey matrices: Using prior information to estimate rare transition events

https://doi.org/10.1016/j.ecolmodel.2021.109526Get rights and content

Highlights

  • The multinomial Dirichlet distribution allows estimation of transition parameters.

  • Estimates can be calculated even in the case of unobserved transitions.

  • This Bayesian approach provides measures of uncertainty throughout the PVA.

  • Credible intervals are bounded by 0 and 1 avoiding biologically unrealistic values.

  • Explicitly reporting observed transitions and priors results in reproducibility.

Abstract

Population projection matrices are a common means for predicting short- and long-term population persistence for rare, threatened and endangered species. Data from such species can suffer from small sample sizes and consequently miss rare demographic events resulting in incomplete or biologically unrealistic life cycle trajectories. Matrices with missing values (zeros; e.g., no observation of seeds transitioning to seedlings) are often patched using prior information from the literature, other populations, time periods, other species, best guess estimates, or are sometimes even ignored. To alleviate this problem, we propose using a multinomial-Dirichlet model for parameterizing transitions and a Gamma for reproduction to patch missing values in these holey matrices. This formally integrates prior information within a Bayesian framework and explicitly includes the weight of the prior information on the posterior distributions. We show using two real data sets that the weight assigned to the prior information mainly influences the dispersion of the posteriors, the inclusion of priors results in irreducible and ergodic matrices, and more biologically realistic inferences can be made on the transition probabilities. Because the priors are explicitly stated, the results are reproducible and can be re-evaluated if alternative priors are available in the future.

Introduction

Population projection matrices are a common means for predicting long term population persistence, especially for rare, threatened and endangered species (Crone et al., 2013; Stott et al., 2010a). For matrix population models to be practical for ecologists and conservation biologists it is assumed that sample size is sufficient to estimate the “true” vital rates and transition probabilities of a population in the period and site of interest (Caswell, 2001), or is at least a good approximation. Matrix entries are transition probabilities and recruitment rates, we define transitions as all arrows in the t-matrix (transition matrix) which include stasis and change of stage, while the recruitment rates are defined in the f-matrix (fecundity matrix) (Fig. 1). However, in reality the amount of data that can be obtained to estimate transition probabilities and recruitment rates from population surveys is often limited (Morris et al., 2002; Crone et al., 2011). The challenge is to gather sufficient data to accurately represent the central tendencies and variance of life history parameters (Caswell, 2001; Coulson et al., 2001). Inference about transition probabilities and recruitment requires data from at least two time periods as the parameters included in the model are the transition/survival rates of individuals at different ages/stages, including survival, death and some measure of offspring production. Some transition events may be naturally rare resulting in infrequently observed or even unobserved events in some time periods or populations (e.g. Enright & Ogden, 1979; Pinero et al., 1984; Calvo, 1993; Kaneko & Kawano, 2002; Schödelbauerová et al., 2010). In all of these cases the authors made educated guesses about values that were missing in their matrices. While such substitutions are common in the literature when incomplete or biologically unrealistic values are obtained, the ad hoc nature of the process is frequently not reproducible, and rarely is the variance in the estimated quantities provided. What is the best way to explicitly take advantage of prior information to resolve issues of missing data or rare vital rates that result in biologically unrealistic estimates?

Confidence in Population Projection Matrices (PPM) analysis outputs is highest when population sizes are large because data collected to estimate the parameters of the matrix will more likely be representative of the life history characteristics of the species. But when sample sizes are small, as when evaluating naturally rare and/or endangered species, parameter estimates, and the consequential inferences based on these parameters could be misleading. It is generally recognized that small sample sizes will at a minimum result in high variance of the parameter estimates, which may have large impacts on estimates of population growth rates and extinction probabilities, impairing management decisions (Crone et al., 2011). Yet even when sample sizes are large, naturally rare events may still be missed or their estimates may not be accurate for the population, species or experimental group under study.

Ecologists have used various approaches to fill in missing data. These include using means of multiple populations, time periods, aggregation among years, values from the literature for the same or different populations, or from congeneric species, and best-guess estimates from specialists (Menges & Dolan, 1998; Kendall & Nichols, 2002; Menges et al., 2006; Kéry & Schaub, 2012). Bayesian approaches have also been used for estimating parameters in Population Viability Analyses (PVA) (McCarthy et al., 2001; McCarthy & Masters, 2005). Lee and Rieman (1997) used a Bayesian belief network in a PVA for salmonids, and Wade (2002) applied a Bayesian approach to estimate life history parameters. However, Wade's approach was limited to individual parameters and not for estimating multinomial parameters or the use of the beta distribution. Tremblay et al. (2009a, 2009b) and Tremblay and McCarthy (2014) used Bayesian hierarchical models using multinomial distributions to estimate multiple transition parameters and vital rates simultaneously for rare or unobserved events, and Ackerman et al. (2020) used the Dirichlet function for transition probabilities.

Tremblay and McCarthy (2014) used a Bayesian approach to estimate transition probabilities and recruitment rates for missing values or low sample sizes with data from conspecific populations, thus allowing the statistical model to “borrow strength” from other populations and generate estimated probabilities for all transitions. These models used data from nearby conspecific or related populations collected in the same time period. Unfortunately, many demography studies are carried out with single populations (44% for plants, Salguero-Gómez et al., 2015; 57% for animals, Salguero-Gómez et al., 2016), limiting the usefulness of the Tremblay and McCarthy (2014) approach. Moreover, Che-Castaldo et al. (2018) recently modelled variation in life history parameters of plants and found that there was little evidence of consistent phylogenetic correlations, and large variation even between different populations of the same species. Similar problems are also noted for PPM of animals where a limited amount of data is available for model construction. For example, Eberhardt et al. (1986) had a small sample size of grizzlies in some age classes. Morris and Doak (pg 197, 2002) recognized the potential problem with small sample sizes for rare transitions, but their recommendation was to use a mathematically complex approach similar to that developed by Tremblay and McCarthy (2014).

King et al. (2010, chapter 8) demonstrate a Bayesian approach to fill in missing covariate values in mark-recapture and other complex ecological models of individual survival. These are different and more complex problems than we address here. In our case, missing values are unobserved transitions. In the more labor-intensive, data rich mark recapture analysis, these unobserved transitions are modelled directly. Our goal is to show that a simple method exists for dealing with unobserved transitions in the case of sparse data.

In all cases the best substitute for missing or inaccurate values will depend on the knowledge of the life history of the species under study and the ability of the researcher to choose the best alternative considering a plethora of biological variables. However, it is not always clear how much weight to place on prior biological knowledge compared to the data, and how to explicitly include this in the projection matrix. An observation of 0 for a rare transition is still data. While the maximum likelihood estimate would be 0, calculating an upper confidence limit of that estimate is not straightforward. However, calculation of credible intervals, bounded by 0 and 1, is straightforward with the Dirichlet process, which considers simultaneously values of all other transitions for that specific stage. In the examples mentioned below we compute matrix elements by using the direct counts for each stage and the change or stasis of these among time periods. The IPM (Integrated Population Model) approach may also be analyzed using Bayesian inference methods (Schaub & Abadi, 2011).

Biologically nonsensical results may arise from PPM by ignoring problems of ergodicity (the condition that regardless of the initial stage structure, the asymptotic growth rate will be the same) and irreducibility (the condition that the transitions from all stages to all other stages are possible), both of which may severely impact matrix structure and analyses (Stott et al., 2010a). Unfortunately, these issues are relatively common. Stott et al. (2010b) surveyed the literature and estimated that 24.7% of population projection matrices were reducible and 15.6% were non-ergodic. Matrices which include zeros due to rare events or low sample size in key life history rates are likely to be either non-ergodic or reducible matrices. While the definition of irreducible matrices has a specific mathematical description (Caswell, 2001), biologically it often results in a life cycle model which is incomplete. Some of these reducible matrices had incomplete life cycles, which Stott et al. (2010b) regarded as defying “biological rationale”. In many cases, authors recognized the limitations of their models and used one or more methods mentioned above to substitute for missing or unrealistic values.

Here we describe the use of a Bayesian approach which allows incorporating explicit prior information into the analysis to “fill in the holes” created by small sample sizes and rare transitions. We define a holey matrix as those matrices which have zeros (“holes”) for parameters that are expected to have non-zero values in critical life history transitions, otherwise the biological life history is nonsensical. A simple example is when no recruitment or death of a certain age/stage was observed during a specific time period. Our approach estimates values to replace those zeros, thereby eliminating unrealistic life history transitions/stasis and fertility rates. We use the multinomial Dirichlet distribution to estimate transition rates between stages, and the Gamma distribution to estimate recruitment rates and contrast the results with standard PPM analyses. This approach differs in that it includes reproducibility as all priors are explicitly stated, parameter estimation considers the sample size of the survey and weight of the priors, and posteriors have a beta distribution thereby resolving issues of nonsensical confidence intervals. Furthermore, it is applicable to a single population.

Section snippets

Model species

We illustrate our method using two data sets, but the method can apply to any populations with transitions between stages or ages of animals or plants. In the first example the data are from the endangered orchid, Lepanthes eltoroensis Stimson, an epiphyte endemic to Puerto Rico (Tremblay & Hutchings, 2003). Data were collected from a total of 1085 individuals on 23 trees which were permanently identified and tracked over a 6-year period at intervals of 6 months. Each tree is an independent

Results

Lepanthes example

Discussion

Uncertainty in vital rate parameter estimates should be small for the species and conditions under consideration. Unfortunately, on many occasions transition events may simply fail to occur during the survey period. The population projection matrices constructed from these data often result in reducible matrices (Stott et al., 2010b). Using the Dirichlet distribution to parameterize the matrices, all of the matrices complied with being irreducible and ergodic, regardless of the choice of prior.

Conclusions

The multinomial Dirichlet approach adds to the tools for estimating transition probabilities and should be implemented for estimating central tendencies and credible intervals of parameters, and for evaluating transitions of rare and unobserved events. Calculation of parameter estimates should be explicit and observed transitions and priors should be shown in publications. Access to the raw data (at least the number of observed transitions) would open up the opportunity for data to be

Credit authors statement

RLT, MEP and AJT were responsible for developing the main ideas. RLT, JDA and AJT developed the biological relevance of the methods. RLT provided the data, and AJT and RLT ran the analyses. RLT initiated the writing of the manuscript. All authors contributed to subsequent revisions.

Data accessibility

For our data analyses we developed an R package “raretrans” (https://atyre2.github.io/raretrans/) that replicates the analysis in this paper and provides functions for evaluating the posterior distribution of matrices from data on individual transitions and populations.

ORCID

Raymond L. Tremblay https://orcid.org/0000–0002–8588–4372

Maria-Eglée Pérez https://orcid.org/0000–0001–8641–8405

Andrew tire, https://orcid.org/0000–0001–9736–641X

James D. Ackerman https://orcid.org/0000–0002–8928–4374

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

Financial support for RLT to attend the symposium on “Demography beyond the Population”, in Sheffield, UK by UPR-Humacao is appreciated. We thank the participants at the meeting who suggested that we submit this paper. RLT and AJT also thank the 2016 Global Analysis of Plant and Animal Demography Workshop funded by the International Opportunity Development Fund, Natural Environment Research Council. Supported by NSF: HRD 0734826. AJT was supported by the Nebraska Agricultural Experiment

References (55)

  • H. Caswell

    Matrix Population Models

    (2001)
  • J. Che-Castoldo et al.

    Predictability of demographic rates based on phylogeny and biological similarity

    Conserv. Biol.

    (2018)
  • S.L. Choy et al.

    Elicitation by design in ecology: using expert opinion to inform priors for Bayesian statistical models

    Ecology

    (2009)
  • P. Congdon

    Bayesian Models for Categorical Data

    (2005)
  • E.E. Crone et al.

    Ability of matrix models to explain the past and predict the future of plant populations

    Conserv. Biol.

    (2013)
  • E.E. Crone et al.

    How do plant ecologists use matrix population models?

    Ecol. Lett.

    (2011)
  • P. Diaconis et al.

    Conjugate priors for Exponential Families

    Ann. Stat.

    (1979)
  • P.G.Lejeune Dirichlet

    Sur une nouvelle méthode pour la détermination des integrale multiples

    Louiville, J. de Mathématiques, Ser. I,

    (1839)
  • L.L. Eberhardt et al.

    Monitoring grizzly bear population trends

    J. Wildlife Manage.

    (1986)
  • S.P. Ellner et al.

    Integral projection models for species with complex demography

    Am. Nat.

    (2006)
  • N. Enright et al.

    Applications of transition matrix models in forest dynamics: Auracaria in Papua New Guinea and Nothofagus in New Zealand

    Aust. J. Ecol.

    (1979)
  • B.A. Frigyik et al.

    Introduction to the Dirichlet Distribution and Related Processes

    (2010)
  • Goodman, D., 2002. Bayesian population viability analysis. In: Population Viability Analysis. Editors: Beissinger, S.R,...
  • R.D. Gupta et al.

    The history of the Dirichlety and Liouville distributions

    Int. Stat. Rev.

    (2001)
  • C.C. Horvitz et al.

    Spatiotemporal variation in demographic transitions of a tropical understory herb: projection matrix analysis

    Ecol. Monogr.

    (1995)
  • Y. Kaneko et al.

    Demography and matrix analysis on a natural Pterocarya rhoifolia population developed along a mountain stream

    J. Plant Res.

    (2002)
  • W.L. Kendall et al.

    Estimating state-transition probabilities for unobservable states using capture-recapture/resighting data

    Ecology

    (2002)
  • Cited by (0)

    View full text