A modified common factor model for modelling mortality jointly for both sexes

Wong, Kenneth; Li, Jackie; Tang, Sixian

doi:10.1007/s12546-020-09243-z

A modified common factor model for modelling mortality jointly for both sexes

Original Research
Published: 17 March 2020

Volume 37, pages 181–212, (2020)
Cite this article

Download PDF

Journal of Population Research Aims and scope Submit manuscript

A modified common factor model for modelling mortality jointly for both sexes

Download PDF

Kenneth Wong¹,
Jackie Li¹ &
Sixian Tang¹

264 Accesses
4 Citations
Explore all metrics

Abstract

There is an increasing attention on the joint modelling of multiple populations. Populations are related in several ways, such as neighbouring countries, females and males, and socioeconomic subgroups within a population. They are associated due to certain common driving forces, and mortality projection models should be constructed to allow for such underlying relationships. One such example is the Poisson common factor model. In this paper, we consider some extensions of the original Poisson common factor model. The first is to use a different number of additional factors for each sex. With this, the potential trend differences between the two sexes can be captured. The second is to incorporate a common age sensitivity effect into the additional factors. It may help improve parameter parsimony. A hybrid version between these two extensions is also considered. Overall, the modified versions deliver better fitting and projection results than the original model, using mortality data from eight developed countries. Qualitatively speaking, the new models provide much more flexibility in modelling populations with different mortality patterns. Empirically, they are shown to be able to produce improved performance in fitting and projection. The models are selected as the optimal choices based on information criteria statistics, and they tend to produce more accurate forecasts of the male-to-female ratios of death rates.

A multi-population evaluation of the Poisson common factor model for projecting mortality jointly for both sexes

Article 17 June 2016

Jackie Li, Leonie Tickle & Nick Parr

Bayesian Poisson log-bilinear models for mortality projections with multiple populations

Article 19 September 2015

Katrien Antonio, Anastasios Bardoutsos & Wilbert Ouburg

Modelling Nigerian Female Mortality: An Application of Four Stochastic Mortality Models

Introduction

In a world that is increasingly connected, population demographics and trends and their potential changes have significant implications on future economic and environmental planning. In particular, the continual improvement in life expectancy over the recent decades presents a serious challenge for government pension schemes, superannuation funds, and other retirement income providers, as these entities need to ensure that the funds available will be sufficient to pay off future benefits. It is hence very important to develop suitable mortality projection models for measuring the past trends and producing mortality forecasts.

There has been much development in the field of mortality modelling and projection in the last two and a half decades. Ever since Lee and Carter (1992) laid the foundation for stochastic mortality modelling, a large amount of interest from both academics and practitioners has been spurred up, from which there are hundreds of research papers and reports published to date on this area. The initial Lee–Carter model is elegant and straightforward, but it also suffers from various limitations, such as the restriction of constant age sensitivity and the inadequate allowance for potential volatility. Numerous modifications, extensions, and alternatives to the Lee–Carter model have been proposed and tested in the literature—see Booth and Tickle (2008), Cairns et al. (2008), and Cairns et al. (2009) for a comprehensive review and comparison of single-population mortality projection models.

A research area that has recently gained much attention is the co-modelling of multiple populations. Some examples of related populations include geographically and socioeconomically close countries, female and male populations of the same region, and regional compared to national mortality of the same country. It can be argued that these subdivided groups or neighbouring populations are linked by certain common driving forces, and mortality projection models should be developed to capture the underlying relationships properly. One of the main criteria of a good multi-population mortality projection model is biological consistency in forecasting female and male mortality (Li 2013). For instance, it is a well-known feature that females have a higher life expectancy than that of males. A model that forecasts the opposite situation would be difficult to justify due to incompatibility with historical data and current biological knowledge (e.g. Li et al. 2016).

The earliest applications of the Lee–Carter model have treated related populations separately or simply in aggregate. As noted in Li (2013), however, treating two populations as independent when applying the Lee–Carter model would result in divergent mortality projections or possibly a mortality crossover. Without considering related populations jointly, long-term mortality forecasts could become nonsensical and unusable. In order to produce coherent or non-divergent mortality forecasts, Li and Lee (2005) proposed an extension of the Lee–Carter model, known as the augmented common factor model. This multi-population model allows for a common population-wide factor for the main long-term trend as well as an additional factor for short-term deviations of each subpopulation from the main trend. The specific time series modelling of the additional factor ensures convergence in the projected death rate ratios at each age in the long term. Li (2013) then modified the augmented common factor model by using a Poisson assumption to cater for the number of deaths and generalising the structure to incorporate multiple additional factors for each sex. This Poisson common factor model (PCFM) maintains the projected death rate ratio’s convergence at each age, offers a formal model framework for statistical analysis, and provides more flexibility in capturing higher order effects in the data. Some examples of its applications to demographic and insurance problems can be found in Li and Haberman (2015), Li et al. (2016), Parr et al. (2016), and Yang et al. (2016).

The PCFM can be further improved in a number of aspects. Yang et al. (2016) extended the PCFM by incorporating the cohort effect via six different structures, and found that the new structures improve the model fitting, reduce the optimal number of additional factors, and maintain coherent mortality forecasts. This paper, on the other hand, seeks to modify the PCFM in a different fashion. First, the original PCFM does not allow for a different number of male- and female-specific factors. In principle, relaxing this limitation should allow the model to capture more different features or trends between both sexes, resulting in a more flexible model to cope with different situations. Second, a common age sensitivity effect can be imposed on the additional factors. It can potentially improve the parsimony of the model, in terms of reducing the number of parameters required. These two modifications work in an opposite way to some extent—one allows for differences in the period effect whereas the other exploits similar age sensitivity patterns (if any) between females and males. But the main purpose is the same, which is to develop more ways to adapt the PCFM to mortality data of different populations with diverse features.

In short, this paper aims to test the two above-mentioned extensions of the PCFM and examine whether these alternatives show an improvement in performance over the baseline model. The assessment is based on the quality of model fitting as well as forecast accuracy. Section 2 reviews the PCFM and provides details of the two proposed model variations. Section 3 applies the models to datasets of eight populations and analyses the fitting results. Section 4 compares the models in terms of forecasting performance and conducts out-of-sample testings. Section 5 further compares the forecasting performance of the models against those proposed by Hyndman et al. (2013) and Bergeron-Boucher et al. (2018). Finally, Sect. 6 sets forth our concluding remarks and suggestions on future research.

Two modifications of PCFM

The Lee and Carter (1992) model specifies the log central death rate at age x in year t as $\ln{m}_{x,t}={a}_{x}+{b}_{x} {k}_{t}+{e}_{t}$,where ${a}_{x}$ is the age-specific average log death rate, ${k}_{t}$ is the time-varying mortality index with ${b}_{x}$ as the age-specific sensitivity, and ${e}_{t}\sim \mathrm{\rm N}(0,{\sigma }_{e}^{2})$. The estimated trend of ${k}_{t}$ is usually downward and fairly linear, and so it is often modelled as a random walk with drift. The Li and Lee (2005) model and the PCFM are its extensions to cater for multiple populations.

Under the PCFM (Li 2013), the force of mortality ${\mu }_{x,t,i}$ at age x in year t for sex i (i = 1 for females and i = 2 for males) is assumed to be constant over each integer age-period interval, so the central death rate ${m}_{x,t,i}={\mu }_{x,t,i}$. The number of deaths is modelled as a Poisson random variable, i.e. ${D}_{x,t,i}\sim {\text{Poisson}}({E}_{x,t,i} {m}_{x,t,i})$, where ${D}_{x,t,i}$ is the number of deaths and ${E}_{x,t,i}$ is the corresponding exposure to risk. While ${E}_{x,t,i}$ is a known quantity, ${m}_{x,t,i}$ is an unknown parameter that requires estimation from the data.

The Poisson assumption has quite a few advantages. As argued in Li (2013), this assumption leads to a rigorous statistical framework for analysing mortality data. Also, treating the number of deaths ${D}_{x,t,i}$ as a counting random variable is a more natural choice compared to modelling the death rate with a homoscedastic error term in earlier work such as Lee and Carter (1992) and Li and Lee (2005). This Poisson framework is widely used in the literature—see Brouhns et al. (2002) and Cairns et al. (2009) for previous applications. When assessing uncertainty in mortality changes, however, observed death counts appear to be over-dispersed for many countries (e.g. Cairns et al. 2009). For such cases, the Poisson assumption above can readily be modified as over-dispersed Poisson, in which the mean ${\rm E}\left({D}_{x,t,i}\right)={E}_{x,t,i} {m}_{x,t,i}$ remains the same while the variance is defined as ${\text{Var}}\left({D}_{x,t,i}\right)=\phi {E}_{x,t,i} {m}_{x,t,i}$ instead, with $\phi >1$ as the dispersion parameter (Renshaw and Haberman, 2006). In general, there is no change needed for the computation algorithm and the parameter estimates would stay the same, except that the extra dispersion parameter has to be calculated separately from the deviance function.

Extended from the augmented common factor model in Li and Lee (2005), the log central death rate under the PCFM is modelled as:

$$\ln{m}_{x,t,i}={a}_{x,i}+{B}_{x} {K}_{t}+\sum \limits_{j=1}^{n}{b}_{x,i,j} {k}_{t,i,j}$$

where ${a}_{x,i}$ represents the overall age effect, ${B}_{x} {K}_{t}$ is the common factor for both sexes, and ${b}_{x,i,j} {k}_{t,i,j}$ is the jth additional sex-specific factor (a total of n factors) for sex i. In more detail, ${K}_{t}$ is the mortality index of the common factor, and ${B}_{x}$ measures the sensitivity of the log central death rate to changes in ${K}_{t}$ at each age. Similarly, ${k}_{t,i,j}$ is the time-varying component of the jth additional factor with age sensitivity measure ${b}_{x,i,j}$. Compared to Li and Lee (2005), the PCFM allows for the incorporation of multiple additional factors where necessary, resulting in improved modelling results (e.g. Li et al. 2016).

The (conditional) maximum likelihood parameter estimates of the PCFM are calculated via an iterative updating scheme (see “Appendix” for details). In order to ensure model identification, the model is subject to (4n + 2) constraints, including ${\sum }_{x}{B}_{x}=1$, ${\sum }_{t}{K}_{t}=0$, ${\sum }_{x}{b}_{x,i,j}=1$, and ${\sum }_{t}{k}_{t,i,j}=0$. To determine the optimal number of additional factors, the Bayesian Information Criterion (BIC)^{Footnote 1} is used as the main statistical measure to strike a balance between goodness-of-fit and over-parameterisation. Other considerations include the patterns in the residual plots, the trends of the additional parameters, and the volume of data being under investigation.

After calibration of the model, the time-varying components ${K}_{t}$ and ${k}_{t,i,j}$ need to be projected into the future. Previous studies have shown that the common mortality index ${K}_{t}$ tends to be linear and decreasing for various countries (e.g. Li et al. 2016). Hence, ${K}_{t}$ can be modelled as a random walk with drift:

$${K}_{t}=\theta +{K}_{t-1}+{\delta }_{t}$$

where $\theta$ is the drift term and ${\delta }_{t}$ is a normally distributed random variable with mean 0 and variance ${\sigma }^{2}$. On the other hand, the time-varying terms ${k}_{t,i,j}$ are used to represent short-term deviations from the main trend for each sex, so a mean-reverting process would be preferred. Each ${k}_{t,i,j}$ is then assumed to follow a weakly stationary autoregressive AR(p) process:

$${k}_{t,i,j}={\alpha }_{0,i,j}+\sum \limits_{l=1}^{p}{\alpha }_{l,i,j} {k}_{t-l,i,j}+{\varepsilon }_{t,i,j}$$

where ${\alpha }_{0,i,j}$ and ${\alpha }_{l,i,j}$ are autoregressive parameters and ${\varepsilon }_{t,i,j}$ is a normal error term with mean 0 and variance ${\omega }_{i,j}^{2}$. The order p is chosen based on the partial autocorrelation functions (PACFs) of the time-varying components and also the autocorrelations of the residuals. Additionally, each ${k}_{t,i,j}$ is assumed to be independent of the others.^{Footnote 2} Under these conditions, future death rates (in year t > T) can be projected as:

$${\hat{m}}_{x,t,i}={\dot{m}}_{x,T,i}\exp\left({B}_{x}\left({\hat{K}}_{t}-{K}_{T}\right)+\sum \limits_{j=1}^{n}{b}_{x,i,j}\left({\hat{k}}_{t,i,j}-{k}_{T,i,j}\right)\right)$$

where ${\hat{K}}_{t}$ and ${\hat{k}}_{t,i,j}$ are projected values from their respective time series processes setting the error terms to zero, and the starting point of projection, ${\dot{m}}_{x,T,i}$, is taken from the latest set of data observed in year T. This arrangement helps to avoid significant bias in the beginning part of the projection period (Lee and Miller 2001). Using these projections, the projected male-to-female ratio of death rates can be expressed as:

$$\frac{{\hat{m}}_{x,t,2}}{{\hat{m}}_{x,t,1}}=\frac{{\dot{m}}_{x,T,2}}{{\dot{m}}_{x,T,1}}\exp\left(\sum \limits_{j=1}^{n}\left({b}_{x,2,j}\left({\hat{k}}_{t,2,j}-{k}_{T,2,j}\right)-{b}_{x,1,j}\left({\hat{k}}_{t,1,j}-{k}_{T,1,j}\right)\right)\right)$$

This projected ratio converges to a constant over time at each age if ${\hat{k}}_{t,1,j}$ and ${\hat{k}}_{t,2,j}$ also converge across time. This ‘coherence’ property (Li and Lee 2005) holds as long as the fitted time series processes of ${k}_{t,i,j}$ are all weakly stationary. In cases where a fitted process is not weakly stationary, an alternative model such as an AR(1) with $\left|{\alpha }_{1,i,j}\right|<1$ or a random walk without drift may also be used arbitrarily to preserve the coherence property.

We now propose the first modified version of the PCFM with variable sex-specific factors (PCFM-VSF), which relaxes the initial assumption that the number of additional factors is the same between both sexes. This approach can be interpreted as allowing for the existence of potential factors or trends that affect only one sex or impact each sex differently. For example, in recent decades, while life expectancies of both sexes have continually improved, male life expectancy has increased faster than female life expectancy. Li (2013) demonstrated that for Australian data, the observed difference in life expectancy at birth between both sexes was approximately 7 years in 1968, reduced to 4.5 years in 2007, and the gap can be projected to narrow to 3.1 years in 2050. There seem to be some underlying factors that drive male mortality to improve faster than female mortality. Utilising the same notation as above, the log central death rate is then modelled as:

$$\ln{m}_{x,t,i}={a}_{x,i}+{B}_{x} {K}_{t}+\sum \limits_{j=1}^{{n}_{i}}{b}_{x,i,j} {k}_{t,i,j}$$

in which there are n_i additional factors specifically for sex i. The parameter estimation procedure remains about the same with some modifications (see “Appendix”). Future death rates under this modified model are projected as:

$${\hat{m}}_{x,t,i}={\dot{m}}_{x,T,i}\exp\left({B}_{x}\left({\hat{K}}_{t}-{K}_{T}\right)+\sum \limits_{j=1}^{{n}_{i}}{b}_{x,i,j}\left({\hat{k}}_{t,i,j}-{k}_{T,i,j}\right)\right)$$

and the projected male-to-female ratio of death rates becomes:

$$\frac{{\hat{m}}_{x,t,2}}{{\hat{m}}_{x,t,1}}=\frac{{\dot{m}}_{x,T,2}}{{\dot{m}}_{x,T,1}}\exp\left(\sum \limits_{j=1}^{{n}_{2}}{b}_{x,2,j}\left({\hat{k}}_{t,2,j}-{k}_{T,2,j}\right)-\sum \limits_{j=1}^{{n}_{1}}{b}_{x,1,j}\left({\hat{k}}_{t,1,j}-{k}_{T,1,j}\right)\right)$$

Hyndman and Ullah (2007) proposed a generalised version of the Lee–Carter model using a functional data analysis approach. One of the extensions for modelling multiple groups therein considered a common age effect (CAE), where different populations share the same age-period sensitivity. Kleinow (2015) also proposed a CAE extension of the Lee–Carter model, and found that compared to a separate age effect for each country, the CAE resulted in a better model fit, based on the use of ten countries’ data. This common age effect concept can readily be adapted to the PCFM. Figure 1 shows the computed ${b}_{x,i,1}$ values, which are rescaled to be more comparable between both sexes. They represent the age sensitivity to the sex-specific short-term deviation from the main trend. It is interesting to see that the ${b}_{x,i,1}$ values from the original PCFM demonstrate peaks and troughs at fairly similar ages between females and males for England and Wales, Denmark, Canada, and the United States.^{Footnote 3} For instance, for the United States, the highest sensitivities can be found at around ages 35 and 70. This observation provides us an incentive to seek a more efficient use of model parameters. Accordingly, we propose another modification of the PCFM with a common age effect (PCFM-CAE), in which the age sensitivity measures in each additional factor are assumed to be equal between both sexes, that is, ${b}_{x,1,j}={b}_{x,2,j}$. This assumption allows a more parsimonious use of parameters and may lead to a lower BIC value in certain cases. As a result, the log central death rate is modelled as:

$$\ln{m}_{x,t,i}={a}_{x,i}+{B}_{x} {K}_{t}+\sum \limits_{j=1}^{n}{b}_{x,j} {k}_{t,i,j}$$

Again, some changes are needed for the parameter estimation procedure (see “Appendix”). Future death rates under the PCFM-CAE are projected using the same equation as the baseline PCFM, with ${b}_{x,i,j}$ replaced by ${b}_{x,j}$:

$${\hat{m}}_{x,t,i}={\dot{m}}_{x,T,i}\exp\left({B}_{x}\left({\hat{K}}_{t}-{K}_{T}\right)+\sum \limits_{j=1}^{n}{b}_{x,j}\left({\hat{k}}_{t,i,j}-{k}_{T,i,j}\right)\right)$$

and the projected male-to-female ratio of death rates is equal to:

$$\frac{{\hat{m}}_{x,t,2}}{{\hat{m}}_{x,t,1}}=\frac{{\dot{m}}_{x,T,2}}{{\dot{m}}_{x,T,1}}\exp\left(\sum \limits_{j=1}^{n}{b}_{x,j}(({\hat{k}}_{t,2,j}-{k}_{T,2,j})-({\hat{k}}_{t,1,j}-{k}_{T,1,j}))\right)$$

Note that since ${b}_{x,j}$ of both sexes are equal, there are two alternative conditions to make this projected ratio converge over time. The first is the same as before—both ${\hat{k}}_{t,1,j}$ and ${\hat{k}}_{t,2,j}$ need to converge. The second (less stringent) condition is that only convergence in ${\hat{k}}_{t,2,j}-{\hat{k}}_{t,1,j}$ is required, but not ${\hat{k}}_{t,1,j}$ and ${\hat{k}}_{t,2,j}$ themselves. For instance, ${\hat{k}}_{t,2,j}-{\hat{k}}_{t,1,j}$ is modelled by a weakly stationary time series process, while the time series process for ${\hat{k}}_{t,1,j}$ is not necessarily stationary.

Fitting results

The mortality data of Australia, England and Wales, France, West Germany, Denmark, Sweden, Canada, and the United States from 1970 to 2011 are obtained from the Human Mortality Database (HMD 2019). The eight developed countries are selected on the basis that they are good representatives of some major continents including Australasia, Europe, and North America. As shown in the following analysis, the data of these countries call for different model choices amongst the alternatives considered, highlighting the importance to have more flexibility in the modelling approach. These datasets are sorted by sex, calendar year, and single year of age. As the exposure to risk and death counts at advanced ages are generally too volatile,^{Footnote 4} the age range 0–89 is chosen to perform a more precise analysis. Moreover, in line with previous studies (Li 2013; Yang et al., 2016), the year 1970 is chosen as the start of the sample period in order to avoid the structural changes in mortality improvement^{Footnote 5} that occurred around that time. This choice ensures that the data used are relevant and helps to make projections more straightforward than otherwise.

The optimal model fit^{Footnote 6} is decided by analysing the BIC values and standardised residual plots. Table 1 shows the BIC results of the PCFM-VSF and PCFM-CAE for the eight selected populations. Note that the case of n = 0 refers to having no additional factors for both sexes. The baseline PCFM results are indeed included in the diagonal of the table, as the PCFM-VSF is effectively an expansion of the original model. For Australia, the baseline model results agree with those from Li (2013) and Yang et al. (2016) that a use of two additional factors is the optimal choice, leading to the lowest BIC value (72,718) on the diagonal. The PCFM-CAE results suggest a further improvement in the BIC (71,838), where there is no change in the optimal number of additional factors. By contrast, the PCFM-VSF results indicate that the optimal choice is to use two male-specific factors but only one female-specific factor (72,674). For the Australian data, both new models produce a better fit than the baseline model, in terms of the BIC, with the PCFM-CAE as the most optimal choice amongst all.

Table 1 BIC values of fitted PCFM (diagonal), PCFM-VSF (matrix), and PCFM-CAE (right column), Australia, England & Wales, France, West Germany, Denmark, Sweden, Canada, United States

Full size table

For England and Wales, there is no improvement shown by the PCFM-VSF over the baseline PCFM—both versions indicate n = 3 for both male- and female-specific factors (90,810). In comparison, the PCFM-CAE suggests one more additional factor for each sex (n = 4) which results in a better model fit under the common age effect (89,174).

For France, the baseline PCFM points to the use of four additional factors (94,176). Similar to the Australian results, the PCFM-VSF suggests that the optimal model choice is to remove one factor from the optimal baseline model, resulting in three female-specific factors and four male-specific factors (93,743). On the other hand, the PCFM-CAE leads to a different conclusion of putting more factors instead, in which the optimal choice is using five additional factors for each sex (92,980). For the French data, both new models deliver a better fit than the original model, again with the PCFM-CAE being the most optimal version.

For West Germany, the PCFM-VSF shows no improvement over the baseline PCFM here at n = 4 (101,580), but the PCFM-CAE suggests using two extra additional factors for each sex (n = 6), rather than just putting one more as in the previous cases, and leads to a lower BIC value (99,922).

For Denmark, the baseline PCFM suggests n = 1 (63,148), while the PCFM-VSF adds one more female-specific factor and generates a lower BIC value (63,109). On the other hand, the PCFM-CAE recommends using two additional factors for each sex (n = 2) and produces the lowest BIC value (62,848) amongst all the options.

For Sweden, the PCFM-VSF demonstrates no improvement over the baseline PCFM at n = 1 (64,927), while the PCFM-CAE gives a lower BIC value (64,301) via using n = 2. For Canada, the results are similar to those of England and Wales, in which the PCFM-VSF makes no improvement over the baseline PCFM at n = 2 (77,159), whereas the PCFM-CAE requires one more additional factor for each sex (n = 3) and produces a better model fit (76,381). Lastly, for the United States, there are some obvious differences in the model fitting results here compared to the other populations. While the baseline PCFM recommends n = 5 (114,904), the PCFM-VSF suggests placing an extra male-specific factor, leading to five female-specific and six male-specific factors (114,788). As before, the PCFM-CAE supports using more factors, and takes six additional factors for each sex as the optimal choice (115,012). But for the United States data, the PCFM-VSF shows the best model fit, whereas the PCFM-CAE is the least optimal among the three candidates.

Overall, in terms of the BIC, the PCFM-CAE leads to the best model fit for seven of the countries, the PCFM-VSF is the best choice in one case, and the baseline PCFM is the least optimal amongst the three options for seven countries. These results clearly show that the two proposed extensions outperform the original model in our fitting exercise. The underlying age sensitivity patterns may be quite similar between both sexes for certain countries and so setting a common age sensitivity measure may make the model more parsimonious. However, there may also be a resulting offsetting effect from the increase in the number of additional factors required. Moreover, a different number of additional factors for each sex can provide more flexibility for modelling mortality data. Regardless of whether it leads to the best model fit, it is interesting to see that there are more male-specific factors required than female-specific factors in the optimal PCFM-VSF for Australia, France, and the United States. It reflects that these male populations may have more complex mortality patterns or trends which demand a use of more model parameters.

Next, the standardised deviance residuals^{Footnote 7} from fitting the three optimal models to the eight datasets are examined. For instance, Fig. 2 displays the residuals for females in Australia, France, and the United States. Overall, the residuals look largely random when plotted against age or calendar year.^{Footnote 8} However, the residuals plotted against cohort year are relatively less random. Particularly, there are differences in the magnitude of these cohort patterns between the models. For Australia, although the PCFM-VSF improves the BIC value over the baseline PCFM, it comes at the cost of more obvious patterns in the residuals for females. This issue has also been found for France, though to a lesser extent. These observations are likely due to the optimal PCFM-VSF model having one less female-specific factor compared to the baseline PCFM. However, this difference is not evident for the United States, owing to the large number of factors being used in both models. For the PCFM-CAE, there are no noticeable differences in cohort residual randomness for Australia, France, and the United States when compared to the baseline PCFM. For England and Wales, as well as Canada (not shown here), the PCFM-CAE shows a slight improvement in the residuals over the baseline PCFM. In contrast, for West Germany (not shown here), the PCFM-CAE leads to slightly more pronounced patterns in the residuals. In fact, all these cohort patterns (mainly for some older individual cohorts) may be addressed by modifying the PCFM-VSF and PCFM-CAE to include a cohort factor in a similar approach to Yang et al. (2016).^{Footnote 9}

To conduct another check on the suitability of the model fit, the mean absolute percentage error (MAPE) values of the fitted log central death rates are displayed in Table 2. The MAPE is defined as:

Table 2 MAPE values of fitted log central death rates (and original scale)

Full size table

$$\frac{1}{{n}_{d}}\sum \limits_{x,t,i}\left|\frac{\ln{\tilde{m}}_{x,t,i}-\ln({d}_{x,t,i}/{E}_{x,t,i})}{\ln({d}_{x,t,i}/{E}_{x,t,i})}\right|$$

where ${n}_{d}$ is the number of data points, ${\tilde{m}}_{x,t,i}$ is the fitted log central death rate, and ${d}_{x,t,i}$ and ${E}_{x,t,i}$ are the observed number of deaths and exposure to risk respectively. The MAPE values are expressed in percentages below. In addition, the MAPE values on the fitted central death rates (original scale) are also provided in brackets. Note that no MAPE values of the PCFM-VSF are provided for those countries where the PCFM-VSF makes no improvement over the baseline PCFM.

All the MAPE values of the fitted log death rates are very small, largely being under 2%. The MAPE values of the fitted death rates are mostly around 5% or less, except for Denmark and Sweden which have more volatile data due to their smaller population sizes. The differences between the three models look rather immaterial, and none of the models has a clear advantage over the others. In terms of only the goodness-of-fit (MAPE), the performances of all the three models are satisfactory.

As mentioned earlier, there seems to be some trade-off between setting a common age sensitivity measure (fewer parameters) and placing more additional factors (more parameters). For seven of the countries being considered, the PCFM-CAE applications require incorporating one or two extra additional factors compared to the baseline PCFM. Consequently, there are more time-varying components in the structure and so more time series processes are needed to perform future projections and simulations. Despite a better model fit, using more time series assumptions may complicate the projection and simulation exercise, and caution must be taken on the stationarity and other properties of the time series processes adopted.

To further test the flexibility offered by the PCFM-VSF and PCFM-CAE, we consider a ‘hybrid’ between the two new models, in which the PCFM-CAE structure is extended to include one extra (either) male- or female-specific factor. Table 12 in the “Appendix” gives the BIC values of fitting this hybrid model to the eight datasets. The hybrid model turns out to be optimal for France, Denmark, and the United States. This model variation can help address the potential limitation of the PCFM-CAE in assuming all the same ${b}_{x,j}$ between females and males.

Projection results

This section examines the accuracy and coherence of the mortality forecasts produced by the optimal model under each of the three variations. As noted previously, the common mortality index ${K}_{t}$ is modelled as a random walk with drift, while the time-varying terms ${k}_{t,i,j}$ are modelled as AR(p) processes. The selected order p is based on the PACFs of the time-varying components, the autocorrelations of the residuals, and whether the fitted process is weakly stationary. When the fitted AR(p) process is not weakly stationary, a random walk without drift is used as a substitute in this section. The purpose is to ensure that the projected male-to-female ratio of death rates at each age eventually tends to a constant, thus avoiding the potential problems of unrealistic projected paths and mortality crossover between both sexes. An example requiring this arbitrary adjustment in our analysis is the second female- and male-specific factors of the PCFM-CAE for England and Wales, where the autoregressive parameters are greater than one, and if unadjusted, the corresponding projected death rate ratio does not converge, and the model forecasts are then not coherent.

Figure 3 illustrates the projected male-to-female ratios of death rates under the three models for four of the countries considered. In general, the three models show similar projection properties. There are some slight differences in the direction and speed of convergence in the projected ratios over time across the various age groups, but all the projected ratios converge ultimately (though very slowly in some cases) and appear roughly in line with the past trends. One potential issue arises with the PCFM-CAE projections for West Germany—because the optimal model has two more additional factors than the baseline PCFM, there are four extra time-varying components being projected. Using more time series processes due to the extra components results in the projected ratios having much stronger autoregressive effects, being most evident in the 0–9 and 10–29 age groups.^{Footnote 10} Although these projected ratios still converge gradually over time and the projected levels reflect the past data reasonably well, the fluctuations may nevertheless be undesirable for forecasting purposes. This problem could be mitigated by using AR(1) models as opposed to AR(p) models for the time-varying components, but this alternative method is rather arbitrary.

The accuracy of the mortality forecasts is assessed via out-of-sample testing, also known as backtesting, in which the PCFM, PCFM-VSF, and PCFM-CAE mortality projections are compared against the actual observations. The data period is split into two parts—the first 30 years’ data are used to fit each model and the remaining 12 years are used to evaluate the forecasting performance of that model. The projection accuracy is measured by the MAPE of the projected log central death rates for each sex, defined in a similar manner to the previous section.

Table 3 presents the MAPE values of the projected estimates of the eight countries under the three models. The MAPE values of the projected male and female log death rates are mostly less than 4%, and the differences between the models are very small. It is promising to see that with fewer parameters (roughly about 100 fewer on average), the new models can still produce a similar forecasting performance. When considering the projected male-to-female ratios, the PCFM-CAE noticeably outperforms the other two models for most of the countries considered. The more obvious cases are France (5.38%), West Germany (5.01%), and the United States (3.84%). For France, age groups 0–9 and 30–49 are the main contributors to the difference, while for West Germany, age group 10–29 is where the key difference lies.^{Footnote 11} These results suggest that the PCFM-CAE may be a more appropriate model in terms of forecasting the male-to-female death rate ratios.

Table 3 MAPE values of projected log central death rates and projected male-to-female ratios

Full size table

When the PCFM-VSF is compared with the baseline PCFM, the differences are not as clear. Regarding the projected male-to-female ratios, the PCFM-VSF outperforms the baseline PCFM for West Germany, slightly underperforms instead for France, and the difference between the two models for the United States is negligible. For France, the PCFM-VSF is slightly worse at the very end of the projection in age groups 0–9 and 10–29. For West Germany, the PCFM-VSF is better at ages 10–29. With fewer (only three) populations to compare, it is difficult to come to any conclusion as to whether the PCFM-VSF would outperform the baseline PCFM.

It is important to understand that the results of out-of-sample testing may change considerably if a different split between the sample and testing periods is used. To further investigate the forecast accuracy of the three models, the analysis is repeated with a sample period of 25 years and a testing period of 17 years. The results of this second out-of-sample test are also presented in Table 3. Regarding the MAPE values of the projected log death rates, the second test tends to produce larger errors than the first test, but there is again not much difference between the models. For the MAPE values of the projected male-to-female ratios, the longer testing period in the second test clearly results in lower accuracy, particularly for Canada and the United States (around 10% to 13%). The projected male-to-female ratios depart more from the actual values at age groups 10–29 and 30–49 for France, Canada, and the United States in the second test. For these three countries, there is an unexpected sudden change in direction in the trends of the observed ratios at ages 10–49. In this test, the PCFM-VSF and PCFM-CAE largely outperform the baseline PCFM in projecting the male-to-female ratios of death rates.

The out-of-sample testing is further conducted on a rolling-window basis, in which successive splits between the sample and testing periods are used (the last 5 years as the testing period and all the previous years as the sample period, then the last 6 years as the testing period and the previous years as the sample period, and so on until the last 20 years as the testing period). Table 4 gives the MAPE values averaged over all the splits. While the differences in the projected log death rates are minimal between the three models, the PCFM-CAE still tends to outperform the baseline PCFM regarding the projected death rate ratios.

Table 4 Average MAPE values of projected log central death rates and projected male-to-female ratios (rolling-window basis)

Full size table

We also project life expectancies by extrapolating death rates to age 109, assuming that $m_x=0.7$ eventually (Gampe, 2010) and using the Coale and Guo (1989) method to close out the life table. Table 5 shows the MAPE values of the projected life expectancies at ages 0 and 60 using the rolling-window basis. Again, the differences in accuracy between the three models are negligible. Though the new models use much fewer parameters than the baseline model in general, they can still make life expectancy forecasts of highly comparable quality. It is clear to see the efficiency of the new modifications in forecasting future mortality.

Table 5 Average MAPE values of projected life expectancy values (rolling-window basis)

Full size table

As discussed above, the models sometimes produce death rate ratio forecasts that are quite different to what has actually happened, e.g. ages 10–29 for Australia and West Germany, and ages 30–69 for France. While it would be difficult for the models to accurately predict a sudden reversal in the trends, it is still useful if the models can produce prediction intervals which capture the possibility of these abrupt changes. Accordingly, we use the semi-parametric bootstrapping process (Brouhns et al. 2005) to generate the 95% prediction intervals for each age group under the three models, and calculate the actual proportions that the observed ratios move outside the intervals. Table 6 lists the percentage of the observed ratios not being captured by the prediction intervals under each model, using a sample period of 35 years and a testing period of 7 years. The PCFM-CAE prediction intervals work fairly well for Australia, England and Wales, Denmark, Canada, and the United States, in which the percentages are 0% to 11% (quite close to the theoretical value of 5%). The PCFM-CAE outperforms the baseline PCFM in this analysis. However, if a longer testing period is used, the proportions of getting outside the prediction intervals may increase. Table 7 gives the corresponding percentage of the observed life expectancy values (at birth) falling outside the prediction intervals under each model. Figure 4 plots the projected and observed values in the testing period with the prediction intervals for females in West Germany and the United States. As shown, the observed values move well within the prediction intervals in many of the cases. For France and West Germany, though the observed values go outside the prediction intervals for a number of times, the extent is mild and the observed values still move close to the bounds. There is also a tendency for the PCFM-VSF and PCFM-CAE to generate slightly wider prediction intervals than the baseline PCFM. It may be caused by the fact that the PCFM-VSF allows a different number of additional factors between both sexes, and that the PCFM-CAE fitting above often requires using more additional factors.

Table 6 Proportions of observed male-to-female ratios of death rates falling outside 95% prediction intervals

Full size table

Table 7 Proportions of observed life expectancy values at birth falling outside 95% prediction intervals

Full size table

Note that although the semi-parametric bootstrapping already takes parameter error into account, the resulting variability in the simulations may still not capture all the potential abrupt changes, as the patterns in the past data are rather steady. Other methods such as Bayesian Markov chain Monte Carlo simulation (Czado et al. 2005) may also be sought.

In summary, the results in these two sections clearly suggest that the PCFM-CAE and PCFM-VSF have potential advantages over the baseline PCFM. Qualitatively, the new models provide more flexibility; quantitatively, they are shown to be able to deliver a potentially better performance in model fitting and future projections and simulations. In particular, the new modifications can produce a comparable forecasting performance on the death rates using fewer parameters, and at the same time generate better forecasts for the death rate ratios. As demonstrated in Yang et al. (2016), the way to project the death rate ratios can have a significant impact on the pricing of pensions and annuities between the two sexes for the sponsoring entities and insurers. It is very important to have a proper modelling of the death rate ratios.

Comparison with other models

This section further compares the forecasting performances between the proposed models, the product-ratio method proposed by Hyndman et al. (2013), and the sex-ratio approach suggested by Bergeron-Boucher et al. (2018). Suppose ${p}_{x,t}=\sqrt{{m}_{x,t,1} {m}_{x,t,2}}$ and ${r}_{x,t}=\sqrt{{m}_{x,t,1}/{m}_{x,t,2}}$ are the square roots of the products and ratios of central death rates. We apply the product-ratio method as:

$$\begin{aligned} & \ln p_{x,t} = \eta_{x} + \phi_{x,1} \lambda_{t,1} + \phi_{x,2} \lambda_{t,2} + \cdots + \phi_{x,g} \lambda_{t,g} \\ & \ln r_{x,t} = \xi_{x} + \psi_{x,1} \gamma_{t,1} + \psi_{x,2} \gamma_{t,2} + \cdots + \psi_{x,h} \gamma_{t,h} \\ \end{aligned}$$

The parameters ${\eta }_{x}$ and ${\xi }_{x}$ are the average of $\ln{p}_{x,t}$ and $\ln{r}_{x,t}$ over time, and ${\phi }_{x,j}$, ${\lambda }_{t,j}$, ${\psi }_{x,j}$, and ${\gamma }_{t,j}$ are the principal components of $\ln{p}_{x,t}$ and $\ln{r}_{x,t}$. As females and males have about the same data size, their log death rates have approximately equal variances, and so $\ln{p}_{x,t}$ and $\ln{r}_{x,t}$ are largely uncorrelated. The principal component analysis can then be applied separately to each term to compute the model parameters. Following Hyndman et al. (2013), we set $g=h=6$. We assume that ${\lambda }_{t,1}$ follows a random walk with drift while all the other time-varying parameters follow an AR(p) process. Particularly, the time series processes for $\ln{r}_{x,t}$ have to be weakly stationary such that the projected death rate ratio at each age converges to a constant over time. On the other hand, the sex-ratio approach proposed by Bergeron-Boucher et al. (2018) assumes a Lee–Carter kind of structure for the log death rate ratio:

$$\ln({m}_{x,t,2}/{m}_{x,t,1})={\alpha }_{x}+I(x<45){\beta }_{x}^{(1)} {\kappa }_{t}^{(1)}+I(x\ge 45){\beta }_{x}^{(2)} {\kappa }_{t}^{(2)}$$

The term ${\alpha }_{x}$ is the age effect and ${\beta }_{x}^{(1)}$ is the age sensitivity to the time-varying parameter ${\kappa }_{t}^{(1)}$ for ages below 45. For ages 45 and over, there is another set of parameters ${\beta }_{x}^{(2)}$ and ${\kappa }_{t}^{(2)}$ to separate the male excess cancer mortality from the male excess accident mortality. Here we assume that the female death rates follow the Lee–Carter model, and that the time-varying parameters ${\kappa }_{t}^{(1)}$ and ${\kappa }_{t}^{(2)}$ follow a weakly stationary AR(p) process. The female death rates are projected by the equation ${\hat{m}}_{x,t,1}={\dot{m}}_{x,T,1}\exp({b}_{x,1}({\hat{k}}_{t,1}-{k}_{T,1}))$, and then the male death rates are projected as ${\hat{m}}_{x,t,2}={\hat{m}}_{x,t,1}\frac{{\dot{m}}_{x,T,2}}{{\dot{m}}_{x,T,1}}\exp(I(x<45){\beta }_{x}^{(1)}({\hat{\kappa }}_{t}^{(1)}-{\kappa }_{T}^{(1)})+I(x\ge 45){\beta }_{x}^{(2)}({\hat{\kappa }}_{t}^{(2)}-{\kappa }_{T}^{(2)}))$.

Table 8 compares the MAPE values computed from the product-ratio method, the sex-ratio approach, and the PCFM-CAE for each country using the rolling-window basis. For the MAPE values of the projected death rates, the performances of the three approaches are quite close. The product-ratio method tends to perform better for female death rates, while the PCFM-CAE and the sex-ratio approach tend to perform better for male death rates. But for the MAPE values of the projected male-to-female ratios, the PCFM-CAE and the sex-ratio approach generally outperform the product-ratio method. The PCFM-CAE performs slightly better than the sex-ratio approach for West Germany, Denmark, Sweden, and the United States, while it is reversed for the other four countries. Moreover, Table 9 shows the corresponding MAPE values of the projected life expectancies at ages 0 and 60. The sex-ratio approach tends to produce lower MAPE values for male life expectancy, while the PCFM-CAE and the product-ratio method tend to produce lower MAPE values for female life expectancy.

Table 8 Average MAPE values of projected log central death rates and projected male-to-female ratios (rolling-window basis)

Full size table

Table 9 Average MAPE values of projected life expectancy values (rolling-window basis)

Full size table

Furthermore, Tables 10 and 11 provides the percentages of the observed death rate ratios and the observed life expectancy values falling outside their prediction intervals under each approach (generated from semi-parametric bootstrapping). Figure 5 plots the observed and projected life expectancy values with the prediction intervals for the product-ratio method and the sex-ratio approach. It can be seen that the prediction intervals generated by the sex-ratio approach appear to be too narrow, and so they cannot capture the observed values at a sufficient level compared to the PCFM-CAE and the product-ratio method. The underlying reason may be the use of only three time-varying parameters under the sex-ratio approach.

Table 10 Proportions of observed male-to-female ratios of death rates falling outside 95% prediction intervals

Full size table

Table 11 Proportions of observed life expectancy values at birth falling outside 95% prediction intervals

Full size table

Overall, it appears that the PCFM-CAE can provide both a relatively accurate way to forecast the death rates and their ratios and also a more proper allowance for the potential variability, when compared to the other two approaches. Note that the sex-ratio approach adopted here makes use of the Lee–Carter model for the female death rates. Its performance may be significantly different if another mortality model is used.

Concluding remarks

In this paper, we have explored a number of modifications to the PCFM proposed by Li (2013) (which is an extension of Li and Lee, 2005) and evaluated the data fitting and forecasting performance using mortality data from eight populations. The new modifications involve allowing for a different number of additional factors for each sex and incorporating a common age effect in the additional factors between both sexes. Our analysis shows that on the whole the modified PCFM-CAE and PCFM-VSF produce better fitting and projection results than the baseline PCFM. Firstly, the BIC results clearly indicate that the two proposed models lead to a better fitting performance for the data of several countries. The benefit of using fewer parameters more than offsets the corresponding impact on the likelihood value. Secondly, despite the use of much fewer parameters, the forecasting performance is maintained overall, and can even be improved when dealing with the death rate ratios. It evidently represents a much more efficient use of model parameters. Moreover, the two new models tend to generate wider prediction intervals than those produced from the baseline PCFM. The wider intervals would then be able to capture the future actual values more satisfactorily. In addition, the PCFM-CAE produces better death rate ratio projections than the product-ratio method, and it also yields much wider life expectancy prediction intervals than the sex-ratio approach.

It is clear that these new modifications can offer more flexibility to deal with the varying features and patterns in different populations. The hybrid model in the “Appendix”, combining the PCFM-CAE and PCFM-VSF, can further improve the fitting performance for some countries. Note that the projected death rate ratios can have a significant effect on the pricing of pensions and annuities between both sexes. Yang et al. (2016) showed that an incorrect projection of the ratios could lead to an inappropriate cross-subsidisation between female and male annuitants. From the insurer’s or sponsor’s perspective, it is of critical importance to model the relationship between female and male mortality properly.

There are some possible areas for future research. It would be interesting to apply these models to more populations’ datasets and compare the results, to further explore the level of flexibility they would offer. Moreover, it would be useful to implement these models in practical scenarios such as government policy planning and insurance pricing, since a realistic joint projection of female and male mortality is very important in valuing annuities, pensions, and social benefits. Furthermore, a more detailed investigation on the use of time series models would be warranted. The choice of time series modelling affects not only the central projection but also the level of variability in the simulation. For demographic or insurance studies requiring probabilistic calculations, one should be cautious about the specific effects of assuming a particular time series process. As a quick comparison, we have tried using an AR(1) process arbitrarily for all the computed time series. Figure 7 in the “Appendix” shows that the projected male-to-female ratios of death rates then converge faster than what have been shown previously. It means that the pace of reaching coherence would be negatively related to the order of the AR(p) process. Despite the various statistical tests to determine the order of the time series process, it remains partially a matter of judgement on how fast the projected ratios should converge over time, if one decides to make coherent mortality forecasts.

Notes

The BIC is calculated as $-2\hat{l}+{n}_{p}\ln({n}_{d})$, where $\hat{l}$ is the computed log-likelihood, ${n}_{p}$ is the effective number of parameters being estimated, and ${n}_{d}$ is the number of observations.
An alternative is to model the ${k}_{t,i,j}$ terms as multivariate time series when assessing uncertainty of mortality changes, but this approach would complicate the projection exercise here with a rather short data period. It can be explored in future research.
For France, West Germany, and Sweden, the differences between both sexes are more obvious at some age ranges, and for Australia, females and males show rather opposite sensitivities at ages 30 to 60. These differences may arise from genuine differences, or merely sampling errors as the sex-specific factors are rather ‘secondary effects’ in the model when compared to the common factor. The former may be captured by using a hybrid model as discussed in the “Appendix”. If it is the latter, using a common age sensitivity may help reduce the effect of sampling errors.
The volatile patterns at advanced ages tend to distort the model fitting and require a further separate analysis, e.g. see Thatcher (1999).
Booth et al. (2002) developed a statistical measure for selecting the optimal fitting period for the original Lee-Carter model. Li et al. (2011) investigated the detection of structural changes and their impact on forecasting.
The R codes for fitting the PCFM, PCFM-VSF, and PCFM-CAE can be provided upon request.
To account for possible over-dispersion in the data, the residuals are standardised with respect to the dispersion parameter (Renshaw and Haberman 2006). The equation is provided in the “Appendix”. For some countries, the extent of dispersion appears to be greater at older ages, which call for expressing the dispersion as a function of age. We leave this possible treatment for future research.
Besides graphical examination, we have also performed the signs test on the residuals over each age, each calendar year, and each cohort year, and examined whether there are ‘too many’ or ‘too few’ positive residuals in each case. On average, only a few percent of the ages considered have significantly too many or too few positive residuals, about 10% of the years are significant, and around 20% of the cohorts are significant. If the number of additional parameters is increased to, say, six in all the models, the last two values drop progressively to less than 10% and 20% respectively, but the BIC values do not support incorporating too many parameters. More details are given in the “Appendix”.
We have tested an addition of a cohort parameter ${g}_{t-x}$ to the PCFM-VSF and PCFM-CAE. The cohort patterns are then largely removed, and the BIC values are further decreased in some (but not all) cases. These results are given in the “Appendix”.
It is worth noting that the fitted AR(p) processes for three of the four extra time-varying components in the PCFM-CAE for West Germany are of order 3 or higher.
Again, it is worth noting that the optimal models for West Germany contain some high order AR(p) processes, resulting in some highly fluctuating projections that may overly reflect the noise in the past data. Additionally, in this out-of-sample testing, the optimal PCFM-CAE has six additional factors, whereas the baseline PCFM has three.

References

Bergeron-Boucher, M. P., Canudas-Romo, V., Pascariu, M., & Lindahl-Jacobsen, R. (2018). Modeling and forecasting sex differences in mortality: A sex-ratio approach. Genus,74(20), 1–28.
Google Scholar
Booth, H., Maindonald, J., & Smith, L. (2002). Applying Lee–Carter under conditions of variable mortality decline. Population Studies,56(3), 325–336.
Article Google Scholar
Booth, H., & Tickle, L. (2008). Mortality modelling and forecasting: A review of methods. Annals of Actuarial Science,3(I/II), 3–43.
Article Google Scholar
Brouhns, N., Denuit, M., & van Keilegom, I. (2005). Bootstrapping the Poisson log-bilinear model for mortality forecasting. Scandinavian Actuarial Journal,2005(3), 212–224.
Article Google Scholar
Brouhns, N., Denuit, M., & Vermunt, J. K. (2002). A Poisson log-bilinear regression approach to the construction of projected lifetables. Insurance: Mathematics and Economics,31(3), 373–393.
Google Scholar
Cairns, A. J., Blake, D., & Dowd, K. (2008). Modelling and management of mortality risk: A review. Scandinavian Actuarial Journal,2008(2–3), 79–113.
Article Google Scholar
Cairns, A. J., Blake, D., Dowd, K., Coughlan, G. D., Epstein, D., Ong, A., et al. (2009). A quantitative comparison of stochastic mortality models using data from England and Wales and the United States. North American Actuarial Journal,13(1), 1–35.
Article Google Scholar
Coale, A., & Guo, G. (1989). Revised regional model life tables at very low levels of mortality. Population Index,55(4), 613–643.
Article Google Scholar
Czado, C., Delwarde, A., & Denuit, M. (2005). Bayesian Poisson log-bilinear mortality projections. Insurance: Mathematics and Economics,36, 260–284.
Google Scholar
Gampe, J. (2010). Human mortality beyond age 110. In H. Maier, J. Gampe, B. Jeune, J. W. Vaupel, & J.-M. Robine (Eds.), Supercentenarians (pp. 219–230). Springer Berlin: Demographic research monographs.
Chapter Google Scholar
Human Mortality Database. (2019). University of California, Berkeley (USA), and Max Planck Institute for Demographic Research (Germany). Retrieved July 1, 2019 from www.mortality.org.
Hyndman, R. J., & Ullah, M. S. (2007). Robust forecasting of mortality and fertility rates: A functional data approach. Computational Statistics & Data Analysis,51(10), 4942–4956.
Article Google Scholar
Hyndman, R. J., Booth, H., & Yasmeen, F. (2013). Coherent mortality forecasting: The product-ratio method with functional time series models. Demography,50(1), 261–283.
Article Google Scholar
Kleinow, T. (2015). A common age effect model for the mortality of multiple populations. Insurance: Mathematics and Economics,63, 147–152.
Google Scholar
Lee, R. D., & Carter, L. R. (1992). Modeling and forecasting US mortality. Journal of the American Statistical Association,87(419), 659–671.
Google Scholar
Lee, R., & Miller, T. (2001). Evaluating the performance of the Lee–Carter method for forecasting mortality. Demography,38(4), 537–549.
Article Google Scholar
Li, J. (2013). A Poisson common factor model for projecting mortality and life expectancy jointly for females and males. Population Studies,67(1), 111–126.
Article Google Scholar
Li, J., & Haberman, S. (2015). On the effectiveness of natural hedging for insurance companies and pension plans. Insurance: Mathematics and Economics,61, 286–297.
Google Scholar
Li, J., Tickle, L., & Parr, N. (2016). A multi-population evaluation of the Poisson common factor model for projecting mortality jointly for both sexes. Journal of Population Research,33(4), 333–360.
Article Google Scholar
Li, J. S. H., Chan, W. S., & Cheung, S. H. (2011). Structural changes in the Lee–Carter mortality indexes: Detection and implications. North American Actuarial Journal,15(1), 13–31.
Article Google Scholar
Li, N., & Lee, R. (2005). Coherent mortality forecasts for a group of populations: An extension of the Lee–Carter method. Demography,42(3), 575–594.
Article Google Scholar
Parr, N., Li, J., & Tickle, L. (2016). A cost of living longer: Projections of the effects of prospective mortality improvement on economic support ratios for 14 advanced economies. Population Studies,70(2), 181–200.
Article Google Scholar
Renshaw, A. E., & Haberman, S. (2006). A cohort-based extension to the Lee-Carter model for mortality reduction factors. Insurance: Mathematics and Economics,38(3), 556–570.
Google Scholar
Thatcher, A. R. (1999). The long-term pattern of adult mortality and the highest attained age. Journal of the Royal Statistical Society: Series A (Statistics in Society),162(1), 5–43.
Article Google Scholar
Yang, B., Li, J., & Balasooriya, U. (2016). Cohort extensions of the Poisson common factor model for modelling both genders jointly. Scandinavian Actuarial Journal,2016(2), 93–112.
Article Google Scholar

Download references

Acknowledgments

The authors would like to thank the editor and the referees for their valuable comments and suggestions which greatly enhance the presentation of the paper.

Author information

Authors and Affiliations

Department of Actuarial Studies and Business Analytics, Macquarie University, Sydney, NSW, 2109, Australia
Kenneth Wong, Jackie Li & Sixian Tang

Authors

Kenneth Wong
View author publications
You can also search for this author in PubMed Google Scholar
Jackie Li
View author publications
You can also search for this author in PubMed Google Scholar
Sixian Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kenneth Wong.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

The iterative updating scheme used to estimate the parameters of the PCFM, PCFM-VSF, PCFM-CAE, and the hybrid model is designed to minimise the deviance function:

$${\text{deviance}}=\sum \limits_{x,t,i}2\left[{d}_{x,t,i}\ln\left(\frac{{d}_{x,t,i}}{{\hat{d}}_{x,t,i}}\right)-{d}_{x,t,i}+{\hat{d}}_{x,t,i}\right]$$

or equivalently, maximise the log likelihood function:

$$l=\ln L=\sum \limits_{x,t,i}\left[{d}_{x,t,i}\ln({\hat{d}}_{x,t,i})-{\hat{d}}_{x,t,i}-\ln({d}_{x,t,i}!)\right]$$

where ${d}_{x,t,i}$ is the observed number of deaths at age x in year t for sex i, and ${\hat{d}}_{x,t,i}$ is the corresponding fitted number of deaths. For example, under the PCFM-CAE, the fitted number of deaths is calculated as:

$${\hat{d}}_{x,t,i}={E}_{x,t,i} {\hat{m}}_{x,t,i}={E}_{x,t,i}\exp\left({\hat{a}}_{x,i}+{\hat{B}}_{x} {\hat{K}}_{t}+\sum \limits_{j=1}^{n}{\hat{b}}_{x,j} {\hat{k}}_{t,i,j}\right)$$

The parameters are updated using the Newton–Raphson method:

$${\theta }^{*}=\theta -\frac{\frac{\partial l}{\partial \theta }}{\frac{{\partial }^{2}l}{\partial {\theta }^{2}}}$$

However, previous studies (e.g. Li 2013) have shown that updating all parameters simultaneously does not necessarily converge to any solution or lead to an optimal solution, as the likelihood function may be approximately flat or the iterations may be trapped in some local maxima. For practical purposes, a multi-stage estimation method (with a conditional maximum likelihood) is used here instead, as similarly suggested in Booth et al. (2002).

Below is an illustration of the updating procedure for the PCFM-CAE. First, the model is fitted with the common factor only, i.e. n = 0. Assuming an age range of 0–89:

(1)
Set up initial parameter values (${\hat{a}}_{x,i}$ = mean of log death rates at age x of sex i over time, ${\hat{B}}_{x}={\hat{b}}_{x,j}$ = 1/90, ${\hat{K}}_{t}={\hat{k}}_{t,i,j}$ = 0), and calculate ${\hat{d}}_{x,t,i}$.
(2)
Update ${\hat{a}}_{x,i}^{*}={\hat{a}}_{x,i}+{\sum }_{t}({d}_{x,t,i}-{\hat{d}}_{x,t,i})/{\sum }_{t}{\hat{d}}_{x,t,i}$ for all x and i, and then recalculate all ${\hat{d}}_{x,t,i}$.
(3)
Update ${\hat{K}}_{t}^{*}={\hat{K}}_{t}+{\sum }_{x,i}({d}_{x,t,i}-{\hat{d}}_{x,t,i}){\hat{B}}_{x}/{\sum }_{x,i}{\hat{d}}_{x,t,i} {\hat{B}}_{x}^{2}$ for all t, adjusted with ${\sum }_{t}{\hat{K}}_{t}^{*}=0$, and then recalculate all ${\hat{d}}_{x,t,i}$.
(4)
Update ${\hat{B}}_{x}^{*}={\hat{B}}_{x}+{\sum }_{t,i}({d}_{x,t,i}-{\hat{d}}_{x,t,i}){\hat{K}}_{t}/{\sum }_{t,i}{\hat{d}}_{x,t,i} {\hat{K}}_{t}^{2}$ for all x, and then recalculate all ${\hat{d}}_{x,t,i}$.
(5)
Divide ${\hat{B}}_{x}$ by ${\sum }_{x}{\hat{B}}_{x}$ and multiply ${\hat{K}}_{t}$ by ${\sum }_{x}{\hat{B}}_{x}$.
(6)
Compute the log likelihood function or deviance function.
(7)
Repeat (2) to (6) until the log likelihood or deviance function converges.
Now, treating the estimated parameters above as given, the additional sex-specific factors are incorporated one at a time and the new parameters are estimated as below:
(8)
Update ${\hat{k}}_{t,i,j}^{*}={\hat{k}}_{t,i,j}+{\sum }_{x}({d}_{x,t,i}-{\hat{d}}_{x,t,i}){\hat{b}}_{x,j}/{\sum }_{x}{\hat{d}}_{x,t,i} {\hat{b}}_{x,j}^{2}$ for all t, all i, and j = 1, adjusted with ${\sum }_{t}{\hat{k}}_{t,i,j}^{*}=0$, and then recalculate all ${\hat{d}}_{x,t,i}$.
(9)
Update ${\hat{b}}_{x,j}^{*}={\hat{b}}_{x,j}+{\sum }_{t,i}({d}_{x,t,i}-{\hat{d}}_{x,t,i}){\hat{k}}_{t,i,j}/{\sum }_{t,i}{\hat{d}}_{x,t,i} {\hat{k}}_{t,i,j}^{2}$ for all x and j = 1, and then recalculate all ${\hat{d}}_{x,t,i}$.
(10)
Divide ${\hat{b}}_{x,j}$ by ${\sum }_{x}{\hat{b}}_{x,j}$ and multiply ${\hat{k}}_{t,i,j}$ by ${\sum }_{x}{\hat{b}}_{x,j}$.
(11)
Compute the log likelihood function or deviance function.
(12)
Repeat (8) to (11) until the log likelihood or deviance function converges.
(13)
Treating the newly estimated ${\hat{b}}_{x,j}$ and ${\hat{k}}_{t,i,j}$ as given as well, repeat (8) to (12) for the other values of j.

This conditional maximum likelihood approach offers a more convenient fitting strategy and does not have any problem regarding the convergence in the parameters or the log likelihood / deviance function. The iterative procedure can readily be adopted for the PCFM-VSF and the hybrid model.

The standardised deviance residuals are calculated as:

$${r}_{x,t,i}={\text{sign}}({d}_{x,t,i}-{\hat{d}}_{x,t,i})\sqrt{\frac{2({d}_{x,t,i}\ln({d}_{x,t,i}/{\hat{d}}_{x,t,i})-{d}_{x,t,i}+{\hat{d}}_{x,t,i})}{\hat{\phi }}}$$

where ${n}_{p}$ is the effective number of parameters being estimated, ${n}_{d}$ is the number of observations, and the dispersion parameter is calculated as $\hat{\phi }=\frac{\text{deviance}}{{n}_{d}-{n}_{p}}$. In addition to graphical examination, we conduct the signs test on the residuals over each age, calendar year, and cohort year. For instance, if the goodness-of-fit is sufficient, the number of positive residuals would be distributed as ${\text{Binomial}}\left(\mathrm{42,0.5}\right)$ at each age, ${\text{Binomial}}\left(\mathrm{90,0.5}\right)$ in each year, and ${\text{Binomial}}\left(\mathrm{42,0.5}\right)$ for each of those cohorts who are covered in every year in the data. Note that incorporating more additional factors can improve the model fit and so the randomness of the residuals, but the corresponding BIC values may suggest against doing so.

We also consider a ‘hybrid’ model, in which the PCFM-CAE structure is modified with one extra (either) male- or female-specific factor. Table 12 gives the BIC values of fitting this hybrid model to the eight datasets. While the PCFM-CAE stays as the optimal model for Australia, England and Wales, West Germany, Sweden, and Canada, the hybrid model becomes the optimal choice for France, Denmark, and the United States.

Table 12 BIC values of fitted PCFM-CAE and hybrid between PCFM-CAE and PCFM-VSF, Australia, England & Wales, France, West Germany, Denmark, Sweden, Canada, and United States

Full size table

We also try adding a cohort parameter ${g}_{t-x}$ to both the PCFM-VSF and PCFM-CAE. The lowest BIC values are achieved by the baseline PCFM for the United States (108,611) and still by the PCFM-CAE for the others (Australia 71,878; England and Wales 84,342; France 86,699; West Germany 88,238; Canada 76,452). These values are lower than those before adding the cohort parameter for England and Wales, France, West Germany, and the United States, but not for Australia and Canada. The corresponding residual plots against cohort year are given in Fig. 6. It can be seen that most of the earlier cohort patterns are eliminated after incorporating the cohort parameter. The signs test also suggests that there is an improvement in the randomness of the residuals, in which the percentage of cohorts with too many or too few positive residuals decreases to less than 10% on average.

Furthermore, we experiment with using simply an AR(1) process for all the time-varying components of the additional factors. As shown in Fig. 7 , the projected male-to-female ratios of death rates generally converge over time faster than those in Fig. 3. The earlier problem of fluctuating PCFM-CAE projections for West Germany no longer exists. It may then be deduced that the speed of reaching coherence is negatively associated with the order of the AR(p) process. Besides conducting standard statistical tests to select the order of the process, based on a limited length of data points, one must also examine how fast the projected ratios converge over time, which could have a significant impact on demographic projections and insurance pricing.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wong, K., Li, J. & Tang, S. A modified common factor model for modelling mortality jointly for both sexes. J Pop Research 37, 181–212 (2020). https://doi.org/10.1007/s12546-020-09243-z

Download citation

Published: 17 March 2020
Issue Date: June 2020
DOI: https://doi.org/10.1007/s12546-020-09243-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A modified common factor model for modelling mortality jointly for both sexes

Abstract

Similar content being viewed by others

A multi-population evaluation of the Poisson common factor model for projecting mortality jointly for both sexes

Bayesian Poisson log-bilinear models for mortality projections with multiple populations

Modelling Nigerian Female Mortality: An Application of Four Stochastic Mortality Models

Introduction

Two modifications of PCFM

Fitting results

Projection results

Comparison with other models

Concluding remarks

Notes

References

Acknowledgments