1 Summary

Hawkes Processes

Hawkes processes form a specific category of point processes. Their main characteristic is that each event “self-excites” the process, which means that an occurrence increases the probability to have another arrival in a short time period. They were first introduced by Hawkes (1971) and are now commonly used for several applications: earthquake modelling Bacry and Muzy (2014), criminology Mohler (2011), finance Bacry et al. (2015), etc.

Hawkes Processes for Insurance

Since recently, insurance companies are developing an interest for Hawkes processes. They are used for calculating Solvency Capital Requirements and modelling different indicators of risks, such as ruin (improvement of Cramer-Lundberg model, see Cheng and Seol (2020)) or cyber-attacks (see Bessy-Roland et al. (2020)). Hawkes processes model claims arrival, considered to follow a Poisson process in classic approaches.

Article Purpose

Up to now, models using Hawkes processes in the literature mainly use an exponential excitation function, whose theory was developed in Hawkes’ original papers Hawkes (1971). For this excitation function, the self-exciting effect of each occurrence is maximal shortly after the event. However, for most insurance use cases, we need to take into account a delay between the event and its exciting effect on the process. The main originality of this paper is to propose the Gamma density as an excitation function suited to this context and to test on real data that this approach is a better choice in insurance situations. We introduce and discuss some practical situations on which the Gamma density fits better than exponential decay. The Gamma density excitation function allows us to deal with this delay and could be seen as a generalization of the exponential excitation function, since both are equivalent for a particular set of parameters.

At our best knowledge in the literature, there only exists limit results on first and second order statistics about this category of Hawkes processes. In Bacry and Muzy (2014), the authors state the Wiener-Hopf equations followed by the kernel matrix functions for all t. They only compute the expectation of a Hawkes process in a stationary state. In Bacry et al. (2012), the focus is on the asymptotic behaviour of multivariate Hawkes processes for a large time, by stating a law of large numbers and a functional central limit theorem. In Bordenave and Torrisi, Large deviations of Poisson cluster processes. Bordenave and Torrisi (2007) (Gao and Wang (2020) resp.), the authors prove a so-called large (moderate resp.) deviations principle on Hawkes processes. Such limit results established on the stationary regime are not suitable to deal with practical cases in the insurance context, where the time horizon is of several months. In this paper, we develop a framework for Hawkes processes with Gamma density excitation function regarding estimation and simulation. A new result in this context is the calculus of the mean and the variance of the process in the transient regime. To illustrate our work, we develop the use case of natural disasters in Luxembourg.

Main Contributions

The major contributions of this paper are to:

  1. 1.

    Define Hawkes processes and the elements needed for a complete use case study (estimation, goodness-of-fit, simulation, etc.);

  2. 2.

    Study mathematical expressions and properties of Hawkes processes (i.e. expectation, variance and central limit theorem) with a Gamma density as excitation function;

  3. 3.

    Model an insurance use case with real data by this specific form of Hawkes processes as an illustration of the mathematical tools.

Plan

The remainder of this paper is organized as follows. The second section defines a Hawkes process and introduces all useful notations. The third section contains the development of our results on mathematical properties of Hawkes processes. We also study a use case applied to insurance in the fourth section (natural disasters in Luxembourg).

2 Hawkes Processes Definition

In this section, we define the notion of Hawkes process, and we introduce useful notations for mathematical properties and algorithms (simulation, estimation).

2.1 Point Process

Hawkes processes are a particular class of point processes. We first propose a definition of a point process, inspired by Daley and Vere-Jones (2003).

Definition 1

(Point Process). A point process \(\mathbf {T}\) is an increasing sequence of random variables \(\mathbf {T} = \{ T_{1}, T_{2}, ...\}\) called occurrences which takes values in \([0, +\infty [\), such that \(\mathbb {P}(0 \le T_{1} \le T_{2} \le ...) = 1\).

Definition 2

(History of Occurrences). We define (\(\mathcal {H}(t), t \ge 0\)) the filtration and the history of the occurrences until time t.

Another useful representation of a sequence of occurrences is the counting notation.

Definition 3

(Counting Process). We say that N is a counting process if it is an almost surely finite stochastic process and a right-continuous step function with +1 increments after each step, taking values in \(\mathbb {N}\), and \(N(0)=0\).

The two notions of point process and counting process are interchangeable. Indeed, a counting process could be seen as the cumulative count of a point process. The link between \(\mathbf {T}\) and N is:

$$\begin{aligned} \forall t \in \mathbb {R}_{+}, N(t) = \# \big \{ T_{i} \in \mathbf {T} \big | T_{i} \le t \big \}, i \ge 1, \end{aligned}$$

where \(\#\) denotes the cardinality of a set and \(T_{i}\) is the time of the \(i^{th}\) occurrence in the time interval [0, T], where Tis the final time of observation.

2.2 Intensity Function

The most frequently used characterization of a Hawkes process is by means of the conditional intensity function, \(\lambda (t|\mathcal {H}(t))\). It is defined below.

Definition 4

(Conditional Intensity Function: Expected Rate of Occurrences Conditioned on \(\mathcal {H}(t)\)). Let us consider a counting process \(N(\cdot )\). If \(\lambda (t|\mathcal {H}(t))\) exists such that:

$$\begin{aligned} \lambda (t|{H}(t)) = \lim \limits _{h \rightarrow 0^{+}} \frac{\mathbb {E}[N(t+h) - N(t)|\mathcal {H}(t)]}{h}, \end{aligned}$$
(1)

then \(\lambda (t|\mathcal {H}(t))\) is the conditional intensity function of \(N(\cdot )\). We denote \(\lambda (t|\mathcal {H}(t))\) by \(\lambda ^{*}(t)\) for the remainder of the paper.

The following proposition is inspired from Proposition 2.2 of Rasmussen (2018) and states that the conditional intensity function uniquely defines the point process. Thus in the following we use this characterization.

Proposition 1

Let us consider a conditional intensity function \(\lambda ^{*}(\cdot )\), defined for any time period [ut], \(0 \le u \le t\), such that:

  • \(\lambda ^{*}(\cdot )\) non-negative and integrable on [ut],

  • \(\lim \limits _{t \rightarrow +\infty }\) \(\int _{u}^{t} \lambda ^{*}(s)ds = +\infty\).

Then there exists an unique point process with \(\lambda ^{*}(t)\) conditional intensity function.

2.3 Hawkes Process Definition

We introduce now the definition of Hawkes processes.

Definition 5

(Hawkes Process). Let us consider \(\lambda > 0\), the background intensity, and \(\mu : [0,+\infty [ \rightarrow [0, +\infty [\), the excitation function. We denote \(\{ t_{1},..,t_{N(t)}\}\) the sequence of past occurrences until time t. A point process is a Hawkes process if its conditional intensity function is of the form:

$$\begin{aligned} \lambda ^{*}(t) = \lambda + \int _{0}^{t} \mu (t-u) dN(u) = \lambda + \sum _{k = 1}^{N(t)} \mu (t-t_{k}), \end{aligned}$$
(2)

where \(N(\cdot )\) is defined in Definition 3.

Hawkes processes are point processes who have a so-called “self-exciting” property. It means that each occurrence increases the intensity, according to the excitation function. A higher intensity leads to more occurrences, leading to a higher frequency, etc. A Hawkes process with a null excitation function is a homogeneous Poisson process of rate \(\lambda\).

Definition 6

(\(\Gamma\)-Hawkes Processes). We define the \(\Gamma\)-Hawkes processes, a Hawkes process whose excitation function is of form:

$$\begin{aligned} \mu (t) = \alpha \frac{t^{k_{1}-1} \exp (- \frac{t}{k_{2}})}{k_{2}^{k_{1}} \Gamma (k_{1})}, t \ge 0, \alpha \in \mathbb {R}_{+}^{*}, k_{1}, k_{2} \in \mathbb {R}_{+}^{*}, \end{aligned}$$
(3)

where \(\Gamma (\cdot )\) is the Gamma function. \(k_{1}\) is the scale parameter and \(k_{2}\) is the shape parameter.

Remark 1

For \(k_{1} = 1\), we obtain the exponential excitation function:

$$\begin{aligned} \mu (t) = \alpha \frac{\exp (- \frac{t}{k_{2}})}{k_{2}}, t \ge 0, \alpha \in \mathbb {R}_{+}^{*}, k_{2} \in \mathbb {R}_{+}^{*}, \end{aligned}$$
(4)

used for instance originally by Hawkes (1971). In this situation:

$$\begin{aligned} \lambda ^{*}(t) = \lambda + \sum _{k = 1}^{N(t)} \alpha \frac{\exp (- \frac{t-t_{k}}{k_{2}})}{k_{2}}. \end{aligned}$$
(5)

For \(k_{1} > 1\), this excitation function allows to avoid a discontinuity in the intensity function when an event occurs, and introduce a delay between an event and its impact on the process. Indeed, in that case, \(\mu (0) = 0\). This property is very helpful when considering events for which an occurrence and the resulting self-excitation are delayed.

3 Mathematical Properties

In this section, we are going to explicit some mathematical properties of the \(\Gamma\)-Hawkes processes (defined by Eq. (3)): we calculate the expectation and the variance for any time \(t \ge 0\) and we state a central limit theorem.

Assumption 1

For the whole Section 3, we consider a \(\Gamma\)-Hawkes process of conditional intensity function:

$$\begin{aligned} \lambda ^{*}(t) = \lambda + \int _{0}^{t} \mu (t-u) dN(u) = \lambda + \alpha \sum _{k=1}^{N(t)} \frac{ (t - t_{k})^{k_{1}-1} \exp (\frac{-(t - t_{k})}{k_{2}})}{k_{2}^{k_{1}}\Gamma (k_{1})}, \end{aligned}$$
(6)

where \(\alpha \in \mathbb {R}_{+}^{*}, k_{1}, k_{2} \in \mathbb {R}_{+}^{*}\), \(N(t) \ge 0\) (number of events occurred between 0 and t), \(t_{k}\), \(k \in \{ 1, ..., N(t) \}\) for which we are going to explicit results for \(k_{1} = 1\) and \(k_{1} = 2\).

This study aims to understand the behaviour of the Hawkes processes in function of parameters introduced previously.

3.1 Behaviour in Function of Values of \(\alpha\)

In this section, we are going to see that the Hawkes process dynamics depend on values of \(\alpha\) (defined in Assumption 1). To do so, let us focus on \(g(t) = \mathbb {E}[\lambda ^{*}(t)]\), which will be used to calculate the expectation of the Hawkes process later.

We could show that g follows a renewal equation:

$$\begin{aligned} g(t) = \lambda + \int _{0}^{t} \mu (t-u) g(u) du. \end{aligned}$$
(7)

According to Asmussen (2003), the behaviour of g(t), when t goes to infinity, depends on the value of:

$$\begin{aligned} \int _{0}^{+\infty } \mu (u) du = \int _{0}^{+\infty } \alpha \frac{\beta ^{k_{1}} u^{k_{1}-1} \exp (-\beta u)}{\Gamma (k_{1})} du = \alpha . \end{aligned}$$
(8)

We distinguish, in the classic literature about point processes:

  • The defective case: \(\alpha < 1\);

  • The proper case: \(\alpha = 1\);

  • The excessive case: \(\alpha > 1\).

The Hawkes process intensity goes to infinity in the excessive case, leading to an explosion of the counting process. This situation does not suit to any real event we would like to model. The proper case is not studied here, because it does not correspond to a real case. We concentrate our attention in the defective case. In this case, g admits a finite limit in \(+\infty\), which will be calculated later.

Assumption 2

For the rest of the Section 3, we consider a Hawkes process of conditional intensity function described in Assumption 1, with \(0< \alpha < 1\) (defective case).

3.2 Expectation: \(\mathbb {E}(N(t))\)

The first indicator that seems relevant for our study was the evaluation of the average number of events through time. We propose a calculation method which could be applied for any value of \(k_{1} \in \mathbb {N}^{*}\). We will develop the calculations for two values of the shape parameter: \(k_{1} = 1\) (exponential excitation function), \(k_{1} = 2\) and \(k_{1} = 3\). Since a Hawkes process is the sum of a background part of intensity \(\lambda\) and an self-exciting part, we expect that the expectation is higher than for a Poisson process of parameter \(\lambda\), which is \(\lambda t\) for all time \(t > 0\).

Proposition 2

The expectation of the \(\Gamma\)-Hawkes process (i.e. conditional intensity function given by Eq. (6)) is:

  1. 1.

    For \(k_{1} = 1\) (i.e. \(\mu (t) = \alpha \frac{\exp (-\frac{t}{k_{2}})}{k_{2}}\), exponential decay):

    $$\begin{aligned} \mathbb {E}(N(t)) = \frac{\lambda }{1-\alpha }t - \frac{\alpha \lambda k_{2}}{(1 - \alpha )^{2}} \Big [ 1 - \exp \big ( - \frac{1 - \alpha }{k_{2}} t \big ) \Big ]. \end{aligned}$$
    (9)
  2. 2.

    For \(k_{1} = 2\) (i.e. \(\mu (t) = \alpha \frac{t \exp (-\frac{t}{k_{2}})}{k_{2}^{2}}\)):

    $$\begin{aligned} \mathbb {E}(N(t)) = \frac{\lambda }{1-\alpha }t + \frac{\lambda \sqrt{\alpha } k_{2}}{2(1 + \sqrt{\alpha })^{2}} \Big [ 1 - \exp \big ( - \frac{1 + \sqrt{\alpha }}{k_{2}} t \big ) \Big ] \nonumber \\ - \frac{\lambda \sqrt{\alpha } k_{2}}{2(1 - \sqrt{\alpha })^{2}} \Big [ 1 - \exp \big ( - \frac{1 - \sqrt{\alpha }}{k_{2}} t \big ) \Big ]. \end{aligned}$$
    (10)
  3. 3.

    For \(k_{1} = 3\) (i.e. \(\mu (t) = \alpha \frac{t^{2} \exp (- \frac{t}{k_{2}})}{k_{2}^{3} \Gamma (3)}\)):

    $$\begin{aligned} \mathbb {E}(N(t)) = \frac{\lambda }{1-\alpha }t + \frac{\lambda \alpha B}{k_{2}^{3}} \exp (-r_1 t) + \frac{\lambda \alpha C}{k_{2}^{3}} \cos (\omega t)\exp (-a t) + \frac{\lambda \alpha (D-Ca)}{k_{2}^{3}\omega } \sin (\omega t)\exp (-a t), \end{aligned}$$
    (11)

    where

    $$\begin{aligned} B&=\ \frac{1}{r_1(r_1 a_1 - r_1^2 - a_2)}, C = \frac{r_1 - a_1}{a_2(r_1 a_1 - r_1^2 - a_2)}, D = \frac{r_1 a_1 - a_1^{2} + a_2}{a_2(r_1 a_1 - r_1^2 - a_2)}, \\ r_{1}&=\ \frac{1-\alpha ^{1/3}}{k_{2}}, a_{1} = \frac{(2 + \alpha ^{1/3})}{k_{2}}, a_{2} = \frac{(1 - \alpha )}{(1 - \alpha ^{1/3})k_{2}^{2}}, a = \frac{(2 + \alpha ^{1/3})}{2k_{2}}, \omega = \frac{\sqrt{3}\alpha ^{1/3}}{2k_{2}}. \end{aligned}$$

See Appendix 1 for the proof, which is based on Laplace transform of \(g(t) = \mathbb {E}(\lambda ^{*}(t))\) and could be applied for any value of \(k_1\). We could notice that despite the formula complexity increases with \(k_{1}\), we could observe several common trends. We first notice that for \(\alpha \rightarrow 0\) (null excitation function), we obtain \(\mathbb {E}(N(t)) \underset{\alpha \rightarrow 0}{\longrightarrow } \lambda t\). This result is natural since we already mentioned that a Hawkes process with a null excitation function is equivalent to a Poisson process of parameter \(\lambda\). For all values of the shape parameter, the expectation reveals three regimes for the \(\Gamma\)-Hawkes process. As an example, we represent the expectation for \(k_{1} = 1, k_{2} = 9, \alpha = 0.9, \lambda = 0.3\), which follows Eq. (9), in Fig. 1. It contains four plots:

  • Top-left: expectation from Eq. (9) in blue and the so-called stationary regime \(t \rightarrow \frac{\lambda }{1-\alpha }t\) in red, from t = 0 to 3000;

  • Bottom-left: the relative difference between the expectation and the stationary regime on this time period;

  • Top-right: focus on the expectation from t = 0 to 500;

  • Bottom-right: plot of the exponential term from Eq. (9), \(t \rightarrow \exp \big ( - \frac{1 - \alpha }{k_{2}} t \big )\).

Fig. 1
figure 1

Number of claims per day processed by Foyer Assurances and labelled as natural disasters since 2015

We distinguish three regimes in this example:

  • The transient regime (from t = 0 to 500 approximately): it is the time period necessary for the exponential term in Eq. (9) to become null;

  • The intermediary regime (from t = 500 to 2000 approximately): the exponential term is null but the constant term could not be neglected before the stationary regime. On this period, the relative difference between the expectation and the stationary regime is above 5\(\%\);

  • The stationary regime: for any value of the parameter scale, and for any Hawkes process (not only for \(\Gamma\)-Hawkes processes), \(\mathbb {E}(N(t)) \underset{t \rightarrow +\infty }{\sim } \frac{\lambda }{1-\alpha }t\) (see Gao (2018)). The constant term in Eq. (9) becomes negligible before the linear term. After the first two regimes, the average number of occurrences of the Hawkes process is the same as for a Poisson process of parameter \(\frac{\lambda }{1-\alpha }\).

We conclude in this discussion that the expectation could not be approximated correctly by the stationary regime from t = 0 to 2000. We will see that this point will have an impact on our use case study.

3.2.1 Variance: \(\mathbb {V}(N(t))\)

We study now the variance of the Hawkes process, in order to see how the occurrences are spread out from the expectation calculated previously. We define \(\widetilde{N}(\cdot )\) as the stationary Hawkes process.

Proposition 3

The variance of the stationary \(\Gamma\)-Hawkes process (i.e. the conditional intensity function is given by Eq. (6)) is:

  1. 1.

    For \(k_{1} = 1\) (i.e. \(\mu (t) = \alpha \frac{\exp (-\frac{t}{k_{2}})}{k_{2}}\)):

    $$\begin{aligned} \mathbb {V}(\widetilde{N}(t))&=\ \frac{\lambda }{(1 - \alpha )^{3}}t - \frac{\lambda \alpha (2 - \alpha ) k_{2}}{(1 - \alpha )^{4}} \Big [ 1 - \exp \big (- \frac{1 - \alpha }{k_{2}} t \big ) \Big ]. \end{aligned}$$
    (12)
  2. 2.

    For \(k_{1} = 2\) (i.e. \(\mu (t) = \alpha \frac{t \exp (-\frac{t}{k_{2}})}{k_{2}^{2}}\)):

    $$\begin{aligned} \mathbb {V}(\widetilde{N}(t))&=\ C_{1} + C_{2}t + C_{3} \exp (-\omega _{1} t) + C_{4} \exp (-\omega _{2} t) \text {, where:} \\ \omega _{1}&=\ \frac{1 + \sqrt{\alpha }}{k_{2}}, \\ \omega _{2}&=\ \frac{1 - \sqrt{\alpha }}{k_{2}}, \\ C_{1}&=\ \frac{-\alpha k_{2} \lambda [8 - 5 \alpha + \alpha ^{2}]}{2 (1 - \alpha )^{4}}, \\ C_{2}&=\ \frac{\lambda }{(1 - \alpha )^{3}}, \\ C_{3}&=\ \frac{-\sqrt{\alpha } \lambda k_{2}[4 -3\alpha - \sqrt{\alpha }\alpha ]}{4 (1-\alpha )^{2}(1+\sqrt{\alpha })^{2}}, \\ C_{4}&=\ \frac{\sqrt{\alpha } \lambda k_{2}[4 -3\alpha + \sqrt{\alpha }\alpha ]}{4 (1-\alpha )^{2}(1-\sqrt{\alpha })^{2}}. \end{aligned}$$
    (13)

See Appendix 2 for the proof, in which we use the classic assumption that covariance density does not depend on time, assumption valid only if the process is stationary. However, we remark that theoretical variance is close to the results from simulations (we propose a method to simulate Hawkes processes in Appendix 3), see Section 4.5 for a comparison between simulated and actual variance. A similar comparison in Bacry et al. (2015) shows that theoretical variance based on stationary assumptions is a fine approximation as well. Therefore, we consider that \(\mathbb {V}(N(t))\) approximately equals \(\mathbb {V}(\widetilde{N}(t))\) from now on.

Let us observe the behaviour of the variance. We first notice that for \(\alpha \rightarrow 0\) (null excitation function), we obtain \(\mathbb {V}(N(t)) \underset{\alpha \rightarrow 0}{\longrightarrow } \lambda t\). This result is natural since we already mentioned that a Hawkes process with a null excitation function is equivalent to a Poisson process of parameter \(\lambda\). Secondly, we remark that \(\mathbb {V}(N(t)) \underset{t \rightarrow +\infty }{\sim } \frac{\lambda }{(1-\alpha )^{3}}t\). We have seen previously that \(\mathbb {E}(N(t)) \underset{t \rightarrow +\infty }{\sim } \frac{\lambda }{1-\alpha }t\): it is natural that we obtain as a limit a variance greater than \(\frac{\lambda }{1-\alpha }t\) since a Hawkes process is more volatile than a Poisson process. The limit of the variance is calculated in Proposition 3 from Gao (2018), which gives the same result. It is also possible to do an analysis in terms of regimes similar to the expectation.

3.3 Central Limit Theorem for Hawkes Processes

From Theorem 2 of Bacry et al. (2012), we state a central limit theorem for Hawkes process, when \(t \rightarrow +\infty\). The following proposition is true for any \(k_{1} \ge 1\):

Proposition 4

(Central Limit Theorem for Hawkes Processes)

$$\begin{aligned} \underset{t \rightarrow +\infty }{\lim } \frac{N(t) - \bar{\lambda }t}{\sqrt{t}} {\mathop {=}\limits ^{d}} \sigma B(1), \end{aligned}$$
(14)

where:

  • \(\bar{\lambda } = \frac{\lambda }{1 - \alpha }\);

  • \(\sigma ^{2} = \frac{\lambda }{(1 - \alpha )^{3}}\);

  • B(1) is a standard Brownian motion;

  • \({\mathop {=}\limits ^{d}}\) means equality in distribution.

We recognize in this central limit theorem the limits of expectation and variance calculated previously. The central limit theorem gives the distribution of the number of events at a large time horizon.

4 Use Case: Natural Disasters in Luxembourg

In this section, we present an insurance use case. Natural disasters are one of the types of rare events who could occur in any insurance setting. We apply it in the case of Luxembourg, with data from insurance. They are mainly storms and gusts of wind, whose frequency seems to increase for these last five years. This trend may be explained by climate imbalance (see public data about climate change in Luxembourg Online (2019)).

After introducing the data provided by Foyer Assurances in Section 4.1, we present the parameters estimated on these data in Section 4.2. Then we justify the choice of using a \(\Gamma\)-Hawkes process and its study on a transient regime by first comparing the goodness-of-fit of our approach with several models by a statistical test, then simulating data and observing the number of events from these simulations to check whether our \(\Gamma\)-Hawkes process fits with the data.

4.1 Presentation of the Data

Foyer Assurances has provided data about natural disasters and home insurance. Figure 2 shows the number of home insurance claims per day processed by Foyer Assurances for which the natural disasters cover was used, from January 2015 to December 2019.

Fig. 2
figure 2

Counting process and intensity since 2015 / Cumulated number of natural disasters events in Luxembourg since 2015 (in blue) and the intensity of the estimated Hawkes process (in red)

This timeline shows the low frequency and the high severity of this kind of event. The peaks correspond to specific weather conditions:

  • 31/03/15: “Niklas” storm;

  • 16/09/15: “Henri” ex-tropical storm;

  • 09/02/16: gusts of wind in East France and Luxembourg;

  • 13/01/17: “Egon” cyclone: storm, snow;

  • 03/01/18: “Eleanor” storm;

  • 10/03/19: gusts of wind in East France and Luxembourg;

  • 09/08/19: exceptional tornado in Luxembourg.

We model the occurrence of claims thanks to a Hawkes process. The branching structure introduced in Appendix 4 may suggest that one claim must be triggered by one another specific claim, which is not the case in reality. But Hawkes processes are well suited for modelling high-frequency periods of occurrence where the causality between events is not trivial as well and where there is a clustering effect due to an underlying cause (i.e. a natural disaster). For instance, Hawkes processes are well established to model high-frequency trades in finance, while one trade is not directly triggered by another (see Bacry et al. (2015)).

4.2 Parameters Estimation

To model the type of event introduced previously, we propose a \(\Gamma\)-Hawkes process:

$$\begin{aligned} \lambda ^{*\text {ND}}(t) = \lambda ^{\text {ND}} + \sum _{k = 1}^{n} \alpha ^{\text {ND}} \frac{(t-t_{k})^{k_{1,\text {ND}}-1}}{\Gamma (k_{1,\text {ND}})k_{2,\text {ND}}^{k_{1,\text {ND}}}}\exp \Big ( -\frac{(t-t_{k})}{k_{2,\text {ND}}} \Big ), \end{aligned}$$
(15)

where ND is an abbreviation for natural disasters and with \(k_{1,\text {ND}} = 2\). This choice of model could be justified a priori as follows:

  • Self-exciting property: a unique background event which occurs rarely (e.g. a storm) and is taken into account by the background intensity in the model, triggers a sequence of numerous claims around the country. The more claims there is, the more serious the event is, the more likely new claims are: this dynamic is modelled by the self-exciting part of the intensity. Thus, we are expecting that the exciting part of the process has more importance than the background part, which should lead to a \(\alpha ^{\text {ND}}\) value greater than \(\lambda ^{\text {ND}}\).

  • Gamma density intensity function of parameter \(k_{1,\text {ND}} = 2\): there is a delay between the occurrence of claims on the different Luxembourgish cities. The storm reaches the Luxembourg regions at different times. This function allows us to model this delay. Also, even if we check that closed numerical values for \(k_{1,\text {ND}}\) performed similarly, we select \(k_{1,\text {ND}} = 2\) to develop an interpretation of results based on closed-form formulae calculated above.

We estimate parameters thanks to EM algorithm, introduced in Appendix 4. Estimation leads to the following parameters in Table 1. We used the R library hawkes, developed by the authors of Halpin and De Boeck (2013), to implement the estimation. We also propose the standard deviation of the estimations, from the inverse of the Fisher information matrix that we compiled numerically (see Ly et al. (2017)).

Table 1 Estimated values of the HP parameters by the EM algorithm

We see that \(\alpha ^{\text {ND}} > \lambda ^{\text {ND}}\), as expected, and that \(\alpha ^{\text {ND}}\) is close to 1. Since the expectation is proportional to \(\frac{1}{{1-\alpha ^{\text {ND}}}}\) (we recall that \(\mathbb {E}(N(t)) \underset{t \rightarrow +\infty }{\sim } \frac{\lambda ^{\text {ND}}}{1-\alpha ^{\text {ND}}}t\)), this value of \(\alpha ^{\text {ND}}\) should lead to a high average number of events.

Figure 3 presents the evolution of the cumulative number of events (in blue) and the estimated intensity of the \(\Gamma\)-Hawkes process (in red). We see that intensity peaks correspond to the meteorological events described previously.

Fig. 3
figure 3

QQ-plot and Kolmogorov-Smirnov test for set of parameters estimated by EM algorithm

4.3 Goodness-of-Fit and Comparison with Other Models

After the parameters estimation, we need to check if this model makes sense. The following proposition from Daley and Vere-Jones (2003), called residual analysis, allows us to test the quality of the model.

Proposition 5

(Residual Analysis) We consider a sequence of occurrence times \(\{ t_{1}, t_{2}, ... \}\) and a monotonic, continuous compensator \(\Lambda (t) = \int _{0}^{t} \lambda ^{*}(s)ds\) such that \(\lim \limits _{t \rightarrow +\infty } \Lambda (t) = +\infty\) almost surely. The sequence \(\{ \Lambda (t_{1}), \Lambda (t_{2}), ... \}\) is a Poisson process with an unit rate if and only if \(\{ t_{1}, t_{2}, ... \}\) is a realisation from the point process defined by \(\lambda ^{*}(\cdot )\).

Thanks to this proposition, we could test with many procedures whether our data fit with a Hawkes process. The previous proposition shows us that testing whether \(\{ t_{1}, t_{2}, ... \}\) follows a Hawkes process with intensity function \(\lambda\) is equivalent to test whether \(\{ \Lambda (t_{1}), \Lambda (t_{2}), ... \}\) form a Process process of parameter 1. It is also equivalent to test whether every interarrival time \(\{ \Lambda (t_{1}), \Lambda (t_{2}) - \Lambda (t_{1}), \Lambda (t_{3}) - \Lambda (t_{2}), ... \}\) are independent and identically distributed and follow an Exponential law of parameter 1. This could be done by:

  • A Quantile-Quantile plot (QQ-plot), which represents quantiles of both distributions;

  • Performing a Kolmogorov-Smirnov test, whose statistic D is the maximum absolute difference between the two distributions.

Figure 4 is a QQ-plot which allows us to compare the distribution of our data and the distribution of the \(\Gamma\)-Hawkes process with the estimated parameters. We also perform a Kolmogorov-Smirnov test.

Fig. 4
figure 4

QQ-plot and Kolmogorov-Smirnov test for an homogeneous Poisson process (left) and a Hawkes process with an exponential (center) and a power-law excitation function (right)

The p-value of the statistical test is greater than \(5\%\), which means that we cannot reject the hypothesis that the data follow the estimated \(\Gamma\)-Hawkes process.

We compare the results with three other models:

  • An homogeneous Poisson process, since it is the simplest point process and is equivalent to a Hawkes process with no self-excitation, whose rate is the average number of events per day;

  • A Random Forest Regressor, a time series approach in order to take into account temporal dependencies;

  • A Hawkes process with an exponential excitation function and a power-law kernel, in order to check whether another excitation function would have better performances than the Gamma density, whose parameters are estimated by the EM algorithm (see Table 2). The exponential excitation function is in Eq. (4). We propose for a power-law excitation function the following equation:

    $$\begin{aligned} \mu (t) = \frac{\alpha }{(t+\nu )^{\kappa }} \end{aligned}$$
    (16)
    Table 2 Estimated value of the HP parameters by the EM algorithm for an exponential and a power-law excitation function

The QQ-plots for the two models are represented on Fig. 5:

Fig. 5
figure 5

Excitation functions with respect to time : exponential (black), Gamma (blue), power-law (red)

As expected, an homogeneous Poisson process is not enough to model this type of event. We need to take into account the self-excitation property of the process. We could also see that the fitting is better with a Gamma excitation function than an exponential or a power-law decay, which validates our study a posteriori. We remark that the exponential and the power-law functions provide the same type of self-excitation with no delay, as we plot the fitted functions in Fig. 6.

Fig. 6
figure 6

Parameters used for Random Forest Regressor

4.3.1 Random Forest Regressor

In order to catch the temporal dependency of the natural disasters, we compare with a time series approach. We present here the results given by Random Forest Regressor (see Breiman and Forests (2001) for details about Random Forest), which fitted better that other types of time series models in terms of Kolmogorov-Smirnov test (ARIMA, ARMA, GARCH, Prophet). We used the scikit-learn estimator sklearn.ensemble.RandomForestRegressor and the parameters presented in Fig. 6.

Fig. 7
figure 7

Prediction of the time series by a Random Forest Regressor

We learned the time series from January 2015 to June 2019 and we predicted the number of events over 100 days from July 2019. Figures 7 and 8 shows the prediction (in blue) versus the real data (in orange, the peak corresponds to the tornado in August 2019).

Fig. 8
figure 8

Prediction of the time series by a Random Forest Regressor

We then compared the distribution of the interarrival times of the prediction with the distribution of the interarrival times of the actual time series with a QQ-plot in Fig. 9.

Fig. 9
figure 9

Simulation of 20 trajectories over five years for set of parameters estimated by EM algorithm

We could see that this approach does not allow to catch the severe peaks which characterize this time series. This kind of approach is more efficient on data with seasonality and a regular trend.

4.4 Simulations

Moreover, we must make sure that our model is able to replicate severe events like those observed in our data. We verify that simulated trajectories present the same structure than the actual time series: few severe events between very calm periods. Figure 10 shows the simulation of 20 trajectories over five years according to our estimated Hawkes process.

Fig. 10
figure 10

Simulation of 20 trajectories over five years for four set of parameters

Most of the simulated trajectories do not present the expected step function structure, observed with real insurance data. The evolution of the process counting is too smooth for these simulations, if we compare the dynamics between Figs. 3 and 6.

Let us see what happens with another set of parameters. Since the expectation of the process assuming stationarity is proportional to \(\frac{\lambda ^{\text {ND}}}{1 - \alpha ^{\text {ND}}}\), we slighty change \(\lambda ^{\text {ND}}\) and \(\alpha ^{\text {ND}}\) so that stationary expectation remains the same. Moreover, as we would like to have larger jumps in the trajectory, it means that the importance of the self-exciting part of the process should be higher. Thus, we will increase the value of \(\alpha ^{\text {ND}}\). We propose the same type of simulation over five years for different sets of \((\lambda ^{\text {ND}},\alpha ^{\text {ND}})\) in Fig. 11.

Fig. 11
figure 11

Average number of times the process reaches a threshold of 500 events in a period of time of 1825 days

Results show that increasing \(\alpha ^{\text {ND}}\) allows indeed the process to be more volatile and with more severe events. Simulated trajectories are more alike actual data for the different sets of parameters. The higher is \(\alpha ^{\text {ND}}\), the larger are the jumps. In order to quantify the capacity to generate scenarios with severe events, we evaluate by a Monte-Carlo method the average number of times the process reaches a threshold of 500 events in a period of time of 1825 days (five years), by performing 1000 simulations. We present the results on Fig. 12.

Fig. 12
figure 12

QQ-plot and Kolmogorov-Smirnov test for four set of parameters

The Monte-Carlo estimation shows that the higher is \(\alpha\), the higher is the average value of severe events. Therefore it confirms that another set of parameters could be more appropriate in terms of ability to generate disasters observed on real data.

In terms of goodness-of-fit, Fig. 13 shows QQ-plot for each set of parameters.

Fig. 13
figure 13

Distribution of errors at the end of the five years

Figure 13 illustrates that goodness-of-fit is worse for each set of parameters, when comparing to the first set of parameters we estimated by our approach (see Fig. 4). Indeed, for each set of parameters we could reject the hypothesis that the Hawkes process fits the data. We eventually have a set of parameters which maximizes the goodness-of-fit (the one estimated by the EM algorithm), but another which seems to catch better the dynamics of the time series (a set with a higher \(\alpha ^{\text {ND}}\)). It is explained by the fact that the goodness-of-fit is evaluated on the entire distribution, by computing the distance on all the quantiles of the distribution. Whereas the dynamics of the natural disasters rely on the accumulation of grouped events, which are the smallest quantiles of the distribution. By the way we could see on Fig. 13 that the distribution gets worse on medium quantiles. There is an imbalance between the quality of simulations and the statistical tests, which illustrates the difficulty to estimate the right set of parameters for a Hawkes process.

4.5 Distribution of the Number of Events

In this subsection, we observe the distribution of N(t), where t = five years. We perform 1000 simulations of the Hawkes processes, with the first set of parameters estimated, for a time period of five years. We observe the errors distribution in Fig. 14, \(\frac{N(t) - \bar{\lambda }t}{\sqrt{t}}\), and we compare with a Normal distribution of variance \(\sigma ^{2} = \frac{\lambda }{(1 - \alpha )^{3}} t\).

Fig. 14
figure 14

A realisation of a Hawkes process and its branching structure

We see that the mean of errors is different from zero. The distribution of errors does not follow the Normal distribution, as the process is not observed long enough to be close to its limit. This observation justifies our mathematical study a posteriori, since the study of limits is not enough. We need the transitory values of the expectation and the variance, even for a long five years period.

Numerically, from Eqs. (10) and (13) we obtain:

$$\begin{aligned} \mathbb {E}(N(t))&=\ -{655.55} + {3.25}t - {0.28} \exp (-{0.22}t) + {655.84} \exp (-{0.004}t), \\ \mathbb {V}(N(t))&=\ -{82154.94} + {383.66}t - {6.01} \exp (-{0.22}t) + {82160.95} \exp (-{0.004}t), \end{aligned}$$
(17)

with t in days.

The following table presents the observed, calculated and limit value of expectation and variance at the final time of simulations.

Table 3 Observed, calculated and limit value of expectation and variance

This table confirms that calculated expectation and variance give the correct values and that focusing on limits is not sufficient to study a process over several years, which is a quite long period for insurance use cases. Thanks to numeric values provided by Eqs. (23) and (24), we could see that we are still in either the transient state or the intermediary state for \(t = 5\) years \(= 1825\) days: while exponential terms are close to zero for both expectation and variance, the difference between the actual value (third column of Table 3) and the limit value (fourth column) is explained by the constant terms. The order of magnitude of these constant terms shows that the limit values are not a good approximation for a period of several years. That justifies our study on the different regimes.

5 Conclusion and Future Work

In this paper, we introduced a complete framework for the study of Hawkes process: definition, estimation and simulation. We presented a specific form of Hawkes processes, the \(\Gamma\)-Hawkes process, and we developed mathematical properties for this category of process. We applied this model to an insurance use case, i.e. natural disasters, and we observed that the considered Hawkes process is well fitted with historical data.

For future work, we intend to apply this model to other insurance use cases, such that:

  • Life events prediction: births, marriage, job change, etc. We could use a multi-variate version of Hawkes processes (see Laub et al. (2015)), since one type of event could trigger another (e.g. a marriage could lead to a birth);

  • Epidemic: we already studied the dynamics of Covid-19 pandemic in Lesage (2020). We intend to adapt the study to a long class of diseases;

  • Workload prediction: we would like to anticipate peaks of workload for Foyer Assurances employees in order to optimize staffing.