1 Introduction

Hall (2003) studied the limiting distribution of the maximum term \(M_n=\max (X_1,\cdots ,X_n)\) of stationary sequences \(\{X_j\}\) defined by non-negative integer-valued moving average (INMA) sequences of the form

$$X_j=\sum _{i=-\infty }^{+\infty } \alpha _{i}\circ V_{j-i},$$

where the innovation sequence \(\{V_i\}\) is an iid sequence of non-negative integer-valued random variables (rvs) with exponential type tails of the form

$$\begin{aligned} 1-F_V(n) \sim n^\xi L(n)(1+\lambda )^{-n},\, n \rightarrow + \infty , \end{aligned}$$
(1)

where \(\xi \in \mathbb {R},\, \lambda >0,\, L(n)\) is slowly varying at \(+\infty\) and \(\alpha_i \circ\) denotes binomial thinning with probability \(\alpha _i \in [0,1]\). Hall (2003) proved that \(\{X_j\}\) satisfies Leadbetter’s conditions \(D(x+b_n)\) and \(D'(x+b_n)\), for a suitable real sequence \(b_n\), and then

$$\left\{ \begin{array}{l} \limsup _{n\rightarrow + \infty } P(M_n\le x+b_{n}) \le \exp (-(1+\lambda /{\alpha _{\max }})^{-x}) \\ \liminf _{n\rightarrow + \infty } P(M_n\le x+b_{n}) \ge \exp (-(1+\lambda /{\alpha _{\max }})^{-(x-1)}),\\ \end{array} \right.$$

for all real x and \(\alpha _{\max }:=\max \{\alpha _i,\, i \in \mathbb{Z}\}\). Note that \(\alpha _{\max }\) plays an important role in this result. This is an extension of Theorem 2 of Anderson (1970), where it is proved that for sequences of iid rvs with an integer-valued distribution function (df) F with infinite right endpoint, the limit

$$\begin{aligned} \displaystyle {\lim _{n\rightarrow +\infty }} \frac{1-F(n-1)}{1-F(n)}=r>1, \end{aligned}$$
(2)

is equivalent to

$$\exp (-r^{-(x-1)}) \le \liminf _{n\rightarrow + \infty } F^n(x+b_{n}) \le \limsup _{n\rightarrow + \infty } F^n(x+b_{n}) \le \exp (-r^{-x}),$$

for all real x.

The class of dfs satisfying (1), which is a particular case of (2) (see, e.g., Hall and Temido (2007)) is called Anderson’s class.

In this paper we extend the result of Hall (2003) for the bivariate case of an INMA model. Concretely, we study the limiting distribution of the maximum term of stationary sequences \(\{(X_{j},Y_{j})\}\) where the two marginals are defined by non-negative integer-valued moving average sequences of the general form

$$\begin{aligned} (X_j, Y_j) = \left( \sum _{i=-\infty }^{+ \infty } \alpha _{i}\circ V_{j-i}, \sum _{i=-\infty }^{+\infty } \beta _{i}\circ W_{j-i}\right) , \end{aligned}$$

where \(X_j\) and \(Y_j\) are defined as above with respect to a two-dimensional iid innovation sequence \(\{V_i, W_i\}\). The binomial thinning operator \(\beta \circ\), due to Steutel and van Harn (1979), is defined by \(\beta \circ Z=\sum _{s=1}^{Z} B_s(\beta ),\,\, \beta \in [0,1],\) where \(\{B_s(\beta )\}\) is an iid sequence of Bernoulli rvs independent of the positive integer rv Z. The possible class of bivariate discrete distributions \(F_{V,W}\) (see (4)) includes also the bivariate geometric models.

We assume that \(X=\alpha \circ V\) and \(Y=\beta \circ W\) are conditionally independent given (VW), because the binomial thinning with \(\alpha \circ\) and \(\beta \circ\) are independent, X and Y are binomial rv’s. with parameters \((V, \alpha )\) respectively \((W, \beta )\), i.e.

$$\begin{array}{ll} P\left( X \in A,Y \in B |{V=v,W=w}\right) &{} = \, P\left( X \in A|{V=v,W=w}\right) P\left( Y \in B|{V=v,W=w}\right) \\ &{} = \, P\left( X \in A|V=v \right) P\left( Y \in B |W=w\right) ,\\ \end{array}$$

for all events A and B and for all possible values of v and w. We assume that \(\alpha _i,\beta _i\in [0,1]\) and

$$\begin{aligned} \alpha _i,\beta _i= O \left( |i|^{-\delta }\right) , |i| \rightarrow +\infty , \end{aligned}$$
(3)

for some \(\delta >2\).

We investigate the limiting behaviour of \((M_n^{(1)}, M_n^{(2)})=(\max _{1\le j\le n} X_j, \max _{1\le j\le n} Y_j)\) and want to find out whether the two maxima components are asymptotically dependent, because of the dependence of the innovations \((V_i,W_i)\). However, we will show that this is not occurring because of the independent thinning, as we believe. We investigate the impact of the dependence of \((V_i,W_i)\) on the limiting distribution and the convergence rate.

Following similar ideas of Hall (2003) for the univariate case, we:

  • Define a bivariate model \(F_{V,W}\) which contains the bivariate geometric model;

  • Characterize the tail of \((\alpha \circ V,\beta \circ W)\) and the tail of \((X_j,Y_j)\), in terms of the model \(F_{V,W}\);

  • Establish the limiting behaviour of the bivariate maximum \((M_n^{(1)}, M_n^{(2)})\) of the stationary sequence \(\{(X_{j},Y_{j})\}\) which is defined componentwise; and

  • Investigate the convergence of the joint distribution of the bivariate maximum to the limiting distribution by simulations.

Examples: 1.) We may consider the \(\{(V_i,W_i)\}\) as the number of person newly infected by virus1 (say COVID-19 virus) and virus2 (say the usual seasonal virus) at time i. It is possible that a person is infected by both virus only at the same time point. We count by \(\{(X_{j},Y_{j})\}\) the total number of infected and still contagious persons at time j adding all infected persons before and at time j. After some time these persons are cured (or died) and are no more counted to the number of infected but still contagious persons. Hence the random numbers \((V_i,W_i)\) are thinned at each time point, so the contribution to \(X_j\) is \(\alpha _{j-i}\circ V_i\), and to \(Y_j\) is \(\beta _{j-i}\circ W_i\) for \(i\le j\).

2.) Another example for bivariate integer valued time series is presented in Pedeli and Karlis (2011) who discuss the bivariate INAR(1) model with negative binomial innovations for the application of road accidents at two different time intervals in Schiphol area. However their bivariate negative binomial innovations are in the case of geometric innovations of a different type herein considered. Similar is the situation in the paper of Silva et al. (2020) who discuss inference of such a bivariate time series with different distribution of the innovations. But their bivariate negative binomial distribution is also not of our type.

3.) A further application of a bivariate time series for count data in finance is given by Quoreshi (2006). He did not specify the bivariate distribution. He derived the mean and variance/covariances of this time series.

2 Preliminaries Results for Bivariate Innovations

Let (VW) be a non-negative random vector with bivariate df \(F_{V, W}\) satisfying

$$\begin{aligned} \begin{array}{lll} 1-F_{V, W}(v,w) &{}=&{} \left( 1+\lambda _1\right) ^{-[v]}\left[ v\right] ^{\xi _1}L_1\left( v\right) +\left( 1+\lambda _2\right) ^{-[w]} \left[ w\right] ^{\xi _2}L_{2}\left( w\right) \\ &{} &{} \, - (1+\lambda _1)^{-[v]}(1+\lambda _2)^{-[w]}\theta ^{\min ([v],[w])}L_3(v)L_4(w) v^{\xi _3}w^{\xi _4}\ell (v,w),\\ \end{array} \end{aligned}$$
(4)

as \(v,w \rightarrow +\infty\), for positive real constants \(\lambda _i>0\), \(i=1,2\), \(\theta >0\) such that \(\theta < \min \{1+\lambda _1, 1+\lambda _2\}\) and \(\theta >1-\lambda _1\lambda _2\), some real constants \(\xi _i\) and slowly varying functions \(L_i\), \(i=1,2,3,4\), and where \(\ell (v,w)\) is a positive bounded (say by \(\vartheta\)) function which converges to a positive constant L as \(v,w \rightarrow \infty\). That \(\ell (v,w)\) converges to L is for simplicity. It has no impact on the results if the limit L would depend on \(v<w, v=w\) or \(v>w\). By [x] we denote the greatest integer not greater than x.

Remark 2.1

The marginal tails of \(F_{V,W}\) are of the form:

$$\begin{aligned} 1-F_V(v) = [v]^{\xi _1}(1+\lambda _1)^{-[v]}L_1(v) \quad \text{ and } \quad 1-F_W(w) = [w]^{\xi _2}(1+\lambda _2)^{-[w]}L_2(w), \end{aligned}$$
(5)

for \(v, w \rightarrow +\infty\). Hence, both marginal dfs belong to the Anderson’s class with

$$\begin{aligned} \lim _{v\rightarrow + \infty } \frac{1-F_V(v)}{1-F_V(v+1)}= 1+\lambda _1 \quad \text{ and } \quad \lim _{w\rightarrow + \infty } \frac{1-F_W(w)}{1-F_W(w+1)}= 1+\lambda _2 . \end{aligned}$$

From (4), we can derive the probability function (pf) of (VW). Because the proofs of the following propositions are technical, we move them to Appendix Proofs.

Proposition 2.1

The pf of the random vector (VW) with df (4) is given by

$$\begin{aligned}&P(V=v,W=w)&= (1+\lambda _{1})^{-v}(1+\lambda _2)^{-w}\theta ^{\min ([v],[w])-1}L_3(v)L_4(w)v^{\xi _3}w^{\xi _4}\ell (v,w)\ell ^*(v,w), \end{aligned}$$

for vw large integers, where

$$\begin{aligned} \lim _{v,w\rightarrow + \infty }{\ell ^*}(v,w) = \left\{ \begin{array}{lc} \lambda _2\left( 1+ \lambda _1 -\theta \right) \ , &{} v<w, \\ \lambda _1 \lambda _2 +\theta -1\ , &{} w = v, \\ \lambda _1\left( 1+ \lambda _2 -\theta \right) \ , &{} w < v,\end{array}\right. \end{aligned}$$
(6)

and \(\ell (v,w)\ell ^*(v,w)\) is bounded and converges to positive constants.

Example 2.1

The Bivariate Geometric (BG) distribution is a particular case of the model (4) with margins (5). Consider the bivariate Bernoulli random vector \((B_1,B_2)\) with \(P(B_1=k,B_2=\ell )=p_{k \ell }, (k,\ell ) \in \{0,1\}^2,\) and success marginal probabilities \(p_{+1}=p_{01}+p_{11}\) and \(p_{1+}=p_{10}+p_{11}\). Due to Mitov and Nadarajah (2005), using the construction of a BG, the pf and the df of a random vector (VW) with BG distribution are given, respectively, by

$$\begin{aligned} f_{V, W}(v,w)=P(V= v,W= w) =\left\{ \begin{array}{lc}p_{00}^{v}p_{10}p_{+0}^{w-v-1}p_{+1}\ , &{} 0\le v< w, \\ \ \\ p_{00}^{v}p_{11}\ , &{} v = w,\\ \\ p_{00}^{w}p_{01}p_{0+}^{v-w-1}p_{1+}\ , &{} 0\le w < v,\end{array}\right. \end{aligned}$$
(7)

for \(v,w \in \mathbb {N}_0\), and

$$\begin{aligned} F_{V, W}(v,w)= & {} P(V\le v,W\le w)\nonumber \\= & {} 1-p_{0+}^{[v]+1}-p_{+0}^{[w]+1}+\left\{ \begin{array}{lc}p_{00}^{[v]+1}p_{+0}^{[w]-[v]}\ , &{} 0\le v \le w, \\ \ \\ p_{00}^{[w]+1}p_{0+}^{[v]-[w]}\ , &{} 0\le w < v,\end{array}\right. \end{aligned}$$
(8)

for \(v,w \in \mathbb {R}_0^{+}\), assuming that \(0< p_{0+}, p_{+0} <1\). Hence, this df satisfies (4) with the constants \(\lambda _1\), \(\lambda _2\) given by

$$\begin{aligned} 1 + \lambda _1 = \frac{1}{p_{0+}}> 1 \quad \text{ and } \quad 1 + \lambda _2 = \frac{1}{p_{+0}} > 1 \end{aligned}$$

and the index \(\theta\) associated to the dependence structure of \((B_1,B_2)\) is

$$\begin{aligned} \theta =\frac{p_{00}}{p_{0+}p_{+0}}. \end{aligned}$$

The slowly varying functions are constants and \(\xi _i=0\), for \(i=1,2,3,4\). The independence case occurs when \(\theta =1\). For dependence cases, we can have \(0< \theta < 1\) or \(\theta >1\). Finally, we note that \(\ell (v,w)\) is a constant. For instance, take \(L_1(v)=L_3(v)= 1/(1+\lambda _1)\), \(L_2(v)=L_4(v)= 1/(1+\lambda _2)\), we have \(\ell (v,w)=\theta\) with \(\ell ^*(v,w)\) as in (6).

The marginal df of V and W are obviously

$$\begin{aligned} P(V\le v)=1-p_{0+}^{[v]+1}\ \text{ and } \ P(W\le w)=1-p_{+0}^{[w]+1} \ , \ \text{ for } {v,w\ge 0}, \end{aligned}$$

which means V and W are geometrically distributed rvs with parameter \(p_{1+}\) and \(p_{+1}\), respectively.                                     \(\Box\)  

In order to characterize the df of \((X,Y)=(\alpha \circ V, \beta \circ W)\) we start by establishing the relationship between the probability generating function (pgf) of (VW) and (XY), defined e.g. for (VW) as

$$G_{V,W}(s_1,s_2):=\sum _{k_1=0}^{+ \infty }\sum _{k_2=0}^{+\infty } P(V=k_1,W=k_2)s_1^{k_1}s_2^{k_2},$$

which exists for \((s_1, s_2)\) in the following region \(\mathcal {R}\) (given in Lemma 2.1).

Taking into account Proposition 2.1, the series \(G_{V,W}( s_1 ,s_2 )\) converges obviously for any \(s_i\le 1\). Even for some \(s_i>1\) the series converges because of the assumption (4). By this assumption, we have \(E(s_1^V)< + \infty\) if \(s_1< 1+\lambda _1\) and \(E(s_2^W)< + \infty\) if \(s_2< 1+\lambda _2\). The following lemma gives a condition such that the series \(G_{V,W}(s_1,s_2)\) exists.

Lemma 2.1

The pgf \(G_{V,W}(s_1 ,s_2)= E(s_1^V s_2^W)\) exists for \((s_1, s_2)\) in

$$\begin{aligned} \mathcal {R}= \left\{ (s_1, s_2)\in \mathbb {R}_+^2: \ s_1s_2< \frac{(1+\lambda _1)(1+\lambda _2)}{\theta }, \, s_1<1+\lambda _1, \, s_2<1+\lambda _2\right\} . \end{aligned}$$

Its more technical proof is given also in the appendix. As consequence of this lemma, the pgf \(G_{V,W}(s_1 ,s_2)\) exists for \(s_1, s_2>1\), if \(s_i\le 1+\lambda _i, i=1,2\) in case \(\theta \le 1\), and if \(s_1\le 1+\lambda _1\) and \(s_2\theta \le 1+\lambda _2\) in case of \(\theta >1\). In the following, we use these convenient conditions for the convergence of \(G_{V,W}\).

Now the relationship of the two pgf is the following. It holds as long as the pgf’s exist. For our derivations it is convenient to use in the following the given domain \(\mathcal {R}\). The proof of this relationship is also given in the appendix.

Proposition 2.2

The pgf of \((X,Y)=(\alpha \circ V, \beta \circ W)\) is given in terms of the pgf of (VW):

$$G_{X,Y}(s_1,s_2)=G_{V,W}(\alpha s_1 +1- \alpha ,\beta s_2+1-\beta ),$$

for all \((s_1,s_2)\) such that \((\alpha s_1 +1- \alpha ,\beta s_2+1-\beta ) \in \mathcal {R}\).

We want to derive an exact relationship of the two distributions \(F_{V,W}\) and \(F_{X,Y}\) with the help of a suitable transformation, as a modified pgf or a Mellin transform. We define the (bivariate) modified pgf or tail generating function (Sagitov (2017))

$$Q_{V,W}(s_1,s_2)=\sum _{k_1=0}^{+ \infty } \sum _{k_2=0}^{+ \infty } \left( 1-F_{(V,W)}(k_1,k_2)\right) {s_1^{k_1}s_2^{k_2}},$$

and analogously for XY. The relationship between \(Q_{V,W}\) and \(G_{V,W}\) is given in the following proposition.

Proposition 2.3

For \((s_1,s_2) \in \mathcal {R}\), we have

$$\begin{aligned} (1-s_1)(1-s_2)Q_{V,W}(s_1,s_2)=1-G_{V,W}(s_1,s_2). \end{aligned}$$

Proposition 2.4

The modified pgf of (XY) and (VW) satisfy

$$Q_{X,Y}(s_1,s_2)=\alpha \beta Q_{V,W}(\alpha s_1+1-\alpha ,\beta s_2+1-\beta ),$$

if the series converge, i.e. \((\alpha s_1+1-\alpha ,\beta s_2+1-\beta ) \in \mathcal {R}\).

From Propositions 2.2 and 2.4, we can derive now the tail \(1- F_{X,Y}\) in terms of \(1- F_{V,W}\)

Proposition 2.5

The df \(F_{X,Y}\) is given in terms of the df \(F_{V,W}\) with \(x, y \in \mathbb {Z^+}\) :

$$1-F_{X,Y}(x,y)=\sum _{k=x}^{+ \infty } \sum _{\ell =y}^{+ \infty } {{k}\atopwithdelims (){x}} {{\ell }\atopwithdelims (){y}}(1-\alpha )^{k-x}(1-\beta )^{\ell -y} \alpha ^{x+1}\beta ^{y+1}\left( 1-F_{V,W}(k,\ell )\right) .$$

Hence the tail of \(F_{X,Y}\) can be estimated by the assumption (4).

Proposition 2.6

If the joint df of (VW) satisfies (4), then for large integers x and y

$$\begin{aligned} \begin{aligned} 1-F_{X,Y}(x,y)&=\left( 1+\frac{\lambda _1}{\alpha }\right) ^{-x}x^{\xi _1}L_1^*\left( x\right) +\left( 1+\frac{\lambda _2}{\beta }\right) ^{-y}y^{\xi _2}L_2^*(y) -H(x,y)\end{aligned} \end{aligned}$$

with

$$\begin{aligned} \begin{aligned} 0\le H(x,y)&\le \vartheta L_3^*(x)x^{\xi _3}\left( 1+\frac{\lambda _{1}}{\alpha }\right) ^ {-x}L_4^*(w)y^{\xi _4}\left( 1+\frac{\lambda _{2\theta }}{\beta }\right) ^{-y},\\ \end{aligned} \end{aligned}$$

where \(L_i^*\) are slowly varying functions, being

$$\begin{aligned} L_1^*(x)\sim \alpha \left( \frac{1+\lambda _1}{\lambda _1+\alpha }\right) ^{{\xi _{1}}+1}L_1(x), \;L_2^*(y)\sim \beta \left( \frac{1+\lambda _2}{\lambda _2+\beta }\right) ^{{\xi _{2}}+1}L_2(y), \end{aligned}$$
$$\begin{aligned} L_3^*(x)\sim \alpha \left( \frac{1+{\lambda _{1}}}{{\lambda _{1}}+\alpha }\right) ^{\xi _{3}+1} {L_3(x)}, \;L_4^*(y)\sim \beta \left( \frac{1+{\lambda _{2\theta }}}{{\lambda _{2\theta }}+\beta }\right) ^{{\xi _4}+1} {L_4(y)}, \end{aligned}$$

with

$$\begin{aligned} \lambda _{2\theta } = \left\{ \begin{array}{lc} \lambda _2\ , &{} \theta \le 1 \\ \\ \frac{1+\lambda _2}{\theta }-1\ , &{} \theta > 1,\end{array}\right. \end{aligned}$$
(9)

and \(\vartheta\) the bound of \(\ell (v,w)\).

Note that \(1<\lambda _{2\theta }< \lambda _2\).

We observe that the stationary bivariate INMA model \(\left( X_j, Y_j\right)\) introduced in our work is an extension of the BINAR model of Pedeli and Karlis (2011) defined by

$$\left( \widetilde{X}_j, \widetilde{Y}_j\right) = \left( \alpha \circ \widetilde{X}_{j-1}+ R_{1j}, \beta \circ \widetilde{Y}_{j-1}+ R_{2j} \right)$$

with an iid innovations sequence \(\{ (R_{1j}, R_{2j})\}\). In their paper it is stated that it has also the representation

$$\begin{aligned} \left( \widetilde{X}_j, \widetilde{Y}_j\right) {\mathop {=}\limits ^{d}} \left( \sum _{i=0}^{+ \infty } \alpha ^{i}\circ R_{1,j-i}, \sum _{i=0}^{+\infty } \beta ^{i}\circ R_{2,j-i}\right) .\end{aligned}$$
(10)

Hence, considering \(\left( X_j, Y_j\right)\) with \(\alpha _i=\beta _i=0\) for \(i < 0\), \(\alpha _i=\alpha ^i\) and \(\beta _i=\beta ^i\) for \(i \ge 0\) we obtain \(\left( \widetilde{X}_j, \widetilde{Y}_j\right)\).

3 The Bivariate Stationary Sequence

We consider now the stationary bivariate INMA model \(\{\left( X_j, Y_j\right) \}\) with iid innovations \(\{(V_i,W_i)\}\) with df satisfying (4). We establish first the tail behaviour of \(\left( X_j, Y_j\right)\). The maximal values of \(\alpha _i\) and \(\beta _i\) are most important as in the univariate case. Therefore we write \(\alpha _{\max }=\max \{\alpha _i : \left| i\right| \ge 0\}\) and \(\beta _{\max }=\max \{\beta _i : \left| i\right| \ge 0\}\). We assume that they are unique. It may happen in the bivariate case that \(\alpha _{\max }\) and \(\beta _{\max }\) occurs at the same index or at different ones. We consider both cases. Furthermore, we use that

$$\begin{aligned} \displaystyle {\sum _{i=-\infty }^{+\infty }}\alpha _i< + \infty ,\,\displaystyle {\sum _{i=-\infty }^{+\infty }\beta _i <+ \infty }. \end{aligned}$$
(11)

which holds because of (3).

Suppose first that \(\alpha _{\max }\) and \(\beta _{\max }\) are occuring at different indexes \(i_0\) and \(i_1\), respectively. We write for any j

$$\begin{aligned} X_j=\alpha _{\max } \circ V_{j-i_0}+\alpha _{i_1}\circ V_{j-i_1}+\sum _{i \ne i_0, i_1}\alpha _i \circ V_{j-i} \end{aligned}$$

and

$$\begin{aligned} Y_j=\beta _{\max } \circ W_{j-i_1}+\beta _{i_0}\circ W_{j-i_0}+\sum _{i \ne i_0, i_1}\beta _i \circ W_{j-i} \, . \end{aligned}$$

Denote \(S_1=\alpha _{\max }\circ V_{j-i_0}\), \(S_2=\alpha _{i_1} \circ V_{j-i_1}\), \(\displaystyle S_3=\sum _{i \ne i_0,i_1} \alpha _i \circ V_{j-i}\), \(S=S_2+S_3\), \(T_1=\beta _{\max } \circ W_{j-i_1}\), \(T_2=\beta _{i_0}\circ W_{j-i_0}\) and \(\displaystyle T_3=\sum _{i \ne i_0, i_1}\beta _i \circ W_{j-i}\), \(T=T_2+T_3\). Hence, \(X_j =S_1+S_2+S_3=S_1 + S\) and \(Y_j =T_1+T_2+T_3=T_1 + T\). Note that \(S,S_i, T\) and \(T_i\) depend on j.

For the proof of the main proposition of this section we need the following lemma.

Lemma 3.1

  1. a)

    If the rv V belongs to the Anderson’s class, then \(E(1+h)^V= 1+hE(V)(1+o_h(1)),\quad \mathrm{as} \;\; h\rightarrow 0^+.\)

  2. b)

    For any set I of integers with \(\alpha _I=\max \{\alpha _i, i \in I \}\), consider the rv \(Z=\sum _{i \in I} \alpha _i \circ V_{-i}.\) Then \(E(1+h)^Z\) is finite for any \(0< h <\frac{\lambda _1}{\alpha _I}\).

The proof of this lemma is given in the appendix. We deal now with the limiting behaviour of the tail of \((X_j, Y_j)\). Besides of the univariate tail distributions we derive only an appropriate positive upper bound \(H^*(x,y)\) for the joint tail which is sufficient for the asymptotic limit distribution of the maxima. We will see that we get asymptotic independence of the components of the bivariate maxima \((M_n^{(1)}, M_n^{(2)})\), since this normalized \(H^*(x,y)\) is vanishing, not contributing to the limit.

For the asymptotic behaviour of the tail of the stationary distribution of the sequence \(\{(X_{j},Y_{j})\}\), we write simply (XY) for any \((X_j,Y_j)\). As mentioned we deal with the two cases that \(\alpha _{\max }\) and \(\beta _{\max }\) are occurring at different indexes or at the same one. We start with the first case and the above defined \(S, S_i, T, T_i\).

For this derivation, we use \(\psi , \rho \in (0,1)\) and \(\lambda >0\) such that \(\frac{\lambda _1}{\alpha _{\max }}<\lambda < \frac{\lambda _1}{\alpha ^*}\), with \(\alpha ^*=\max \{ \alpha _i, i\ne i_0\}\), and \(\lambda _{2\theta }\) given in (9),

$$\begin{aligned} 1+\frac{\lambda _1}{\alpha _{\max }}< (1+\lambda )^\psi< 1+\lambda < 1+\frac{\lambda _1}{\alpha ^*} \end{aligned}$$
(12)

and

$$\begin{aligned} \rho < B= \log \left( 1+ \frac{\lambda _{2}}{\beta _{\max }}\right) / \log \left( 1+ \frac{\lambda _{2\theta }}{\beta _{i_0}}\right) . \end{aligned}$$
(13)

Proposition 3.1

If \(\left( V, W\right)\) satisfies (4) and \(\alpha _{\max }\) and \(\beta _{\max }\) are unique and taken at different indexes, then

  1. (i)

    for the marginal dfs

    $$\begin{aligned} 1- F_{X}(x) \sim x^{\xi _1}\left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{-\left[ x\right] }L_1^{**}(x),\,x \rightarrow +\infty , \end{aligned}$$

    and

    $$\begin{aligned} 1- F_{Y}(y) \sim y^{\xi _2}\left( 1+\frac{\lambda _2}{\beta _{\max }}\right) ^{-\left[ y\right] }L_2^{**}(y),\,y \rightarrow +\infty , \end{aligned}$$
  2. (ii)

    for the joint df with \(\psi , \rho , \lambda\) satisfying (12) and (13)

$$\begin{aligned} \begin{aligned} 1-F_{X,Y}(x,y)&=x^{\xi _1}\left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{-\left[ x\right] }L_1^{**}(x)\left( 1+o_{x}(1)\right) \\&+y^{\xi _2} \left( 1+\frac{\lambda _2}{\beta _{\max }}\right) ^{-[y]} L_2^{**}(y)(1+o_y(1)) -H^*(x,y), \\ \end{aligned} \end{aligned}$$

as \(\,x,y \rightarrow +\infty\), where

$$\begin{aligned} L_1^{**}(x)=L_1^*(x) E\left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{S}, \quad L_2^{**}(y)=L_2^*(y) E\left( 1+\frac{\lambda _2}{\beta _{\max }}\right) ^{T} \end{aligned}$$
(14)

and

$$\begin{aligned} \begin{aligned} 0\,\le \, H^*(x,y) \le&o_y(1) \left( 1+\frac{\lambda _{1}}{\alpha _{\max }}\right) ^{- x}\left( 1+\frac{\lambda _{2\theta }}{\beta _{i_0}}\right) ^{-\rho y}x^{\xi _3}L_3^{*}(x)+Cx^{\xi _1+1} y^{\xi _2}L^*_1(x) L^*_2(y) \times \\ {}&\times \left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{-(1-\psi )x} \left( 1+\frac{\lambda _{2}}{\beta _{\max }}\right) ^{-y+ (\log y)^2} +O_x\left( P(S>\psi x) \right) ,\\ \end{aligned} \end{aligned}$$
(15)

for some constant \(C>0\).

We show also that \(P(S>\psi x)=o_x(P(S_1 > x))\).

Proof

In fact

$$\begin{aligned} 1- F_{\left( X, Y\right) }(x,y)=1- F_{X}(x) + 1- F_{Y}(y)-P(X>x,Y>y).\end{aligned}$$
(16)

We deal with the three terms in (16), separately.

  1. (i)

    Since \(\frac{\lambda _1}{\alpha _{\max }} < \frac{\lambda _1}{\alpha ^*}\), taking the sum \(S=Z\) in Lemma 3.1, we conclude that \(E\left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{S}\) is finite. Similarly \(E\left( 1+\frac{\lambda _2}{\beta _{\max }}\right) ^{T}\) is finite since \(\frac{\lambda _2}{\beta _{\max }} < \frac{\lambda _2}{\beta ^*}\) with \(\beta ^*=\max \{\beta _i, {i\ne i_1}\}\). The tail function of X is given, with \(\psi _x=[\psi x]\), by

    $$\begin{aligned} \begin{array}{lll} 1- F_{X}(x) &{} = &{} P\left( S_1 + S> x\right) =\displaystyle {\sum _{k=0}^{+ \infty }P(S_1> x-k)P(S=k)}\\ &{} &{} \\ &{} = &{} P\left( S_1> x\right) \displaystyle { \sum _{k=0}^{\psi _x}} \,\,\frac{P(S_1> x-k)}{P(S_1>x)}P(S=k)+ \\ &{}&{} \\ &{} &{} \quad +\displaystyle { \sum _{k=\psi _x+1}^{+ \infty } P(S_1> x-k)P(S=k)}.\\ \end{array} \end{aligned}$$
    (17)

    For the first sum of (17), we get by applying Proposition 2.6 with \(\alpha =\alpha _{\max }\) for the marginal distribution

    $$\begin{aligned} \begin{aligned}&\sum _{k=0}^{\psi _x} \frac{P(S_1> x-k)}{P(S_1>x)}P(S=k) = \sum _{k=0}^{\psi _x}\left( 1+ \frac{\lambda _1}{\alpha _{\max }}\right) ^k (1+o_x(1))P(S=k)\\&\quad \rightarrow \displaystyle {\sum _{k=0}^{+ \infty }}\left( 1+ \frac{\lambda _1}{\alpha _{\max }}\right) ^k P(S=k)\\&\quad = E\left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{S}, \, x \rightarrow +\infty ,\\ \end{aligned} \end{aligned}$$

    by dominated convergence. For the second sum in (17), we get for x large

    $$\begin{aligned} \begin{aligned}&\sum _{k=\psi _x+1}^{+ \infty } P(S_1> x-k)P(S=k) \le P(S>\psi _x)\\&= P\left( (1+\lambda )^{S} > (1+ \lambda )^{\psi _x}\right) \le \frac{E\left( 1+\lambda \right) ^{S}}{(1+ \lambda )^{\psi _x}},\\ \end{aligned} \end{aligned}$$
    (18)

    using the Markov inequality, since \(E\left( 1+\lambda \right) ^{S}\) is finite for \(\lambda <\lambda _1/\alpha ^*\). Since \((1+\lambda )^\psi > 1+\frac{\lambda _1}{\alpha _{\max }},\) we get by Theorem 4 of Hall (2003)

    $$\begin{aligned} \begin{aligned} \frac{(1+ \lambda )^{-\psi _x}}{P(S_1>x)} \rightarrow 0,\, x \rightarrow + \infty ,\end{aligned} \end{aligned}$$
    (19)

    and thus together

    $$\begin{aligned} \begin{aligned} 1- F_{X}(x)&= P(S_1>x)\left[ E\left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{S} + O_x\left( \frac{(1+\lambda )^{-\psi _x}}{P(S_1>x)}\right) \right] \\&= P(S_1>x)E\left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{S}(1+o_x(1)).\\ \end{aligned} \end{aligned}$$

    With the same arguments we characterize the tail \(1- F_{Y}\). Hence, the statements on the marginal dfs are shown.

  2. (ii)

    Now we deal with the third term in (16). Note that \((S_1,T_2)\), \((S_2,T_1)\) and \((S_3, T_3)\) in the representation of X and Y are independent. For any \(\psi \in (0,1)\) and \(\lambda >0\) satisfying (12), we use that (18) and (19) imply

    $$\begin{aligned} \begin{aligned}&P(S_2 + S_3>\psi x) = P\left( S > \psi x\right) = O_x((1+ \lambda )^{-\psi x})\ \end{aligned} \end{aligned}$$
    (20)

    and

    $$\begin{aligned} \begin{aligned}&P(S> \psi x) = o_x (P\left( S_1 > x\right) ).\ \end{aligned} \end{aligned}$$
    (21)

    The probability in the third term of (16) is split into four summands with \(\psi <1\) satisfying (12), \(\psi _x=[\psi x]\) and \(\delta _y=[y-(\log y)^2]\). We get for x and y large,

    $$\begin{aligned} \begin{aligned}&P(X>x,\ Y>y)=P(S_1+S_2+S_3> x,\ T_1+T_2+T_3>y)\\&= \sum _{k=0}^{\psi _x}\sum _{\ell =0}^{ \delta _y} P(S_1> x-k, T_2> y-\ell ) P(S_2+S_3=k,T_1+ T_3=\ell )+ \\&{+\sum _{k=0}^{\psi _x}\sum _{\ell =\delta _y+1}^{+ \infty } P(S_1> x-k, T_2> y-\ell ) P(S_2+S_3=k,T_1+ T_3=\ell ) +} \\&{+ \sum _{k=\psi _x+1}^{+ \infty }\sum _{\ell =0}^{ \delta _y} P(S_1> x-k, T_2> y-\ell ) P(S_2+S_3=k,T_1+ T_3=\ell ) +}\\&{+ \sum _{k=\psi _x+1}^{+ \infty }\sum _{\ell =\delta _y+1}^{+ \infty } P(S_1> x-k, T_2 > y-\ell ) P(S_2+S_3=k,T_1 +T_3=\ell )}\\&=: \sum _{m=1}^4 S_m(\psi _x,\delta _y)\\ \end{aligned} \end{aligned}$$
    (22)

    to simplify the proof. The last sum \(S_4(\psi _x,\delta _y)\) is bounded by \(P(S_2+S_3> \psi _x, T_1+T_3>\delta _y)\le P(S_2+S_3 > \psi _x)=O_x((1+\lambda )^{-\psi _x})\) by (20). For the first sum \(S_1(\psi _x,\delta _y)\) of (22) we use Proposition 2.6 and obtain with \(\rho <1\) such that (13) holds,

    $$\begin{aligned} \begin{aligned}&\sum _{k=0}^{\psi _x}\sum _{\ell =0}^{\delta _y} P(S_1> x-k, T_2 > y-\ell ) P(S_2+S_3=k,T_1 +T_3=\ell )\\&\le \vartheta \sum _{k=0}^{\psi _x}\sum _{\ell =0}^{\delta _y} \left( [x]-k\right) ^{\xi _{3}}\left( [y]-\ell \right) ^{\xi _{4}}\left( 1+\frac{\lambda _{1}}{\alpha _{\max }}\right) ^{- ([x]- k)}\left( 1+\frac{\lambda _{2\theta }}{\beta _{i_0}}\right) ^{-([y]-\ell )}\times \\& \times L_3^{*}([x]-k) \ L_4^{*}( [y]- \ell ) P\left( S_2+S_3=k, T_1+ T_3=\ell \right) \\&\le \vartheta \sum _{k=0}^{\psi _x}\sum _{\ell =0}^{\delta _y} \left( [x]-k\right) ^{\xi _{3}}\left( [y]-\ell \right) ^{\xi _{4}}\left( 1+\frac{\lambda _{1}}{\alpha _{\max }}\right) ^{- ([x]- k)}\left( 1+\frac{\lambda _{2\theta }}{\beta _{i_0}}\right) ^{-((1-\rho )+\rho )([y]-\ell )}\times \\& \times L_3^{*}([x]-k) L_4^{ *}( [y]- \ell ) P\left( S_2+S_3=k, T_1+ T_3=\ell \right) . \end{aligned} \end{aligned}$$

    Note that \(\left( [y]-\ell \right) ^{\xi _{4}}\left( 1+\frac{\lambda _{2\theta }}{\beta _{i_0}}\right) ^{-(1-\rho )([y]-\ell )} L_4^{ *} [y]- \ell =o_y(1)\) uniformly for \(\ell \le \delta _y\), i.e. \(y-\ell > (\log y)^2\rightarrow \infty\). Hence the sum is bounded above by

    $$\begin{aligned} \begin{aligned}&o_y(1) x^{\xi _{3}}L_3^{ *}(x) \left( 1+\frac{\lambda _{1}}{\alpha _{\max }}\right) ^{- x}\left( 1+\frac{\lambda _{2\theta }}{\beta _{i_0}}\right) ^{-\rho y}\times \\&\times \sum _{k=0}^{\psi _x}\sum _{\ell =0}^{\delta _y} \left( 1+\frac{\lambda _{1}}{\alpha _{\max }}\right) ^{k} \left( 1+\frac{\lambda _{2\theta }}{\beta _{i_0}}\right) ^{\rho \ell } P\left( S_2+S_3=k, T_1+ T_3=\ell \right) \end{aligned} \end{aligned}$$
    $$\begin{aligned} \begin{aligned}&\le o_y(1) x^{\xi _{3}}L_3^{ *}(x) \left( 1+\frac{\lambda _{1}}{\alpha _{\max }}\right) ^{- x}\left( 1+\frac{\lambda _{2\theta }}{\beta _{i_0}}\right) ^{-\rho y} \times \\& \times E\left( \left( 1+\frac{\lambda _{1}}{\alpha _{\max }}\right) ^{ (S_2+ S_3)}\left( 1+\frac{\lambda _{2\theta }}{\beta _{i_0}}\right) ^{\rho (T_1 + T_3)}\right) \\&\le o_y(1) x^{\xi _{3}}L_3^{*}(x) \left( 1+\frac{\lambda _{1}}{\alpha _{\max }}\right) ^{- x}\left( 1+\frac{\lambda _{2\theta }}{\beta _{i_0}}\right) ^{-\rho y}, \\ \end{aligned} \end{aligned}$$

    since the last pgf exists due to Lemma 3.1 and (13) Note that

    $$\begin{aligned}&E\left( \left( 1+\frac{\lambda _{1}}{\alpha _{\max }}\right) ^{ (S_2+ S_3)}\left( 1+\frac{\lambda _{2\theta }}{\beta _{i_0}}\right) ^{\rho (T_1 + T_3)}\right) \\&= E\left( \prod _{i\neq i_0}(1+\frac{\lambda _{1}}{\alpha _{\max }})^{\alpha _i \circ V_{-i}}\left( (1+\frac{\lambda _{2\theta }}{\beta _{i_0}})^{\rho }\right) ^{\beta _i \circ W_{-i}}\right) \\&= \prod _{i\neq i_0} E\left( \left(1+ \frac{\alpha _i \lambda _{1}}{\alpha _{\max }}\right)^ {V_{-i}} \left(1+\beta _i([1+\frac{\lambda _{2\theta }}{\beta _{i_0}}]^\rho -1)\right)^ {W_{-i}}\right) . \end{aligned}$$

    The expectations exist by assumption (4) since \(1 + \frac{\alpha _i\lambda _{1}}{\alpha _{\max }} < 1 + \lambda _1\), and also \(1+\beta _i([1+\frac{\lambda _{2\theta }}{\beta _{i_0}}]^\rho -1) \le 1+\beta _{\max }([1+\frac{\lambda _{2\theta }}{\beta _{i_0}}]^\rho -1) < 1 + \lambda _2\), for all i, by the choice of \(\rho\) in (13), by using the arguments of Lemma 3.1. We consider now the approximation of the second sum \(S_2(\psi _x,\delta _y)\) in (22). We have with some positive constant C

    $$\begin{aligned} S_2(\psi _x,\delta _y)\le & {} \displaystyle { \sum _{k=0}^{\psi _x}\sum _{\ell =\delta _y+1}^{+ \infty }} P(S_1> x-k) P(T_1+ T_3=\ell )\nonumber \\\le & {} C x^{\xi _{1}+1}L_1^{*}(x) \left( \displaystyle 1+\frac{\lambda _{1}}{\alpha _{\max }}\right) ^{- (1-\psi ) x} P(T_1+ T_3>\delta _y).\end{aligned}$$
    (23)

    By the arguments used to approximate \(P(X>x)=P(S_1+S_2+S_3>x)\) in (i), we also obtain

    $$P(T_1+ T_3>\delta _y) \sim C y^{\xi _{2}} \ L_2^{*}( y)E\left( 1+ \frac{\lambda _2}{\beta _{\max }}\right) ^{T_3}\left( 1+\frac{\lambda _{2}}{\beta _{\max }}\right) ^{-\delta _y},\, y \rightarrow +\infty ,$$

    with some generic constant C. Hence, it implies together with (23)

    $$\begin{aligned} \begin{aligned}&\sum _{k=0}^{\psi _x}\sum _{\ell =\delta _y+1}^{+ \infty } P(S_1> x-k, T_2 > y-\ell ) P(S_2+S_3=k,T_1+ T_3=\ell ) \\&\le C x^{\xi _{1}+1}y^{\xi _{2}}L_1^{*}(x) \ L_2^{*}( y) \left( 1+\frac{\lambda _{1}}{\alpha _{\max }}\right) ^{-(1-\psi ) x}\left( 1+\frac{\lambda _{2}}{\beta _{\max }}\right) ^{-y+ (\log y)^2},\, y \rightarrow + \infty .\\ \end{aligned} \end{aligned}$$

    For the third sum \(S_3(\psi _x,\delta _y)\) in (22), we get analogously to the derivation of the second sum

    $$\begin{aligned} \begin{aligned} S_3(\psi _x,\delta _y)&\le \delta _y P(T_2> y-\delta _y) P(S_2+S_3> \psi _x)\\&\le y (\log y)^{2\xi _2} L_2^*((\log y)^2)(1+\frac{\lambda _2}{\beta _{i_0}})^{-(\log y)^2}P(S_2+S_3> \psi _x) \\&=o_y(1)P(S_2+S_3> \psi _x) = o_x(P(S>\psi x)). \\ \end{aligned} \end{aligned}$$

Combining now the bounds of the four terms \(S_i(\psi _x,\delta _y)\), we get the upper bound for \(H^*(x,y)\) which shows our statement. \(\square\)

Suppose now the case that the unique \(\alpha _{\max }\) and \(\beta _{\max }\) are taken at the same index \(i_0\), say. Write for any j

$$\begin{aligned} X_j=\alpha _{\max } \circ V_{j-i_0}+\sum _{i \ne i_0}\alpha _i \circ V_{j-i} \end{aligned}$$

and

$$\begin{aligned} Y_j=\beta _{\max } \circ W_{j-i_0}+\sum _{i \ne i_0}\beta _i \circ W_{j-i}. \end{aligned}$$

Denote \(S_1=\alpha _{\max }\circ V_{j-i_0}\), \(\displaystyle S=\sum _{i \ne i_0} \alpha _i \circ V_{j-i}\), \(T_1=\beta _{\max } \circ W_{j-i_0}\), and \(\displaystyle T=\sum _{i \ne i_0}\beta _i \circ W_{j-i}\), as used for Proposition 3.1. Observe that \((S_1,T_1)\) and (S,T) are independent. Then the corresponding statement of Proposition 3.1 holds for this case (letting \(\beta _{i_0}=\beta _{\max }\)) which is given in Proposition 3.2. We omit the proof since it is very similar to the given one with a few obvious changes.

Proposition 3.2

If \(\left( V, W\right)\) satisfies (4) and \(\alpha _{\max }\) and \(\beta _{\max }\) are unique, occurring at the same index, then the stationary distribution satisfies

$$\begin{aligned} \begin{aligned} 1- F_{X, Y}(x,y)&= x^{\xi _1}\left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{-\left[ x\right] }L_1^{**}(x)\left( 1+o_{x}(1)\right) \\&+y^{\xi _2}\left( 1+\frac{\lambda _2}{\beta _{\max }}\right) ^{-\left[ y\right] }L_2^{**}(y)\left( 1+o_{y}(1)\right) \\&- H^*(x,y), \end{aligned} \end{aligned}$$

as \(x,y \,\rightarrow +\infty\), where

$$\begin{aligned} L_1^{**}(x)=L_1^*(x) E\left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{S}, \quad L_2^{**}(y)=L_2^*(y) E\left( 1+\frac{\lambda _2}{\beta _{\max }}\right) ^{T} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} 0\,\le \, H^*(x,y)&\le o_y(1) x^{\xi _3} L_3^{*} \left( x\right) \left( 1+\frac{\lambda _{1}}{\alpha _{\max }}\right) ^{-x} \left( 1+\frac{\lambda _{2\theta }}{\beta _{\max }}\right) ^{- y}+ \\ {}& + C y^{\xi _2} L_2^{*}(y) x^{\xi _1+1}L_1^*(x) \left( 1+\frac{\lambda _{1}}{\alpha _{\max }}\right) ^{-(1-\psi )x} \left( 1+\frac{\lambda _{2}}{\beta _{\max }}\right) ^{-y+ (\log y)^2} \\& + O_x( P\left( S> \psi _x) \right) , \end{aligned} \end{aligned}$$

for some constant \(C>0\) and \(\psi \in (0,1)\) satisfying (12).

Now we investigate the limiting behaviour for the bivariate maxima, in case of an iid sequence \(\{(X_j, Y_j)\}\).

Theorem 3.1

Let \(\left( V, W\right)\) be such that (4) holds and \(\alpha _{\max }\) and \(\beta _{\max }\) are unique, occurring either at the same or not the same index. Let

$$d_1=1/\log (1+\frac{\lambda _1}{\alpha _{\max }}), \quad d_2= 1/\log (1+\frac{\lambda _2}{\beta _{\max }}).$$

Define the normalizations

$$\begin{aligned} u_n(x) = x+d_1[\log n + \xi _1\log \log n +\log L_1^{**}(\log n) +\xi _1\log d_1] \end{aligned}$$
(24)

and

$$\begin{aligned} v_n(y)=y+d_2[\log n + \xi _2\log \log n +\log L_2^{**}(\log n) +\xi _2\log d_2]. \end{aligned}$$
(25)

Then, for xy real,

$$\begin{aligned}&\left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{-x } + \,\left( 1+\frac{\lambda _2}{\beta _{\max }}\right) ^{-y }\le \liminf _{n\rightarrow \infty }\, n(1- F_{\left( X, Y\right) })(u_n(x),v_n(y))\\ \\ \le&\limsup _{n\rightarrow \infty } \, n(1- F_{\left( X, Y\right) })(u_n(x),v_n(y))\le \left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{-x +1} + \,\left( 1+\frac{\lambda _2}{\beta _{\max }}\right) ^{-y +1}. \end{aligned}$$

Proof

The convergence for the marginal distributions holds by applying Proposition 3.1 or 3.2 with the chosen normalization sequences. Since \(u_n(x)\) and \(v_n(y)\) are similar in type, we only show the derivation of the first marginal. Because the normalization \(u_n(x)\) is not always an integer, we have to consider \(\limsup\) and \(\liminf\). Let us deal with the \(\limsup\) case. Note that

$$\begin{aligned} \left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{-d_1\log n}=\left( 1+\frac{\lambda _2}{\beta _{\max }}\right) ^{-d_2 \log n}=\frac{1}{n} \end{aligned}$$

and

$$\begin{aligned} \left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{-d_1( \xi _1 \log \log n + \log L_1^{**}(\log n) + \xi _1\log d_1)} = \frac{(d_1\log n)^{-\xi _1} }{L_1^{**}(\log n)}. \end{aligned}$$

For the normalization we get

$$\begin{aligned} \begin{aligned}&[u_n(x)] \ge x-1 + d_1(\log n + \xi _1 \log \log n + \log L_1^{**}(\log n)+\xi _1\log d_1) \sim d_1\log n. \end{aligned} \end{aligned}$$

So

$$\begin{aligned} \begin{aligned}&n\times [u_n(x)]^{\xi _1}\left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{-[u_n(x)]}L_1^{**}([u_n(x)]) \\&\qquad \lesssim n \times (d_1\log n)^{\xi _1}\left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{-x+1-d_1(\log n + \xi _1 \log \log n + \log L_1^{**}(\log n)+\xi _1 \log d_1)}L_1^{**}(\log n)\\&\qquad = \left( 1+\lambda _1/\alpha _{\max }\right) ^{-(x-1)}. \end{aligned} \end{aligned}$$

The derivation of the \(\liminf\) is similar using \([u_n(x)]\le u_n(x)\).

Now for the joint distribution we use the bounds of \(H^*(u_n,v_n)\) of the two propositions. First we consider the case of Proposition 3.1 with \(\alpha _{max}\) and \(\beta _{max}\) at different indexes. We have to derive the limits of three boundary terms of \(H^*(u_n,v_n)\) given in Proposition 3.1 multiplied by n. The last of these terms tends to 0 because (21) holds and due to the fact that from (14), we get

$$\begin{aligned} \begin{aligned}&n P(S_1 > u_n(x)) = n\times (u_n(x))^{\xi _1}\left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{-u_n(x)}L_1^{*}(u_n(x))\\&\qquad \sim \left( 1+\lambda _1/\alpha _{\max }\right) ^{-x} L_1^{*}(\log n)/ L_1^{**}(\log n)\\&\qquad \sim \left( 1+\lambda _1/\alpha _{\max }\right) ^{-x} / E\left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{S}, \end{aligned} \end{aligned}$$

which is bounded.

The first of the three boundary terms of \(H^*(u_n,v_n)\) is smaller than

$$\begin{aligned}\begin{aligned}&n o_n(1) \left( 1+\frac{\lambda _{1}}{\alpha _{\max }}\right) ^{-u_n(x)}\left( 1+\frac{\lambda _{2\theta }}{\beta _{i_0}}\right) ^{-\rho v_n(y)}(u_n(x))^{\xi _3}L_3^{*}(u_n(x))\\ &= n o_n(1) \left( 1+\frac{\lambda _{1}}{\alpha _{\max }}\right) ^{-d_1\log n + o(\log n)}\left( 1+\frac{\lambda _{2\theta }}{\beta _{i_0}}\right) ^{-\rho d_2 \log n + o (\log n)}(d_1 \log n)^{\xi _3}L_3^{*}(\log n)\\ &= o_n(1)( \log n)^{\xi _3} L^*_3(\log n) \exp \left( -\rho d_2\log n \log \left( 1+ \frac{\lambda _{2\theta }}{\beta _{i_0 }}\right) + o(\log n) \right) \\ &~= o_n(1)( \log n)^{\xi _3} L^*_3(\log n) \exp \left( -(\rho /B) \log n (1+ o_n(1)) \right) \\&=o_n(1 ), \end{aligned} \end{aligned}$$

because \(\rho /B>0\) with B given by (13).

The second boundary term of \(H^*(u_n,v_n)\) is smaller than

$$\begin{aligned}&n C_1(d_1\log n)^{\xi _1+1} (d_2\log n)^{\xi _2}L^*_1(\log n) L^*_2(\log n)\\& \times \left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{-(1-\psi )d_1\log n + o(\log n)} \left( 1+\frac{\lambda _{2}}{\beta _{\max }}\right) ^{- d_2\log n +(\log (d_2 \log n))^2}\\\le & {} C_1 (\log n)^{\xi _1+\xi _2+1}L^*_1(\log n) L^*_2(\log n) \exp \left( \log n -(1-\psi )\log n-\log n +o(\log n)\right) \\= & {} o_n(1), \end{aligned}$$

since \(1-\psi >0\) and where \(C_1\) represents a generic positive constant.

Thus the limiting distribution is proved in case of Proposition 3.1.

Now let us consider the changes of the proof for the case of Proposition 3.2. Again we have to deal with the three boundary terms of \(H^*(u_n,v_n)\) where the last two are as in Proposition 3.1. In the first of these terms we have similarly

$$\begin{aligned}\begin{aligned}&n o_n(1) (u_n(x))^{\xi _3}L_3^{*}(u_n(x)) \left( 1+\frac{\lambda _{1}}{\alpha _{\max }}\right) ^{- u_n(x)}\left( 1+\frac{\lambda _{2\theta }}{\beta _{\max }}\right) ^{- v_n(y)}\\&= o_n(1)( \log n)^{\xi _3} L^*_3(\log n) \exp \left( - d_2\log n \log \left( 1+ \frac{\lambda _{2\theta }}{\beta _{\max }} \right) +o(\log n)\right) \\&=o_n(1), \end{aligned} \end{aligned}$$

since \(d_2\log \left( 1+ \frac{\lambda _{2\theta }}{\beta _{\max }} \right) >0\). Thus the statements are shown. \(\square\)

4 Main result

We consider now the stationary sequence \(\{(X_{j},Y_{j})\}\). From extreme value theory it is known that the behaviour of their extremes is as in the case of an iid sequence \(\{(X_{j},Y_{j})\}\) if the following two conditions hold: a mixing condition, called \(D(u_n,v_n)\), and a local dependence condition, called \(D'(u_n,v_n)\). In our bivariate extreme value case we consider the conditions \(D(u_n,v_n)\) and \(D'(u_n,v_n)\) of Hüsler (1990) (see also Hsing (1989) and Falk et al. (1990)). The condition \(D(u_n,v_n)\) is a long range mixing one for extremes and means that extreme values occurring in largely separated (by \(\ell _n\)) intervals of positive integers are asymptotically independent. The condition \(D'(u_n,v_n)\) considers the local dependence of extremes and excludes asymptotically the occurrences of local clusters of extreme or large values in each individual margin of \(\{(X_{j},Y_{j})\}\) as well as jointly in the two components. We write \(u_n, v_n\) for short because \(x, y\) do not play a role in the following proofs.

Definition 4.1

The sequence \(\{(X_{j},Y_{j})\}\) satisfies the condition \(D(u_{n},v_n)\) if for any integers \(1\le i_{1}<...<i_{p}<j_{1}<...<j_{q}\le n,\) for which \(j_{1}-i_{p}>\ell _{n},\) we have

$$\begin{aligned}& \displaystyle { \big | P\big (\bigcap _{s=1}^{p}\{X_{i_{s}}\le u_{n},Y_{i_{s}}\le v_{n}\}, \bigcap _{t=1}^{q}\{X_{j_{t}} \le u_n, Y_{j_{t}} \le v_n\} \big )} \\&\displaystyle { -P\big (\bigcap _{s=1}^{p}\{X_{i_{s}}\le u_{n},Y_{i_{s}}\le v_{n}\}\big ) P\big (\bigcap _{t=1}^{q}\{X_{j_{t}} \le u_n, Y_{j_{t}} \le v_n\} \big )} \big | \le \alpha _{n,{\ell _{n}}},\\ \end{aligned}$$

for some \(\alpha _{n,{\ell _{n}}}\) with \(\displaystyle {\lim _{n\rightarrow +\infty } \alpha _{n,{\ell _{n}}} =0}\), for some integer sequence \(\ell _{n}=o(n)\).

We use the following \(D'(u_n,v_n)\) condition.

Definition 4.2

Let \(\{s_n\}\) be a sequence of positive integers such that \(s_n \rightarrow + \infty\). The sequence \(\{(X_{j},Y_{j})\}\) satisfies the condition \(D'(u_n,v_n)\) if

$$\begin{aligned}& n \displaystyle {\sum _{j=1}^{[n/s_n]}\biggl \{P\left( X_{0}>u_n,X_{j}>u_n\right) + P\left( X_{0}>u_n,Y_{j}>v_n\right) }\\&+P\left( Y_{0}>v_n,Y_{j}>v_n\right) +P\left( Y_{0}>v_n,X_{j}>u_n\right) \biggr \} \rightarrow 0, \, n \rightarrow +\infty . \end{aligned}$$

In the following we use the sequences \(\{s_n\}, \{\ell _n\}\) and \(\alpha _{n,\ell _{n}}\) such that

$$\begin{aligned} \displaystyle {\lim _{n\rightarrow + \infty }s_{n}^{-1}}= \displaystyle {\lim _{n\rightarrow + \infty }\frac{s_{n}\ell _{n}}{n}=} \displaystyle {\lim _{n\rightarrow + \infty }s_{n}\alpha _{n,\ell _{n}}=0}.\end{aligned}$$
(26)

Such a sequence \(\{s_n\}\) in (26) exists always. Take e.g. for the given \(\ell _n\) and \(\alpha _{n,\ell _n}\) in condition \(D(u_n,v_n)\) the sequence \(s_n=\min (\sqrt{n/\ell _n}, 1/\sqrt{\alpha _{n,\ell _n}})\rightarrow +\infty\). In our proof we use simpler sequences.

Write \(M_{n}^{(1)}= \max \{X_1,\cdots ,X_{n}\}\) and \(M_{n}^{(2)}= \max \{Y_1,\cdots ,Y_{n}\}\). For the stationary sequence \(\{(X_{j},Y_{j})\}\) satisfying \(D(u_{n},v_n)\) and \(D'(u_{n},v_n)\), the limiting behaviour of the bivariate maxima \(\left( M_{n}^{(1)}, M_{n}^{(2)}\right)\), under linear normalization, is given in Theorem 3.1, as if the sequence \(\{(X_{j},Y_{j})\}\) would be a sequence of independent \((X_j,Y_j)\).

In Theorem 3.1 we derived upper and lower bounds of the limiting distribution of the maximum term of non-negative integer-valued moving average sequences which leads to a “quasi max-stable” limiting behavior of the bivariate maximum in the sense of Anderson’s type. So the main result of the maximum of this bivariate discrete random sequence is the following.

Theorem 4.1

Consider the stationary sequences \(\{(X_{j},Y_{j})\}\) defined by

$$\begin{aligned} (X_j, Y_j) = \left( \sum _{i=-\infty }^{+ \infty } \alpha _{i}\circ V_{j-i}, \sum _{i=-\infty }^{+ \infty } \beta _{i}\circ W_{j-i}\right) . \end{aligned}$$

Suppose that the innovation sequence \(\{(V_i,W_i)\}\) is an iid sequence of non-negative integer-valued random vectors with df of the form (4), the sequences of \(\{\alpha _i\}\) and \(\{\beta _i\}\) satisfy (3) and \(\alpha _{\max }\) and \(\beta _{\max }\) are unique. Then,

$$\begin{aligned}& \limsup \,(\liminf )\, P\left( M_n^{(1)}\le u_n(x), M_n^{(2)}\le v_n(y) \right) \lessgtr \\&\lessgtr \exp \left( -\left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{-(x - 0(1))} -\left( 1+\frac{\lambda _2}{\beta _{\max }}\right) ^{-(y - 0(1))}\right) , \end{aligned}$$

for all real x and y and where \(u_n(x)\) and \(v_n(y)\) are defined by (24) and (25).

To prove this theorem, it remains to show that the conditions \(D(u_n, v_n)\) and \(D'(u_n, v_n)\) hold with \(u_n\) and \(v_n\) given by (24) and (25).

Proof of \(D(u_n, v_n)\):

Let \(1\le i_1 \le \cdots \le i_p < j_1 \le \cdots \le j_q \le n\) with \(j_1-i_p> 2 \ell _n\), with separation \(\ell _n=n^\phi\), where \(\phi <1\). We select \(\phi\) later. We use the following notation:

$$\begin{aligned} X_j^\star =\sum _{k=\ell _n}^{+\infty } \alpha _k\circ V_{j-k} \ , \quad X_j^{\star \star }=\sum _{k=-\infty }^{-\ell _n} \alpha _{k}\circ V_{j-k} \end{aligned}$$

and

$$\begin{aligned} Y_j^\star =\sum _{k=-\infty }^{-\ell _n} \beta _{k}\circ W_{j-k} \ , \quad Y_j^{\star \star }=\sum _{k=\ell _n}^{+ \infty } \beta _{k}\circ W_{j-k}. \end{aligned}$$

Note that

$$\begin{aligned} \left\{ X_{i} -X_i^{\star \star }, Y_{i} -Y_i^\star \, , i \le i_p\right\} = \left\{ \sum _{k=-\ell _n+1}^{+ \infty } \alpha _k\circ V_{i-k}, \sum _{k=-\ell _n+1}^{+ \infty } \beta _k\circ W_{i-k} \, , i \le i_p \right\} \end{aligned}$$

and

$$\begin{aligned} \left\{ X_{j} -X_j^{\star }, Y_{j} -Y_j^{\star \star } \, , j \ge j_1 \right\} =\left\{ \sum _{k=-\infty }^{\ell _n-1} \alpha _k\circ V_{j-k}, \sum _{k=-\infty }^{\ell _n-1} \beta _k\circ W_{j-k} \, , j \ge j_1 \right\} \end{aligned}$$
(27)

are independent.

a) We have as upper bound

$$\begin{aligned}& \displaystyle {P\biggl (\bigcap _{s=1}^{p}\{X_{i_{s}}\le u_{n},Y_{i_{s}}\le v_{n}\}, \bigcap _{t=1}^{q}\{X_{j_{t}} \le u_n, Y_{j_{t}} \le v_n\} \biggr )} \nonumber \\&\le \displaystyle {P\biggl (\bigcap _{s=1}^{p}\{X_{i_{s}}- X_{i_s}^{\star \star } \le u_{n}, Y_{i_{s}} - Y_{i_s}^\star \le v_{n}\}\biggr )} \displaystyle {P\biggl (\bigcap _{t=1}^{q}\{X_{j_{t}}- X_{j_t}^{\star } \le u_n, Y_{j_t}-Y_{j_t}^{\star \star } \le v_n\}\biggr )}\nonumber \\&\le \displaystyle {P\biggl (\bigcap _{s=1}^{p}\{X_{i_{s}} \le u_{n} + M^{(1,1)}_{n}, Y_{i_s} \le v_n + M^{(1,2)}_n\}\biggr )\times }\nonumber \\&\times \displaystyle {P\biggl ( \bigcap _{t=1}^{q}\{X_{j_{t}} \le u_n + M^{(2,1)}_n, Y_{j_{t}} \le v_n + M^{(2,2)}_n\} \biggr )}, \end{aligned}$$
(28)

where \(M^{(1,1)}_n=\max _{0\le j\le n}X_j^{\star \star }\), \(\,\,M^{(1,2)}_n=\max _{0\le j\le n}Y_j^\star\), \(M^{(2,1)}_n= \max _{0\le j\le n}X_j^{\star }\), and \(M^{(2,2)}_n=\max _{0\le j\le n}Y_j^{\star \star }\).

We split furthermore this upper bound.

$$\begin{aligned}& \displaystyle {P\biggl (\bigcap _{s=1}^{p}\{X_{i_{s}}\le u_{n},Y_{i_{s}}\le v_{n}\}, \bigcap _{t=1}^{q}\{X_{j_{t}} \le u_n, Y_{j_{t}} \le v_n\} \biggr )} \nonumber \\&\le \displaystyle {\biggl [P\biggl (\bigcap _{s=1}^{p}\{X_{i_{s}} \le u_{n} + M^{(1,1)}_{n}, Y_{i_{s}} \le v_{n} + M^{(1,2)}_{n}\}, M^{(1,1)}_{n}={0}, M^{(1,2)}_{n}={0}\biggr )}\nonumber \\& \displaystyle { + \ P\biggl (M^{(1,1)}_n\ge 1 \vee M^{(1,2)}_{n}\ge 1\biggr )\biggr ]} \nonumber \\& \times \displaystyle {\biggl [P\biggl ( \bigcap _{t=1}^{q}\{X_{j_{t}} \le u_n + M^{(2,1)}_{n}, Y_{j_{t}} \le v_n + M^{(2,2)}_{n}\}, M^{(2,1)}_{n}={0}, M^{(2,2)}_{n}={0}\biggr )} \nonumber \\& \displaystyle { + \ P\biggl (M^{(2,1)}_{n}\ge 1 \vee M^{(2,2)}_{n}\ge 1\biggr )\biggr ]} \nonumber \\&\le \displaystyle {\biggl [P\big (\bigcap _{s=1}^{p}\{X_{i_{s}} \le u_{n} , Y_{i_{s}} \le v_{n} \}\big ) + P\big (M^{(1,1)}_{n}\ge 1 \vee M^{(1,2)}_{n}\ge 1\big )\biggr ]} \nonumber \\&\times \displaystyle {\biggl [P\big ( \bigcap _{t=1}^{q}\{X_{j_{t}} \le u_n, Y_{j_{t}} \le v_n \}\big ) + P\big (M^{(2,1)}_{n}\ge 1 \vee M^{(2,2)}_{n}\ge 1 \big )\biggr ]} \nonumber \\&\le \displaystyle {P\biggl (\bigcap _{s=1}^{p}\{X_{i_{s}} \le u_{n} , Y_{i_{s}} \le v_{n} \}\biggr )} \times \displaystyle {P\biggl ( \bigcap _{t=1}^{q}\{X_{j_{t}} \le u_n, Y_{j_{t}} \le v_n \}\biggr )}\nonumber \\& + 2 P\big (M^{(1,1)}_{n}\ge 1\big ) + 2 P\big ( M^{(1,2)}_{n}\ge 1\big ) + 2 P\big (M^{(2,1)}_{n}\ge 1\big ) + 2 P\big ( M^{(2,2)}_{n}\ge 1\big ). \end{aligned}$$
(29)

The last four terms in (29) tend to 0 as it is proved in Hall (2003) depending on \(\ell _n\). We show it for one term.

$$\begin{aligned} P\left( M_{n}^{(1,1)} \ge 1 \right)\le & {} (n+1) P \left( \sum _{k=-\infty }^{-\ell _n} \alpha _k \circ V_{-k} \ge 1 \right) \nonumber \\\le & {} (n+1) \sum _{k=-\infty }^{-\ell _n} E( \alpha _k \circ V_{-k}) = (n+1) \sum _{k=-\infty }^{-\ell _n} \alpha _k E(V_{-k}) \nonumber \\\le & {} C {n}\sum _{k=\ell _n}^{+ \infty } \frac{1}{k^\delta } \nonumber \\\le & {} C {n} \ell _n^{1-\delta }, \end{aligned}$$
(30)

for some generic constant C and \(\{\alpha _k\}\) satisfying (3) with \(\delta >2\). Selecting \(\phi >1/(\delta -1)\), this bound tends to 0. The sum of the bounds of the last four terms in (29) gives the bound \(\alpha _{n,\ell _n}=Cn\ell _n^{1-\delta }\), which tends to 0.

b) In the same way we establish the lower bound of (28). In fact, using again the independence mentioned in (27), we get

$$\begin{aligned} & \displaystyle {P\big (\bigcap _{s=1}^{p}\{X_{i_{s}}\le u_{n},Y_{i_{s}}\le v_{n}\} \big )P\big ( \bigcap _{t=1}^{q}\{X_{j_{t}} \le u_n, Y_{j_{t}} \le v_n\} \big )} \\\le & {} \displaystyle {P\big (\bigcap _{s=1}^{p}\{X_{i_{s}}- X_{i_s}^{\star \star } \le u_{n}, Y_{i_{s}} - Y_{i_s}^\star \le v_{n}\}\big )}\displaystyle {P\big (\bigcap _{t=1}^{q}\{X_{j_t}- X_{j_t}^{\star } \le u_n, Y_{j_t}-Y_{j_t}^{\star \star } \le v_n\}\big )}\\= & {} \displaystyle {P\big (\bigcap _{s=1}^{p}\{X_{i_s}- X_{i_s}^{\star \star }\le u_{n},Y_{i_s} - Y_{i_s}^\star \le v_{n}\},}\displaystyle {\bigcap _{t=1}^{q}\{X_{j_t}- X_{j_t}^{\star } \le u_n, Y_{j_t}-Y_{j_t}^{\star \star } \le v_n\}\big )}\\\le & {} \displaystyle {P\left( \bigcap _{s=1}^{p}\{X_{i_{s}} \le u_{n} + M^{(1,1)}_{n}, Y_{i_{s}} \le v_{n} + M^{(1,2)}_{n}\}\right. ,}\\& \displaystyle { \left. \bigcap _{t=1}^{q}\{X_{j_{t}} \le u_n + M^{(2,1)}_{n}, Y_{j_{t}} \le v_n + M^{(2,2)}_{n}\}\right) }\\\le & {} \displaystyle {P\biggl (\bigcap _{s=1}^{p}\{X_{i_{s}}\le u_{n},Y_{i_{s}}\le v_{n}\}, \bigcap _{t=1}^{q}\{X_{j_{t}} \le u_n, Y_{j_{t}} \le v_n\} \biggr )} + Cn \ell _n^{1-\delta } \end{aligned}$$

using (29) and (30). Hence the condition \(D(u_n,v_n)\) holds.

In the proof of \(D^\prime (u_n,v_n)\), we need also that \(s_n\alpha _{n,\ell _n}\rightarrow 0\). With \(s_n=n^\zeta\) we select \(\zeta\) such that \(s_n n \ell _n^{1-\delta }= n^{1+\zeta - \phi (\delta -1) }\rightarrow 0\), which holds for \(1+\zeta < \phi (\delta -1)\).

Proof of \(D^\prime (u_n,v_n)\):

We have to consider first the sums on the terms \(P\left( X_0>u_n,Y_j>v_n\right)\) and on the terms \(P\left( Y_0>v_n,X_j>u_n\right)\).

We show it for the sum of the first terms, since for the second one the proof follows in the same way. Let \(\gamma _n=n^{\nu }\) with \(\nu < 1-\zeta\), which implies that \(\gamma _n =o(n/s_n) =o(n^{1-\zeta })\). For \(j<2\gamma _n\), we write

$$\begin{aligned} (X_0, Y_j)&=\left( \sum _{i=-\infty }^{+ \infty } \alpha _i \circ V_{-i} \,,\, \sum _{i=-\infty }^{+ \infty } \beta _{i+j} \circ W_{-i}\right) . \end{aligned}$$

Note that \(\alpha _{i_0}=\alpha _{\max }\) for some \(i_0\) and \(\beta _{j_0}=\beta _{\max }\) for some \(j_0\). For one j we have \(i_0+j=j_0\), i.e. \(j= j_0-i_0\). Hence the maximum terms occur at the same index for \(V_{-i_0}\) and \(W_{-i_0}\) if \(j=j_0-i_0\). If \(j_0=i_0\), hence \(j=0\), but this case does not occur in the sum. For all other j’s the maxima is occurring at different indexes. We consider the bound established in Proposition 3.1 and 3.2 for \(H^{*}\).

For \(j=j_0-i_0\), we showed in the proof of Theorem 3.1 that \(nH^*(u_n,v_n) \rightarrow 0.\)

For \(j \ne j_0-i_0\), we have \(\beta _{i_0+j}< \beta _{\max }\) for the terms \(P(X_0>u_n, Y_j> v_n)\) and deduce from Proposition 3.1 the following upper bound for \(H^*(u_n, v_n)\)

$$\begin{aligned} \begin{aligned}&o_n(1) \left( 1+\frac{\lambda _{1}}{\alpha _{\max }}\right) ^{- u_n}\left( 1+\frac{\lambda _{2\theta }}{\beta _{i_0+j}}\right) ^{-\rho v_n}u_n^{\xi _3}L_3^{*}(u_n)+Cu_n^{\xi _1+1} v_n^{\xi _2}L^*_1(u_n) L^*_2(v_n) \times \\&\times \left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{-(1-\psi )u_n} \left( 1+\frac{\lambda _{2}}{\beta _{\max }}\right) ^{-v_n+ (\log v_n)^2} +O\left( P(S>\psi u_n) \right) , \end{aligned} \end{aligned}$$
(31)

with \(\rho ,\psi \in (0,1)\) defined in (12) and (13). Note that \(\rho =\rho (j)\) should be such that \(\left( 1+ \frac{\lambda _{2\theta }}{\beta _{i_0+j}}\right) ^{\rho (j)} < 1+ \frac{\lambda _{2}}{\beta _{\max }}\), for all \(j\ne j_0-i_0\) that (13) is satisfied. It means that the term B in (13) depends on j, i.e. \(B=B_j\). Note that \(B_j\) may be larger or smaller than 1, but is bounded above by \(\log (1+\lambda _2/\beta _{\max })/\log (1+\lambda _{2\theta }/\beta _{\max })=B^*\). For \(B_j>1\), we select \(\epsilon <1\) large such that \((1-\epsilon )B^*<1\), which implies that \((1-\epsilon )B_j <1\), thus we select \(\rho(j)> (1-\epsilon )B_j\). In case \(B_j \le 1\), we select also \(\rho(j)> (1-\epsilon )B_j\).

It implies that there exists an \(\epsilon >0\) to select \(\rho (j)\) for every \(j\ne j_0-i_0\) such that

$$\log \left( 1+ \frac{\lambda _{2}}{\beta _{\max }}\right)>\rho (j) \log \left( 1+\frac{\lambda _{2\theta }}{\beta _{i_0+j}}\right) >(1-\epsilon ) \log \left( 1+ \frac{\lambda _{2}}{\beta _{\max }}\right) .$$

a) Now the sum of the first term in the bound (31) of \(H^*(u_n,v_n)\) multiplied by n, for \(\{j\le 2\gamma _n, j \ne j_0\}\), is bounded by

$$\begin{aligned}&o_n(1) n^{1+\nu } \left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{-d_1\log n+o(\log n)} \left( 1+\frac{\lambda _{2\theta }}{\beta _{i_0+j}}\right) ^{-\rho (j)d_2\log n + o(\log n)} (d_1 \log n)^{\xi _3} L_3^*(\log n)\nonumber \\& < o_n(1)\exp \left\{ (\log n ) \left( 1+\nu - d_1\log \left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) - d_2 (1-\epsilon ) \log \left( 1+ \frac{\lambda _{2}}{\beta _{\max }}\right) \right) + o(\log n)\right\} \\&= o_n(1) \exp \left\{ -(\log n )(1-\epsilon -\nu + o_n(1)) \right\} \rightarrow 0, \, n \rightarrow + \infty , \end{aligned}$$

if also \(\nu\) is such that \(\nu < 1-\epsilon\).

The sum of the second term in (31) multiplied by n, for \(\{j\le 2\gamma _n, j\ne j_0\}\), tends to 0 because

$$\begin{aligned}&n^{1+\nu }\left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{-(1-\psi )u_n} \left( 1+\frac{\lambda _{2}}{\beta _{\max }}\right) ^{-v_n+ (\log v_n)^2} \exp (o(\log n))\\&= n^{1+\nu }\left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{-(1-\psi )d_1\log n + o(\log n)} \left( 1+\frac{\lambda _{2}}{\beta _{\max }}\right) ^{-d_2\log n+ o(\log n)}\exp (o(\log n))\\&= \exp \left\{ (\log n) \left[ 1+\nu - (1-\psi ) - 1\right] + o(\log n)\right\} \\&= \exp \left\{ (\log n ) [\nu + \psi - 1 +o_n(1)]\right\} \rightarrow 0, \;\; \mathrm{as } \; n \rightarrow +\infty , \end{aligned}$$

if also \(\nu < 1-\psi\). Hence we choose \(\nu < \min \{1-\epsilon , 1-\zeta , 1-\psi \}\).

It remains to deal with the sum of the third terms in (31) for \(\{j\le 2\gamma _n, j\ne j_0\}\). We showed that \(P(S>\psi x)= O(( 1+\lambda )^{-\psi x})\) in (20) with \((1+\lambda )^\psi > 1+\frac{\lambda _1}{\alpha _{\max }}\) in (12). Let \(\tilde{\psi }>1\) such that \((1+\lambda )^{\psi /\tilde{\psi }} = 1+\frac{\lambda _1}{\alpha _{\max }}\). This sum on \(\{j\le 2\gamma _n, j\ne j_0\}\) multiplied with n is bounded by

$$C n^{1+\nu } \left( 1+\frac{\lambda _1}{\alpha _{\max }}\right) ^{-\tilde{\psi }u_n}=C \exp \left\{ (\log n ) \left[ 1 + \nu -\tilde{\psi }+ o_n(1)\right] \right\} \rightarrow 0, \, n \rightarrow +\infty ,$$

if also \(\nu < \tilde{\psi }-1\) and C is a generic positive constant.

Thus combining these three bounds it shows that

$$n\sum _{j\le 2\gamma _n} P\left( X_0>u_n,Y_j>v_n\right) \rightarrow 0,\, n \rightarrow +\infty ,$$

if \(\nu < \min \{1-\epsilon , 1-\zeta , 1-\psi , \tilde{\psi }-1\}\).

b) We consider now the sum on j with \(2\gamma _n<j \le n/s_n\) and write

$$\begin{aligned} X_0^\prime =\sum _{i=-\gamma _n}^{+ \infty } \alpha _{i}\circ V_{-i} \ , \quad X_0^{\prime \prime }=\sum _{i=-\infty }^{-\gamma _n-1} \alpha _{i}\circ V_{-i}, \end{aligned}$$
$$\begin{aligned} X_j^\prime =\sum _{i=-\infty }^{\gamma _n} \alpha _{i}\circ V_{j-i} \ , \quad X_j^{\prime \prime }=\sum _{i=\gamma _n+1}^{+ \infty } \alpha _{i}\circ V_{j-i} \end{aligned}$$

and

$$\begin{aligned} Y_j^\prime =\sum _{i=-\infty }^{\gamma _n} \beta _{i}\circ W_{j-i} \ , \quad Y_j^{\prime \prime }=\sum _{i=\gamma _n+1}^{+ \infty } \beta _{i}\circ W_{j-i}. \end{aligned}$$

Note that \(X_0^\prime\) and \(Y_j^\prime\) are independent. We have, for \(j>2\gamma _n\) and some \(k>1\) (chosen later, not depending on n),

$$\begin{aligned}& P\left( X_0>u_n,Y_j>v_n\right) = P\left( X_0^\prime +X_0^{\prime \prime }>u_n, Y_j^\prime +Y_j^{\prime \prime }>v_n\right) \\\le & {} P\left( X_0^\prime>u_n-X_0^{\prime \prime }, Y_j^\prime>v_n-Y_j^{\prime \prime }, X_0^{\prime \prime }<k, Y_j^{\prime \prime }<k\right) \\&+ P\left( X_0^{\prime \prime }\ge k\right) + P\left( Y_j^{\prime \prime }\ge k\right) \\\le & {} P\left( X_0^\prime>u_n-k, Y_j^\prime>v_n-k\right) + P\left( X_0^{\prime \prime }\ge k\right) + P\left( Y_j^{\prime \prime }\ge k\right) \\\le & {} P\left( X_0>u_n-k\right) P\left( Y_j>v_n-k\right) + P\left( X_0^{\prime \prime }\ge k\right) + P\left( Y_j^{\prime \prime }\ge k\right) \\= & {} O\left( \frac{1}{n}\right) O\left( \frac{1}{n}\right) + P\left( X_0^{\prime \prime }\ge k\right) + P\left( Y_j^{\prime \prime }\ge k\right) . \end{aligned}$$

Similar to Hall (2003), the last two probabilities are sufficiently fast tending to 0. For, we have

$$\begin{aligned} P\left( X_0^{\prime \prime }\ge k\right)= & {} P\left( \sum _{i=-\infty }^{-\gamma _n-1} \alpha _i \circ V_{-i} \ge k \right) \\= & {} P\left( (1+h_n)^{\sum _{i=- \infty }^{-\gamma _n-1}\alpha _i \circ V_{-i}} > (1+ h_n)^k\right) \\\le & {} \frac{E\left( (1+h_n)^{\sum _{i=-\infty }^{-\gamma _n-1}\alpha _i \circ V_{-i}}\right) }{(1+h_n)^k}.\\ \end{aligned}$$

We select \(h_n\) such that \(h_n\gamma _n^{1-\delta } =C>0,\) for some constant C. For \(i\le -\gamma _n -1\) and \(\delta >2\) and some positive constant \(C^*\), it follows that

$$\begin{aligned} 0 <\alpha _ih_n \le C^* |i|^{-\delta }h_n \le C^* (\gamma _n +1 )^{-\delta } h_n =O(1/{\gamma _n}) \rightarrow 0, n \rightarrow + \infty , \end{aligned}$$

by the assumption (3) on the sequence \(\{\alpha _i\}\). It implies again that

$$E\left( (1+h_n)^{\sum _{i=-\infty }^{-\gamma _n-1}\alpha _i \circ V_{-i}}\right) = \prod _{i=-\infty }^{-\gamma _n-1} E\left( (1+\alpha _ih_n)^{V_{-i}}\right)$$

where the expectations exist, and, due to Lemma 3.1,

$$\begin{aligned} \begin{aligned} E&\left( (1+h_n)^{\sum _{i=-\infty }^{-\gamma _n-1}\alpha _i \circ V_{-i}} \right) \le \prod _{i=-\infty }^{-\gamma _n-1} \left( 1+\alpha _i h_nE( V_{0}) (1 + o_n(1))\right) \\&= \exp \left( E(V_0) h_n O(1)\sum _{i=-\infty }^{-\gamma _n-1}|i|^{-\delta }\right) \\&= \exp \left( O(1) h_n \gamma _n^{1-\delta }\right) = O(1), \, n \rightarrow +\infty , \end{aligned} \end{aligned}$$

by the choice of \(h_n\). Note that \(h_n=C\gamma _n^{\delta -1}=Cn^{\nu (\delta -1)}\rightarrow +\infty\). Now, select k depending on \(\delta\), \(\nu\) and \(\zeta\) such that \(n^2/((1+h_n)^{k}s_n)\sim n^2/(C^kn^{k\nu (\delta -1)}n^\zeta )=o(1)\) which holds for \(k> (2-\zeta )/(\nu (\delta -1))\). This choice implies that \((n^2/s_n) P\left( X_0^{\prime \prime }\ge k\right) \rightarrow 0\). In the same way we can show that also \(n \sum _{j\le n/s_n} P\left( Y_j^{\prime \prime }\ge k\right) \rightarrow 0\) for such a k, since also \(\beta _i\le C\,|i|^{-\delta }\) for \(|i|\ge \gamma _n\) and some constant \(C>0\).

c) In order to deduce

$$\begin{aligned} n \displaystyle {\sum _{j=1}^{[n/s_n]}P\left( X_0>u_n,X_j>u_n\right) }\rightarrow 0,\quad n \rightarrow + \infty , \end{aligned}$$

we use the same arguments as for \(P\left( X_0> u_n, Y_j>v_n\right)\). In this case, since \(X_0^{\prime }\) and \(X_j^{\prime }\) are independent, we get for some positive k

$$\begin{aligned}& P\left( X_0>u_n,X_j>u_n\right) \\\le & {} P\left( X_0^\prime>u_n-k, X_j^\prime >u_n-k\right) + P\left( X_0^{\prime \prime }\ge k\right) + P\left( X_j^{\prime \prime }\ge k\right) \\= & {} O\left( \frac{1}{n^2}\right) + P\left( X_0^{\prime \prime }\ge k\right) + P\left( X_j^{\prime \prime }\ge k\right) . \end{aligned}$$

As above we can show that \(n \sum _{j\le n/s_n} P\left( X_j^{\prime \prime }\ge k\right) \rightarrow 0\) and \((n^2/s_n) P\left( X_0^{\prime \prime }\ge k\right) \rightarrow 0\). In the same way it follows also that

$$\begin{aligned} n \displaystyle {\sum _{j=1}^{[n/s_n]}P\left( Y_0>v_n,Y_j>v_n\right) }\rightarrow 0, \, n \rightarrow + \infty . \end{aligned}$$

Hence condition \(D'(u_n, v_n)\) holds.

5 Simulations

We investigate the convergence of the distribution of the bivariate maxima \((M_n^{(1)}, M_n^{(2)})\) to the limiting distribution as given in Theorem 4.1. We notice that the thinning coefficients \(\alpha _i\) and \(\beta _i\) have an impact on the norming values of the bivariate maxima, besides of the distribution of the \((V_i,W_i)\).

Let us consider the bivariate geometric distribution for \((V_i,W_i)\) mentioned in Example 2.1 and a finite number of positive values \(\alpha _i\) and \(\beta _i\). As mentioned, the bivariate geometric distribution satisfies the conditions of the general assumptions of the joint distribution of \((V_i,W_i)\). We assumed a strong dependence with \(p_{00}=0.85, p_{01}=0.03, p_{10}=0.02\) and \(p_{11}=0.1\).

We consider quite different models with different \(\alpha _i\) and \(\beta _i\) to investigate the convergence rate. Let in the first case \(\alpha _1=0.8, \alpha _2=0.6, \alpha _3=0.4, \beta _1=0.6, \beta _2=0.45, \beta _3=0.3\) and \(\alpha _i=0=\beta _i\) for \(i>3,\) and in the second case \(\alpha _1=0.6, \alpha _2=0.35, \alpha _3=0.1, \beta _1=0.5, \beta _2=0.3, \beta _3=0.1\) and \(\alpha _i=0=\beta _i\) for \(i>3.\)

For each of these first two models we simulated 10’000 time series, selected \(n=100\) and 500 and derived the bivariate maxima \((M_n^{(1)},M_n^{(2)}\)). Thus we compared the empirical (simulated) distribution functions (cdf) with the asymptotic cdf.

We plotted two cases with \(P(M_n^{(1)}-\tilde{u}_n\le x ,M_n^{(2)}-\tilde{v}_n\le x+\delta )\) where \(\tilde{u}_n= u_n-x\) and \(\tilde{v}_n=v_n -y\) with \(u_n, v_n\) given in (24) and (25), respectively, using \(\delta =0\) and 2 (see Figs. 1 and 2).

We notice from these simulations that the convergence rate is quite good, but it depends on the dependence, which is given by the thinning factors \(\alpha _i\) and \(\beta _i\). We find that the convergence rate is slower for the more dependent time series (the first case, Fig. 1) and that the factor \(\delta\) has a negligible impact. This is even more clear in the second cases shown in Fig. 2.

Fig. 1
figure 1

Simulated cdf with upper and lower asymptotic cdf, first case, where \(ai=\alpha _i, bi=\beta _i\), \(i=1,2,3\)

Fig. 2
figure 2

Simulated cdf with upper and lower asymptotic cdf, second case, where \(ai=\alpha _i, bi=\beta _i\), \(i=1,2,3\)

In some additional models we considered larger and more thinning factors different from 0. We show the simulations of the cases with \(\alpha _i=(0.7)^i, \beta _i=(0.6)^i\), for \(i\le 25\), and also with \(\alpha _i=(0.9)^i, \beta _i=(0.8)^i\), for \(i\le 40\). These cases are close to a infinite MA series, since \(\alpha _i,\beta _i\) are very small for \(i> 26\) or \(i>41\), respectively. It means that such small values have an impact on the maxima. We figured out that the number of positive values is not so important. However, in these cases the second largest value of \(\alpha _i\) or \(\beta _i\) is closer to the maximal value (=1), in particular in the second of these additional models. Considering the results of again 10’000 simulations (Fig. 3), we show that the convergence rates are quite slower than in the first two models (Figs. 1 and 2). We show the results of the two cases with \(n=100\) and 500 with \(\delta =0\) only. We also figured out from the simulations of other models and distributions that if the correlation of the two components of the sequence is stronger, then the convergence to the limiting distribution (with asymptotic independence) is slower.

Fig. 3
figure 3

Simulated cdf with upper and lower asymptotic cdf, third and fourth model where \(ai=\alpha _i, bi=\beta _i\) for \(i\le 25\) (third model), and \(i\le 40\) (fourth model), respectively, with \(n=100,\) and 500 and \(\delta =0\)