Skip to main content
Log in

Mallows’ models for imperfect ranking in ranked set sampling

  • Original Paper
  • Published:
AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Abstract

In this paper, we consider some statistical measures of deviation from the perfect ranking in the framework of ranked set sampling. We use nonparametric approach for testing the null hypothesis for perfect ranking. The distance-based Mallows’ models with appropriate distance on permutations are suggested in the case of imperfect ranking. Some asymptotic results for the corresponding error probability matrix are derived for the models based on Spearman’s footrule and Spearman’s rho. We propose an EM algorithm for estimating the unknown parameter in the Mallows’ models in order to compare the power of the presented test statistics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Aragon, M.E.D., Patil, G.P., Taillie, C.: A performance indicator for ranked set sampling using ranking error probability matrix. Environ. Ecol. Stat. 6, 75–89 (1999)

    Article  Google Scholar 

  • Balakrishnan, N., Li, T.: Ordered ranked set samples and applications to inference. J. Stat. Plan. Inference 138, 3512–3524 (2008)

    Article  MathSciNet  Google Scholar 

  • Chen, Z., Bai, Z., Sinha, B.K.: Ranked Set Sampling: Theory and Applications. Lecture Notes in Statistics 176. Springer, New York (2004)

    Book  Google Scholar 

  • Dell, T.R., Clutter, J.L.: Ranked set sampling theory with order statistics background. Biometrics 28, 545–555 (1972)

    Article  Google Scholar 

  • Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. 39, 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  • Fligner, M., Verducci, T.: Distance based ranking models. J. R. Stat. Soc. 48, 359–369 (1986)

    MathSciNet  MATH  Google Scholar 

  • Frey, J.: New imperfect rankings models for ranked set sampling. J. Stat. Plan. Inference 137, 1433–1445 (2007)

    Article  MathSciNet  Google Scholar 

  • Frey, J.: Nonparametric mean estimation using partially ordered sets. Environ. Ecol. Stat. 19, 309–326 (2012)

    Article  MathSciNet  Google Scholar 

  • Frey, J., Ozturk, O., Deshpande, J.V.: Nonparametric tests for perfect judgment rankings. J. Am. Stat. Assoc. 102, 708–717 (2007)

    Article  MathSciNet  Google Scholar 

  • Frey, J., Wang, L.: Most powerful rank tests for perfect rankings. Comput. Stat. Data Anal. 60, 157–168 (2013)

    Article  MathSciNet  Google Scholar 

  • Hoeffding, W.: A combinatorial limit theorem. Ann. Math. Stat. 22, 558–566 (1951)

    Article  MathSciNet  Google Scholar 

  • Li, T., Balakrishnan, N.: Some simple nonparametric methods to test for perfect ranking in ranked set sampling. J. Stat. Plan. Inference 138, 1325–1338 (2008)

    Article  MathSciNet  Google Scholar 

  • Mallows, C.M.: Non-null ranking models. I Biometrika 44, 114–130 (1957)

    Article  MathSciNet  Google Scholar 

  • Marden, J.I.: Analyzing and Modeling Rank Data. Monographs on Statistics and Applied Probability 64. Chapman & Hall, London (1995)

    Google Scholar 

  • McIntyre, G.A.: A method for unbiased selective sampling, using ranked sets. Aust. J. Agric. Res. 3, 385–390 (1952)

    Article  Google Scholar 

  • McLachlan, G., Krishnan, T.: The EM Algorithm and Extentions, 2nd edn. John Wiley, New York (2008)

    Book  Google Scholar 

  • Murray, R.A., Ridout, M.S., Cross, J.V.: The use of ranked set sampling in spray deposit assessment. Asp. Appl. Biol. 57, 141–146 (2000)

    Google Scholar 

  • Nikolov, N.I., Stoimenova, E.: Asymptotic properties of Lee distance. Metrika 82, 385–408 (2019)

    Article  MathSciNet  Google Scholar 

  • Ozturk, O.: Statistical inference under a stochastic ordering constraint in ranked set sampling. J. Nonparametric Stat. 19, 131–144 (2007)

    Article  MathSciNet  Google Scholar 

  • Ozturk, O.: Nonparametric maximum-likelihood estimation of within-set ranking errors in ranked set sampling. J. Nonparametric Stat. 22, 823–840 (2010)

    Article  MathSciNet  Google Scholar 

  • Ozturk, O.: Sampling from partially rank-ordered sets. Environ. Ecol. Stat. 18, 757–779 (2011)

    Article  MathSciNet  Google Scholar 

  • Pesarin, F., Salmaso, L.: Permutation Tests for Complex Data: Theory, Applications and Software. Wiley, Hoboken (2010)

    Book  Google Scholar 

  • Vock, M., Balakrishnan, N.: A Jonckheere-Terpstra-type test for perfect ranking in balanced ranked set sampling. J. Stat. Plan. Inference 141, 624–630 (2011)

    Article  MathSciNet  Google Scholar 

  • Wolfe, D.A.: Ranked set sampling: its relevance and impact on statistical inference. ISRN Probability and Statistics: article ID 568385 (2012)

  • Zamanzade, E., Arghami, N.R., Vock, M.: Permutation-based tests of perfect ranking. Stat. Probab. Lett. 82, 2213–2220 (2012)

    Article  MathSciNet  Google Scholar 

  • Zamanzade, E., Vock, M.: Some nonparametric tests of perfect judgment ranking for judgment post stratification. Stat Pap. 59, 1085–1100 (2016)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported by the Bulgarian Ministry of Education and Science under the National Research Programme “Young scientists and postdoctoral students” approved by DCM #577 / 17.08.2018 and by the National Science Fund of Bulgaria under Grant DH02-13.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nikolay I. Nikolov.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

The proofs given below are mainly based on the following combinatorial central limit theorem (CCLT), formulated and proved by Hoeffding (1951).

Theorem 4

(Hoeffding’s CCLT) Let \(\pi \sim Uniform(\mathbf {S_{k}})\) and \(D(\pi )=\sum \limits _{s=1}^{k}a_{k}\big (\pi (s),s\big )\), where \(a_{k}(r,s)\in {\mathbf {R}}\) for \(r,s=1,2,\ldots ,k\). Then the mean and variance of D are

$$\begin{aligned} {\mathbf {E}}\left( D\right)&=\frac{1}{k}\sum _{r=1}^{k}\sum _{s=1}^{k}a_{k}(r,s) \end{aligned}$$
(15)
$$\begin{aligned} \mathbf {Var} \left( D\right)&=\frac{1}{k-1}\sum _{r=1}^{k}\sum _{s=1}^{k}b_{k}^{2}(r,s) \, , \end{aligned}$$
(16)

where

$$\begin{aligned} b_{k}(r,s)=a_{k}(r,s)- \frac{1}{k} \sum _{l=1}^{k}a_{k}(l,s)-\frac{1}{k}\sum _{m=1}^ {k}a_{k}(r,m)+\frac{1}{k^{2}}\sum _{l=1}^{k} \sum _{m=1}^{k}a_{k}(l,m) \end{aligned}$$

for \(r,s=1,2,\ldots ,k\). Furthermore, the distribution of D is asymptotically normal if

$$\begin{aligned} \lim _{k \rightarrow \infty } \displaystyle \frac{\displaystyle \max \nolimits _{1 \le r,s \le k}b_{k}^{2}(r,s)}{\displaystyle \frac{1}{k}\sum \nolimits _{r=1}^{k}\sum \nolimits _{s=1}^{k}b_{k}^{2}(r,s)} = 0 \,. \end{aligned}$$
(17)

Consider the random variables based on Spearman’s footrule and Spearman’s rho:

$$\begin{aligned} D_{F}\left( \pi \right) =\sum \limits _{s=1}^{k}\mid \pi (s)-s\mid \quad \hbox {and} \quad D_{R}\left( \pi \right) =\sum \limits _{s=1}^{k}\left( \pi (s)-s\right) ^{2}, \end{aligned}$$

where \(\pi \sim Uniform(\mathbf {S_{k}})\). By applying Theorem 4 to \(D_{F}\) and \(D_{R}\) (see, e.g., Marden 1995, p.83) it can be shown that \(D_{F}\) and \(D_{R}\) are asymptotically normal with means and variances:

$$\begin{aligned} {\mathbf {E}}\left( D_{F}\right)&=\frac{1}{k}\sum _{r=1}^{k}\sum _{s=1}^{k}\mid r-s\mid =\frac{k^2-1}{3},\nonumber \\ \mathbf {Var}\left( D_{F}\right)&=\frac{(k+1)\left( 2k^2+7\right) }{45}, \end{aligned}$$
(18)
$$\begin{aligned} {\mathbf {E}}\left( D_{R}\right)&=\frac{1}{k}\sum _{r=1}^{k}\sum _{s=1}^{k}\left( r-s\right) ^{2}=\frac{k\left( k^2-1\right) }{6},\nonumber \\ \mathbf {Var}\left( D_{R}\right)&=\frac{k^2(k-1)\left( k+1\right) ^{2}}{36}. \end{aligned}$$
(19)

In order to prove Theorem 2, let define the random variables \(D_{F}^{(i,j)}=d_{F}\left( \pi ,e_{k}\right) \) for \(i,j=1,2,\ldots ,k\), where \(d_{F}(\cdot ,\cdot )\) is Spearman’s footrule, and \(\pi \) is uniformly and randomly selected from \({\mathbf {S}}_{{\mathbf {k}}}^{(i,j)}=\left\{ \sigma \in \mathbf {S_{k}}: \sigma (j)=i\right\} \), i.e., \(\pi \sim Uniform\left( {\mathbf {S}}_{{\mathbf {k}}}^{(i,j)}\right) \). Then, for a fixed pair (ij),

$$\begin{aligned} D_{F}^{(i,j)}=\sum \limits _{s=1}^{k}\mid \pi (s)-s\mid =\sum \limits _{\begin{array}{c} s=1 \\ s\ne j \end{array}}^{k}\mid \pi (s)-s\mid + \mid i-j\mid =\sum \limits _{s=1}^{k-1}{\tilde{a}}_{k}(\sigma (s),s) + \mid i-j\mid \,, \end{aligned}$$

where

$$\begin{aligned} \sigma (s)= {\left\{ \begin{array}{ll} \pi (s), &{}\quad \hbox {if}\quad s<j \quad \hbox {and}\quad \pi (s)<i,\\ \pi (s)-1, &{}\quad \hbox {if}\quad s<j \quad \hbox { and }\quad \pi (s)>i,\\ \pi (s+1), &{}\quad \hbox {if}\quad s\ge j \quad \hbox { and }\quad \pi (s+1)<i,\\ \pi (s+1)-1, &{}\quad \hbox {if}\quad s\ge j \quad \hbox { and }\quad \pi (s+1)>i, \end{array}\right. } \end{aligned}$$
(20)

and

$$\begin{aligned} {\tilde{a}}_{k}(r,s)= {\left\{ \begin{array}{ll} \mid r-s \mid , &{}\quad \hbox {if }\quad s<j \quad \hbox { and }\quad r<i,\\ \mid r+1-s \mid , &{}\quad \hbox {if }\quad s<j \quad \hbox { and }\quad r\ge i,\\ \mid r-s-1 \mid , &{}\quad \hbox {if }\quad s\ge j \quad \hbox { and }\quad r<i,\\ \mid r+1-s-1 \mid , &{}\quad \hbox {if }\quad s\ge j \quad \hbox { and }\quad r\ge i, \end{array}\right. } \end{aligned}$$
(21)

for \(r,s=1,2,\ldots ,k-1\) and \(\pi \sim Uniform\left( {\mathbf {S}}_{{\mathbf {k}}}^{(i,j)}\right) \).

Lemma 1

Let \(\displaystyle {\tilde{D}}_{F}\left( \sigma \right) =\sum \nolimits _{s=1}^{k-1}{\tilde{a}}_{k}(\sigma (s),s)\), where \(\sigma (\cdot )\) and \({\tilde{a}}_{k}(\cdot ,\cdot )\) are given in (20) and (21), respectively. Then the distribution of \({\tilde{D}}_{F}\) is asymptotically normal with mean and variance

$$\begin{aligned} {\mathbf {E}}\left( {\tilde{D}}_{F}\right)&= \displaystyle \frac{k(k+1)}{3}-\frac{f(i)+f(j)-\mid i-j\mid }{k-1}, \end{aligned}$$
(22)
$$\begin{aligned} \mathbf {Var} \left( {\tilde{D}}_{F}\right)&= \displaystyle \frac{1}{k-2}\left\{ \sum _{\begin{array}{c} r=1 \\ r\ne i \end{array}}^{k}\sum _{\begin{array}{c} s=1 \\ s\ne j \end{array}}^{k} \left[ \mid r-s\mid +\frac{k(k+1)}{3(k-1)}\right. \right. \nonumber \\&\qquad \qquad \qquad -\frac{f(r)+f(s)-\mid i-s\mid -\mid r-j\mid }{k-1}\nonumber \\&\qquad \qquad \qquad \left. \left. -\frac{f(i)+f(j)-\mid i-j\mid }{(k-1)^{2}}\right] ^{2}\right\} , \end{aligned}$$
(23)

where

$$\begin{aligned} f(x)=\frac{x(x-1)+(k-x)(k-x+1)}{2}. \end{aligned}$$

Proof

From the definition of \(\sigma \) in (20) it is easy to check that \(\sigma \sim Uniform(\mathbf {S_{k-1}})\) for \(\pi \sim Uniform\left( {\mathbf {S}}_{{\mathbf {k}}}^{(i,j)}\right) \). Therefore, Theorem 4 can be applied to the random variable \({\tilde{D}}_{F}\). By using (15), (21) and the expectation in (18), it follows that

$$\begin{aligned}&{\mathbf {E}}\left( {\tilde{D}}_{F}\right) {\mathop {=}\limits ^{(15)}}\frac{1}{k-1}\sum _{r=1}^{k-1}\sum _{s=1}^{k-1}{\tilde{a}}_{k}(r,s){\mathop {=}\limits ^{(21)}}\frac{1}{k-1}\sum _{\begin{array}{c} r=1 \\ r\ne i \end{array}}^{k}\sum _{\begin{array}{c} s=1 \\ s\ne j \end{array}}^{k}\mid r-s\mid \\&\quad =\frac{1}{k-1}\sum _{r=1}^{k}\sum _{s=1}^{k}\mid r-s\mid -\frac{1}{k-1}\sum _{r=1}^{k}\mid r-j\mid \\&\qquad -\frac{1}{k-1}\sum _{s=1}^{k}\mid i-s\mid +\frac{\mid i-j\mid }{k-1}\\&\quad {\mathop {=}\limits ^{(18)}}\frac{k(k+1)}{3}-\frac{f(i)+f(j)-\mid i-j\mid }{k-1}, \end{aligned}$$

where

$$\begin{aligned} f(x)=\sum _{r=1}^{k}\mid r-x\mid =\frac{x(x-1)+(k-x)(k-x+1)}{2}, \end{aligned}$$
(24)

for \(x=1,2,\ldots ,k\). Using (16) of Theorem 4,

$$\begin{aligned} \mathbf {Var} \left( {\tilde{D}}_{F}\right) =\frac{1}{k-2}\sum _{\begin{array}{c} r=1 \\ r\ne i \end{array}}^{k}\sum _{\begin{array}{c} s=1 \\ s\ne j \end{array}}^{k}{\tilde{b}}_{k}^{2}(r,s), \end{aligned}$$
(25)

where

$$\begin{aligned} {\tilde{b}}_{k}(r,s)=\mid r-s\mid - \sum _{\begin{array}{c} l=1 \\ l\ne i \end{array}}^{k}\frac{\mid l-s\mid }{k-1}-\sum _{\begin{array}{c} m=1 \\ m\ne j \end{array}}^{k}\frac{\mid r-m\mid }{k-1}+\frac{1}{\left( k-1\right) ^{2}} \sum _{\begin{array}{c} l=1 \\ l\ne i \end{array}}^{k}\sum _{\begin{array}{c} m=1 \\ m\ne j \end{array}}^{k}\mid l-m\mid , \end{aligned}$$

for \(r,s=1,2,\ldots ,k\). Simplifying this expression gives

$$\begin{aligned} {\tilde{b}}_{k}(r,s)= & {} \mid r-s\mid -\frac{f(r)+f(s)-\mid i-s\mid -\mid r-j\mid }{k-1} \nonumber \\&+\frac{k(k+1)}{3(k-1)}-\frac{f(i)+f(j)-\mid i-j\mid }{(k-1)^{2}}. \end{aligned}$$
(26)

The variance of \({\tilde{D}}_{F}\) given in (23) is obtained by substituting (26) in formula (25).

From (24) it is easy to check that

$$\begin{aligned} \frac{k^{2}-1}{4} \le f(x)\le \frac{k(k-1)}{2} \quad \hbox {for } 1\le x \le k. \end{aligned}$$
(27)

Combining

$$\begin{aligned} 1\le \mid x-y\mid \le k-1 \quad \hbox {for } 1\le x,y \le k, \end{aligned}$$

with (26) and (27), it follows that

$$\begin{aligned} \mid r-s \mid - \frac{2k}{3}+\epsilon _{1} \le {\tilde{b}}_{k}(r,s) \le \mid r-s \mid - \frac{k}{6}+\epsilon _{2}, \end{aligned}$$

where \(r,s=1,2,\ldots ,k\), \(\displaystyle \lim _{k \rightarrow \infty }\frac{\epsilon _{1}}{k}=0\) and \(\displaystyle \lim _{k \rightarrow \infty }\frac{\epsilon _{2}}{k}=0\). Therefore, there exists a constant \(c_{1}>0\) such that

$$\begin{aligned} \max _{1 \le r,s \le k}{\tilde{b}}_{k}^{2}(r,s) \le c_{1}k^{2}, \end{aligned}$$
(28)

and a number \(N>0\) such that for \(k\ge N\)

$$\begin{aligned} \mid r-s \mid - k \le {\tilde{b}}_{k}(r,s) \le \mid r-s \mid - \frac{k}{7} \; . \end{aligned}$$

Suppose that r is a fixed index from the set \(\left\{ 1,2,\ldots ,k\right\} \). Then for \(k\ge N\)

$$\begin{aligned} \sum _{\begin{array}{c} s=1 \\ s\ne j \end{array}}^{k}{\tilde{b}}_{k}^{2}(r,s)= & {} \sum _{s=1}^{k}{\tilde{b}}_{k}^{2}(r,s)-{\tilde{b}}_{k}^{2}(r,j) \ge \sum _{\mid r-s\mid =0}^{k/7}{\tilde{b}}_{k}^{2}(r,s)-{\tilde{b}}_{k}^{2}(r,j)\\\ge & {} \sum _{v=0}^{k/7}\left( v-\frac{k}{7}\right) ^{2}-{\tilde{b}}_{k}^{2}(r,j), \end{aligned}$$

where \(\displaystyle \sum _{\mid r-s\mid =0}^{k/7}\) is a summation over all values of s such that \(\displaystyle 0\le \mid r-s\mid \le \frac{k}{7}\). Thus, for \(k\ge N\) there exists a constant \(c_{2}>0\) such that

$$\begin{aligned} \sum _{\begin{array}{c} s=1 \\ s\ne j \end{array}}^{k}{\tilde{b}}_{k}^{2}(r,s) \ge c_{2}k^{3} \; . \end{aligned}$$
(29)

By using (28) and (29),

$$\begin{aligned} \lim _{k \rightarrow \infty } \displaystyle \frac{\displaystyle \max \nolimits _{1 \le r,s \le k}{\tilde{b}}_{k}^{2}(r,s)}{\displaystyle \frac{1}{k-1}\sum \nolimits _{\begin{array}{c} r=1 \\ r\ne i \end{array}}^{k}\sum \nolimits _{\begin{array}{c} s=1 \\ s\ne j \end{array}}^{k}{\tilde{b}}_{k}^{2}(r,s)} \le \lim _{k \rightarrow \infty } \frac{c_{1}k^2}{\displaystyle \frac{1}{k-1}\sum \nolimits _{\begin{array}{c} r=1 \\ r\ne i \end{array}}^{k} c_{2}k^3} = 0, \end{aligned}$$

i.e., the condition (17) of Theorem 4 is fulfilled and the distribution of \({\tilde{D}}_{F}\) is asymptotically normal. \(\square \)

Similarly to \(\left\{ D_{F}^{(i,j)}\right\} _{i,j=1}^{k}\), consider the random variables \(D_{R}^{(i,j)}=d_{R}\big (\pi ,e_{k}\big )\) based on Spearman’s rho. For a fixed pair (ij),

$$\begin{aligned} D_{R}^{(i,j)}=\sum \limits _{s=1}^{k}\big (\pi (s)-s\big )^{2}=\sum \limits _{\begin{array}{c} s=1 \\ s\ne j \end{array}}^{k}\big (\pi (s)-s\big )^{2} + \left( i-j\right) ^{2}=\sum \limits _{s=1}^{k-1}{\bar{a}}_{k}(\sigma (s),s) + \left( i-j\right) ^{2} \,, \end{aligned}$$

where

$$\begin{aligned} {\bar{a}}_{k}(r,s)= {\left\{ \begin{array}{ll} \left( r-s \right) ^{2}, &{}\quad \hbox {if }\quad s<j \quad \hbox { and }\quad r<i,\\ \left( r+1-s \right) ^{2}, &{}\quad \hbox {if }\quad s<j \quad \hbox { and }\quad r\ge i,\\ \left( r-s-1 \right) ^{2}, &{}\quad \hbox {if }\quad s\ge j \quad \hbox { and }\quad r<i,\\ \left( r+1-s-1 \right) ^{2}, &{}\quad \hbox {if }\quad s\ge j \quad \hbox { and }\quad r\ge i, \end{array}\right. } \end{aligned}$$
(30)

for \(r,s=1,2,\ldots ,k-1\), \(\pi \sim Uniform\left( {\mathbf {S}}_{{\mathbf {k}}}^{(i,j)}\right) \) and \(\sigma (s)\) is defined as in (20).

Lemma 2

Let \(\displaystyle {\bar{D}}_{R}\left( \sigma \right) =\sum \nolimits _{s=1}^{k-1}{\bar{a}}_{k}(\sigma (s),s)\), where \(\sigma (\cdot )\) and \({\bar{a}}_{k}(\cdot ,\cdot )\) are given in (20) and (30), respectively. Then the distribution of \({\bar{D}}_{R}\) is asymptotically normal with mean and variance

$$\begin{aligned} {\mathbf {E}}\left( {\bar{D}}_{R}\right)&= \displaystyle \frac{k^{2}(k+1)}{6}-\frac{h(i)+h(j)-\left( i-j\right) ^{2} }{k-1},\\ \mathbf {Var} \left( {\bar{D}}_{R}\right)&= \displaystyle \frac{1}{k-2}\left\{ \sum _{\begin{array}{c} r=1 \\ r\ne i \end{array}}^{k}\sum _{\begin{array}{c} s=1 \\ s\ne j \end{array}}^{k} \left[ \left( r-s \right) ^{2} +\frac{k^{2}(k+1)}{6(k-1)}\right. \right. \\&\qquad \qquad \qquad -\frac{h(r)+h(s)-\left( i-s\right) ^{2} -\left( r-j\right) ^{2} }{k-1}\\&\qquad \qquad \qquad \left. \left. -\frac{h(i)+h(j)-\left( i-j\right) ^{2} }{(k-1)^{2}}\right] ^{2}\right\} , \end{aligned}$$

where

$$\begin{aligned} h(x)=\frac{x(x-1)(2x-1)+(k-x)(k-x+1)(2k-2x+1)}{6}. \end{aligned}$$

Proof

By using (19) and the fact that for \(x\in \left\{ 1,2,\ldots ,k\right\} \)

$$\begin{aligned} h(x)= & {} \sum _{r=1}^{k}\left( r-k\right) ^{2}\nonumber \\= & {} \frac{x(x-1)(2x-1)+(k-x)(k-x+1)(2k-2x+1)}{6}, \end{aligned}$$
(31)

\({\mathbf {E}}\left( {\bar{D}}_{R}\right) \) and \(\mathbf {Var} \left( {\bar{D}}_{R}\right) \) can be evaluated in a similar way as in the proof of Lemma 1.

Now, consider the quantities

$$\begin{aligned} {\bar{b}}_{k}(r,s)= & {} \left( r-s\right) ^{2} +\frac{k^{2}(k+1)}{6(k-1)}-\frac{h(r)+h(s)-\left( i-s\right) ^{2} -\left( r-j\right) ^{2} }{k-1} \\&-\frac{h(i)+h(j)-\left( i-j\right) ^{2} }{(k-1)^{2}} \end{aligned}$$

for \(r,s=1,2,\ldots ,k\). Since (31)

$$\begin{aligned} \frac{k(k^{2}+2)}{12} \le h(x)\le \frac{k(k-1)(2k-1)}{6} \quad \hbox {for } 1\le x \le k, \end{aligned}$$

it follows that

$$\begin{aligned} \left( r-s \right) ^{2} -\frac{k^{2}}{2}+\epsilon _{1} \le {\bar{b}}_{k}(r,s) \le \left( r-s \right) ^{2} +\epsilon _{2}, \end{aligned}$$

where \(r,s=1,2,\ldots ,k\), \(\displaystyle \lim _{k \rightarrow \infty }\frac{\epsilon _{1}}{k^{2}}=0\) and \(\displaystyle \lim _{k \rightarrow \infty }\frac{\epsilon _{2}}{k^{2}}=0\). Hence, there exists a constant \(c_{1}>0\) such that

$$\begin{aligned} \max _{1 \le r,s \le k}{\bar{b}}_{k}^{2}(r,s) \le c_{1}k^{4}. \end{aligned}$$
(32)

Further, fix the indexes \(\displaystyle 1\le r,s \le \frac{k}{4}\). Then, since

$$\begin{aligned} \frac{k(7k^{2}+12k+8)}{48} \le h(x)\le \frac{k(k-1)(2k-1)}{6} \quad \hbox {for } 1\le x \le \frac{k}{4}, \end{aligned}$$

it follows that

$$\begin{aligned} \left( r-s \right) ^{2} -\frac{k^{2}}{2}+\epsilon _{3} \le {\bar{b}}_{k}(r,s) \le \left( r-s \right) ^{2} -\frac{k^{2}}{8}+\epsilon _{4}, \end{aligned}$$

where \(\displaystyle \lim _{k \rightarrow \infty }\frac{\epsilon _{3}}{k^{2}}=0\) and \(\displaystyle \lim _{k \rightarrow \infty }\frac{\epsilon _{4}}{k^{2}}=0\). Thus, there exists a number \(N>0\) such that for \(k\ge N\)

$$\begin{aligned} \left( r-s \right) ^{2} - k^{2} \le {\bar{b}}_{k}(r,s) \le \left( r-s \right) ^{2} -\frac{k^{2}}{9}. \end{aligned}$$

Hence, for \(k\ge N\)

$$\begin{aligned}&\sum _{\begin{array}{c} r=1 \\ r\ne i \end{array}}^{k} \sum _{\begin{array}{c} s=1 \\ s\ne j \end{array}}^{k}{\bar{b}}_{k}^{2}(r,s)\ge \sum _{\begin{array}{c} r=1 \\ r\ne i \end{array}}^{k/4} \sum _{\begin{array}{c} s=1 \\ s\ne j \end{array}}^{k/4}{\bar{b}}_{k}^{2}(r,s)=\sum _{\begin{array}{c} r=1 \\ r\ne i \end{array}}^{k/4} \left\{ \sum _{s=1}^{k/4}{\bar{b}}_{k}^{2}(r,s)-{\bar{b}}_{k}^{2}(r,j)\right\} \\&\quad \ge \sum _{\begin{array}{c} r=1 \\ r\ne i \end{array}}^{k/4} \left\{ \sum _{\left( r-s\right) ^{2}=0}^{k^{2}/9}{\bar{b}}_{k}^{2}(r,s)-{\bar{b}}_{k}^{2}(r,j)\right\} \ge \sum _{\begin{array}{c} r=1 \\ r\ne i \end{array}}^{k/4} \left\{ \sum _{v=0}^{k/3}\left( v^{2}-\frac{k^{2}}{9}\right) ^{2}-{\bar{b}}_{k}^{2}(r,j)\right\} , \end{aligned}$$

where \( \sum _{\left( r-s\right) ^{2}=0}^{k^{2}/9}\) is a summation over all values of s, such that \(\displaystyle 0\le \left( r-s\right) ^{2} \le \frac{k^{2}}{9}\). Thus, for \(k\ge N\) there exists a constant \(c_{2}>0\), such that

$$\begin{aligned} \sum _{\begin{array}{c} r=1 \\ r\ne i \end{array}}^{k} \sum _{\begin{array}{c} s=1 \\ s\ne j \end{array}}^{k}{\bar{b}}_{k}^{2}(r,s) \ge c_{2}k^{6} \; . \end{aligned}$$
(33)

From (32) and (33), it is easy to check that the condition (17) of Theorem 4 is fulfilled and the distribution of \({\bar{D}}_{R}\) is asymptotically normal. \(\square \)

Proof of Theorem 2

From (5), (6) and (7), it follows that

$$\begin{aligned} q(i,j,k,\theta )=\sum _{\pi (j)=i} \exp \left( \theta d(\pi ,e_{k})-\psi _{k}(\theta )\right) =\frac{(k-1)!{\tilde{m}}_{k-1}(\theta )}{k!m_{k}(\theta )}=\frac{1}{k}\frac{{\tilde{m}}_{k-1}(\theta )}{m_{k}(\theta )}, \end{aligned}$$

where \(m_{k}(\cdot )\) and \({\tilde{m}}_{k-1}(\cdot )\) are the moment generating functions of \(D_{F}(\pi )\) and \(D_{F}^{(i,j)}(\sigma )\) for \(\pi \sim Uniform(\mathbf {S_{k}})\) and \(\sigma \sim Uniform\left( {\mathbf {S}}_{{\mathbf {k}}}^{(i,j)}\right) \). Since \(D_{F}^{(i,j)}={\tilde{D}}_{F}+\mid i-j\mid \) and according to Lemma 1\({\tilde{D}}_{F}\) is asymptotically normal, it follows that \(D_{F}^{(i,j)}\) is asymptotically normal. Therefore, \(m_{k}(\cdot )\) and \({\tilde{m}}_{k-1}(\cdot )\) can be approximated with the moment generating function of the normal distribution and

$$\begin{aligned} q(i,j,k,\theta ) \displaystyle \frac{k}{ \exp \left( \theta \mu + \frac{\theta ^{2}\nu ^{2}}{2}\right) } \xrightarrow [k \rightarrow \infty ] \displaystyle 1, \end{aligned}$$

where \(\mu ={\mathbf {E}}\left( D_{F}^{(i,j)}\right) -{\mathbf {E}}\left( D_{F}\right) \) and \(\nu ^{2}=\mathbf {Var}\left( D_{F}^{(i,j)}\right) -\mathbf {Var}\left( D_{F}\right) \).

The values of \(\mu \) and \(\nu ^{2}\) given in Theorem 2 are obtained by combining formulas (18), (22) and (23) with

$$\begin{aligned} {\mathbf {E}}\left( D_{F}^{(i,j)}\right) ={\mathbf {E}}\left( {\tilde{D}}_{F}\right) +\mid i-j\mid \quad \hbox {and} \quad \mathbf {Var}\left( D_{F}^{(i,j)}\right) =\mathbf {Var}\left( {\tilde{D}}_{F}\right) . \end{aligned}$$

\(\square \)

Proof of Theorem 3

The proof is similar to the proof of Theorem 2 using Lemma 2.

\(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nikolov, N.I., Stoimenova, E. Mallows’ models for imperfect ranking in ranked set sampling. AStA Adv Stat Anal 104, 459–484 (2020). https://doi.org/10.1007/s10182-019-00354-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10182-019-00354-4

Keywords

Navigation