Estimation in Complex Sampling Designs Based on Resampling Methods

Panahbehagh, Bardia

doi:10.1007/s13253-020-00390-7

Bardia Panahbehagh¹

223 Accesses
1 Citation
Explore all metrics

Abstract

Generally, to select a representative sample of the population, we use a combination of several probabilistic sampling methods which is called a complex sampling design. A complex sampling design usually needs very sophisticated mathematical calculations to provide unbiased estimators of the population parameters. Therefore, only a limited number of sampling designs are commonly used in practice. In the present study, to overcome this complexity, we propose a general method of estimation based on resampling that is suitable for all standard designs, either conventional or adaptive. In this method, we calculate Murthy estimator as an unbiased estimator for the population mean and its variance estimator without intensive mathematical calculations. Using this method, researchers can perform any probability design with the guarantee that the estimator is unbiased. To show this ability and as an application of the method, we introduce Adaptive Random Walk Sampling as a complex and efficient sampling design, proper for the quadrat-based environmental population. Despite the complexity of this design, the method proposed in this paper provides unbiased estimator for the population mean based on the design and then makes it a practical design. Simulations confirm the expected performance of the method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recent developments in systematic sampling: A review

Article 01 June 2018

Inverse Adaptive Stratified Random Sampling

Alternative and complementary approaches to spatially balanced samples

Article 18 September 2017

References

Brown, J.A. and Manly, B.J.F. (1998), Restricted adaptive cluster sampling. Environmental and Ecological Statistics, 5, 49-63.
Article Google Scholar
Chao, C.T. and Thompson, S.K. (1999), Incomplete adaptive cluster sampling designs. In: Proceedings of the section on survey research methods of the American statistical association, 345–350.
Fattorini, L. (2006), Applying the Horvitz–Thompson criterion in complex designs: a computer-intensive perspective for estimating inclusion probabilities, Biometrika, 93(2), 269–278.
Article MathSciNet Google Scholar
Horvitz, D.G. and Thompson, D.J. (1952), A generalization of sampling without replacement from a finite universe, Journal of the American Statistical Association, 47, 663–685.
Article MathSciNet Google Scholar
Karr, A.F. (1993), Probability, Springer-Verlag, New York.
Book Google Scholar
Kruskaland, W. and Mosteller, F. (1997a), Representative sampling, I: non-scientific literature, International Statistical Review, 47, 13–24.
Article Google Scholar
— (1997b), Representative sampling, II: scientific literature, excluding statistics, International Statistical Review, 47, 111–127.
Article Google Scholar
— (1997c), Representative sampling, III: the current statistical literature, International Statistical Review, 47, 245–265.
Article Google Scholar
Murthy, M.N. (1957), Ordered and unordered estimators in sampling without replacement. Sankhya: Indian Journal of Statistics, 18, 379–390.
MathSciNet MATH Google Scholar
Narain, R. (1951), On sampling without replacement with varying probabilities. Journal of the Indian Society of Agricultural Statistics, 3, 169–175.
MathSciNet Google Scholar
Panahbehagh, B. (2016) Adaptive rectangular sampling: an easy, incomplete, neighborhood-free adaptive cluster sampling design. Survey Methodology, 42(2), 263–281.
Google Scholar
Panahbehagh, B. and Brown, J. (2016), gap based inverse sampling. Communications in Statistics; Theory and Methods, https://doi.org/10.1080/03610926.2016.1217022.
Ross, S.M. (2006), A first course in probability. Upper Saddle River, N.J., Pearson Prentice Hall.
MATH Google Scholar
Salehi, M.M. and Seber, G.A.F. (1997) Two-stage adaptive cluster sampling. Biometrics, 53, 959-970.
Article Google Scholar
— (2001), A new proof of Murthy’s estimator with applies to sequential sampling, Australian & New Zealand Journal of Statistics, 43(3), 281–286.
Article MathSciNet Google Scholar
Salehi, M.M. and Smith, D.R. (2005), Two-stage sequential sampling: a neighborhood-free adaptive sampling procedure. Journal of Agricultural, Biological, and Environmental Statistics, 10, 84-103.
Article Google Scholar
Sarndal, C.E., Swensson, B. and Wretman, J. (1992), Model assisted survey sampling. Springer series in statistics, Springer-Verlag Publishing.
Smith, D.R., Conroy, M.J. and Brakhage, H. (1995), Efficiency of adaptive cluster sampling for estimating density of wintering waterfowl. Biometrics, 51, 777–788.
Article Google Scholar
Su, Z. and Quinn II, T.J. (2003), Estimator bias and efficiency for adaptive cluster sampling with order statistics and a stopping rule. Environmental and Ecological Statistics, 10, 17–41.
Article MathSciNet Google Scholar
Szwarcwald, C.L., Damacena G.N. (2008) Complex Sampling Design in Population Surveys: Planning and effects on statistical data analysis. Rev Bras Epidemiol,11, 38–45.
Article Google Scholar
Thompson, S.K. (1990) Adaptive cluster sampling. Journal of American Statistical Association, 85, 1050-1059.
Article MathSciNet Google Scholar
Thompson, S.K. and Seber, G.A.F. (1996), Adaptive Sampling, Wiley, New York.
MATH Google Scholar
Yang, H., Kleinn, C., Fehrmann, L., Tang, S. and Magnussen, S. (2011) A new design for sampling with adaptive sample plots. Environmental and Ecological Statistics, 8, 223–237.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, Facutly of Mathematical Sciences and Computer, Kharazmi University, Tehran, Iran
Bardia Panahbehagh

Authors

Bardia Panahbehagh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bardia Panahbehagh.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Proof 1

For ${\hat{p}}_s$ we have

$$\begin{aligned} {\hat{p}}_s=\frac{n(s)+1}{K+1};\;\;\; n(s)=\sum \limits _{k= 1}^KI_s(k);\;\;\; I_s(1),I_s(2),\ldots ,I_s(K)\sim ^{iid}\hbox {Bernoulli}(p_s) \end{aligned}$$

where $I_s(k)=1$ if $s_k=s$ and iid indicates the sample units are independent and identically distributed.

Then Based on Strong Law of Large Numbers and $E(|I_s(k)|)<\infty $ (Karr 1993, pp. 188), as $K\xrightarrow { }\infty $ we have

$$\begin{aligned} {\bar{I}}_s=\frac{n(s)}{K}\xrightarrow {a.s.}p_s \end{aligned}$$

and then

$$\begin{aligned} {\hat{p}}_s=\frac{n(s)+1}{K+1}=\frac{{\bar{I}}_s+1/K}{1+1/K}\xrightarrow {a.s.}p_s. \end{aligned}$$

proofs for ${\hat{p}}_{s,i},\; {\hat{p}}_i,\; {\hat{p}}_{s,i,j}$ and ${\hat{p}}_{i,j}$ are the same as the proof of ${\hat{p}}_s$.

Also

$$\begin{aligned} {\hat{\mu }}_{\text{ K }}\xrightarrow {a.s.}{\hat{\mu }}; \;\;\; {\hat{V}}_{1\text{ K }}\xrightarrow {a.s.}{\hat{V}}_1;\;\;\; {\hat{V}}_{2\text{ K }}\xrightarrow {a.s.}{\hat{V}}_2 \end{aligned}$$

are satisfied because of the continuity of their functions and a.s. convergence of their elements (for more details see Karr 1993 pp. 150). $\square $

Proof 2

First please $E_d$ and $E_r$ denote expectations according to design and resampling, respectively. In MBR it is easy to show that:

$$\begin{aligned} n(s,i)|n(s)\sim B(n(s),\frac{p_{s,i}}{p_s}). \end{aligned}$$

Also, if $X\sim B(K,p)$,

$$\begin{aligned} E_r\left( \frac{X}{X+1}\right) =1-\frac{1-(1-p)^{(K+1)}}{(K+1)p}, \end{aligned}$$

then

$$\begin{aligned} E_r\left( \frac{n(s,i)}{n(s)+1}\right) = \frac{p_{s,i}}{p_s}-\left[ \frac{1-(1-p_s)^{K+1}}{(K+1)p_s}\frac{p_{s,i}}{p_s}\right] , \end{aligned}$$

and then

$$\begin{aligned} \frac{|E({\hat{\mu }}_{\text{ K }})-\mu |}{\mu }= & {} \left| \frac{1}{\mu }E_d\left( \frac{1-(1-p_s)^{K+1}}{(K+1)p_s}\frac{1}{N}\sum \limits _{i\in s}\frac{p_{s,i}}{p_sp_i}y_i\right) \right| \\\le & {} \max \limits _{s}{\frac{1-(1-p_s)^{K+1}}{(K+1)p_s}} \le {\frac{1}{(K+1)p_{s^*}}}. \end{aligned}$$

For variance, as

$$\begin{aligned} n(s,i,j)|n(s)\sim B(n(s),\frac{p_{s,i,j}}{p_s}), \end{aligned}$$

and since $(h+1)x(1-x)^{h}\le 1$ for any positive integer h and $x\in [0,1]$, we have

$$\begin{aligned} E_r\left( \frac{n(s,i,j)}{n(s)+1}\right) -\frac{p_{s,i,j}}{p_s}= \left[ \frac{(1-p_s)^{K+1}-1}{(K+1)p_s}\frac{p_{s,i,j}}{p_s}\right] \le \frac{1}{(K+1)(K+2)p^2_s}\frac{p_{s,i,j}}{p_s}, \end{aligned}$$

and

$$\begin{aligned} \frac{p_{s,i,j}}{p_s}-E_r\left( \frac{n(s,i,j)}{n^*(s)}\right) = \left[ \frac{1-(1-p_s)^{K+1}}{(K+1)p_s}\frac{p_{s,i,j}}{p_s}\right] \le \frac{1}{(K+1)p_s}\frac{p_{s,i,j}}{p_s}, \end{aligned}$$

then

$$\begin{aligned} |E_r\left( \frac{n(s,i,j)}{n^*(s)}\right) -\frac{p_{s,i,j}}{p_s}|\le \frac{1}{(K+1)p^2_s}\frac{p_{s,i,j}}{p_s}. \end{aligned}$$

Now we have

$$\begin{aligned} \frac{|E({\hat{V}}_{1\text{ K }})-V_1|}{V_1}= & {} \frac{|E_d(E_r({\hat{V}}_{1\text{ K }})-{\hat{V}}_1)|}{V_1}\\\le & {} \frac{E_d\left( \frac{1}{N^2}\sum \limits _{i\in s}\sum \limits _{j<i\in s}\frac{1}{{p}_{i,j}}\left| E_r\left( \frac{{\hat{p}}_{s,i,j}}{{\hat{p}}_s}\right) -\frac{p_{s,i,j}}{p_s}\right| \left( \frac{y_i}{{p}_i}-\frac{y_j}{{p}_{j}}\right) ^2{p}_i{p}_{j}\right) }{V_1}\\\le & {} \frac{\frac{1}{(K+1)p^2_{s^*}}E_d\left( \frac{1}{N^2}\sum \limits _{i\in s}\sum \limits _{j<i\in s}\frac{p_{s,i,j}}{p_{i,j}p_s}\left( \frac{y_i}{{p}_i} -\frac{y_j}{{p}_{j}}\right) ^2{p}_i{p}_{j}\right) }{V_1}\\= & {} \frac{1}{(K+1)p^2_{s^*}}\frac{E_d({\hat{V}}_1)}{V_1}=\frac{1}{(K+1)p^2_{s^*}}, \end{aligned}$$

and for $V_2$ we have

$$\begin{aligned} n(s,i)n(s,j)|n(s)\sim MB\left( n(s),\frac{p_{s,i}}{p_s},\frac{p_{s,j}}{p_s},1 -\frac{p_{s,i}}{p_s}-\frac{p_{s,j}}{p_s}\right) , \end{aligned}$$

where MB denotes the Multinomial distribution. Then

$$\begin{aligned} E_r\left( \frac{n(s,i)n(s,j)}{(n(s)+1)^2}\right)= & {} E_r\left[ E_r\left( \frac{n(s,i)}{n(s)+1}\frac{n(s,j)}{n(s)+1}|n(s)\right) \right] \\= & {} E_r\left[ E_r\left( \frac{n(s,i)}{n(s)+1}|n(s)\right) E_r\left( \frac{n(s,j)}{n(s)+1}|n(s)\right) \right. \nonumber \\&\left. + Cov_r\left( \frac{n(s,i)}{n(s)+1}\frac{n(s,j)}{n(s)+1}|n(s)\right) \right] \\&\quad \frac{p_{s,i}p_{s,j}}{p^2_s}E_r\left[ \frac{n(s)^2}{(n(s)+1)^2} -\frac{n(s)}{(n(s)+1)^2}\right] \\\le & {} \frac{p_{s,i}p_{s,j}}{p_s^2} E_r\left[ \frac{n(s)^2}{(n(s)+1)^2}\right] \le \frac{p_{s,i}p_{s,j}}{p_s^2}E_r\left[ \frac{n(s)}{n(s)+1}\right] \\= & {} \frac{p_{s,i}p_{s,j}}{p_s^2}\left( 1+\frac{(1-p_s)^{K+1}-1}{(K+1)p_s}\right) , \end{aligned}$$

and therefore

$$\begin{aligned} E_r\left( \frac{n(s,i)n(s,j)}{(n(s)+1)^2}\right) -\frac{p_{s,i}p_{s,j}}{p_s^2}\le & {} \frac{p_{s,i}p_{s,j}}{p_s^2}\left( \frac{(1-p_s)^{K+1}-1}{(K+1)p_s}\right) \\\le & {} \frac{p_{s,i}p_{s,j}}{p_s^2}\left( \frac{1}{(K+1)(K+2)p_s^2}-\frac{1}{(K+1)p_s}\right) \\\le & {} \frac{p_{s,i}p_{s,j}}{p_s^2}\left( \frac{1}{(K+1)(K+2)p_s^2}\right) \\\le & {} \frac{p_{s,i}p_{s,j}}{p_s^2}\left( \frac{1}{(K+1)p_s^2}\right) . \end{aligned}$$

Now as

$$\begin{aligned}&E_r\left( \frac{n(s,i)n(s,j)}{(n(s)+1)^2}\right) \\&\quad = \frac{p_{s,i}p_{s,j}}{p_s^2} E_r\left[ \frac{n(s)(n(s)-1)}{(n(s)+1)^2}\right] \ge \frac{p_{s,i}p_{s,j}}{p_s^2} E_r\left[ \frac{(n(s)-1)^2}{(n(s)+1)^2}\right] \\&\quad \ge \frac{p_{s,i}p_{s,j}}{p_s^2} E_r\left[ \frac{(n(s)-1)^2}{(n(s)+1)^2}\right] \ge \frac{p_{s,i}p_{s,j}}{p_s^2} E_r\left[ \frac{(n(s)-1)^2}{(n(s)+1)(n(s)+2)}\right] \\&\quad =\frac{p_{s,i}p_{s,j}}{p_s^2}\\&\qquad \times \frac{-9(1-p_s)^{K+2}-4(K+2)p_s(1-p_s)^{K+1} +(K+2)p_s(1-p_s)+((K+2)p_s-3)^2}{(K+1)(K+2)p_s^2}, \end{aligned}$$

therefore

$$\begin{aligned}&\frac{p_{s,i}p_{s,j}}{p_s^2}-E_r\left( \frac{n(s,i)n(s,j)}{(n(s)+1)^2}\right) \le \frac{p_{s,i}p_{s,j}}{p_s^2}\\&\qquad \times \left( 1-\frac{-9(1-p_s)^{K+2}-4(K+2)p_s(1-p_s)^{K+1}+(K+2)p_s(1-p_s) +((K+2)p_s-3)^2}{(K+1)(K+2)p_s^2}\right) \\&\quad \le \frac{p_{s,i}p_{s,j}}{p_s^2}\left( 1+\frac{9}{(K+1)(K+2)p_s^2} +\frac{4}{(K+2)(K+1)p^2_s}-\frac{K+2}{(K+1)}\right. \\&\qquad \left. -\frac{9}{(K+1)(K+2)p_s^2}+\frac{6}{(K+1)p_s}-\frac{1-p_s}{(K+1)p_s}\right) \\&\quad =\frac{p_{s,i}p_{s,j}}{p_s^2}\left( \frac{4}{(K+2)(K+1)p^2_s} -\frac{1}{(K+1)}+\frac{6}{(K+1)p_s}\right) \le \frac{p_{s,i}p_{s,j}}{p_s^2}\left( \frac{10}{(K+1)p^2_s}\right) , \end{aligned}$$

and then

$$\begin{aligned} \left| E_r\left( \frac{n(s,i)n(s,j)}{(n(s)+1)^2}\right) -\frac{p_{s,i}p_{s,j}}{p_s^2}\right| \le \frac{10}{(K+1)p_s^2}, \end{aligned}$$

and similar to $V_1$ we have

$$\begin{aligned} \frac{|E({\hat{V}}_{2\text{ K }})-V_2|}{V_2}= & {} \frac{|E_d(E_r({\hat{V}}_{2\text{ K }})-{\hat{V}}_2)|}{V_2}\\\le & {} \frac{E_d\left( \frac{1}{N^2}\sum \limits _{i\in s}\sum \limits _{j<i\in s}\left| E_r\left( \frac{{\hat{p}}_{s,i}{\hat{p}}_{s,j}}{{\hat{p}}^2_s}\right) -\frac{p_{s,i}p_{s,j}}{p^2_s}\right| \left( \frac{y_i}{{p}_i}-\frac{y_j}{{p}_{j}}\right) ^2\right) }{V_2}\\\le & {} \frac{\frac{10}{(K+1)p^2_{s^*}}E_d\left( \frac{1}{N^2}\sum \limits _{i\in s}\sum \limits _{j<i\in s}\frac{p_{s,i}p_{s,j}}{p^2_s}\left( \frac{y_i}{{p}_i}-\frac{y_j}{{p}_{j}}\right) ^2\right) }{V_2}\\= & {} \frac{10}{(K+1)p^2_{s^*}}\frac{E_d({\hat{V}}_2)}{V_2}=\frac{10}{(K+1)p^2_{s^*}}. \end{aligned}$$

$\square $

Proof 3

According to the condition (3) of Theorem 3

$$\begin{aligned} E_r\left( \frac{n(s,i)}{n(s)}\right) =\frac{p_{s,i}}{p_s} \end{aligned}$$

and then

$$\begin{aligned} E({\hat{\mu }}_{\text{ K }})=E_d\left( \frac{1}{N}\sum \limits _{i\in s}\frac{p_{s,i}}{p_sp_i}y_i\right) =\mu . \end{aligned}$$

For variance, as

$$\begin{aligned} E_r\left( \frac{n(s,i,j)}{n(s)}\right) =\frac{p_{s,i,j}}{p_s} \end{aligned}$$

then

$$\begin{aligned} E({\hat{V}}_{1\text{ K }})=V_1 \end{aligned}$$

and for $V_2$ we have

$$\begin{aligned} E_r\left( \frac{n(s,i)n(s,j)}{n(s)^2}\right) =\frac{p_{s,i}p_{s,j}}{p_s^2} \left( 1-\frac{1}{K}\right) \end{aligned}$$

then we have

$$\begin{aligned} \frac{|E({\hat{V}}_{2\text{ K }})-V_2|}{V_2}= & {} \frac{|E_d\left( E_r({\hat{V}}_{2\text{ K }})-{\hat{V}}_2\right) |}{V_2}\\= & {} \frac{E_d\left( \frac{1}{N^2}\sum \limits _{i\in s} \sum \limits _{j<i\in s}\left| E_r\left( \frac{{\hat{p}}_{s,i}{\hat{p}}_{s,j}}{{\hat{p}}^2_s}\right) -\frac{p_{s,i}p_{s,j}}{p^2_s}\right| \left( \frac{y_i}{{p}_i}-\frac{y_j}{{p}_{j}}\right) ^2\right) }{V_2} \\= & {} \frac{\frac{1}{K}E_d\left( \frac{1}{N^2}\sum \limits _{i\in s} \sum \limits _{j<i\in s}\frac{p_{s,i}p_{s,j}}{p^2_s}\left( \frac{y_i}{{p}_i} -\frac{y_j}{{p}_{j}}\right) ^2\right) }{V_2}=\frac{1}{K}. \end{aligned}$$

$\square $

Proof 4

Consider s as the result of a standard sampling design with equally likely sample space $L=\{s(1),s(2),\ldots ,s(M)\}$. Then

$$\begin{aligned} p_s=\frac{N_L(s)}{M};\;\;\;p_{s,i}=\frac{N_L(s,i)}{M}, \end{aligned}$$

where $N_L(s)$ and $N_L(s,i)$ are the number of outcomes in L that lead to s and s with i as the first unit, respectively. Therefore for $p_{i|s}$ we have

$$\begin{aligned} p_{i|s}=\frac{p_{s,i}}{p_s}=\frac{N_L(s,i)}{N_L(s)}. \end{aligned}$$

Now executing the design on s will lead to

$$\begin{aligned} L^*=\{s^*(1),s^*(2),\ldots ,s^*(M^*)\}, \end{aligned}$$

and as the design is a standard ($P(s|{\mathbf {y}})$ is not dependent on y values of $U-s$) with equally likely sample space, then $N_L(s)=N_{L^*}(s)$ and $N_L(s,i)=N_{L^*}(s,i)$. Therefore

$$\begin{aligned} p^*_s=\frac{N_L^*(s)}{M^*}=\frac{N_L(s)}{M^*};\;\;\;p^*_{s,i}=\frac{N_L^*(s,i)}{M^*}=\frac{N_L(s,i)}{M^*}, \end{aligned}$$

and then

$$\begin{aligned} p^*_{i|s}=\frac{p^*_{s,i}}{p^*_s}=\frac{N_L(s,i)}{N_L(s)}=p_{i|s}. \end{aligned}$$

The same holds for $p_{i,j|s}$. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Panahbehagh, B. Estimation in Complex Sampling Designs Based on Resampling Methods. JABES 25, 206–228 (2020). https://doi.org/10.1007/s13253-020-00390-7

Download citation

Received: 25 June 2019
Accepted: 02 March 2020
Published: 12 March 2020
Issue Date: June 2020
DOI: https://doi.org/10.1007/s13253-020-00390-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimation in Complex Sampling Designs Based on Resampling Methods

Abstract

Access this article

Similar content being viewed by others

Recent developments in systematic sampling: A review

Inverse Adaptive Stratified Random Sampling

Alternative and complementary approaches to spatially balanced samples

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Proof 1

Proof 2

Proof 3

Proof 4

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Estimation in Complex Sampling Designs Based on Resampling Methods

Abstract

Access this article

Similar content being viewed by others

Recent developments in systematic sampling: A review

Inverse Adaptive Stratified Random Sampling

Alternative and complementary approaches to spatially balanced samples

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Proof 1

Proof 2

Proof 3

Proof 4

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation