Abstract
Generally, to select a representative sample of the population, we use a combination of several probabilistic sampling methods which is called a complex sampling design. A complex sampling design usually needs very sophisticated mathematical calculations to provide unbiased estimators of the population parameters. Therefore, only a limited number of sampling designs are commonly used in practice. In the present study, to overcome this complexity, we propose a general method of estimation based on resampling that is suitable for all standard designs, either conventional or adaptive. In this method, we calculate Murthy estimator as an unbiased estimator for the population mean and its variance estimator without intensive mathematical calculations. Using this method, researchers can perform any probability design with the guarantee that the estimator is unbiased. To show this ability and as an application of the method, we introduce Adaptive Random Walk Sampling as a complex and efficient sampling design, proper for the quadrat-based environmental population. Despite the complexity of this design, the method proposed in this paper provides unbiased estimator for the population mean based on the design and then makes it a practical design. Simulations confirm the expected performance of the method.
Similar content being viewed by others
References
Brown, J.A. and Manly, B.J.F. (1998), Restricted adaptive cluster sampling. Environmental and Ecological Statistics, 5, 49-63.
Chao, C.T. and Thompson, S.K. (1999), Incomplete adaptive cluster sampling designs. In: Proceedings of the section on survey research methods of the American statistical association, 345–350.
Fattorini, L. (2006), Applying the Horvitz–Thompson criterion in complex designs: a computer-intensive perspective for estimating inclusion probabilities, Biometrika, 93(2), 269–278.
Horvitz, D.G. and Thompson, D.J. (1952), A generalization of sampling without replacement from a finite universe, Journal of the American Statistical Association, 47, 663–685.
Karr, A.F. (1993), Probability, Springer-Verlag, New York.
Kruskaland, W. and Mosteller, F. (1997a), Representative sampling, I: non-scientific literature, International Statistical Review, 47, 13–24.
— (1997b), Representative sampling, II: scientific literature, excluding statistics, International Statistical Review, 47, 111–127.
— (1997c), Representative sampling, III: the current statistical literature, International Statistical Review, 47, 245–265.
Murthy, M.N. (1957), Ordered and unordered estimators in sampling without replacement. Sankhya: Indian Journal of Statistics, 18, 379–390.
Narain, R. (1951), On sampling without replacement with varying probabilities. Journal of the Indian Society of Agricultural Statistics, 3, 169–175.
Panahbehagh, B. (2016) Adaptive rectangular sampling: an easy, incomplete, neighborhood-free adaptive cluster sampling design. Survey Methodology, 42(2), 263–281.
Panahbehagh, B. and Brown, J. (2016), gap based inverse sampling. Communications in Statistics; Theory and Methods, https://doi.org/10.1080/03610926.2016.1217022.
Ross, S.M. (2006), A first course in probability. Upper Saddle River, N.J., Pearson Prentice Hall.
Salehi, M.M. and Seber, G.A.F. (1997) Two-stage adaptive cluster sampling. Biometrics, 53, 959-970.
— (2001), A new proof of Murthy’s estimator with applies to sequential sampling, Australian & New Zealand Journal of Statistics, 43(3), 281–286.
Salehi, M.M. and Smith, D.R. (2005), Two-stage sequential sampling: a neighborhood-free adaptive sampling procedure. Journal of Agricultural, Biological, and Environmental Statistics, 10, 84-103.
Sarndal, C.E., Swensson, B. and Wretman, J. (1992), Model assisted survey sampling. Springer series in statistics, Springer-Verlag Publishing.
Smith, D.R., Conroy, M.J. and Brakhage, H. (1995), Efficiency of adaptive cluster sampling for estimating density of wintering waterfowl. Biometrics, 51, 777–788.
Su, Z. and Quinn II, T.J. (2003), Estimator bias and efficiency for adaptive cluster sampling with order statistics and a stopping rule. Environmental and Ecological Statistics, 10, 17–41.
Szwarcwald, C.L., Damacena G.N. (2008) Complex Sampling Design in Population Surveys: Planning and effects on statistical data analysis. Rev Bras Epidemiol,11, 38–45.
Thompson, S.K. (1990) Adaptive cluster sampling. Journal of American Statistical Association, 85, 1050-1059.
Thompson, S.K. and Seber, G.A.F. (1996), Adaptive Sampling, Wiley, New York.
Yang, H., Kleinn, C., Fehrmann, L., Tang, S. and Magnussen, S. (2011) A new design for sampling with adaptive sample plots. Environmental and Ecological Statistics, 8, 223–237.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Proof 1
For \({\hat{p}}_s\) we have
where \(I_s(k)=1\) if \(s_k=s\) and iid indicates the sample units are independent and identically distributed.
Then Based on Strong Law of Large Numbers and \(E(|I_s(k)|)<\infty \) (Karr 1993, pp. 188), as \(K\xrightarrow { }\infty \) we have
and then
proofs for \({\hat{p}}_{s,i},\; {\hat{p}}_i,\; {\hat{p}}_{s,i,j}\) and \({\hat{p}}_{i,j}\) are the same as the proof of \({\hat{p}}_s\).
Also
are satisfied because of the continuity of their functions and a.s. convergence of their elements (for more details see Karr 1993 pp. 150). \(\square \)
Proof 2
First please \(E_d\) and \(E_r\) denote expectations according to design and resampling, respectively. In MBR it is easy to show that:
Also, if \(X\sim B(K,p)\),
then
and then
For variance, as
and since \((h+1)x(1-x)^{h}\le 1\) for any positive integer h and \(x\in [0,1]\), we have
and
then
Now we have
and for \(V_2\) we have
where MB denotes the Multinomial distribution. Then
and therefore
Now as
therefore
and then
and similar to \(V_1\) we have
\(\square \)
Proof 3
According to the condition (3) of Theorem 3
and then
For variance, as
then
and for \(V_2\) we have
then we have
\(\square \)
Proof 4
Consider s as the result of a standard sampling design with equally likely sample space \(L=\{s(1),s(2),\ldots ,s(M)\}\). Then
where \(N_L(s)\) and \(N_L(s,i)\) are the number of outcomes in L that lead to s and s with i as the first unit, respectively. Therefore for \(p_{i|s}\) we have
Now executing the design on s will lead to
and as the design is a standard (\(P(s|{\mathbf {y}})\) is not dependent on y values of \(U-s\)) with equally likely sample space, then \(N_L(s)=N_{L^*}(s)\) and \(N_L(s,i)=N_{L^*}(s,i)\). Therefore
and then
The same holds for \(p_{i,j|s}\). \(\square \)
Rights and permissions
About this article
Cite this article
Panahbehagh, B. Estimation in Complex Sampling Designs Based on Resampling Methods. JABES 25, 206–228 (2020). https://doi.org/10.1007/s13253-020-00390-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13253-020-00390-7