Nonparametric Interval Estimators for the Coefficient of Variation

Dongliang Wang; Margaret K. Formica; Song Liu

doi:10.1515/ijb-2017-0041

Published by De Gruyter April 19, 2018

Nonparametric Interval Estimators for the Coefficient of Variation

Dongliang Wang , Margaret K. Formica and Song Liu

From the journal The International Journal of Biostatistics

https://doi.org/10.1515/ijb-2017-0041

Showing a limited preview of this publication:

Abstract

The coefficient of variation (CV) is a widely used scaleless measure of variability in many disciplines. However the inference for the CV is limited to parametric methods or standard bootstrap. In this paper we propose two nonparametric methods aiming to construct confidence intervals for the coefficient of variation. The first one is to apply the empirical likelihood after transforming the original data. The second one is a modified jackknife empirical likelihood method. We also propose bootstrap procedures for calibrating the test statistics. Results from our simulation studies suggest that the proposed methods, particularly the empirical likelihood method with bootstrap calibration, are comparable to existing methods for normal data and yield better coverage probabilities for nonnormal data. We illustrate our methods by applying them to two real-life datasets.

Keywords: coefficient of variation; empirical likelihood; Jackknife empirical likelihood; bootstrap; Wilks’ theorem

Acknowledgements

We are grateful to the constructive comments from the Associated Editor and the three anonymous Referees, which notably improved the quality of our manuscript.

Appendix

5.1 Proof of Theorem 2.2

Proof

Firstly we show that at τ=τ0, Un(τ) can be written as a U-statistic

Un(τ)=n2−1∑1≤i1\lti2≤nh(Xi1,Xi2;τ),

where h(X1,X2;τ)=12τ2(X12+X22)−12(τ2+1)(X1−X2)2, since

s2=1n−1∑i=1n(Xi−Xˉ)2=n2−1∑1≤i1\lti2≤n12(Xi1−Xi2)2.

We then show that Eh(X1,X2;τ)=0, which can be easily derived by noting the facts that

τ2=σ2μ2=σ2EX2−σ2,σ2=12E(X1−X2)2 and

EX2=12E(X12+X22)

Thus Theorem 2.2 follows directly from Theorem 2.1 in Jing et al. (2009).

5.2 R script for the proposed methods

[baselinestretch=0.75]if (F){ install.packages(“emplik”)}library(emplik)######################## EL functions######################cvhat4el.f = function(x){## CV estimate for EL n = length(x) m = floor(n/2) y = z = rep(NA,m) for (i in 1:m){ y[i] = (x[i]-x[m+i])^2/2 z[i] = (x[i]^2+x[m+i]^2)/2 } cvhat = sqrt(mean(y)/mean(z-y)) out = list(cvhat=cvhat,x=x, y=y, z=z) return(out)}el.f = function(y,z,tau){ zvals = y-tau^2*(z-y) tt = el.test(x=zvals,mu=0) ll = tt$“-2LLR” if(abs(sum(tt$wts)-length(y))>1) ll = 300 out = list(“-2LLR” = ll,zvals = zvals,tau = tau,n = n) return(out)}ci.el.f = function(x,avals = c(0.10,0.05),B = 1000,step = 0.01){ nalpha = length(avals) ci.alpha = array(NA,dim = c(nalpha,3,2)) cx.alpha = round(qchisq(1-avals,1),2) ## cut-off for chi-square n = length(x) m = floor(n/2) ttt = cvhat$el.f(x) cvhat = ttt$cvhat y = ttt$y z = ttt$z llboot = rep(NA,B) for (i in 1:B){ idx = sample(1:m,replace=T) ystar = y[idx] zstar = z[idx] llboot[i] = el.f(tau=cvhat, y=ystar, z=zstar)$“-2LLR” } cboot.alpha = round(quantile(llboot,prob=1-avals,na.rm=T),2) for (i in 1:nalpha){ if (el.f(tau = 1000,y = y, z = z)$“-2LLR” >= cx.alpha[i]){ ci = findUL(step = step, fun = el.f, MLE = cvhat, y = y, z = z, level = cx.alpha[i]) ci.alpha[i,1,] = c(ci$Low,ci$Up) } ci.alpha[i,3,] = ci.alpha[i,1,] if (el.f(tau = 1000,y = y, z = z)$“-2LLR” >=cboot.alpha[i]){ ci = findUL(step = step, fun = el.f, MLE = cvhat, y = y, z = z,level = cboot.alpha[i]) ci.alpha[i,3,] = c(ci$Low,ci$Up) } } out = list(ci.alpha = ci.alpha,x = x,avals = avals, B = B,cx.alpha = cx.alpha, cboot.alpha = cboot.alpha) return(out)}######################## JEL######################cvhat4jel.f=function(x){## CV estimate for JEL n=length(x) y=x^2 tt1 = mean(y) tt2 = sd(x)^2 cvhat = sqrt(tt2/(tt1-tt2)) out=list(cvhat=cvhat,x=x) return(out)}U_n.f=function(x,tau){ n=length(x) y=x^2 tt1 = mean(y) tt2 = sd(x)^2 tt=tt1*tau^2-(tau^2+1)*tt2 out=list(U=tt, tt1=tt1, tt2=tt2,x=x) return(out)}jkkf.f=function(x,tau){ n=length(x) Un=U_n.f(x,tau)$U Un1=vv=rep(NA,n) for (i in 1:n){ Un1[i]=U_n.f(x[-i],tau)$U } vjack=n*Un-(n-1)*Un1 out=list(vjack=vjack,n=n,tau=tau,x=x,Un=Un) return(out)}jel.f=function(tau,x){ n=length(x) vjack=jkkf.f(x=x,tau=tau)$vjack tt = el.test(x=vjack,mu=0) ll=tt$“-2LLR” if(abs(sum(tt$wts)-length(x))>1) ll=300 out=list(“-2LLR”=ll,vjack=vjack,tau=tau,n=n) return(out)}ci.jel.f=function(x,avals=c(0.10,0.05),B=1000,step=0.01){ n=length(x) cvhat=cvhat4jel.f(x)$cvhat nalpha=length(avals) ci.alpha=array(NA,dim=c(nalpha,3,2)) llboot=rep(NA,B) for (i in 1:B){ xstar=sample(x,n,replace=T) llboot[i]=jel.f(tau=cvhat,x=xstar)$“-2LLR” } cx.alpha=qchisq(1-avals,1) ## cut-off for chi-square cboot.alpha=round(quantile(llboot,prob=1-avals,na.rm=T),2) ## bootstrap cutoff for (i in 1:nalpha){ ci.alpha[i,1,]=rep(NA,2) if (jel.f(tau=1000,x=x)$“-2LLR”>=cx.alpha[i]){ ci=findUL(step=step, fun=jel.f, MLE=cvhat, x=x,level=cx.alpha[i]) ci.alpha[i,1,]=c(ci$Low,ci$Up) } ci.alpha[i,3,] = ci.alpha[i,1,] if (jel.f(tau=1000,x=x)$“-2LLR”>=cboot.alpha[i]){ ci=findUL(step=step, fun=jel.f, MLE=cvhat, x=x,level=cboot.alpha[i]) ci.alpha[i,3,]=c(ci$Low,ci$Up) } } out=list(ci.alpha=ci.alpha,x=x,avals=avals,B=B,cx.alpha=cx.alpha, cboot.alpha=cboot.alpha) return(out)}

References

[1] Pearson K. Mathematical contributions to the theory of evolution? III. Regression, heredity and panmixia. Philos Trans R Soc A. 1896;187:253–318.10.1098/rsta.1896.0007Search in Google Scholar

[2] Reed GF, Lynn F, Meade BD. Use of coefficient of variation in assessing variability of quantitative Assays. Clin Diagn Lab Immunol. 2002;9(6);1235–1239.10.1128/CDLI.9.6.1235-1239.2002Search in Google Scholar

[3] Chow SC, Wang H. On sample size calculation in bioequivalence trials. J Pharmacok Pharmacod. 2001;28:155–169.10.1023/A:1011503032353Search in Google Scholar PubMed

[4] Lehmann EL. Testing statistical hypothesis, 2nd ed. New York: Wiley, 1996.Search in Google Scholar

[5] McKay AT. Distribution of the coefficient of variation and the extended t distribution. J Roy Statist Soc B. 1932;95:695–698.10.2307/2342041Search in Google Scholar

[6] David FN. Note on the application of Fisher’s k-statistics. Biometrika. 1949;36:383–393.10.1093/biomet/36.3-4.383Search in Google Scholar

[7] Reh W, Scheffler B. Significance tests and confidence intervals for coefficients of variation. Comput Stat Data Anal. 1996;22(4):449–452.10.1016/0167-9473(96)83707-8Search in Google Scholar

[8] Vangel MG. Confidence interval for a normal coefficient of variation. The Am Stat. 1996;50:21–26.Search in Google Scholar

[9] Wong ACM, Wu J. Small sample asymptotic inference for the coefficient of variation: normal and nonnormal models. J Stat Plann Inference. 2002;104:73–82.10.1016/S0378-3758(01)00241-5Search in Google Scholar

[10] Verrill S, Johnson RA. Confidence bounds and hypothesis tests for normal distribution coefficients of variation. Commun Stat Theory Methods. 2007;36(12):2187–2206.10.2737/FPL-RP-638Search in Google Scholar

[11] Mahmoudvand R, Hassani H. Two new confidence intervals for the coefficient of variation in a normal distribution. J Appl Stat. 2009;36(4):429–442.10.1080/02664760802474249Search in Google Scholar

[12] Panichkitkosolkul W. Confidence intervals for the coefficient of variation in a normal distribution with a known population mean. Probab Stat J. 2013;Article ID 324940. DOI: 10.1155/2013/324940.Search in Google Scholar

[13] Sharma KK, Krishna H. Asymptotic sampling distribution of inverse coefficient of variation and its applications. IEEE Trans Reliab. 1994;43(4):630–633.10.1109/24.370217Search in Google Scholar

[14] Banik S, Kibria BMG. Estimating the population coefficient of variation by confidence intervals. Commun Stat Simul Comput. 2011;40:1236–1261.10.1080/03610918.2011.568151Search in Google Scholar

[15] Monika G, Kibria BMG, Albatineh AN, Ahmed NU. A comparison of some confidence intervals for estimating the population coefficient of variation: a simulation study. SORT 2012;36:45–68.Search in Google Scholar

[16] Albatineh AN, Boubakari I, Kibria BMG. New confidence interval estimator of the signal-to-noise ratio based on asymptotic sampling distribution. Commun Stat Theory Meth. 2015. DOI: 10.1080/03610926.2014.1000498.Search in Google Scholar

[17] Albatineh AN, Kibria BMG, Wilcox ML, Zogheib B. Confidence interval estimation for the population coefficient of variation using ranked set sampling: a simulation study. J Appl Stat. 2014;41:733–751.10.1080/02664763.2013.847405Search in Google Scholar

[18] Owen A. Empirical likelihood ratio confidences for single functional. Biometrika. 1988;75:237–249.10.1093/biomet/75.2.237Search in Google Scholar

[19] Owen AB. Empirical likelihood. Boca Raton, FL: Chapman and Hall/CRC Press, 2001.Search in Google Scholar

[20] Jing B, Yuan, J, Zhou W. Jackknife empirical likelihood. Journal of the American Statistical Association. 2009;104(487):1224–1232.10.1198/jasa.2009.tm08260Search in Google Scholar

[21] Peng L, Qi Y. Smoothed jackknife empirical likelihood method for tail copulas. TEST. 2010;19(3):514–536.10.1007/s11749-010-0184-4Search in Google Scholar

[22] Adimari G, Chiogna M. Jackknife empirical likelihood based confidence intervals for partial areas under ROC curves. Stat Sin. 2012;22:1457–1477.10.5705/ss.2011.088Search in Google Scholar

[23] Yang H, Zhao Y. Smoothed jackknife empirical likelihood inference for the difference of ROC curves. J Multivariate Anal. 2013;115:270–284.10.1016/j.jmva.2012.10.010Search in Google Scholar

[24] Yang H, Zhao Y. Smoothed jackknife empirical likelihood inference for ROC curves with missing data. J Multivariate Anal. 2015;140:123–138.10.1016/j.jmva.2015.05.002Search in Google Scholar

[25] Wang D, Zhao Y, Gilmore DW. Jackknife empirical likelihood confidence interval for the Gini index. Stat Probab Lett. 2016;110:289–295.10.1016/j.spl.2015.09.026Search in Google Scholar

[26] Wang D, Zhao Y. Jackknife empirical likelihood for comparing two Gini indices. Canadian J Stat. 2016;44(1):102–119.10.1002/cjs.11275Search in Google Scholar

[27] Canty A, Ripley B. Boot: Bootstrap R (S-Plus) Functions. R package version 1.3-18, 2016.10.1002/9781118445112.stat06177.pub2Search in Google Scholar

[28] Davison AC, Hinkley DV. Bootstrap methods and their applications. Cambridge: Cambridge University Press, 1997.10.1017/CBO9780511802843Search in Google Scholar

[29] Zhou M. emplik: Empirical likelihood ratio for censored/truncated data. R package version 1.0-3, 2016.Search in Google Scholar

[30] Wood RJ, Durham TM. Reproducibility of serological titers. J Clin Microbiol 1980;11:541–545.10.1128/jcm.11.6.541-545.1980Search in Google Scholar PubMed PubMed Central

[31] Proschan F. Theoretical explanation of observed decreasing failure rate. Technometrics. 1963;5:375–383.10.1080/00401706.1963.10490105Search in Google Scholar

[32] Gail MH, Gastwirth JL. A scale-free goodness-of-fit test for the exponential distribution based on the Gini statistic. J R Stat Soc Ser B. 1978;40:350–357.10.1111/j.2517-6161.1978.tb01048.xSearch in Google Scholar

[33] Eisenberg DT. Telomere length measurement validity: the coefficient of variation is invalid and cannot be used to compare quantitative polymerase chain reaction and Southern blot telomere length measurement techniques. Int J Epidemiol. 2016;45:1295–1298.10.1093/ije/dyw191Search in Google Scholar PubMed

Received: 2017-06-05

Revised: 2018-02-15

Accepted: 2018-03-16

Published Online: 2018-04-19

Nonparametric Interval Estimators for the Coefficient of Variation

Abstract

Acknowledgements

Appendix

5.1 Proof of Theorem 2.2

Proof

5.2 R script for the proposed methods

References

Journal and Issue

Articles in the same Issue