当前位置: X-MOL 学术Biometrics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient odds ratio estimation under two-phase sampling using error-prone data from a multi-national HIV research cohort
Biometrics ( IF 1.9 ) Pub Date : 2021-07-02 , DOI: 10.1111/biom.13512
Sarah C Lotspeich 1 , Bryan E Shepherd 1 , Gustavo G C Amorim 1 , Pamela A Shaw 2 , Ran Tao 1, 3
Affiliation  

Persons living with HIV engage in routine clinical care, generating large amounts of data in observational HIV cohorts. These data are often error-prone, and directly using them in biomedical research could bias estimation and give misleading results. A cost-effective solution is the two-phase design, under which the error-prone variables are observed for all patients during Phase I, and that information is used to select patients for data auditing during Phase II. For example, the Caribbean, Central, and South America network for HIV epidemiology (CCASAnet) selected a random sample from each site for data auditing. Herein, we consider efficient odds ratio estimation with partially audited, error-prone data. We propose a semiparametric approach that uses all information from both phases and accommodates a number of error mechanisms. We allow both the outcome and covariates to be error-prone and these errors to be correlated, and selection of the Phase II sample can depend on Phase I data in an arbitrary manner. We devise a computationally efficient, numerically stable EM algorithm to obtain estimators that are consistent, asymptotically normal, and asymptotically efficient. We demonstrate the advantages of the proposed methods over existing ones through extensive simulations. Finally, we provide applications to the CCASAnet cohort.

中文翻译:

使用来自多国艾滋病毒研究队列的易错数据,在两阶段抽样下进行有效的比值比估计

艾滋病毒感染者参与常规临床护理,在观察性艾滋病毒队列中产生大量数据。这些数据通常容易出错,直接在生物医学研究中使用它们可能会导致估计偏差并产生误导性结果。一种经济高效的解决方案是两阶段设计,在第一阶段观察所有患者的易出错变量,并使用该信息在第二阶段选择患者进行数据审核。例如,加勒比、中美洲和南美洲艾滋病毒流行病学网络(CCASAnet)从每个站点随机抽取样本进行数据审计。在这里,我们考虑使用部分审计的、容易出错的数据进行有效的比值比估计。我们提出了一种半参数方法,该方法使用两个阶段的所有信息并适应多种错误机制。我们允许结果和协变量都容易出错,并且这些错误是相关的,并且 II 期样本的选择可以以任意方式依赖于 I 期数据。我们设计了一种计算高效、数值稳定的 EM 算法,以获得一致、渐近正态和渐近高效的估计量。我们通过广泛的模拟证明了所提出的方法相对于现有方法的优势。最后,我们向 CCASAnet 群体提供应用程序。
更新日期:2021-07-02
down
wechat
bug