当前位置: X-MOL 学术J. Stat. Plann. Inference › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Ridge rerandomization: An experimental design strategy in the presence of covariate collinearity
Journal of Statistical Planning and Inference ( IF 0.9 ) Pub Date : 2021-03-01 , DOI: 10.1016/j.jspi.2020.07.002
Zach Branson , Stephane Shao

Abstract Randomization ensures that observed and unobserved covariates are balanced, on average. However, randomizing units to treatment and control often leads to covariate imbalances in realization, and such imbalances can inflate the variance of estimators of the treatment effect. One solution to this problem is rerandomization – an experimental design strategy that randomizes units until some balance criterion is fulfilled – which yields more precise estimators of the treatment effect if covariates are correlated with the outcome. Most rerandomization schemes in the literature utilize the Mahalanobis distance, which may not be preferable when covariates are high-dimensional or highly correlated with each other. As an alternative, we introduce an experimental design strategy called ridge rerandomization, which utilizes a modified Mahalanobis distance that addresses collinearities among covariates. This modified Mahalanobis distance has connections to principal components and the Euclidean distance, and – to our knowledge – has remained unexplored. We establish several theoretical properties of this modified Mahalanobis distance and our ridge rerandomization scheme. These results guarantee that ridge rerandomization is preferable over randomization and suggest when ridge rerandomization is preferable over standard rerandomization schemes. We also provide simulation evidence that suggests that ridge rerandomization is particularly preferable over typical rerandomization schemes in high-dimensional or high-collinearity settings.

中文翻译:

岭重新随机化:存在协变量共线性的实验设计策略

摘要 随机化确保观察到的和未观察到的协变量平均是平衡的。然而,将单元随机化到处理和控制通常会导致实现中的协变量不平衡,这种不平衡会扩大处理效果估计量的方差。这个问题的一个解决方案是重新随机化——一种随机化单位直到满足某些平衡标准的实验设计策略——如果协变量与结果相关,它会产生更精确的治疗效果估计值。文献中的大多数重新随机化方案利用马氏距离,当协变量是高维或彼此高度相关时,这可能不是优选的。作为替代方案,我们引入了一种称为脊重新随机化的实验设计策略,它利用修正的马氏距离来解决协变量之间的共线性问题。这种修改后的马哈拉诺比斯距离与主成分和欧几里得距离有关,而且——据我们所知——仍未被探索。我们建立了这种修改后的马氏距离和我们的脊重新随机化方案的几个理论特性。这些结果保证了脊重新随机化优于随机化,并建议何时脊重新随机化优于标准重新随机化方案。我们还提供了模拟证据,表明在高维或高共线性设置中,脊重新随机化比典型的重新随机化方案特别可取。这种修改后的马哈拉诺比斯距离与主成分和欧几里得距离有关,而且——据我们所知——仍未被探索。我们建立了这种修改后的马氏距离和我们的脊重新随机化方案的几个理论特性。这些结果保证了脊重新随机化优于随机化,并建议何时脊重新随机化优于标准重新随机化方案。我们还提供了模拟证据,表明在高维或高共线性设置中,脊重新随机化比典型的重新随机化方案特别可取。这种修改后的马哈拉诺比斯距离与主成分和欧几里得距离有关,而且——据我们所知——仍未被探索。我们建立了这种修改后的马氏距离和我们的脊重新随机化方案的几个理论特性。这些结果保证了脊重新随机化优于随机化,并建议何时脊重新随机化优于标准重新随机化方案。我们还提供了模拟证据,表明在高维或高共线性设置中,脊重新随机化比典型的重新随机化方案特别可取。这些结果保证了脊重新随机化优于随机化,并建议何时脊重新随机化优于标准重新随机化方案。我们还提供了模拟证据,表明在高维或高共线性设置中,脊重新随机化比典型的重新随机化方案特别可取。这些结果保证了脊重新随机化优于随机化,并建议何时脊重新随机化优于标准重新随机化方案。我们还提供了模拟证据,表明在高维或高共线性设置中,脊重新随机化比典型的重新随机化方案特别可取。
更新日期:2021-03-01
down
wechat
bug