当前位置: X-MOL 学术J. Am. Stat. Assoc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Bias and high-dimensional adjustment in observational studies of peer effects
Journal of the American Statistical Association ( IF 3.0 ) Pub Date : 2020-09-01 , DOI: 10.1080/01621459.2020.1796393
Dean Eckles 1, 2 , Eytan Bakshy 3
Affiliation  

Peer effects, in which the behavior of an individual is affected by the behavior of their peers, are posited by multiple theories in the social sciences. Other processes can also produce behaviors that are correlated in networks and groups, thereby generating debate about the credibility of observational (i.e. nonexperimental) studies of peer effects. Randomized field experiments that identify peer effects, however, are often expensive or infeasible. Thus, many studies of peer effects use observational data, and prior evaluations of causal inference methods for adjusting observational data to estimate peer effects have lacked an experimental "gold standard" for comparison. Here we show, in the context of information and media diffusion on Facebook, that high-dimensional adjustment of a nonexperimental control group (677 million observations) using propensity score models produces estimates of peer effects statistically indistinguishable from those from using a large randomized experiment (220 million observations). Naive observational estimators overstate peer effects by 320% and commonly used variables (e.g., demographics) offer little bias reduction, but adjusting for a measure of prior behaviors closely related to the focal behavior reduces bias by 91%. High-dimensional models adjusting for over 3,700 past behaviors provide additional bias reduction, such that the full model reduces bias by over 97%. This experimental evaluation demonstrates that detailed records of individuals' past behavior can improve studies of social influence, information diffusion, and imitation; these results are encouraging for the credibility of some studies but also cautionary for studies of rare or new behaviors. More generally, these results show how large, high-dimensional data sets and statistical learning techniques can be used to improve causal inference in the behavioral sciences.

中文翻译:

同伴效应观察研究中的偏差和高维调整

社会科学中的多种理论假设了同伴效应,即个人的行为受到同伴行为的影响。其他过程也可以产生在网络和群体中相关的行为,从而引发关于同伴效应观察性(即非实验性)研究的可信度的争论。然而,确定同伴效应的随机现场实验通常是昂贵的或不可行的。因此,对同伴效应的许多研究都使用观察数据,而先前对因果推断方法的评估,用于调整观察数据以估计同伴效应,缺乏用于比较的实验“黄金标准”。在这里,我们展示了,在 Facebook 上的信息和媒体传播的背景下,使用倾向评分模型对非实验对照组(6.77 亿观察)进行的高维调整产生的同伴效应估计与使用大型随机实验(2.2 亿观察)的结果在统计上无法区分。朴素的观察估计将同伴效应夸大了 320%,常用变量(例如,人口统计)几乎没有减少偏差,但调整与焦点行为密切相关的先前行为的衡量标准可将偏差减少 91%。针对 3,700 多个过去行为进行调整的高维模型提供了额外的偏差减少,从而使完整模型减少了 97% 以上的偏差。这项实验评估表明,详细记录个人过去的行为可以改善对社会影响、信息传播和模仿的研究;这些结果对于某些研究的可信度是令人鼓舞的,但对于罕见或新行为的研究也具有警示意义。更一般地说,这些结果显示了可以使用多大的高维数据集和统计学习技术来改进行为科学中的因果推理。
更新日期:2020-09-01
down
wechat
bug