当前位置: X-MOL 学术Sociological Methodology › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Uncovering Sociological Effect Heterogeneity Using Tree-Based Machine Learning
Sociological Methodology ( IF 2.4 ) Pub Date : 2021-03-04 , DOI: 10.1177/0081175021993503
Jennie E Brand 1, 2, 3 , Jiahui Xu 4 , Bernard Koch 1 , Pablo Geraldo 1
Affiliation  

Individuals do not respond uniformly to treatments, such as events or interventions. Sociologists routinely partition samples into subgroups to explore how the effects of treatments vary by selected covariates, such as race and gender, on the basis of theoretical priors. Data-driven discoveries are also routine, yet the analyses by which sociologists typically go about them are often problematic and seldom move us beyond our biases to explore new meaningful subgroups. Emerging machine learning methods based on decision trees allow researchers to explore sources of variation that they may not have previously considered or envisaged. In this article, the authors use tree-based machine learning, that is, causal trees, to recursively partition the sample to uncover sources of effect heterogeneity. Assessing a central topic in social inequality, college effects on wages, the authors compare what is learned from covariate and propensity score–based partitioning approaches with recursive partitioning based on causal trees. Decision trees, although superseded by forests for estimation, can be used to uncover subpopulations responsive to treatments. Using observational data, the authors expand on the existing causal tree literature by applying leaf-specific effect estimation strategies to adjust for observed confounding, including inverse propensity weighting, nearest neighbor matching, and doubly robust causal forests. We also assess localized balance metrics and sensitivity analyses to address the possibility of differential imbalance and unobserved confounding. The authors encourage researchers to follow similar data exploration practices in their work on variation in sociological effects and offer a straightforward framework by which to do so.



中文翻译:


使用基于树的机器学习揭示社会效应异质性



个体对事件或干预等治疗的反应并不统一。社会学家通常将样本划分为亚组,以根据理论先验来探索治疗效果如何因选定的协变量(例如种族和性别)而变化。数据驱动的发现也是例行公事,但社会学家通常进行的分析往往存在问题,并且很少能让我们超越偏见来探索新的有意义的子群体。基于决策树的新兴机器学习方法使研究人员能够探索他们以前可能没有考虑或设想的变异来源。在本文中,作者使用基于树的机器学习(即因果树)来递归地划分样本,以揭示效应异质性的来源。在评估社会不平等的一个中心主题——大学对工资的影响时,作者将基于协变量和倾向得分的划分方法与基于因果树的递归划分方法进行了比较。决策树虽然在估计方面被森林取代,但仍可用于发现对治疗有反应的亚群。作者利用观察数据,扩展了现有的因果树文献,应用特定于叶子的效应估计策略来调整观察到的混杂因素,包括逆倾向加权、最近邻匹配和双鲁棒因果森林。我们还评估局部平衡指标和敏感性分析,以解决差异不平衡和未观察到的混杂因素的可能性。作者鼓励研究人员在社会学影响变化的研究中遵循类似的数据探索实践,并提供一个简单的框架来实现这一目标。

更新日期:2021-03-04
down
wechat
bug