Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Exploring diversity through machine learning: a case for the use of decision trees in social science research
International Journal of Social Research Methodology ( IF 3.0 ) Pub Date : 2021-06-05 , DOI: 10.1080/13645579.2021.1933064
F. Jordan Srour 1 , Silva Karkoulian 2
Affiliation  

ABSTRACT

The literature provides multiple measures of diversity along a single demographic dimension, but when it comes to studying the interaction of multiple diversity types (e.g. age, gender, and race), the field of useable measures diminishes. We present the use of decision trees as a machine learning technique to automatically identify the interactions across diversity types to predict different levels of a dependent variable. In order to demonstrate the power of decision trees, we use five types of surface-level diversity (age, gender, education level, religion, and region of origin) measured via the standardized Blau index as independent variables and knowledge sharing as the dependent variable. The results of our decision tree approach relative to linear regression show that decision trees serve as a powerful tool to identify key demographic faultlines without a priori specification of a model structure.



中文翻译:

通过机器学习探索多样性:在社会科学研究中使用决策树的案例

摘要

文献提供了沿单一人口统计维度的多种多样性测量方法,但在研究多种多样性类型(例如年龄、性别种族),可用措施的范围缩小。我们将使用决策树作为一种机器学习技术来自动识别多样性类型之间的相互作用,以预测因变量的不同水平。为了展示决策树的威力,我们使用通过标准化 Blau 指数测量的五种表层多样性(年龄、性别、教育水平、宗教和原籍地)作为自变量,知识共享作为因变量. 我们的决策树方法相对于线性回归的结果表明,决策树可以作为一种强大的工具来识别关键的人口断层线,而无需先验地说明模型结构。

更新日期:2021-06-05
down
wechat
bug