当前位置: X-MOL 学术Deviant Behavior › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Evaluation of Random Forest in Crime Prediction: Comparing Three-Layered Random Forest and Logistic Regression
Deviant Behavior ( IF 1.716 ) Pub Date : 2021-08-18 , DOI: 10.1080/01639625.2021.1953360
Gyeongseok Oh 1 , Juyoung Song 2 , Hyoungah Park 3 , Chongmin Na 4
Affiliation  

ABSTRACT

This study evaluated random forest’s accuracy in predicting violent or criminal behavior of juveniles compared to that of conventional logistic regression using different sets of risk factors. Drawing on the National Longitudinal Study of Adolescent Health (Add Health), we predicted three outcomes – arrests, convictions, and incarcerations – using three sets of predictors, starting with sociodemographic variables only (Model 1) and incrementally adding behavioral/situational (Model 2) and emotional/environmental risk factors (Model 3). Although both prediction methods yielded similar levels of “overall” predictive accuracy (measured by the area under the receiver operating characteristic curve), our balanced random forest model, with a cost ratio of 10 (false negatives) to 1 (false positives), substantially improved prediction of who will be arrested, convicted, and incarcerated, which is of paramount importance for many researchers and practitioners. In addition to its capability to enhance sensitivity (prediction of “true positives”), random forest is more effective in forecasting juvenile criminal behavior than is conventional logistic regression in that the former is less susceptible to the influences of added predictors than is the latter.



中文翻译:

犯罪预测中随机森林的评估:比较三层随机森林和逻辑回归

摘要

本研究与使用不同风险因素集的传统逻辑回归相比,评估了随机森林在预测青少年暴力或犯罪行为方面的准确性。根据全国青少年健康纵向研究(Add Health),我们使用三组预测变量预测了三种结果——逮捕、定罪和监禁,仅从社会人口变量(模型 1)开始,逐渐增加行为/情境(模型 2) ) 和情绪/环境风险因素(模型 3)。尽管两种预测方法产生了相似水平的“整体”预测准确度(通过接收器操作特征曲线下的面积来衡量),但我们的平衡随机森林模型的成本比为 10(假阴性)比 1(假阳性),大大改进了对谁将被逮捕​​、定罪和监禁的预测,这对许多研究人员和从业人员来说至关重要。除了提高敏感性(对“真阳性”的预测)的能力外,随机森林在预测青少年犯罪行为方面比传统的逻辑回归更有效,因为前者比后者更不容易受到添加的预测变量的影响。

更新日期:2021-08-18
down
wechat
bug