当前位置: X-MOL 学术Anal. Methods Accid. Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Big data, traditional data and the tradeoffs between prediction and causality in highway-safety analysis
Analytic Methods in Accident Research ( IF 12.5 ) Pub Date : 2020-01-25 , DOI: 10.1016/j.amar.2020.100113
Fred Mannering , Chandra R. Bhat , Venky Shankar , Mohamed Abdel-Aty

The analysis of highway accident data is largely dominated by traditional statistical methods (standard regression-based approaches), advanced statistical methods (such as models that account for unobserved heterogeneity), and data-driven methods (artificial intelligence, neural networks, machine learning, and so on). These methods have been applied mostly using data from observed crashes, but this can create a problem in uncovering causality since individuals that are inherently riskier than the population as a whole may be over-represented in the data. In addition, when and where individuals choose to drive could affect data analyses that use real-time data since the population of observed drivers could change over time. This issue, the nature of the data, and the implementation target of the analysis imply that analysts must often tradeoff the predictive capability of the resulting analysis and its ability to uncover the underlying causal nature of crash-contributing factors. The selection of the data-analysis method is often made without full consideration of this tradeoff, even though there are potentially important implications for the development of safety countermeasures and policies. This paper provides a discussion of the issues involved in this tradeoff with regard to specific methodological alternatives and presents researchers with a better understanding of the trade-offs often being inherently made in their analysis.



中文翻译:

大数据,传统数据以及高速公路安全性分析中预测与因果关系之间的权衡

高速公路事故数据的分析主要由传统统计方法(基于标准回归的方法),高级统计方法(例如,说明未观察到的异质性的模型)和数据驱动方法(人工智能,神经网络,机器学习,等等)。这些方法主要用于观察到的崩溃数据,但是这在发现因果关系方面可能会产生问题,因为固有风险高于总体风险的个人可能会在数据中被过度代表。此外,个人选择驾驶的时间和地点可能会影响使用实时数据的数据分析,因为观察到的驾驶员数量可能会随时间变化。这个问题,数据的性质,分析的实施目标意味着分析人员必须经常权衡结果分析的预测能力及其发现崩溃原因的潜在因果关系的能力。尽管在制定安全对策和政策时可能具有潜在的重要意义,但通常在没有充分考虑这种折衷的情况下进行数据分析方法的选择。本文讨论了有关特定方法替代方案的折衷问题,并为研究人员提供了对分析中通常固有的折衷的更好理解。尽管在制定安全对策和政策时可能具有潜在的重要意义,但通常在没有充分考虑这种折衷的情况下进行数据分析方法的选择。本文讨论了有关特定方法替代方案的折衷问题,并为研究人员提供了对分析中通常固有的折衷的更好理解。尽管在制定安全对策和政策时可能具有潜在的重要意义,但通常在没有充分考虑这种折衷的情况下进行数据分析方法的选择。本文讨论了有关特定方法替代方案的折衷问题,并为研究人员提供了对分析中通常固有的折衷的更好理解。

更新日期:2020-01-25
down
wechat
bug