Sample selection bias in evaluation of prediction performance of causal models,Statistical Analysis and Data Mining

当前位置： X-MOL 学术 › Stat. Anal. Data Min. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Sample selection bias in evaluation of prediction performance of causal models
Statistical Analysis and Data Mining ( IF 2.1 ) Pub Date : 2021-10-20 , DOI: 10.1002/sam.11559
James P Long ₁ , Min Jin Ha ₁

Affiliation

Causal models are notoriously difficult to validate because they make untestable assumptions regarding confounding. New scientific experiments offer the possibility of evaluating causal models using prediction performance. Prediction performance measures are typically robust to violations in causal assumptions. However, prediction performance does depend on the selection of training and test sets. Biased training sets can lead to optimistic assessments of model performance. In this work, we revisit the prediction performance of several recently proposed causal models tested on a genetic perturbation data set of Kemmeren. We find that sample selection bias is likely a key driver of model performance. We propose using a less-biased evaluation set for assessing prediction performance and compare models on this new set. In this setting, the causal models have similar or worse performance compared to standard association-based estimators such as Lasso. Finally, we compare the performance of causal estimators in simulation studies that reproduce the Kemmeren structure of genetic knockout experiments but without any sample selection bias. These results provide an improved understanding of the performance of several causal models and offer guidance on how future studies should use Kemmeren.

中文翻译：

因果模型预测性能评估中的样本选择偏差

众所周知，因果模型很难验证，因为它们对混杂做出了不可检验的假设。新的科学实验提供了使用预测性能评估因果模型的可能性。预测性能度量通常对违反因果假设具有鲁棒性。但是，预测性能确实取决于训练集和测试集的选择。有偏见的训练集可以导致对模型性能的乐观评估。在这项工作中，我们重新审视了在 Kemmeren 的遗传扰动数据集上测试的几个最近提出的因果模型的预测性能。我们发现样本选择偏差可能是模型性能的关键驱动因素。我们建议使用偏差较小的评估集来评估预测性能并比较这个新集上的模型。在这种设定下，与标准的基于关联的估计器（例如 Lasso）相比，因果模型具有相似或更差的性能。最后，我们比较了模拟研究中因果估计器的性能，这些模拟研究重现了基因敲除实验的 Kemmeren 结构，但没有任何样本选择偏差。这些结果提供了对几种因果模型性能的更好理解，并为未来研究应如何使用 Kemmeren 提供了指导。

更新日期：2021-10-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11