当前位置: X-MOL 学术J. R. Stat. Soc. B › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Anchor regression: Heterogeneous data meet causality
The Journal of the Royal Statistical Society, Series B (Statistical Methodology) ( IF 3.1 ) Pub Date : 2021-01-25 , DOI: 10.1111/rssb.12398
Dominik Rothenhäusler 1 , Nicolai Meinshausen 2 , Peter Bühlmann 2 , Jonas Peters 3
Affiliation  

We consider the problem of predicting a response variable from a set of covariates on a data set that differs in distribution from the training data. Causal parameters are optimal in terms of predictive accuracy if in the new distribution either many variables are affected by interventions or only some variables are affected, but the perturbations are strong. If the training and test distributions differ by a shift, causal parameters might be too conservative to perform well on the above task. This motivates anchor regression, a method that makes use of exogenous variables to solve a relaxation of the ‘causal’ minimax problem by considering a modification of the least‐squares loss. The procedure naturally provides an interpolation between the solutions of ordinary least squares (OLS) and two‐stage least squares. We prove that the estimator satisfies predictive guarantees in terms of distributional robustness against shifts in a linear class; these guarantees are valid even if the instrumental variable assumptions are violated. If anchor regression and least squares provide the same answer (‘anchor stability’), we establish that OLS parameters are invariant under certain distributional changes. Anchor regression is shown empirically to improve replicability and protect against distributional shifts.

中文翻译:

锚回归:异构数据遇到因果关系

我们考虑了根据分布与训练数据不同的数据集上的一组协变量来预测响应变量的问题。如果在新的分布中,许多变量受干预影响,或者仅某些变量受影响,但因果关系很强,则因果参数在预测准确性方面是最佳的。如果训练和测试分布相差一个班次,则因果参数可能过于保守,无法在上述任务中很好地执行。这激励了锚回归,该方法利用外生变量通过考虑最小二乘损失的修正来解决“因果”极小极大问题的松弛。该过程自然会在普通最小二乘(OLS)和两阶段最小二乘的解之间进行插值。我们证明估计器在针对线性类中的移位的分布鲁棒性方面满足了预测保证;即使违反了工具变量假设,这些保证也仍然有效。如果锚回归和最小二乘提供相同的答案(“锚稳定性”),我们确定OLS参数在某些分布变化下是不变的。经验表明,锚定回归可以提高可复制性并防止分布偏移。
更新日期:2021-01-25
down
wechat
bug