当前位置: X-MOL 学术Annu. Rev. Stat. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Variable Importance Without Impossible Data
Annual Review of Statistics and Its Application ( IF 7.9 ) Pub Date : 2023-08-25 , DOI: 10.1146/annurev-statistics-040722-045325
Masayoshi Mase 1 , Art B. Owen 2 , Benjamin B. Seiler 2
Affiliation  

The most popular methods for measuring importance of the variables in a black-box prediction algorithm make use of synthetic inputs that combine predictor variables from multiple observations. These inputs can be unlikely, physically impossible, or even logically impossible. As a result, the predictions for such cases can be based on data very unlike any the black box was trained on. We think that users cannot trust an explanation of the decision of a prediction algorithm when the explanation uses such values. Instead, we advocate a method called cohort Shapley, which is grounded in economic game theory and uses only actually observed data to quantify variable importance. Cohort Shapley works by narrowing the cohort of observations judged to be similar to a target observation on one or more features. We illustrate it on an algorithmic fairness problem where it is essential to attribute importance to protected variables that the model was not trained on.Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 11 is March 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

中文翻译:

没有不可能数据的可变重要性

在黑盒预测算法中测量变量重要性的最流行方法是利用综合输入,将来自多个观测的预测变量组合起来。这些输入可能是不可能的、物理上不可能的,甚至逻辑上不可能的。因此,对此类情况的预测可以基于与黑匣子训练所依据的数据非常不同的数据。我们认为,当解释使用此类值时,用户无法信任对预测算法决策的解释。相反,我们提倡一种称为队列 Shapley 的方法,该方法基于经济博弈论,仅使用实际观察到的数据来量化变量重要性。队列沙普利(Cohort Shapley)的工作原理是缩小被判断为与一个或多个特征的目标观察相似的观察队列的范围。我们在算法公平性问题上进行说明,其中必须将重要性归因于模型未经过训练的受保护变量。《统计及其应用年度回顾》第 11 卷的预计最终在线发布日期为 2024 年 3 月。请参阅 http ://www.annualreviews.org/page/journal/pubdates 了解修订后的估计。
更新日期:2023-08-25
down
wechat
bug