A Proposal for Informative Default Priors Scaled by the Standard Error of Estimates,The American Statistician

当前位置： X-MOL 学术 › Am. Stat. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Proposal for Informative Default Priors Scaled by the Standard Error of Estimates
The American Statistician ( IF 1.8 ) Pub Date : 2021-07-06 , DOI: 10.1080/00031305.2021.1938225
Erik van Zwet ₁ , Andrew Gelman ₂

Affiliation

Abstract

If we have an unbiased estimate of some parameter of interest, then its absolute value is positively biased for the absolute value of the parameter. This bias is large when the signal-to-noise ratio (SNR) is small, and it becomes even larger when we condition on statistical significance; the winner’s curse. This is a frequentist motivation for regularization or “shrinkage.” To determine a suitable amount of shrinkage, we propose to estimate the distribution of the SNR from a large collection or “corpus” of similar studies and use this as a prior distribution. The wider the scope of the corpus, the less informative the prior, but a wider scope does not necessarily result in a more diffuse prior. We show that the estimation of the prior simplifies if we require that posterior inference is equivariant under linear transformations of the data. We demonstrate our approach with corpora of 86 replication studies from psychology and 178 phase 3 clinical trials. Our suggestion is not intended to be a replacement for a prior based on full information about a particular problem; rather, it represents a familywise choice that should yield better long-term properties than the current default uniform prior, which has led to systematic overestimates of effect sizes and a replication crisis when these inflated estimates have not shown up in later studies.

中文翻译：

由估计的标准误差衡量的信息性违约先验建议

摘要

如果我们对某个感兴趣的参数有一个无偏估计，那么它的绝对值对于参数的绝对值是正偏的。当信噪比 (SNR) 小时，这种偏差很大，当我们以统计显着性为条件时，它变得更大；胜利者的诅咒。这是正则化或“收缩”的常客动机。为了确定合适的收缩量，我们建议从类似研究的大型集合或“语料库”中估计 SNR 的分布，并将其用作先验分布。语料库的范围越广，先验的信息越少，但范围越广并不一定导致先验越分散。我们表明，如果我们要求后验推断在数据的线性变换下是等变的，则先验的估计会得到简化。我们使用来自心理学的 86 项复制研究和 178 项 3 期临床试验的语料库展示了我们的方法。我们的建议并非旨在替代基于特定问题的完整信息的先验；相反，它代表了一种家庭选择，应该比当前默认的统一先验产生更好的长期属性，这导致系统性高估效应大小和复制危机，而这些夸大的估计在后来的研究中没有出现。

更新日期：2021-07-06

点击分享查看原文

点击收藏

阅读更多本刊最新论文