Reconceptualizing the p-value from a likelihood ratio test: a probabilistic pairwise comparison of models based on Kullback-Leibler discrepancy measures,Journal of Applied Statistics

当前位置： X-MOL 学术 › J. Appl. Stat. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Reconceptualizing the p-value from a likelihood ratio test: a probabilistic pairwise comparison of models based on Kullback-Leibler discrepancy measures
Journal of Applied Statistics ( IF 1.2 ) Pub Date : 2020-04-23 , DOI: 10.1080/02664763.2020.1754360
Benjamin Riedle ₁ , Andrew A Neath ₂ , Joseph E Cavanaugh ₃

Affiliation

Discrepancy measures are often employed in problems involving the selection and assessment of statistical models. A discrepancy gauges the separation between a fitted candidate model and the underlying generating model. In this work, we consider pairwise comparisons of fitted models based on a probabilistic evaluation of the ordering of the constituent discrepancies. An estimator of the probability is derived using the bootstrap. In the framework of hypothesis testing, nested models are often compared on the basis of the p-value. Specifically, the simpler null model is favored unless the p-value is sufficiently small, in which case the null model is rejected and the more general alternative model is retained. Using suitably defined discrepancy measures, we mathematically show that, in general settings, the likelihood ratio test p-value is approximated by the bootstrapped discrepancy comparison probability (BDCP). We argue that the connection between the p-value and the BDCP leads to potentially new insights regarding the utility and limitations of the p-value. The BDCP framework also facilitates discrepancy-based inferences in settings beyond the limited confines of nested model hypothesis testing.

中文翻译：

从似然比检验重新概念化 p 值：基于 Kullback-Leibler 差异测量的模型的概率成对比较

差异度量通常用于涉及统计模型的选择和评估的问题。差异衡量拟合的候选模型和基础生成模型之间的分离。在这项工作中，我们考虑了基于对组成差异排序的概率评估的拟合模型的成对比较。使用 bootstrap 导出概率的估计量。在假设检验的框架中，嵌套模型通常基于 p 值进行比较。具体来说，除非 p 值足够小，否则更简单的空模型受到青睐，在这种情况下，空模型被拒绝并保留更通用的替代模型。使用适当定义的差异度量，我们在数学上表明，在一般情况下，似然比检验 p 值近似于自举差异比较概率 (BDCP)。我们认为 p 值和 BDCP 之间的联系导致了关于 p 值的效用和局限性的潜在新见解。BDCP 框架还有助于在嵌套模型假设检验的有限范围之外的设置中进行基于差异的推断。

更新日期：2020-04-23

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11