Comparative Probability Metrics: Using Posterior Probabilities to Account for Practical Equivalence in A/B tests,The American Statistician

当前位置： X-MOL 学术 › Am. Stat. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Comparative Probability Metrics: Using Posterior Probabilities to Account for Practical Equivalence in A/B tests
The American Statistician ( IF 1.8 ) Pub Date : 2022-01-04 , DOI: 10.1080/00031305.2021.2000495
Nathaniel T. Stevens ₁ , Luke Hagar ₁

Affiliation

ABSTRACT

Recently, online-controlled experiments (i.e., A/B tests) have become an extremely valuable tool used by internet and technology companies for purposes of advertising, product development, product improvement, customer acquisition, and customer retention to name a few. The data-driven decisions that result from these experiments have traditionally been informed by null hypothesis significance tests and analyses based on p-values. However, recently attention has been drawn to the shortcomings of hypothesis testing, and an emphasis has been placed on the development of new methodologies that overcome these shortcomings. We propose the use of posterior probabilities to facilitate comparisons that account for practical equivalence and that quantify the likelihood that a result is practically meaningful, as opposed to statistically significant. We call these posterior probabilities comparative probability metrics (CPMs). This Bayesian methodology provides a flexible and intuitive means of making meaningful comparisons by directly calculating, for example, the probability that two groups are practically equivalent, or the probability that one group is practically superior to another. In this article, we describe a unified framework for constructing and estimating such probabilities, and we develop a sample size determination methodology that may be used to determine how much data are required to calculate trustworthy CPMs.

中文翻译：

比较概率度量：使用后验概率来解释 A/B 测试中的实际等效性

摘要

最近，在线控制实验（即 A/B 测试）已成为互联网和技术公司用于广告、产品开发、产品改进、客户获取和客户保留等目的的非常有价值的工具。传统上，这些实验产生的数据驱动决策是通过零假设显着性检验和基于p的分析得出的。-价值观。然而，最近人们注意到假设检验的缺点，并强调开发克服这些缺点的新方法。我们建议使用后验概率来促进解释实际等效性的比较，并量化结果具有实际意义的可能性，而不是统计显着性。我们将这些后验概率称为比较概率度量 (CPM)。这种贝叶斯方法通过直接计算例如两组实际上相等的概率或一组实际上优于另一组的概率，提供了一种灵活且直观的方法来进行有意义的比较。在本文中，

更新日期：2022-01-04

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>