当前位置: X-MOL 学术J. R. Stat. Soc. A › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Priyantha Wijayatunga’s contribution to the Discussion of ‘Testing by betting: A strategy for statistical and scientific communication’ by Glenn Shafer
The Journal of the Royal Statistical Society, Series A (Statistics in Society) ( IF 1.5 ) Pub Date : 2021-05-05 , DOI: 10.1111/rssa.12670
Priyantha Wijayatunga 1
Affiliation  

Here I make two points on hypothesis testing. The p‐value is the conditional probability of seeing the test statistic having realizations the same or more extreme from what already available data have shown, given the null hypothesis H0. Since the word ‘conditional’ is often omitted in text books, many practitioners believe that the p‐value is just an unconditional probability, thus tend to misuse it. One may argue that if we obtain the correct p‐value often. Suppose we test if the mean of a normal population, say, μ is positive with a sample of observations with size n from the population. For unknown population variance, the test statistic T calculated from the sample has a t‐distribution with degrees of freedom (n−1) under H 0 . Therefore, the p‐value is p = P { T t o | H 0 is true } where t o the observed test statistic value. One can make a reasonable argument that the p‐value that we calculate is often smaller than its true value for the application since, for example, we assume that our data are a random sample; our observed p‐value,
p ̂ = P { T t o and data are a random sample | H 0 is true } = P { data are a random sample } P { T t o | H 0 is true } p .
So, in practice, we may be rejecting the null hypothesis more often than what it should have been. This means that we should inflate our calculated p‐value to a certain degree.

Second, consider a test (see Sprenger, 2013); out of 104,490,000 Bernoulli trials, 52,263,471 are successes and 52,226,529 are failures, therefore observed probability of success is 0.5001768. For testing if the true value of it is 0.5, we get a p‐value that is lower than 0.01. Therefore, it is rejected at 0.01. The standard error of the estimate of the probability of success is 0.00004891394 that is almost equal to its value under null hypothesis. For the purpose of deciding if the true probability of success is 0.5, do we need to do a hypothesis test, since the empirical estimate is almost the same as the test value, and the standard error of the estimate is practically zero? What is the purpose of doing a test under these circumstances? If we take that the standard error to be zero, then we should accept that the value of the estimate is 0.5. We do not need hypothesis tests to communicate the statistical result in this case. The hypothesis tests are only mathematically objective procedures that have no subjective opinions embedded in them. However, use of any statistical result is often subjective or contextual!



中文翻译:

Priyantha Wijayatunga对Glenn Shafer的“通过投注测试:统计和科学交流策略”的讨论的贡献

在此,我对假设检验提出两点意见。的p -值是看到具有的实现从什么已经可用的数据已经示出的相同或更极端的检验统计量的条件概率,给出零假设ħ 0。由于“有条件的”一词在教科书中经常被省略,因此许多从业者认为p值只是无条件的概率,因此往往会误用它。有人可能会争辩说,如果我们经常获得正确的p值。假设我们使用样本量为n的观察样本检验正常人口的均值(例如,μ)是否为正。对于未知的总体方差,检验统计量T根据样本计算得出的t分布在以下条件下具有自由度(n -1) H 0 。因此,p值是 p = P { Ť Ť Ø | H 0 是真的 } 在哪里 Ť Ø 观察到的测试统计值。可以提出一个合理的论据,即我们计算出的p值通常小于应用程序的真实值,因为例如,我们假设我们的数据是随机样本。我们观察到的p值,
p ̂ = P { Ť Ť Ø 数据是随机样本 | H 0 是真的 } = P { 数据是随机样本 } P { Ť Ť Ø | H 0 是真的 } p
因此,在实践中,我们可能比原本应该多地拒绝原假设。这意味着我们应该将计算出的p值增加到一定程度。

其次,考虑一个测试(参见Sprenger,2013年);在104,490,000例伯努利试验中,有52,263,471例是成功的,有52,226,529例是失败的,因此观察到的成功概率为0.5001768。为了测试它的真实值为0.5,我们得到一个p-小于0.01的值。因此,它在0.01时被拒绝。成功概率估计的标准误差为0.00004891394,该误差几乎等于零假设下的误差。为了确定成功的真实概率是否为0.5,我们是否需要进行假设检验,因为经验估计与测试值几乎相同,并且估计的标准误实际上为零?在这种情况下进行测试的目的是什么?如果我们认为标准误为零,那么我们应该接受估计值为0.5。在这种情况下,我们不需要假设检验来传达统计结果。假设检验仅是数学上客观的程序,没有嵌入主观意见。然而,

更新日期:2021-05-05
down
wechat
bug