当前位置: X-MOL 学术J. Am. Stat. Assoc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Discussion
Journal of the American Statistical Association ( IF 3.7 ) Pub Date : 2020-04-02 , DOI: 10.1080/01621459.2020.1762616
A. C. Davison 1
Affiliation  

I congratulate Professor Efron on receiving the International Prize in Statistics. Although given primarily for his invention of the bootstrap, this and his many other contributions add up to a hugely influential body of work, and the award is richly deserved. As a graduate student I had the privilege of hearing him speak in London, and as we left the seminar someone quietly said to me that this bootstrap thing could not possibly work. Many (too many!) years have passed since then, but what seemed visionary around 1980 has now come to pass: despite the initial doubts of some, resampling does indeed provide a very powerful approach to uncertainty assessment in a huge variety of settings, effectively replacing the analytical derivation of standard errors and suchlike with large-scale computations. There are caveats, of course, and during the 1980s and 1990s powerful theoreticians such as Peter Hall, sorely missed, shed light on how and when the bootstrap would work successfully and how to fix it when it failed. But the basic vision, that computer power could replace mathematical derivations, has come true. At the time some colleagues feared that computing power would replace statisticians, but this (to us) disastrous scenario has not materialized. The last time I looked, the top three “best jobs” in a standard US listing were “data scientist,” “statistician,” and “university professor” (very gratifying for those wearing all three hats), so I do not think we need fear the disappearance of “traditional” statistics any time soon. One reason for this is that despite all the hype about scraping information from the web and other sources of big data, the vast majority of datasets remain small or moderate in size. Quite apart from issues of sampling bias and representativity, there is typically a quality/quantity tradeoff in data-gathering. One of many possible examples arises in studying the ages of those humans who live the longest, to see if there is an upper limit to the human lifespan. Individuals who live over 100 years form a tiny proportion of the general population, and even official documentation can be lacunary, so that a considerable effort is needed to verify their identities and thus their ages. One can have lots of data, mostly reliable, about the bulk of the population, but the information about the small group of most interest must be carefully verified. Moreover one runs up against the extrapolation problem mentioned in the paper—since the aspect of most interest lies outside the data, some sort of theory seems essential, be it mathematical, biological or both. Many problems are of this general sort, involving data that are of

中文翻译:

讨论

我祝贺 Efron 教授获得国际统计学奖。尽管主要是因为他发明了 bootstrap,但这和他的许多其他贡献共同构成了一个非常有影响力的工作体系,该奖项实至名归。作为一名研究生,我有幸在伦敦听到他的演讲,当我们离开研讨会时,有人悄悄对我说,这种引导程序不可能奏效。从那时起已经过去了很多(太多!)年,但 1980 年左右似乎有远见的事情现在已经成为现实:尽管最初有人怀疑,重采样确实提供了一种非常强大的方法,可以在各种环境中进行不确定性评估,有效地用大规模计算代替标准误差等的分析推导。当然,也有一些注意事项,在 1980 年代和 1990 年代期间,彼得·霍尔等强大的理论家非常怀念,阐明了引导程序如何以及何时成功运行以及如何在失败时修复它。但是,计算机能力可以取代数学推导这一基本愿景已经实现。当时一些同事担心计算能力会取代统计学家,但这种(对我们而言)灾难性的情况并没有成为现实。上次我看,在标准的美国列表中,排名前三的“最佳工作”是“数据科学家”、“统计学家”和“大学教授”(对于同时戴这三顶帽子的人来说非常令人欣慰),所以我不认为我们需要担心“传统”统计数据很快就会消失。原因之一是,尽管大肆宣传从网络和其他大数据来源获取信息,绝大多数数据集的规模仍然很小或中等。除了抽样偏差和代表性问题之外,数据收集中通常存在质量/数量权衡。许多可能的例子之一出现在研究那些寿命最长的人的年龄时,看看人类的寿命是否有上限。活过 100 岁的人在总人口中只占很小的比例,甚至官方文件也可能有缺陷,因此需要付出相当大的努力来核实他们的身份,从而核实他们的年龄。人们可以拥有大量关于大部分人口的数据,而且大多是可靠的,但必须仔细核实关于最感兴趣的一小群人的信息。此外,还遇到了论文中提到的外推问题——由于最感兴趣的方面在于数据之外,某种理论似乎是必不可少的,无论是数学的、生物学的还是两者兼而有之。许多问题都属于这种一般类型,涉及的数据属于
更新日期:2020-04-02
down
wechat
bug