-
The Application of the Likelihood Ratio Test and the Cochran-Mantel-Haenszel Test to Discrimination Cases Am. Stat. (IF 1.8) Pub Date : 2023-09-15 Weiwen Miao, Joseph L. Gastwirth
ABSTRACT In practice, the ultimate outcome of many important discrimination cases, e.g. the Wal-Mart, Nike and Goldman-Sachs equal pay cases, is determined at the stage when the plaintiffs request that the case be certified as a class action. The primary statistical issue at this time is whether the employment practice in question leads to a common pattern of outcomes disadvantaging most plaintiffs
-
Melded Confidence Intervals Do Not Provide Guaranteed Coverage Am. Stat. (IF 1.8) Pub Date : 2023-09-08 Jesse Frey, Yimin Zhang
Melded confidence intervals were proposed as a way to combine two independent one-sample confidence intervals to obtain a two-sample confidence interval for a quantity like a difference or a ratio....
-
Statistical Challenges in Online Controlled Experiments: A Review of A/B Testing Methodology Am. Stat. (IF 1.8) Pub Date : 2023-09-08 Nicholas Larsen, Jonathan Stallrich, Srijan Sengupta, Alex Deng, Ron Kohavi, Nathaniel T. Stevens
Abstract The rise of internet-based services and products in the late 1990’s brought about an unprecedented opportunity for online businesses to engage in large scale data-driven decision making. Over the past two decades, organizations such as Airbnb, Alibaba, Amazon, Baidu, Booking.com, Alphabet’s Google, LinkedIn, Lyft, Meta’s Facebook, Microsoft, Netflix, Twitter, Uber, and Yandex have invested
-
Differentially Private Methods for Releasing Results of Stability Analyses Am. Stat. (IF 1.8) Pub Date : 2023-08-29 Chengxin Yang, Jerome P. Reiter
Abstract Data stewards and analysts can promote transparent and trustworthy science and policy-making by facilitating assessments of the sensitivity of published results to alternate analysis choices. For example, researchers may want to assess whether the results change substantially when different subsets of data points (e.g., sets formed by demographic characteristics) are used in the analysis,
-
Multiple-model-based robust estimation of causal treatment effect on a binary outcome with integrated information from secondary outcomes Am. Stat. (IF 1.8) Pub Date : 2023-08-21 Chixiang Chen, Shuo Chen, Qi Long, Sudeshna Das, Ming Wang
Abstract An assessment of the causal treatment effect in the development and progression of certain diseases is important in clinical trials and biomedical studies. However, it is not possible to infer a causal relationship when the treatment assignment is imbalanced and confounded by other mechanisms. Specifically, when the treatment assignment is not randomized and the primary outcome is binary,
-
Enhanced Inference for Finite Population Sampling-Based Prevalence Estimation with Misclassification Errors Am. Stat. (IF 1.8) Pub Date : 2023-08-23 Lin Ge, Yuzi Zhang, Lance A. Waller, Robert H. Lyles
Epidemiologic screening programs often make use of tests with small, but non-zero probabilities of misdiagnosis. In this article, we assume the target population is finite with a fixed number of true cases, and that we apply an imperfect test with known sensitivity and specificity to a sample of individuals from the population. In this setting, we propose an enhanced inferential approach for use in
-
Bayesian Detection of Bias in Peremptory Challenges Using Historical Strike Data Am. Stat. (IF 1.8) Pub Date : 2023-08-21 Sachin S. Pandya, Xiaomeng Li, Eric Barón, Timothy E. Moore
Abstract United States law bars using peremptory strikes during jury selection because of prospective juror race, ethnicity, sex, or membership in certain other cognizable classes. Here, we extend a Bayesian approach for detecting such illegal strike bias by showing how to incorporate historical data on an attorney’s use of peremptory strikes in past cases. In so doing, we use the power prior to adjust
-
Bivariate Analysis of Distribution Functions Under Biased Sampling Am. Stat. (IF 1.8) Pub Date : 2023-08-21 Hsin-wen Chang, Shu-Hsiang Wang
This paper compares distribution functions among pairs of locations in their domains, in contrast to the typical approach of univariate comparison across individual locations. This bivariate approa...
-
Counting the unseen: Estimation of susceptibility proportions in zero-inflated models using a conditional likelihood approach Am. Stat. (IF 1.8) Pub Date : 2023-08-18 Wen-Han Hwang, Lu-Fang Chen, Jakub Stoklosa
Abstract Zero-inflated count data models are widely used in various fields such as ecology, epidemiology, and transportation, where count data with a large proportion of zeros is prevalent. Despite their widespread use, their theoretical properties have not been extensively studied. This study aims to investigate the impact of ignoring heterogeneity in event count intensity and susceptibility probability
-
Likelihood-Free Parameter Estimation with Neural Bayes Estimators Am. Stat. (IF 1.8) Pub Date : 2023-08-17 Matthew Sainsbury-Dale, Andrew Zammit-Mangion, Raphaël Huser
Abstract Neural Bayes estimators are neural networks that approximate Bayes estimators. They are fast, likelihood-free, and amenable to rapid bootstrap-based uncertainty quantification. In this paper, we aim to increase the awareness of statisticians to this relatively new inferential tool, and to facilitate its adoption by providing user-friendly open-source software. We also give attention to the
-
First-passage times for random partial sums: Yadrenko’s model for e and beyond Am. Stat. (IF 1.8) Pub Date : 2023-08-10 Joel E. Cohen
Abstract M. I. Yadrenko discovered that the expectation of the minimum number N1N1 of independent and identically distributed uniform random variables on (0, 1) that have to be added to exceed 1 is e. For any threshold a>0a>0 , K. G. Russell (1983) found the distribution, mean, and variance of the minimum number NaNa of independent and identically distributed uniform random summands required to exceed
-
Event History Analysis with R, 2nd ed. Am. Stat. (IF 1.8) Pub Date : 2023-07-31 Ding-Geng Chen
Published in The American Statistician (Vol. 77, No. 3, 2023)
-
Here Comes the STRAIN: Analyzing Defensive Pass Rush in American Football with Player Tracking Data Am. Stat. (IF 1.8) Pub Date : 2023-07-28 Quang Nguyen, Ronald Yurko, Gregory J. Matthews
Abstract In American football, a pass rush is an attempt by the defensive team to disrupt the offense and prevent the quarterback (QB) from completing a pass. Existing metrics for assessing pass rush performance are either discrete-time quantities or based on subjective judgment. Using player tracking data, we propose STRAIN, a novel metric for evaluating pass rushers in the National Football League
-
Confidence Distributions for the Autoregressive Parameter Am. Stat. (IF 1.8) Pub Date : 2023-07-12 Rolf Larsson
Abstract The notion of confidence distributions is applied to inference about the parameter in a simple autoregressive model, allowing the parameter to take the value one. This makes it possible to compare to asymptotic approximations in both the stationary and the nonstationary cases at the same time. The main point, however, is to compare to a Bayesian analysis of the same problem. A noninformative
-
Introducing Variational Inference in Statistics and Data Science Curriculum Am. Stat. (IF 1.8) Pub Date : 2023-06-30 Vojtech Kejzlar, Jingchen Hu
Abstract Probabilistic models such as logistic regression, Bayesian classification, neural networks, and models for natural language processing, are increasingly more present in both undergraduate and graduate statistics and data science curricula due to their wide range of applications. In this paper, we present a one-week course module for studnets in advanced undergraduate and applied graduate courses
-
Out-of-Sample R2: Estimation and Inference Am. Stat. (IF 1.8) Pub Date : 2023-06-30 Stijn Hawinkel, Willem Waegeman, Steven Maere
Abstract Out-of-sample prediction is the acid test of predictive models, yet an independent test dataset is often not available for assessment of the prediction error. For this reason, out-of-sample performance is commonly estimated using data splitting algorithms such as cross-validation or the bootstrap. For quantitative outcomes, the ratio of variance explained to total variance can be summarized
-
Sensitivity Analyses of Clinical Trial Designs: Selecting Scenarios and Summarizing Operating Characteristics Am. Stat. (IF 1.8) Pub Date : 2023-06-26 Larry Han, Andrea Arfè, Lorenzo Trippa
Abstract The use of simulation-based sensitivity analyses is fundamental for evaluating and comparing candidate designs of future clinical trials. In this context, sensitivity analyses are especially useful to assess the dependence of important design operating characteristics with respect to various unknown parameters. Typical examples of operating characteristics include the likelihood of detecting
-
Inverse Probability Weighting Estimation in Completely Randomized Experiments Am. Stat. (IF 1.8) Pub Date : 2023-06-26 Biao Zhang
Abstract In addition to treatment assignments and observed outcomes, covariate information is often available prior to randomization in completely randomized experiments that compare an active treatment versus control. The analysis of covariance (ANCOVA) method is commonly applied to adjust for baseline covariates in order to improve precision. We focus on making propensity score-based adjustment to
-
Evidential Calibration of Confidence Intervals Am. Stat. (IF 1.8) Pub Date : 2023-06-26 Samuel Pawel, Alexander Ly, Eric-Jan Wagenmakers
Abstract We present a novel and easy-to-use method for calibrating error-rate based confidence intervals to evidence-based support intervals. Support intervals are obtained from inverting Bayes factors based on a parameter estimate and its standard error. A k support interval can be interpreted as “the observed data are at least k times more likely under the included parameter values than under a specified
-
Play Call Strategies and Modeling for Target Outcomes in Football Am. Stat. (IF 1.8) Pub Date : 2023-06-09 Preston Biro, Stephen G. Walker
Abstract This paper considers one–off actions for a football coach who is asking for a specific outcome from a play. This will be in the form of a minimum gain in yards, usually in order to gain a first down. Using a random utility model approach we propose the play to be called is the one which maximizes the probability of the desired outcome. We specifically focus on pass plays, which requires the
-
Response to Comment by Schilling Am. Stat. (IF 1.8) Pub Date : 2023-05-23 Jay Bartroff, Gary Lorden, Lijia Wang
Published in The American Statistician (Vol. 77, No. 3, 2023)
-
Hypothesis testing for matched pairs with missing data by maximum mean discrepancy: An application to continuous glucose monitoring Am. Stat. (IF 1.8) Pub Date : 2023-04-27 Marcos Matabuena, Paulo Félix, Marc Ditzhaus, Juan Vidal, Francisco Gude
Abstract A frequent problem in statistical science is how to properly handle missing data in matched paired observations. There is a large body of literature coping with the univariate case. Yet, the ongoing technological progress in measuring biological systems raises the need for addressing more complex data, e.g., graphs, strings, and probability distributions. To fill this gap, this paper proposes
-
Comment on “A Case for Nonparametrics” by Bower et al. Am. Stat. (IF 1.8) Pub Date : 2023-04-24 Kenneth Rice, Thomas Lumley
Published in The American Statistician (Vol. 77, No. 2, 2023)
-
A Response to Rice and Lumley Am. Stat. (IF 1.8) Pub Date : 2023-04-24 Roy Bower, William Cipolli III
We recognize the careful reading of and thought-provoking commentary on our work by Rice and Lumley. Further, we appreciate the opportunity to respond and clarify our position regarding the three presented concerns. We address these points in three sections below and conclude with final remarks in Section 4.
-
Graph Sampling Am. Stat. (IF 1.8) Pub Date : 2023-04-24 Jae-Kwang Kim
Published in The American Statistician (Vol. 77, No. 2, 2023)
-
Handbook of Multiple Comparisons Am. Stat. (IF 1.8) Pub Date : 2023-04-24 Junyong Park
Published in The American Statistician (Vol. 77, No. 2, 2023)
-
Bartroff, J., Lorden, G. and Wang, L. (2022), “Optimal and Fast Confidence Intervals for Hypergeometric Successes,” The American Statistician: Comment by Schilling Am. Stat. (IF 1.8) Pub Date : 2023-04-25 Mark F. Schilling
Published in The American Statistician (Vol. 77, No. 3, 2023)
-
Distribution-Free Location-Scale Regression Am. Stat. (IF 1.8) Pub Date : 2023-04-19 Sandra Siegfried, Lucas Kook, Torsten Hothorn
Abstract We introduce a generalized additive model for location, scale, and shape (GAMLSS) next of kin aiming at distribution-free and parsimonious regression modelling for arbitrary outcomes. We replace the strict parametric distribution formulating such a model by a transformation function, which in turn is estimated from data. Doing so not only makes the model distribution-free but also allows to
-
Hierarchical Spatio-Temporal Change-Point Detection Am. Stat. (IF 1.8) Pub Date : 2023-04-11 Mehdi Moradi, Ottmar Cronie, Unai Pérez-Goya, Jorge Mateu
Abstract Detecting change-points in multivariate settings is usually carried out by analyzing all marginals either independently, via univariate methods, or jointly, through multivariate approaches. The former discards any inherent dependencies between different marginals and the latter may suffer from domination/masking among different change-points of distinct marginals. As a remedy, we propose an
-
Learning to forecast: The probabilistic time series forecasting challenge Am. Stat. (IF 1.8) Pub Date : 2023-04-05 Johannes Bracher, Nils Koster, Fabian Krüger, Sebastian Lerch
Abstract We report on a course project in which students submit weekly probabilistic forecasts of two weather variables and one financial variable. This real-time format allows students to engage in practical forecasting, which requires a diverse set of skills in data science and applied statistics. We describe the context and aims of the course, and discuss design parameters like the selection of
-
Mapping life expectancy loss in Barcelona in 2020 Am. Stat. (IF 1.8) Pub Date : 2023-04-03 Xavier Puig, Josep Ginebra
We use a Bayesian spatio-temporal model, first to smooth small-area initial life expectancy estimates in Barcelona for 2020, and second to predict what small-area life expectancy would have been in 2020 in absence of covid-19 using mortality data from 2007 to 2019. This allows us to estimate and map the small-area life expectancy loss, which can be used to assess how the impact of covid-19 varies spatially
-
Correction: Linearity of Unbiased Linear Model Estimators Am. Stat. (IF 1.8) Pub Date : 2023-03-27
Published in The American Statistician (Vol. 77, No. 2, 2023)
-
The Wald Confidence Interval for a Binomial p as an Illuminating “Bad” Example Am. Stat. (IF 1.8) Pub Date : 2023-03-21 Per Gösta Andersson
Abstract When teaching we usually not only demonstrate/discuss how a certain method works, but, not less important, why it works. In contrast, the Wald confidence interval for a binomial p constitutes an excellent example of a case where we might be interested in why a method does not work. It has been in use for many years and, sadly enough, it is still to be found in many textbooks in mathematical
-
A Characterization of Most(More) Powerful Test Statistics with Simple Nonparametric Applications Am. Stat. (IF 1.8) Pub Date : 2023-03-21 Albert Vexler, Alan D. Hutson
Abstract Data-driven most powerful tests are statistical hypothesis decision-making tools that deliver the greatest power against a fixed null hypothesis among all corresponding data-based tests of a given size. When the underlying data distributions are known, the likelihood ratio principle can be applied to conduct most powerful tests. Reversing this notion, we consider the following questions. (a)
-
Estimating the Performance of Entity Resolution Algorithms: Lessons Learned Through PatentsView.org Am. Stat. (IF 1.8) Pub Date : 2023-03-15 Olivier Binette, Sokhna A York, Emma Hickerson, Youngsoo Baek, Sarvo Madhavan, Christina Jones
Abstract This paper introduces a novel evaluation methodology for entity resolution algorithms. It is motivated by PatentsView.org, a public-use patent data exploration platform that disambiguates patent inventors using an entity resolution algorithm. We provide a data collection methodology and tailored performance estimators that account for sampling biases. Our approach is simple, practical and
-
Hitting a Prime in 2.43 Dice Rolls (On Average) Am. Stat. (IF 1.8) Pub Date : 2023-03-15 Noga Alon, Yaakov Malinovsky
Abstract What is the number of rolls of fair six-sided dice until the first time the total sum of all rolls is a prime? We compute the expectation and the variance of this random variable up to an additive error of less than 10−410−4 . This is a solution to a puzzle suggested by DasGupta in the Bulletin of the Institute of Mathematical Statistics, where the published solution is incomplete. The proof
-
Improved approximation and visualization of the correlation matrix Am. Stat. (IF 1.8) Pub Date : 2023-03-02 Jan Graffelman, Jan de Leeuw
Abstract The graphical representation of the correlation matrix by means of different multivariate statistical methods is reviewed, a comparison of the different procedures is presented with the use of an example data set, and an improved representation with better fit is proposed. Principal component analysis is widely used for making pictures of correlation structure, though as shown a weighted alternating
-
Consultancy Style Dissertations in Statistics and Data Science: Why and How Am. Stat. (IF 1.8) Pub Date : 2023-02-02 Serveh Sharifi Far, Vanda Inácio, Daniel Paulin, Miguel de Carvalho, Nicole H. Augustin, Mike Allerhand, Gail Robertson
Abstract In this article, we chronicle the development of the consultancy style dissertations of the MSc program in Statistics with Data Science at the University of Edinburgh. These dissertations are based on real-world data problems, in joint supervision with industrial and academic partners, and aim to get all students in the cohort together to develop consultancy skills and best practices, and
-
Object Oriented Data Analysis Am. Stat. (IF 1.8) Pub Date : 2023-01-30 James O. Ramsay
Published in The American Statistician (Vol. 77, No. 1, 2023)
-
Quantitative Drug Safety and Benefit-Risk Evaluation: Practical and Cross-Disciplinary Approaches Am. Stat. (IF 1.8) Pub Date : 2023-01-30 Huan Wang
Published in The American Statistician (Vol. 77, No. 1, 2023)
-
MOVER-R and Penalized MOVER-R Confidence Intervals for the Ratio of Two Quantities Am. Stat. (IF 1.8) Pub Date : 2023-01-30 Peng Wang, Yilei Ma, Siqi Xu, Yi-Xin Wang, Yu Zhang, Xiangyang Lou, Ming Li, Baolin Wu, Guimin Gao, Ping Yin, Nianjun Liu
Abstract Developing a confidence interval for the ratio of two quantities is an important task in statistics because of its omnipresence in real world applications. For such a problem, the MOVER-R (method of variance recovery for the ratio) technique, which is based on the recovery of variance estimates from confidence limits of the numerator and the denominator separately, was proposed as a useful
-
Revisiting the name variant of the two-children problem Am. Stat. (IF 1.8) Pub Date : 2023-01-25 Davy Paindaveine, Philippe Spindel
Abstract Initially proposed by Martin Gardner in the 1950s, the famous two-children problem is often presented as a paradox in probability theory. A relatively recent variant of this paradox states that, while in a two-children family for which at least one child is a girl, the probability that the other child is a boy is 2/3, this probability becomes 1/2 if the first name of the girl is disclosed
-
Bayesian Log-Rank Test Am. Stat. (IF 1.8) Pub Date : 2023-01-06 Jiaqi Gu, Yan Zhang, Guosheng Yin
Abstract Comparison of two survival curves is a fundamental problem in survival analysis. Although abundant frequentist methods have been developed for comparing survival functions, inference procedures from the Bayesian perspective are rather limited. In this article, we extract the quantity of interest from the classic log-rank test and propose its Bayesian counterpart. Monte Carlo methods, including
-
Selection Criterion of Working Correlation Structure for Spatially Correlated Data Am. Stat. (IF 1.8) Pub Date : 2023-01-06 Marcelo dos Santos, Fernanda De Bastiani, Miguel A. Uribe-Opazo, Manuel Galea
Abstract To obtain regression parameter estimates in generalized estimation equation modeling, whether in longitudinal or spatially correlated data, it is necessary to specify the structure of the working correlation matrix. The regression parameter estimates can be affected by the choice of this matrix. Within spatial statistics, the correlation matrix also influences how spatial variability is modeled
-
Integrating Ethics into the Guidelines for Assessment and Instruction in Statistics Education (GAISE) Am. Stat. (IF 1.8) Pub Date : 2023-01-04 Rameela Raman, Jessica Utts, Andrew I. Cohen, Matthew J. Hayat
Abstract Statistics education at all levels includes data collected on human subjects. Thus, statistics educators have a responsibility to educate their students about the ethical aspects related to the collection of those data. The changing statistics education landscape has seen instruction moving from being formula-based to being focused on statistical reasoning. The widely implemented Guidelines
-
Semi-Structured Distributional Regression Am. Stat. (IF 1.8) Pub Date : 2023-01-03 David Rügamer, Chris Kolb, Nadja Klein
Abstract Combining additive models and neural networks allows to broaden the scope of statistical regression and extend deep learning-based approaches by interpretable structured additive predictors at the same time. Existing attempts uniting the two modeling approaches are, however, limited to very specific combinations and, more importantly, involve an identifiability issue. As a consequence, interpretability
-
Quantifying the Inspection Paradox with Random Time Am. Stat. (IF 1.8) Pub Date : 2022-12-19 Diana Rauwolf, Udo Kamps
Abstract The well-known inspection paradox of renewal theory states that, in expectation, the inspection interval is larger than a common renewal interval, in general. For a random inspection time, which includes the deterministic case, and a delayed renewal process, representations of the expected length of an inspection interval and related inequalities in terms of covariances are shown. Datasets
-
A Case for Nonparametrics Am. Stat. (IF 1.8) Pub Date : 2022-12-13 Roy Bower, Justin Hager, Chris Cherniakov, Samay Gupta, William Cipolli III
ABSTRACT We provide a case study for motivating and teaching nonparametric statistical inference alongside traditional parametric approaches. The case consists of analyses by Bracht et al. who use analysis of variance (ANOVA) to assess the applicability of the human microfibrillar-associated protein 4 (MFAP4) as a biomarker for hepatic fibrosis in hepatitis C patients. We revisit their analyses and
-
Bayes Factors and Posterior Estimation: Two Sides of the Very Same Coin Am. Stat. (IF 1.8) Pub Date : 2022-12-02 Harlan Campbell, Paul Gustafson
Abstract Recently, several researchers have claimed that conclusions obtained from a Bayes factor (or the posterior odds) may contradict those obtained from Bayesian posterior estimation. In this article, we wish to point out that no such “contradiction” exists if one is willing to consistently define one’s priors and posteriors. The key for congruence is that the (implied) prior model odds used for
-
Statistical Guidance to Authors at Top-Ranked Journals across Scientific Disciplines Am. Stat. (IF 1.8) Pub Date : 2022-12-01 Tom E. Hardwicke, Maia Salholz-Hillel, Mario Malički, Dénes Szűcs, Theiss Bendixen, John P. A. Ioannidis
Abstract Scientific journals may counter the misuse, misreporting, and misinterpretation of statistics by providing guidance to authors. We described the nature and prevalence of statistical guidance at 15 journals (top-ranked by Impact Factor) in each of 22 scientific disciplines across five high-level domains (N = 330 journals). The frequency of statistical guidance varied across domains (Health
-
A Look into the Problem of Preferential Sampling through the Lens of Survey Statistics Am. Stat. (IF 1.8) Pub Date : 2022-11-28 Daniel Vedensky, Paul A. Parker, Scott H. Holan
Abstract An evolving problem in the field of spatial and ecological statistics is that of preferential sampling, where biases may be present due to a relationship between sample data locations and a response of interest. This field of research bears a striking resemblance to the longstanding problem of informative sampling within survey methodology, although with some important distinctions. With the
-
Mixture of Networks for Clustering Categorical Data: A Penalized Composite Likelihood Approach Am. Stat. (IF 1.8) Pub Date : 2022-11-17 Jangsun Baek, Jeong-Soo Park
Abstract One of the challenges in clustering categorical data is the curse of dimensionality caused by the inherent sparsity of high-dimensional data, the records of which include a large number of attributes. The latent class model (LCM) assumes local independence between the variables in clusters, and is a parsimonious model-based clustering approach that has been used to circumvent the problem.
-
Comment on “On Optimal Correlation-Based Prediction”, By Bottai et al. (2022) Am. Stat. (IF 1.8) Pub Date : 2022-11-15 Stan Lipovetsky
Published in The American Statistician (Vol. 77, No. 1, 2023)
-
Probability, Statistics, and Data: A Fresh Approach Using R Am. Stat. (IF 1.8) Pub Date : 2022-11-07 Scott A. Roths
Published in The American Statistician (Vol. 76, No. 4, 2022)
-
Statistical Issues in Drug Development, 3rd ed. Am. Stat. (IF 1.8) Pub Date : 2022-11-07 Jie Cui, Haoda Fu
Published in The American Statistician (Vol. 76, No. 4, 2022)
-
RafterNet: Probabilistic Predictions in Multi-Response Regression Am. Stat. (IF 1.8) Pub Date : 2022-10-31 Marius Hofert, Avinash Prasad, Mu Zhu
Abstract A fully nonparametric approach for making probabilistic predictions in multi-response regression problems is introduced. Random forests are used as marginal models for each response variable and, as novel contribution of the present work, the dependence between the multiple response variables is modeled by a generative neural network. This combined modeling approach of random forests, corresponding
-
The State of Play of Reproducibility in Statistics: An Empirical Analysis Am. Stat. (IF 1.8) Pub Date : 2022-10-31 Xin Xiong, Ivor Cribben
Abstract Reproducibility, the ability to reproduce the results of published papers or studies using their computer code and data, is a cornerstone of reliable scientific methodology. Studies where results cannot be reproduced by the scientific community should be treated with caution. Over the past decade, the importance of reproducible research has been frequently stressed in a wide range of scientific
-
A Comparative Tutorial of Bayesian Sequential Design and Reinforcement Learning Am. Stat. (IF 1.8) Pub Date : 2022-10-31 Mauricio Tec, Yunshan Duan, Peter Müller
Abstract Reinforcement learning (RL) is a computational approach to reward-driven learning in sequential decision problems. It implements the discovery of optimal actions by learning from an agent interacting with an environment rather than from supervised data. We contrast and compare RL with traditional sequential design, focusing on simulation-based Bayesian sequential design (BSD). Recently, there
-
Athlete Recruitment and the Myth of the Sophomore Peak Am. Stat. (IF 1.8) Pub Date : 2022-10-28 Monnie McGee, Benjamin Williams, Jacy Sparks
Abstract Conventional wisdom dispersed by fans and coaches in the stands at almost any high school track meet suggests female athletes typically peak around 10th grade or earlier (15 years of age), particularly for distance runners, and male athletes continuously improve. Given that universities in the United States typically recruit track and field athletes from high school teams, it is important
-
Optimal and Fast Confidence Intervals for Hypergeometric Successes Am. Stat. (IF 1.8) Pub Date : 2022-10-28 Jay Bartroff, Gary Lorden, Lijia Wang
Abstract We present an efficient method of calculating exact confidence intervals for the hypergeometric parameter representing the number of “successes,” or “special items,” in the population. The method inverts minimum-width acceptance intervals after shifting them to make their endpoints nondecreasing while preserving their level. The resulting set of confidence intervals achieves minimum possible