-
Covariance Matrix Estimation for High-Throughput Biomedical Data with Interconnected Communities Am. Stat. (IF 1.8) Pub Date : 2024-03-11 Yifan Yang, Chixiang Chen, Shuo Chen
Estimating a covariance matrix is central to high-dimensional data analysis. Empirical analyses of high-dimensional biomedical data, including genomics, proteomics, microbiome, and neuroimaging, am...
-
On the term “randomization test” Am. Stat. (IF 1.8) Pub Date : 2024-02-21 Jesse Hemerik
There is no consensus on the meaning of the term “randomization test”. Contradictory uses of the term are leading to confusion, misunderstandings and indeed invalid data analyses. A main source of ...
-
Fitting log-Gaussian Cox processes using generalized additive model software Am. Stat. (IF 1.8) Pub Date : 2024-02-08 Elliot Dovers, Jakub Stoklosa, David I. Warton
While log-Gaussian Cox process regression models are useful tools for modeling point patterns, they can be technically difficult to fit and require users to learn/adopt bespoke software. We show th...
-
Hidden Markov Models for Low-Frequency Earthquake Recurrence Am. Stat. (IF 1.8) Pub Date : 2024-02-05 Jessica Allen, Ting Wang
Low-frequency earthquakes (LFEs) are small magnitude earthquakes with frequencies of 1–10 Hertz which often occur in overlapping sequence forming persistent seismic tremors. They provide insights i...
-
Applied Linear Regression for Longitudinal Data: With an Emphasis on Missing Observations Am. Stat. (IF 1.8) Pub Date : 2024-02-05 Maria Francesca Marino
Published in The American Statistician (Vol. 78, No. 1, 2024)
-
Proximal MCMC for Bayesian Inference of Constrained and Regularized Estimation Am. Stat. (IF 1.8) Pub Date : 2024-01-23 Xinkai Zhou, Qiang Heng, Eric C. Chi, Hua Zhou
This paper advocates proximal Markov Chain Monte Carlo (ProxMCMC) as a flexible and general Bayesian inference framework for constrained or regularized estimation. Originally introduced in the Baye...
-
Parole Board Decision-Making using Adversarial Risk Analysis Am. Stat. (IF 1.8) Pub Date : 2024-01-22 Chaitanya Joshi, Charné Nel, Javier Cano, Devon L.L. Polaschek
Adversarial Risk Analysis (ARA) allows for much more realistic modeling of game theoretic decision problems than Bayesian game theory. While ARA solutions for various applications have been discuss...
-
Prioritizing Variables for Observational Study Design using the Joint Variable Importance Plot Am. Stat. (IF 1.8) Pub Date : 2024-01-10 Lauren D. Liao, Yeyi Zhu, Amanda L. Ngo, Rana F. Chehab, Samuel D. Pimentel
Observational studies of treatment effects require adjustment for confounding variables. However, causal inference methods typically cannot deliver perfect adjustment on all measured baseline varia...
-
Hitting a prime by rolling a die with infinitely many faces Am. Stat. (IF 1.8) Pub Date : 2023-12-01 Shane Chern
Alon and Malinovsky recently proved that it takes on average 2.42849… rolls of fair six-sided dice until the first time the total sum of all rolls arrives at a prime. Naturally, one may extend the...
-
Using Conformal Win Probability to Predict the Winners of the Canceled 2020 NCAA Basketball Tournaments Am. Stat. (IF 1.8) Pub Date : 2023-11-17 Chancellor Johnstone, Dan Nettleton
The COVID-19 pandemic was responsible for the cancellation of both the men’s and women’s 2020 National Collegiate Athletic Association (NCAA) Division I basketball tournaments. Starting from the po...
-
Understanding the implications of a complete case analysis for regression models with a right-censored covariate Am. Stat. (IF 1.8) Pub Date : 2023-11-13 Marissa C. Ashner, Tanya P. Garcia
Despite its drawbacks, the complete case analysis is commonly used in regression models with incomplete covariates. Understanding when the complete case analysis will lead to consistent parameter e...
-
Lessons from a Discussion-based Course on the History of Statistics Am. Stat. (IF 1.8) Pub Date : 2023-11-08 David B. Hitchcock
A special-topics undergraduate course about the history of statistics which was taught in Spring 2023 at the University of South Carolina is described. We review other similar courses (past and cur...
-
Bayesian Modeling and Computation in Python Am. Stat. (IF 1.8) Pub Date : 2023-10-31 P. Richard Hahn
Published in The American Statistician (Vol. 77, No. 4, 2023)
-
A First Course in Linear Model Theory, 2nd ed. Am. Stat. (IF 1.8) Pub Date : 2023-10-31 Carlos Cinelli
Published in The American Statistician (Vol. 77, No. 4, 2023)
-
ANOVA and Mixed Models: A Short Introduction Using R Am. Stat. (IF 1.8) Pub Date : 2023-10-31 Brady T. West
Published in The American Statistician (Vol. 77, No. 4, 2023)
-
Comment on “Forbidden knowledge and specialized training: A versatile solution for the two main sources of overfitting in linear regression,” by Rohlfs (2023) Am. Stat. (IF 1.8) Pub Date : 2023-10-30 Ronald Christensen
Published in The American Statistician (Just accepted, 2023)
-
Technical Validation of Plot Designs by Use of Deep Learning Am. Stat. (IF 1.8) Pub Date : 2023-10-13 Anne Helby Petersen, Claus Ekstrøm
When does inspecting a certain graphical plot allow for an investigator to reach the right statistical conclusion? Visualizations are commonly used for various tasks in statistics – including model...
-
One-step weighting to generalize and transport treatment effect estimates to a target population* Am. Stat. (IF 1.8) Pub Date : 2023-10-09 Ambarish Chattopadhyay, Eric R. Cohn, José R. Zubizarreta
The problems of generalization and transportation of treatment effect estimates from a study sample to a target population are central to empirical research and statistical methodology. In both ran...
-
The Phistogram Am. Stat. (IF 1.8) Pub Date : 2023-10-09 Adriana Verónica Blanc
This article introduces a new kind of histogram-based representation for univariate random variables, named the phistogram because of its perceptual qualities. The technique relies on shifted group...
-
A Note on Monte Carlo Integration in High Dimensions Am. Stat. (IF 1.8) Pub Date : 2023-10-09 Yanbo Tang
Monte Carlo integration is a commonly used technique to compute intractable integrals and is typically thought to perform poorly for very high-dimensional integrals. To show that this is not always...
-
Causal quartets: Different ways to attain the same average treatment effect* Am. Stat. (IF 1.8) Pub Date : 2023-10-05 Andrew Gelman, Jessica Hullman, Lauren Kennedy
The average causal effect can often be best understood in the context of its variation. We demonstrate with two sets of four graphs, all of which represent the same average effect but with much dif...
-
Missing data imputation with high-dimensional data Am. Stat. (IF 1.8) Pub Date : 2023-10-02 Alberto Brini, Edwin R. van den Heuvel
Imputation of missing data in high-dimensional datasets with more variables P than samples N, P≫N , is hampered by the data dimensionality. For multivariate imputation, the covariance matrix is ill...
-
The Application of the Likelihood Ratio Test and the Cochran-Mantel-Haenszel Test to Discrimination Cases Am. Stat. (IF 1.8) Pub Date : 2023-09-15 Weiwen Miao, Joseph L. Gastwirth
ABSTRACT In practice, the ultimate outcome of many important discrimination cases, e.g. the Wal-Mart, Nike and Goldman-Sachs equal pay cases, is determined at the stage when the plaintiffs request that the case be certified as a class action. The primary statistical issue at this time is whether the employment practice in question leads to a common pattern of outcomes disadvantaging most plaintiffs
-
Melded Confidence Intervals Do Not Provide Guaranteed Coverage Am. Stat. (IF 1.8) Pub Date : 2023-09-08 Jesse Frey, Yimin Zhang
Melded confidence intervals were proposed as a way to combine two independent one-sample confidence intervals to obtain a two-sample confidence interval for a quantity like a difference or a ratio....
-
Statistical Challenges in Online Controlled Experiments: A Review of A/B Testing Methodology Am. Stat. (IF 1.8) Pub Date : 2023-09-08 Nicholas Larsen, Jonathan Stallrich, Srijan Sengupta, Alex Deng, Ron Kohavi, Nathaniel T. Stevens
Abstract The rise of internet-based services and products in the late 1990’s brought about an unprecedented opportunity for online businesses to engage in large scale data-driven decision making. Over the past two decades, organizations such as Airbnb, Alibaba, Amazon, Baidu, Booking.com, Alphabet’s Google, LinkedIn, Lyft, Meta’s Facebook, Microsoft, Netflix, Twitter, Uber, and Yandex have invested
-
Differentially Private Methods for Releasing Results of Stability Analyses Am. Stat. (IF 1.8) Pub Date : 2023-08-29 Chengxin Yang, Jerome P. Reiter
Abstract Data stewards and analysts can promote transparent and trustworthy science and policy-making by facilitating assessments of the sensitivity of published results to alternate analysis choices. For example, researchers may want to assess whether the results change substantially when different subsets of data points (e.g., sets formed by demographic characteristics) are used in the analysis,
-
Multiple-model-based robust estimation of causal treatment effect on a binary outcome with integrated information from secondary outcomes Am. Stat. (IF 1.8) Pub Date : 2023-08-21 Chixiang Chen, Shuo Chen, Qi Long, Sudeshna Das, Ming Wang
Abstract An assessment of the causal treatment effect in the development and progression of certain diseases is important in clinical trials and biomedical studies. However, it is not possible to infer a causal relationship when the treatment assignment is imbalanced and confounded by other mechanisms. Specifically, when the treatment assignment is not randomized and the primary outcome is binary,
-
Enhanced Inference for Finite Population Sampling-Based Prevalence Estimation with Misclassification Errors Am. Stat. (IF 1.8) Pub Date : 2023-08-23 Lin Ge, Yuzi Zhang, Lance A. Waller, Robert H. Lyles
Epidemiologic screening programs often make use of tests with small, but non-zero probabilities of misdiagnosis. In this article, we assume the target population is finite with a fixed number of true cases, and that we apply an imperfect test with known sensitivity and specificity to a sample of individuals from the population. In this setting, we propose an enhanced inferential approach for use in
-
Bayesian Detection of Bias in Peremptory Challenges Using Historical Strike Data Am. Stat. (IF 1.8) Pub Date : 2023-08-21 Sachin S. Pandya, Xiaomeng Li, Eric Barón, Timothy E. Moore
Abstract United States law bars using peremptory strikes during jury selection because of prospective juror race, ethnicity, sex, or membership in certain other cognizable classes. Here, we extend a Bayesian approach for detecting such illegal strike bias by showing how to incorporate historical data on an attorney’s use of peremptory strikes in past cases. In so doing, we use the power prior to adjust
-
Bivariate Analysis of Distribution Functions Under Biased Sampling Am. Stat. (IF 1.8) Pub Date : 2023-08-21 Hsin-wen Chang, Shu-Hsiang Wang
This paper compares distribution functions among pairs of locations in their domains, in contrast to the typical approach of univariate comparison across individual locations. This bivariate approa...
-
Counting the unseen: Estimation of susceptibility proportions in zero-inflated models using a conditional likelihood approach Am. Stat. (IF 1.8) Pub Date : 2023-08-18 Wen-Han Hwang, Lu-Fang Chen, Jakub Stoklosa
Abstract Zero-inflated count data models are widely used in various fields such as ecology, epidemiology, and transportation, where count data with a large proportion of zeros is prevalent. Despite their widespread use, their theoretical properties have not been extensively studied. This study aims to investigate the impact of ignoring heterogeneity in event count intensity and susceptibility probability
-
Likelihood-Free Parameter Estimation with Neural Bayes Estimators Am. Stat. (IF 1.8) Pub Date : 2023-08-17 Matthew Sainsbury-Dale, Andrew Zammit-Mangion, Raphaël Huser
Neural Bayes estimators are neural networks that approximate Bayes estimators. They are fast, likelihood-free, and amenable to rapid bootstrap-based uncertainty quantification. In this article, we ...
-
First-Passage Times for Random Partial Sums: Yadrenko’s Model for e and Beyond Am. Stat. (IF 1.8) Pub Date : 2023-08-10 Joel E. Cohen
M. I. Yadrenko discovered that the expectation of the minimum number N1 of independent and identically distributed uniform random variables on (0, 1) that have to be added to exceed 1 is e. For any...
-
Event History Analysis with R, 2nd ed. Am. Stat. (IF 1.8) Pub Date : 2023-07-31 Ding-Geng Chen
Published in The American Statistician (Vol. 77, No. 3, 2023)
-
Here Comes the STRAIN: Analyzing Defensive Pass Rush in American Football with Player Tracking Data Am. Stat. (IF 1.8) Pub Date : 2023-07-28 Quang Nguyen, Ronald Yurko, Gregory J. Matthews
Abstract In American football, a pass rush is an attempt by the defensive team to disrupt the offense and prevent the quarterback (QB) from completing a pass. Existing metrics for assessing pass rush performance are either discrete-time quantities or based on subjective judgment. Using player tracking data, we propose STRAIN, a novel metric for evaluating pass rushers in the National Football League
-
Confidence Distributions for the Autoregressive Parameter Am. Stat. (IF 1.8) Pub Date : 2023-07-12 Rolf Larsson
The notion of confidence distributions is applied to inference about the parameter in a simple autoregressive model, allowing the parameter to take the value one. This makes it possible to compare ...
-
Play Call Strategies and Modeling for Target Outcomes in Football Am. Stat. (IF 1.8) Pub Date : 2023-07-12 Preston Biro, Stephen G. Walker
This article considers one-off actions for a football coach who is asking for a specific outcome from a play. This will be in the form of a minimum gain in yards, usually in order to gain a first d...
-
Introducing Variational Inference in Statistics and Data Science Curriculum Am. Stat. (IF 1.8) Pub Date : 2023-06-30 Vojtech Kejzlar, Jingchen Hu
Abstract Probabilistic models such as logistic regression, Bayesian classification, neural networks, and models for natural language processing, are increasingly more present in both undergraduate and graduate statistics and data science curricula due to their wide range of applications. In this paper, we present a one-week course module for studnets in advanced undergraduate and applied graduate courses
-
Out-of-Sample R2: Estimation and Inference Am. Stat. (IF 1.8) Pub Date : 2023-06-30 Stijn Hawinkel, Willem Waegeman, Steven Maere
Out-of-sample prediction is the acid test of predictive models, yet an independent test dataset is often not available for assessment of the prediction error. For this reason, out-of-sample perform...
-
Sensitivity Analyses of Clinical Trial Designs: Selecting Scenarios and Summarizing Operating Characteristics Am. Stat. (IF 1.8) Pub Date : 2023-06-26 Larry Han, Andrea Arfè, Lorenzo Trippa
The use of simulation-based sensitivity analyses is fundamental for evaluating and comparing candidate designs of future clinical trials. In this context, sensitivity analyses are especially useful...
-
Inverse Probability Weighting Estimation in Completely Randomized Experiments Am. Stat. (IF 1.8) Pub Date : 2023-06-26 Biao Zhang
In addition to treatment assignments and observed outcomes, covariate information is often available prior to randomization in completely randomized experiments that compare an active treatment ver...
-
Evidential Calibration of Confidence Intervals Am. Stat. (IF 1.8) Pub Date : 2023-06-26 Samuel Pawel, Alexander Ly, Eric-Jan Wagenmakers
We present a novel and easy-to-use method for calibrating error-rate based confidence intervals to evidence-based support intervals. Support intervals are obtained from inverting Bayes factors base...
-
Distribution-Free Location-Scale Regression Am. Stat. (IF 1.8) Pub Date : 2023-06-01 Sandra Siegfried, Lucas Kook, Torsten Hothorn
We introduce a generalized additive model for location, scale, and shape (GAMLSS) next of kin aiming at distribution-free and parsimonious regression modeling for arbitrary outcomes. We replace the...
-
Hypothesis Testing for Matched Pairs with Missing Data by Maximum Mean Discrepancy: An Application to Continuous Glucose Monitoring Am. Stat. (IF 1.8) Pub Date : 2023-05-30 Marcos Matabuena, Paulo Félix, Marc Ditzhaus, Juan Vidal, Francisco Gude
A frequent problem in statistical science is how to properly handle missing data in matched paired observations. There is a large body of literature coping with the univariate case. Yet, the ongoin...
-
Response to Comment by Schilling Am. Stat. (IF 1.8) Pub Date : 2023-05-23 Jay Bartroff, Gary Lorden, Lijia Wang
Published in The American Statistician (Vol. 77, No. 3, 2023)
-
Mapping Life Expectancy Loss in Barcelona in 2020 Am. Stat. (IF 1.8) Pub Date : 2023-05-04 Xavier Puig, Josep Ginebra
We use a Bayesian spatio-temporal model, first to smooth small-area initial life expectancy estimates in Barcelona for 2020, and second to predict what small-area life expectancy would have been in...
-
Comment on “A Case for Nonparametrics” by Bower et al. Am. Stat. (IF 1.8) Pub Date : 2023-04-24 Kenneth Rice, Thomas Lumley
Published in The American Statistician (Vol. 77, No. 2, 2023)
-
A Response to Rice and Lumley Am. Stat. (IF 1.8) Pub Date : 2023-04-24 Roy Bower, William Cipolli III
We recognize the careful reading of and thought-provoking commentary on our work by Rice and Lumley. Further, we appreciate the opportunity to respond and clarify our position regarding the three presented concerns. We address these points in three sections below and conclude with final remarks in Section 4.
-
Graph Sampling Am. Stat. (IF 1.8) Pub Date : 2023-04-24 Jae-Kwang Kim
Published in The American Statistician (Vol. 77, No. 2, 2023)
-
Handbook of Multiple Comparisons Am. Stat. (IF 1.8) Pub Date : 2023-04-24 Junyong Park
Published in The American Statistician (Vol. 77, No. 2, 2023)
-
Bartroff, J., Lorden, G. and Wang, L. (2022), “Optimal and Fast Confidence Intervals for Hypergeometric Successes,” The American Statistician: Comment by Schilling Am. Stat. (IF 1.8) Pub Date : 2023-04-25 Mark F. Schilling
Published in The American Statistician (Vol. 77, No. 3, 2023)
-
Learning to Forecast: The Probabilistic Time Series Forecasting Challenge Am. Stat. (IF 1.8) Pub Date : 2023-04-24 Johannes Bracher, Nils Koster, Fabian Krüger, Sebastian Lerch
We report on a course project in which students submit weekly probabilistic forecasts of two weather variables and one financial variable. This real-time format allows students to engage in practic...
-
A Characterization of Most(More) Powerful Test Statistics with Simple Nonparametric Applications Am. Stat. (IF 1.8) Pub Date : 2023-04-18 Albert Vexler, Alan D. Hutson
Data-driven most powerful tests are statistical hypothesis decision-making tools that deliver the greatest power against a fixed null hypothesis among all corresponding data-based tests of a given ...
-
Estimating the Performance of Entity Resolution Algorithms: Lessons Learned Through PatentsView.org Am. Stat. (IF 1.8) Pub Date : 2023-04-13 Olivier Binette, Sokhna A York, Emma Hickerson, Youngsoo Baek, Sarvo Madhavan, Christina Jones
This article introduces a novel evaluation methodology for entity resolution algorithms. It is motivated by PatentsView.org, a public-use patent data exploration platform that disambiguates patent ...
-
Hierarchical Spatio-Temporal Change-Point Detection Am. Stat. (IF 1.8) Pub Date : 2023-04-11 Mehdi Moradi, Ottmar Cronie, Unai Pérez-Goya, Jorge Mateu
Detecting change-points in multivariate settings is usually carried out by analyzing all marginals either independently, via univariate methods, or jointly, through multivariate approaches. The for...
-
Improved Approximation and Visualization of the Correlation Matrix Am. Stat. (IF 1.8) Pub Date : 2023-04-11 Jan Graffelman, Jan de Leeuw
The graphical representation of the correlation matrix by means of different multivariate statistical methods is reviewed, a comparison of the different procedures is presented with the use of an e...
-
The Wald Confidence Interval for a Binomial p as an Illuminating “Bad” Example Am. Stat. (IF 1.8) Pub Date : 2023-04-05 Per Gösta Andersson
When teaching we usually not only demonstrate/discuss how a certain method works, but, not less important, why it works. In contrast, the Wald confidence interval for a binomial p constitutes an ex...
-
Correction: Linearity of Unbiased Linear Model Estimators Am. Stat. (IF 1.8) Pub Date : 2023-03-27
Published in The American Statistician (Vol. 77, No. 2, 2023)
-
Hitting a Prime in 2.43 Dice Rolls (On Average) Am. Stat. (IF 1.8) Pub Date : 2023-03-15 Noga Alon, Yaakov Malinovsky
Abstract What is the number of rolls of fair six-sided dice until the first time the total sum of all rolls is a prime? We compute the expectation and the variance of this random variable up to an additive error of less than 10−410−4 . This is a solution to a puzzle suggested by DasGupta in the Bulletin of the Institute of Mathematical Statistics, where the published solution is incomplete. The proof
-
Revisiting the Name Variant of the Two-Children Problem Am. Stat. (IF 1.8) Pub Date : 2023-02-23 Davy Paindaveine, Philippe Spindel
Initially proposed by Martin Gardner in the 1950s, the famous two-children problem is often presented as a paradox in probability theory. A relatively recent variant of this paradox states that, wh...