-
Illustrating Randomness in Statistics Courses with Spatial Experiments Am. Stat. (IF 1.83) Pub Date : 2021-01-06 Amanda S. Hering; Luke Durell; Grant Morgan
Abstract Understanding the concept of randomness is fundamental for students in introductory statistics courses, but the notion of randomness is deceivingly complex, so it is often emphasized less than the mechanics of probability and inference. The most commonly used classroom tools to assess students’ production or perception of randomness are binary choices, such as coin tosses, and number sequences
-
The Impact of Application of the Jackknife to the Sample Median* Am. Stat. (IF 1.83) Pub Date : 2020-12-31 Jianning Yang; John E. Kolassa
Abstract The jackknife is a reliable tool for reducing the bias of a wide range of estimators. This note demonstrates that even such versatile tools have regularity conditions that can be violated even in relatively simple cases, and that caution needs to be exercised in their use. In particular, we show that the jackknife does not provide the expected reliability for bias-reduction for the sample
-
Bayesian Inference Is Unaffected by Selection: Fact or Fiction? Am. Stat. (IF 1.83) Pub Date : 2020-12-23 David A. Harville
Abstract The problem considered is that of making inferences about the value of a parameter vector θ based on the value of an observable random vector y that is subject to selection of the form y ∈ S (for a known subset S). According to conventional wisdom, a Bayesian approach (unlike a frequentist approach) requires no adjustment for selection, which is generally regarded as counterintuitive and
-
Comparing Covariate Prioritization via Matching to Machine Learning Methods for Causal Inference using Five Empirical Applications* Am. Stat. (IF 1.83) Pub Date : 2020-12-23 Luke Keele; Dylan S. Small
Abstract When investigators seek to estimate causal effects, they often assume that selection into treatment is based only on observed covariates. Under this identification strategy, analysts must adjust for observed confounders. While basic regression models have long been the dominant method of statistical adjustment, methods based on matching or weighting have become more common. Of late, methods
-
Hurdle blockmodels for sparse network modeling Am. Stat. (IF 1.83) Pub Date : 2020-12-21 Narges Motalebi; Nathaniel T. Stevens; Stefan H. Steiner
Abstract A variety of random graph models have been proposed in the literature to model the associations within an interconnected system and to realistically account for various structures and attributes of such systems. In particular, much research has been devoted to modeling the interaction of humans within social networks. However, such networks in real-life tend to be extremely sparse and existing
-
Learning Hamiltonian Monte Carlo in R Am. Stat. (IF 1.83) Pub Date : 2020-12-21 Samuel Thomas; Wanzhu Tu
Abstract Hamiltonian Monte Carlo (HMC) is a powerful tool for Bayesian computation. In comparison with the traditional Metropolis-Hastings algorithm, HMC offers greater computational efficiency, especially in higher dimensional or more complex modeling situations. To most statisticians, however, the idea of HMC comes from a less familiar origin, one that is based on the theory of classical mechanics
-
Facilitating Authentic Practice for Early Undergraduate Statistics Students Am. Stat. (IF 1.83) Pub Date : 2020-10-30 Peter E. Freeman
Abstract In current curricula, authentic statistical practice generally only occurs in capstone projects undertaken by advanced undergraduate and Master’s students. We argue that deferring practice is a mistake: undergraduate students should achieve experience via repeated practice from their first years onward, to achieve heightened levels of confidence and competence prior to graduation. However
-
Mathematical and Statistical Skills in the Biopharmaceutical Industry: A Pragmatic Approach. Am. Stat. (IF 1.83) Pub Date : 2020-10-28 Wen Li; Thomas O. Jemielita
(2020). Mathematical and Statistical Skills in the Biopharmaceutical Industry: A Pragmatic Approach. The American Statistician: Vol. 74, No. 4, pp. 416-417.
-
Multiple Imputation in Practice: With Examples Using IVEware. Am. Stat. (IF 1.83) Pub Date : 2020-10-28 Qixuan Chen
(2020). Multiple Imputation in Practice: With Examples Using IVEware. The American Statistician: Vol. 74, No. 4, pp. 417-417.
-
Xinjie Hu, Aekyung Jung, and Gengsheng Qin (2020), “Interval Estimation for the Correlation Coefficient,” The American Statistician, 74:1, 29–36: Comment by Krishnamoorthy and Xia Am. Stat. (IF 1.83) Pub Date : 2020-10-28 Kalimuthu Krishnamoorthy; Yanping Xia
(2020). Xinjie Hu, Aekyung Jung, and Gengsheng Qin (2020), “Interval Estimation for the Correlation Coefficient,” The American Statistician, 74:1, 29–36: Comment by Krishnamoorthy and Xia. The American Statistician: Vol. 74, No. 4, pp. 418-418.
-
A Response to the Letter to the Editor on “Interval Estimation for the Correlation Coefficient,” The American Statistician, 74:1, 29–36: Comment by Krishnamoorthy and Xia Am. Stat. (IF 1.83) Pub Date : 2020-10-28 Xinjie Hu; Aekyung Jung; Gengsheng Qin
(2020). A Response to the Letter to the Editor on “Interval Estimation for the Correlation Coefficient,” The American Statistician, 74:1, 29–36: Comment by Krishnamoorthy and Xia. The American Statistician: Vol. 74, No. 4, pp. 419-419.
-
Editorial Collaborators Am. Stat. (IF 1.83) Pub Date : 2020-10-28
(2020). Editorial Collaborators. The American Statistician: Vol. 74, No. 4, pp. 420-421.
-
Further examples related to correlations between variables and ranks Am. Stat. (IF 1.83) Pub Date : 2020-10-06 Chunming Zhang
Abstract Rank statistics {R1,…,Rn} of actual variates {X1,…,Xn} play an important role in University undergraduate nonparametric statistics courses. This paper derives explicit expressions of the correlation coefficients between Xi and Rj for not only i = j but also i≠j, for i.i.d. continuous variables X1,…,Xn with a distribution function FX(·) of X and n≥2: (a) ρXi,Ri=n−1n+1ρX,FX(X)∈(0,n−1n+1] for
-
From one environment to many: The problem of replicability of statistical inferences Am. Stat. (IF 1.83) Pub Date : 2020-09-28 James J. Higgins; Michael J. Higgins; Jinguang Lin
Abstract Among plausible causes for replicability failure, one that has not received sufficient attention is the environment in which the research is conducted. Consisting of the population, equipment, personnel, and various conditions such as location, time, and weather, the research environment can affect treatments and outcomes, and changes in the research environment that occur when an experiment
-
The Lorenz Curve in the Classroom Am. Stat. (IF 1.83) Pub Date : 2020-09-14 Roberta La Haye; Petr Zizler
The Lorenz curve and Gini index have great social relevance due to concerns regarding income inequality. However, their discussion is limited in the undergraduate statistics and mathematics curriculum. This paper outlines how to increase the educational potential of Lorenz curves as an application in both the calculus class and introductory probability classroom. We show how calculus and probability
-
A set of efficient methods to generate high-dimensional binary data with specified correlation structures Am. Stat. (IF 1.83) Pub Date : 2020-08-31 Wei Jiang; Shuang Song; Lin Hou; Hongyu Zhao
High dimensional correlated binary data arise in many areas, such as observed genetic variations in biomedical research. Data simulation can help researchers evaluate efficiency and explore properties of different computational and statistical methods. Also, some statistical methods, such as Monte-Carlo methods, rely on data simulation. Lunn and Davies (1998) proposed linear time complexity methods
-
Null hypothesis significance testing interpreted and calibrated by estimating probabilities of sign errors: A Bayes-frequentist continuum Am. Stat. (IF 1.83) Pub Date : 2020-08-31 David R. Bickel
Hypothesis tests are conducted not only to determine whether a null hypothesis (H0) is true but also to determine the direction or sign of an effect. A simple estimate of the posterior probability of a sign error is PSE = (1 - PH0) p/2 + PH0, depending only on a two-sided p value and PH0, an estimate of the posterior probability of H0. A convenient option for PH0 is the posterior probability derived
-
The 9 Pitfalls of Data Science Am. Stat. (IF 1.83) Pub Date : 2020-08-10 Yongdai Kim
(2020). The 9 Pitfalls of Data Science. The American Statistician: Vol. 74, No. 3, pp. 307-307.
-
Feature Engineering and Selection: A Practical Approach for Predictive Models Am. Stat. (IF 1.83) Pub Date : 2020-08-10 Brandon Butcher; Brian J. Smith
(2020). Feature Engineering and Selection: A Practical Approach for Predictive Models. The American Statistician: Vol. 74, No. 3, pp. 308-309.
-
Modern Statistics for Modern Biology Am. Stat. (IF 1.83) Pub Date : 2020-08-10 Bailey K. Fosdick; G. Brooke Anderson
(2020). Modern Statistics for Modern Biology. The American Statistician: Vol. 74, No. 3, pp. 309-311.
-
Surprises in Probability: Seventeen Short Stories Am. Stat. (IF 1.83) Pub Date : 2020-08-10 Jonathan M. Wells
(2020). Surprises in Probability: Seventeen Short Stories. The American Statistician: Vol. 74, No. 3, pp. 311-311.
-
Time Series: A Data Analysis Approach Using R Am. Stat. (IF 1.83) Pub Date : 2020-08-10 Robert B. Lund
(2020). Time Series: A Data Analysis Approach Using R. The American Statistician: Vol. 74, No. 3, pp. 312-312.
-
Comment on “Test for Trend With a Multinomial Outcome” by Szabo (2019) Am. Stat. (IF 1.83) Pub Date : 2020-06-02 Ronald Christensen
(2020). Comment on “Test for Trend With a Multinomial Outcome” by Szabo (2019) The American Statistician: Vol. 74, No. 3, pp. 313-314.
-
Micha Mandel (2020), “The Scaled Uniform Model Revisited,” The American Statistician, 74:1, 98–100: Comment Am. Stat. (IF 1.83) Pub Date : 2020-06-12 Gunnar Taraldsen
(2020). Micha Mandel (2020), “The Scaled Uniform Model Revisited,” The American Statistician, 74:1, 98–100: Comment. The American Statistician: Vol. 74, No. 3, pp. 315-315.
-
Calculating Sample Size for Follmann’s Simple Multivariate Test for One-Sided Alternatives Am. Stat. (IF 1.83) Pub Date : 2020-08-03 Matthew J. McIntosh
Follmann developed a multivariate test, when X ∼ MVN(μ,Σ), to test H0 versus H1 − H0 where H0: μ=0 and H1:μ≥0. Follmann provided strict lower bounds on the power function when an orthogonal mapping requirement was satisfied, the use of which requires knowledge about the unknown population covariance matrix. In this article, we show that the orthogonal mapping requirement for his theorem is equivalent
-
A Generalization of the Savage-Dickey Density Ratio for Testing Equality and Order Constrained Hypotheses Am. Stat. (IF 1.83) Pub Date : 2020-07-23 J. Mulder; E.-J. Wagenmakers; M. Marsman
The Savage-Dickey density ratio is a specific expression of the Bayes factor when testing a precise (equality constrained) hypothesis against an unrestricted alternative. The expression greatly simplifies the computation of the Bayes factor at the cost of assuming a specific form of the prior under the precise hypothesis as a function of the unrestricted prior. A generalization was proposed by Verdinelli
-
Going Viral, Binge-Watching, and Attention Cannibalism Am. Stat. (IF 1.83) Pub Date : 2020-07-09 Scott D. Grimshaw; Natalie J. Blades; Candace Berrett
Abstract Binge-watching behavior is modeled for a single season of an original program from a streaming service to understand and make predictions about how individuals watch newly released content. Viewers make two choices in binge watching. First, the onset when individuals begin viewing the program is modeled using a change point between epidemic viewing with a nonconstant hazard rate and endemic
-
Random number generators produce collisions: Why, how many and more Am. Stat. (IF 1.83) Pub Date : 2020-06-23 Marius Hofert
It seems surprising that when applying widely used random number generators to generate one million random numbers on modern architectures, one obtains, on average, about 116 collisions. This article explains why, how to mathematically compute such a number, why they often cannot be obtained in a straightforward way, how to numerically compute them in a robust way and, among other things, what would
-
Learning Temporal Structures of Random Patterns by Generating Functions Am. Stat. (IF 1.83) Pub Date : 2020-06-08 Yanlong Sun; Hongbin Wang
We present a method of generating functions to compute the distributions of the first-arrival and inter-arrival times of random patterns in independent Bernoulli trials and first-order Markov trials. We use segmentation of pattern events and diagrams of Markov chains to illustrate the recursive structures represented by generating functions. We then relate the results of pattern time to the probability
-
Adjusting Published Estimates for Exploratory Biases Using the Truncated Normal Distribution Am. Stat. (IF 1.83) Pub Date : 2020-05-29 Travis Loux; Orlando Davy
Publication bias can occur for many reasons, including the perceived need to present statistically significant results. We propose and compare methods for adjusting a single published estimate for possible publication bias using a truncated normal distribution. We attempt to estimate the mean of the underlying normal sampling distribution using only summary data readily available in most published
-
Decision-Theoretic Hypothesis Testing: A Primer With R Package OptSig Am. Stat. (IF 1.83) Pub Date : 2020-05-06 Jae H. Kim
Abstract This article is a primer for a decision-theoretic approach to hypothesis testing for students and teachers of basic statistics. Using three examples at an introductory level, this article demonstrates how decision-theoretic hypothesis testing can be taught to the students of basic statistics. It also demonstrates that students and researchers can make more sensible and unambiguous decisions
-
The exact form of the ‘Ockham factor’ in model selection Am. Stat. (IF 1.83) Pub Date : 2020-05-05 Jonathan Rougier; Carey E. Priebe
We explore the arguments for maximizing the ‘evidence’ as an algorithm for model selection. We show, using a new definition of model complexity which we term ‘flexibility’, that maximizing the evidence should appeal to both Bayesian and Frequentist statisticians. This is due to flexibility’s unique position in the exact decomposition of log-evidence into log-fit minus flexibility. In the Gaussian linear
-
Computing (Bivariate) Poisson Moments using Stein–Chen Identities Am. Stat. (IF 1.83) Pub Date : 2020-05-04 Christian H. Weiß; Boris Aleksandrov
The (bivariate) Poisson distribution is the most common distribution for (bivariate) count random variables. The univariate Poisson distribution is characterized by the famous Stein–Chen identity. We demonstrate that this identity allows to derive even sophisticated moment expressions in such a simple way that the corresponding computations can be presented in an introductory Statistics class. Then
-
On Being an Ethical Statistical Expert in a Legal Case Am. Stat. (IF 1.83) Pub Date : 2020-05-04 William B. Fairley; William A. Huber
In the Anglo-American legal system, courts rely heavily on experts who perform an essential social function in supplying information to resolve disputes. Experts are the vehicles through which facts of any technical complexity are brought out. The adversarial nature of this legal system places expert witnesses in a quandary. Enjoined to serve the court and their profession with unbiased, independent
-
R Markdown: The Definitive Guide Am. Stat. (IF 1.83) Pub Date : 2020-04-27 Paul Johnson
(2020). R Markdown: The Definitive Guide. The American Statistician: Vol. 74, No. 2, pp. 209-210.
-
Model-Based Clustering and Classification for Data Science: With Applications in R Am. Stat. (IF 1.83) Pub Date : 2020-04-27 Seung Jun Shin
(2020). Model-Based Clustering and Classification for Data Science: With Applications in R. The American Statistician: Vol. 74, No. 2, pp. 208-209.
-
Capture-Recapture Methods for the Social and Medical Sciences Am. Stat. (IF 1.83) Pub Date : 2020-04-27 Daniel Manrique-Vallier
(2020). Capture-Recapture Methods for the Social and Medical Sciences. The American Statistician: Vol. 74, No. 2, pp. 207-208.
-
The Art of Statistics: How to Learn From Data Am. Stat. (IF 1.83) Pub Date : 2020-04-27 Jong Hee Park
(2020). The Art of Statistics: How to Learn From Data. The American Statistician: Vol. 74, No. 2, pp. 207-207.
-
Benjamin, D. J., and Berger, J. O. (2019), “Three Recommendations for Improving the Use of p-Values”, The American Statistician, 73, 186–191: Comment by Foulley Am. Stat. (IF 1.83) Pub Date : 2019-10-16 Jean-Louis Foulley
(2020). Benjamin, D. J., and Berger, J. O. (2019), “Three Recommendations for Improving the Use of p-Values”, The American Statistician, 73, 186–191: Comment by Foulley. The American Statistician: Vol. 74, No. 1, pp. 101-102.
-
The Scaled Uniform Model Revisited Am. Stat. (IF 1.83) Pub Date : 2019-05-30 Micha Mandel
Sufficiency, conditionality, and invariance are basic principles of statistical inference. Current mathematical statistics courses do not devote much teaching time to these classical principles, and even ignore the latter two, in order to teach modern methods. However, being the philosophical cornerstones of statistical inference, a minimal understanding of these principles should be part of any curriculum
-
Further Examples Related to the Identical Distribution of X/(X+Y) and Y/(X+Y) Am. Stat. (IF 1.83) Pub Date : 2019-05-20 Barry C. Arnold
The study of conditions under which a two-dimensional random variable (X, Y) will have the property that X/(X+Y)=dY/(X+Y) was initiated by Bhattacharjee and Dhar. Some additional perhaps unexpected examples related to this phenomenon are provided. Discrete and absolutely continuous cases are discussed in detail. Singular continuous cases are briefly mentioned.
-
A Shiny Update to an Old Experiment Game Am. Stat. (IF 1.83) Pub Date : 2018-11-13 Robert B. Gramacy
Games can be a powerful tool for learning about statistical methodology. Effective game design involves a fine balance between caricature and realism, to simultaneously illustrate salient concepts in a controlled setting and serve as a testament to real-world applicability. Striking that balance is particularly challenging in response surface and design domains, where real-world scenarios often play
-
Two-Tailed p-Values and Coherent Measures of Evidence Am. Stat. (IF 1.83) Pub Date : 2018-08-06 Peter H. Peskun
In a test of significance, it is common practice to report the p-value as one way of summarizing the incompatibility between a set of data and a proposed model for the data constructed under a set of assumptions together with a null hypothesis. However, the p-value does have some flaws, one being in general its definition for two-sided tests and a related serious logical one of incoherence, in its
-
Models for Geostatistical Binary Data: Properties and Connections Am. Stat. (IF 1.83) Pub Date : 2018-07-09 Victor De Oliveira
This article explores models for geostatistical data for situations in which the region where the phenomenon of interest varies is partitioned into two disjoint subregions. This is called a binary map. The goals of the article are 3-fold. First, a review is provided of the classes of models that have been proposed so far in the literature for geostatistical binary data as well as a description of their
-
Comment on “A Note on Collinearity Diagnostics and Centering” by Velilla (2018) Am. Stat. (IF 1.83) Pub Date : 2019-07-22 Román Salmerón Gómez; Catalina García García; Jose García Pérez
(2020). Comment on “A Note on Collinearity Diagnostics and Centering” by Velilla (2018) The American Statistician: Vol. 74, No. 1, pp. 68-71.
-
On the Loss Robustness of Least-Square Estimators Am. Stat. (IF 1.83) Pub Date : 2019-05-09 Tamal Ghosh; Malay Ghosh; Tatsuya Kubokawa
The article revisits univariate and multivariate linear regression models. It is shown that least-square estimators (LSEs) are minimum risk estimators in general class of linear unbiased estimators under some general divergence loss. This amounts to the loss robustness of LSEs.
-
A Note on Item Response Theory Modeling for Online Customer Ratings Am. Stat. (IF 1.83) Pub Date : 2018-06-04 Chien-Lang Su; Sun-Hao Chang; Ruby Chiu-Hsing Weng
Online consumer product ratings data are increasing rapidly. While most of the current graphical displays mainly represent the average ratings, Ho and Quinn proposed an easily interpretable graphical display based on an ordinal item response theory (IRT) model, which successfully accounts for systematic interrater differences. Conventionally, the discrimination parameters in IRT models are constrained
-
The Johnson System of Frequency Curves—Historical, Graphical, and Limiting Perspectives Am. Stat. (IF 1.83) Pub Date : 2019-08-19 Johan René van Dorp; M. C. Jones
The idea of transforming one random variate to another with a more convenient density has been developed in the first half of the 20th century. In his thesis, Norman L. Johnson (1917–2004) developed a pioneering system of transformations of the standard normal distribution which gained substantial popularity in the second half of the 20th century and beyond. In Johnson’s 1949 Johnson, N. L. (1949)
-
Interval Estimation for the Correlation Coefficient Am. Stat. (IF 1.83) Pub Date : 2018-07-09 Xinjie Hu; Aekyung Jung; Gengsheng Qin
The correlation coefficient (CC) is a standard measure of a possible linear association between two continuous random variables. The CC plays a significant role in many scientific disciplines. For a bivariate normal distribution, there are many types of confidence intervals for the CC, such as z-transformation and maximum likelihood-based intervals. However, when the underlying bivariate distribution
-
Generating Correlation Matrices With Specified Eigenvalues Using the Method of Alternating Projections Am. Stat. (IF 1.83) Pub Date : 2018-07-17 Niels G. Waller
This article describes a new algorithm for generating correlation matrices with specified eigenvalues. The algorithm uses the method of alternating projections (MAP) that was first described by Neumann. The MAP algorithm for generating correlation matrices is both easy to understand and to program in higher-level computer languages, making this method accessible to applied researchers with no formal
-
A Short Note on Almost Sure Convergence of Bayes Factors in the General Set-Up Am. Stat. (IF 1.83) Pub Date : 2018-06-11 Debashis Chatterjee; Trisha Maitra; Sourabh Bhattacharya
Although there is a significant literature on the asymptotic theory of Bayes factor, the set-ups considered are usually specialized and often involves independent and identically distributed data. Even in such specialized cases, mostly weak consistency results are available. In this article, for the first time ever, we derive the almost sure convergence theory of Bayes factor in the general set-up
-
Fostering Undergraduate Data Science Am. Stat. (IF 1.83) Pub Date : 2018-06-05 Fulya Gokalp Yavuz; Mark Daniel Ward
ABSTRACT Data Science is one of the newest interdisciplinary areas. It is transforming our lives unexpectedly fast. This transformation is also happening in our learning styles and practicing habits. We advocate an approach to data science training that uses several types of computational tools, including R, bash, awk, regular expressions, SQL, and XPath, often used in tandem. We discuss ways for undergraduate
-
The Democratization of Data Science Education Am. Stat. (IF 1.83) Pub Date : 2019-10-28 Sean Kross; Roger D. Peng; Brian S. Caffo; Ira Gooding; Jeffrey T. Leek
Abstract Over the last three decades, data have become ubiquitous and cheap. This transition has accelerated over the last five years and training in statistics, machine learning, and data analysis has struggled to keep up. In April 2014, we launched a program of nine courses, the Johns Hopkins Data Science Specialization, which has now had more than 4 million enrollments over the past five years.
-
Applications of the Fractional-Random-Weight Bootstrap Am. Stat. (IF 1.83) Pub Date : 2020-04-17 Li Xu; Chris Gotwalt; Yili Hong; Caleb B. King; William Q. Meeker
Abstract For several decades, the resampling based bootstrap has been widely used for computing confidence intervals (CIs) for applications where no exact method is available. However, there are many applications where the resampling bootstrap method cannot be used. These include situations where the data are heavily censored due to the success response being a rare event, situations where there is
-
Improving Effect Estimates by Limiting the Variability in Inverse Propensity Score Weights Am. Stat. (IF 1.83) Pub Date : 2020-04-14 Keith Kranker; Laura Blue; Lauren Vollmer Forrow
Abstract This study describes a novel method to reweight a comparison group used for causal inference, so the group is similar to a treatment group on observable characteristics yet avoids highly variable weights that would limit statistical power. The proposed method generalizes the covariate-balancing propensity score (CBPS) methodology developed by Imai and Ratkovic (2014) to enable researchers
-
The Short-Term and Long-Term Hazard Ratio Model: Parameterization Inconsistency Am. Stat. (IF 1.83) Pub Date : 2020-04-08 Philippe Flandre; John O’Quigley
The test of Yang and Prentice, based on the short-term and long-term hazard ratio model for the presence of a regression effect appears to be an attractive one, being able to detect departures from a null hypothesis of no effect against quite broad alternatives. We recall the model on which this test is based and the test itself. In simulations, the test has shown good performance and is judged to
-
Reconnecting p-Value and Posterior Probability Under One- and Two-Sided Tests Am. Stat. (IF 1.83) Pub Date : 2020-02-25 Haolun Shi; Guosheng Yin
As a convention, p-value is often computed in frequentist hypothesis testing and compared with the nominal significance level of 0.05 to determine whether or not to reject the null hypothesis. The smaller the p-value, the more significant the statistical test. Under noninformative prior distributions, we establish the equivalence relationship between the p-value and Bayesian posterior probability of
-
Visually Communicating and Teaching Intuition for Influence Functions Am. Stat. (IF 1.83) Pub Date : 2020-02-25 Aaron Fisher; Edward H. Kennedy
Abstract Estimators based on influence functions (IFs) have been shown to be effective in many settings, especially when combined with machine learning techniques. By focusing on estimating a specific target of interest (e.g., the average effect of a treatment), rather than on estimating the full underlying data generating distribution, IF-based estimators are often able to achieve asymptotically optimal
-
March Madness “Anomalies”: Are They Real, and If So, Can They Be Explained? Am. Stat. (IF 1.83) Pub Date : 2020-02-21 Dale L. Zimmerman; Nathan D. Zimmerman; Joshua T. Zimmerman
Previously published statistical analyses of NCAA Division I Men’s Basketball Tournament (“March Madness”) game outcomes since the 64-team format for its main draw began in 1985 have uncovered some apparent anomalies, such as 12-seeds upsetting 5-seeds more often than might be expected, and seeds 10 through 12 advancing to the Sweet Sixteen much more often than 8-seeds and 9-seeds—the so-called middle-seed
-
Harmonizing Optimized Designs With Classic Randomization in Experiments Am. Stat. (IF 1.83) Pub Date : 2020-02-21 Adam Kapelner; Abba M. Krieger; Michael Sklar; Uri Shalit; David Azriel
There is a long debate in experimental design between the classic randomization design of Fisher, Yates, Kempthorne, Cochran, and those who advocate deterministic assignments based on notions of optimality. In nonsequential trials comparing treatment and control, covariate measurements for each subject are known in advance, and subjects can be divided into two groups based on a criterion of imbalance
Contents have been reproduced by permission of the publishers.