样式: 排序: IF: - GO 导出 标记为已读
-
Measuring the Functioning Human Brain Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2024-09-11 Martin A. Lindquist, Bonnie B. Smith, Arunkumar Kannan, Angela Zhao, Brian Caffo
The emergence of functional magnetic resonance imaging (fMRI) marked a significant technological breakthrough in the real-time measurement of the functioning human brain in vivo. In part because of their 4D nature (three spatial dimensions and time), fMRI data have inspired a great deal of statistical development in the past couple of decades to address their unique spatiotemporal properties. This
-
High-Dimensional Gene–Environment Interaction Analysis Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2024-09-11 Mengyun Wu, Yingmeng Li, Shuangge Ma
Beyond the main genetic and environmental effects, gene–environment (G–E) interactions have been demonstrated to significantly contribute to the development and progression of complex diseases. Published analyses of G–E interactions have primarily used a supervised framework to model both low-dimensional environmental factors and high-dimensional genetic factors in relation to disease outcomes. In
-
A Theoretical Review of Modern Robust Statistics Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2024-08-21 Po-Ling Loh
Robust statistics is a fairly mature field that dates back to the early 1960s, with many foundational concepts having been developed in the ensuing decades. However, the field has drawn a new surge of attention in the past decade, largely due to a desire to recast robust statistical principles in the context of high-dimensional statistics. In this article, we begin by reviewing some of the central
-
Crafting 10 Years of Statistics Explanations: Points of Significance Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2024-08-21 Naomi Altman, Martin Krzywinski
Points of Significance is an ongoing series of short articles about statistics in Nature Methods that started in 2013. Its aim is to provide clear explanations of essential concepts in statistics for a nonspecialist audience. The articles favor heuristic explanations and make extensive use of simulated examples and graphical explanations, while maintaining mathematical rigor. Topics range from basic
-
Statistical Data Integration for Health Policy Evidence-Building Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2024-08-19 Susan M. Paddock, Carolina Franco, F. Jay Breidt, Brenda Betancourt
Health policy evidence-building requires data sources such as health care claims, electronic health records, probability and nonprobability survey data, epidemiological surveillance databases, administrative data, and more, all of which have strengths and limitations for a given policy analysis. Data integration techniques leverage the relative strengths of input sources to obtain a blended source
-
The Role of the Bayes Factor in the Evaluation of Evidence Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2024-04-24 Colin Aitken, Franco Taroni, Silvia Bozza
The use of the Bayes factor as a metric for the assessment of the probative value of forensic scientific evidence is largely supported by recommended standards in different disciplines. The application of Bayesian networks enables the consideration of problems of increasing complexity. The lack of a widespread consensus concerning key aspects of evidence evaluation and interpretation, such as the adequacy
-
Convergence Diagnostics for Entity Resolution Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2024-04-24 Serge Aleshin-Guendel, Rebecca C. Steorts
Entity resolution is the process of merging and removing duplicate records from multiple data sources, often in the absence of unique identifiers. Bayesian models for entity resolution allow one to include a priori information, quantify uncertainty in important applications, and directly estimate a partition of the records. Markov chain Monte Carlo (MCMC) sampling is the primary computational method
-
Manifold Learning: What, How, and Why Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-11-29 Marina Meilă, Hanyu Zhang
Manifold learning (ML), also known as nonlinear dimension reduction, is a set of methods to find the low-dimensional structure of data. Dimension reduction for large, high-dimensional data is not merely a way to reduce the data; the new representations and descriptors obtained by ML reveal the geometric shape of high-dimensional point clouds and allow one to visualize, denoise, and interpret them.
-
Maps: A Statistical View Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-11-29 Lance A. Waller
Maps provide a data framework for the statistical analysis of georeferenced data observations. Since the middle of the twentieth century, the field of spatial statistics has evolved to address key inferential questions relating to spatially defined data, yet many central statistical properties do not translate to spatially indexed and spatially correlated data, and the development of statistical inference
-
Communication of Statistics and Evidence in Times of Crisis Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-11-29 Claudia R. Schneider, John R. Kerr, Sarah Dryhurst, John A.D. Aston
This review provides an overview of concepts relating to the communication of statistical and empirical evidence in times of crisis, with a special focus on COVID-19. In it, we consider topics relating to both the communication of numbers, such as the role of format, context, comparisons, and visualization, and the communication of evidence more broadly, such as evidence quality, the influence of changes
-
Recent Advances in Text Analysis Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-11-29 Zheng Tracy Ke, Pengsheng Ji, Jiashun Jin, Wanshan Li
Text analysis is an interesting research area in data science and has various applications, such as in artificial intelligence, biomedical research, and engineering. We review popular methods for text analysis, ranging from topic modeling to the recent neural language models. In particular, we review Topic-SCORE, a statistical approach to topic modeling, and discuss how to use it to analyze the Multi-Attribute
-
Statistical Brain Network Analysis Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-11-28 Sean L. Simpson, Heather M. Shappell, Mohsen Bahrami
The recent fusion of network science and neuroscience has catalyzed a paradigm shift in how we study the brain and led to the field of brain network analysis. Brain network analyses hold great potential in helping us understand normal and abnormal brain function by providing profound clinical insight into links between system-level properties and health and behavioral outcomes. Nonetheless, methods
-
Relational Event Modeling Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-11-28 Federica Bianchi, Edoardo Filippi-Mazzola, Alessandro Lomi, Ernst C. Wit
Advances in information technology have increased the availability of time-stamped relational data, such as those produced by email exchanges or interaction through social media. Whereas the associated information flows could be aggregated into cross-sectional panels, the temporal ordering of the events frequently contains information that requires new models for the analysis of continuous-time interactions
-
Competing Risks: Concepts, Methods, and Software Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-11-22 Ronald B. Geskus
The role of competing risks in the analysis of time-to-event data is increasingly acknowledged. Software is readily available. However, confusion remains regarding the proper analysis: When and how do I need to take the presence of competing risks into account? Which quantities are relevant for my research question? How can they be estimated and what assumptions do I need to make? The main quantities
-
Distributed Computing and Inference for Big Data Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-11-17 Ling Zhou, Ziyang Gong, Pengcheng Xiang
Data are distributed across different sites due to computing facility limitations or data privacy considerations. Conventional centralized methods—those in which all datasets are stored and processed in a central computing facility—are not applicable in practice. Therefore, it has become necessary to develop distributed learning approaches that have good inference or predictive accuracy while remaining
-
Causal Inference in the Social Sciences Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-11-17 Guido W. Imbens
Knowledge of causal effects is of great importance to decision makers in a wide variety of settings. In many cases, however, these causal effects are not known to the decision makers and need to be estimated from data. This fundamental problem has been known and studied for many years in many disciplines. In the past thirty years, however, the amount of empirical as well as methodological research
-
Interpretable Machine Learning for Discovery: Statistical Challenges and Opportunities Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-11-17 Genevera I. Allen, Luqin Gan, Lili Zheng
New technologies have led to vast troves of large and complex data sets across many scientific domains and industries. People routinely use machine learning techniques not only to process, visualize, and make predictions from these big data, but also to make data-driven discoveries. These discoveries are often made using interpretable machine learning, or machine learning models and techniques that
-
Geometric Methods for Cosmological Data on the Sphere Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-11-06 Javier Carrón Duque, Domenico Marinucci
This review is devoted to recent developments in the statistical analysis of spherical data, strongly motivated by applications in cosmology. We start from a brief discussion of cosmological questions and motivations, arguing that most cosmological observables are spherical random fields. Then, we introduce some mathematical background on spherical random fields, including spectral representations
-
Stochastic Models of Rainfall Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-10-31 Paul J. Northrop
Rainfall is the main input to most hydrological systems. To assess flood risk for a catchment area, hydrologists use models that require long series of subdaily, perhaps even subhourly, rainfall data, ideally from locations that cover the area. If historical data are not sufficient for this purpose, an alternative is to simulate synthetic data from a suitably calibrated model. We review stochastic
-
Shape-Constrained Statistical Inference Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-10-13 Lutz Dümbgen
Statistical models defined by shape constraints are a valuable alternative to parametric models or nonparametric models defined in terms of quantitative smoothness constraints. While the latter two classes of models are typically difficult to justify a priori, many applications involve natural shape constraints, for instance, monotonicity of a density or regression function. We review some of the history
-
Analysis of Microbiome Data Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-10-13 Christine B. Peterson, Satabdi Saha, Kim-Anh Do
The microbiome represents a hidden world of tiny organisms populating not only our surroundings but also our own bodies. By enabling comprehensive profiling of these invisible creatures, modern genomic sequencing tools have given us an unprecedented ability to characterize these populations and uncover their outsize impact on our environment and health. Statistical analysis of microbiome data is critical
-
Distributional Regression for Data Analysis Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-10-13 Nadja Klein
The flexible modeling of an entire distribution as a function of covariates, known as distributional regression, has seen growing interest over the past decades in both the statistics and machine learning literature. This review outlines selected state-of-the-art statistical approaches to distributional regression, complemented with alternatives from machine learning. Topics covered include the similarities
-
Role of Statistics in Detecting Misinformation: A Review of the State of the Art, Open Issues, and Future Research Directions Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-10-13 Zois Boukouvalas, Allison Shafer
With the evolution of social media, cyberspace has become the default medium for social media users to communicate, especially during high-impact events such as pandemics, natural disasters, terrorist attacks, and periods of political unrest. However, during such events, misinformation can spread rapidly on social media, affecting decision-making and creating social unrest. Identifying and curtailing
-
An Update on Measurement Error Modeling Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-10-13 Mushan Li, Yanyuan Ma
The issues caused by measurement errors have been recognized for almost 90 years, and research in this area has flourished since the 1980s. We review some of the classical methods in both density estimation and regression problems with measurement errors. In both problems, we consider when the original error-free model is parametric, nonparametric, and semiparametric, in combination with different
-
Making Sense of Censored Covariates: Statistical Methods for Studies of Huntington's Disease Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-09-08 Sarah C. Lotspeich, Marissa C. Ashner, Jesus E. Vazquez, Brian D. Richardson, Kyle F. Grosser, Benjamin E. Bodek, Tanya P. Garcia
The landscape of survival analysis is constantly being revolutionized to answer biomedical challenges, most recently the statistical challenge of censored covariates rather than outcomes. There are many promising strategies to tackle censored covariates, including weighting, imputation, maximum likelihood, and Bayesian methods. Still, this is a relatively fresh area of research, different from the
-
Variable Importance Without Impossible Data Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-08-25 Masayoshi Mase, Art B. Owen, Benjamin B. Seiler
The most popular methods for measuring importance of the variables in a black-box prediction algorithm make use of synthetic inputs that combine predictor variables from multiple observations. These inputs can be unlikely, physically impossible, or even logically impossible. As a result, the predictions for such cases can be based on data very unlike any the black box was trained on. We think that
-
Bayesian Inference for Misspecified Generative Models Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-08-24 David J. Nott, Christopher Drovandi, David T. Frazier
Bayesian inference is a powerful tool for combining information in complex settings, a task of increasing importance in modern applications. However, Bayesian inference with a flawed model can produce unreliable conclusions. This review discusses approaches to performing Bayesian inference when the model is misspecified, where, by misspecified, we mean that the analyst is unwilling to act as if the
-
Inverse Problems for Physics-Based Process Models Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-08-16 Derek Bingham, Troy Butler, Don Estep
We describe and compare two formulations of inverse problems for a physics-based process model in the context of uncertainty and random variability: the Bayesian inverse problem and the stochastic inverse problem. We describe the foundations of the two problems in order to create a context for interpreting the applicability and solutions of inverse problems important for scientific and engineering
-
Graph-Based Change-Point Analysis Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-03-09 Hao Chen, Lynna Chu
Recent technological advances allow for the collection of massive data in the study of complex phenomena over time and/or space in various fields. Many of these data involve sequences of high-dimensional or non-Euclidean measurements, where change-point analysis is a crucial early step in understanding the data. Segmentation, or offline change-point analysis, divides data into homogeneous temporal
-
Surrogate Endpoints in Clinical Trials Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-03-09 Michael R. Elliott
Surrogate markers are often used in clinical trials settings when obtaining a final outcome to evaluate the effectiveness of a treatment requires a long wait, is expensive to obtain, or both. Formal definitions of surrogate marker quality resulting from a large variety of estimation approaches have been proposed over the years. I review this work, with a particular focus on approaches that use the
-
High-Dimensional Data Bootstrap Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-03-09 Victor Chernozhukov, Denis Chetverikov, Kengo Kato, Yuta Koike
This article reviews recent progress in high-dimensional bootstrap. We first review high-dimensional central limit theorems for distributions of sample mean vectors over the rectangles, bootstrap consistency results in high dimensions, and key techniques used to establish those results. We then review selected applications of high-dimensional bootstrap: construction of simultaneous confidence sets
-
Second-Generation Functional Data Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-03-09 Salil Koner, Ana-Maria Staicu
Modern studies from a variety of fields record multiple functional observations according to either multivariate, longitudinal, spatial, or time series designs. We refer to such data as second-generation functional data because their analysis—unlike typical functional data analysis, which assumes independence of the functions—accounts for the complex dependence between the functional observations and
-
A Brief Tour of Deep Learning from a Statistical Perspective Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-03-09 Eric Nalisnick, Padhraic Smyth, Dustin Tran
We expose the statistical foundations of deep learning with the goal of facilitating conversation between the deep learning and statistics communities. We highlight core themes at the intersection; summarize key neural models, such as feedforward neural networks, sequential neural networks, and neural latent variable models; and link these ideas to their roots in probability and statistics. We also
-
Statistical Deep Learning for Spatial and Spatiotemporal Data Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2023-03-09 Christopher K. Wikle, Andrew Zammit-Mangion
Deep neural network models have become ubiquitous in recent years and have been applied to nearly all areas of science, engineering, and industry. These models are particularly useful for data that have strong dependencies in space (e.g., images) and time (e.g., sequences). Indeed, deep models have also been extensively used by the statistical community to model spatial and spatiotemporal data through
-
Confidentiality Protection in the 2020 US Census of Population and Housing Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-11-29 John M. Abowd, Michael B. Hawes
In an era where external data and computational capabilities far exceed statistical agencies’ own resources and capabilities, they face the renewed challenge of protecting the confidentiality of underlying microdata when publishing statistics in very granular form and ensuring that these granular data are used for statistical purposes only. Conventional statistical disclosure limitation methods are
-
Statistical Methods for Exoplanet Detection with Radial Velocities Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-11-22 Nathan C. Hara, Eric B. Ford
Exoplanets can be detected with various observational techniques. Among them, radial velocity (RV) has the key advantages of revealing the architecture of planetary systems and measuring planetary mass and orbital eccentricities. RV observations are poised to play a key role in the detection and characterization of Earth twins. However, the detection of such small planets is not yet possible due to
-
Statistical Machine Learning for Quantitative Finance Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-11-22 M. Ludkovski
We survey the active interface of statistical learning methods and quantitative finance models. Our focus is on the use of statistical surrogates, also known as functional approximators, for learning input–output relationships relevant for financial tasks. Given the disparate terminology used among statisticians and financial mathematicians, we begin by reviewing the main ingredients of surrogate construction
-
Approximate Methods for Bayesian Computation Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-11-22 Radu V. Craiu, Evgeny Levi
Rich data generating mechanisms are ubiquitous in this age of information and require complex statistical models to draw meaningful inference. While Bayesian analysis has seen enormous development in the last 30 years, benefitting from the impetus given by the successful application of Markov chain Monte Carlo (MCMC) sampling, the combination of big data and complex models conspire to produce significant
-
Fifty Years of the Cox Model Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-11-21 John D. Kalbfleisch, Douglas E. Schaubel
The Cox model is now 50 years old. The seminal paper of Sir David Cox has had an immeasurable impact on the analysis of censored survival data, with applications in many different disciplines. This work has also stimulated much additional research in diverse areas and led to important theoretical and practical advances. These include semiparametric models, nonparametric efficiency, and partial likelihood
-
Statistical Data Privacy: A Song of Privacy and Utility Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-11-19 Aleksandra Slavković, Jeremy Seeman
To quantify trade-offs between increasing demand for open data sharing and concerns about sensitive information disclosure, statistical data privacy (SDP) methodology analyzes data release mechanisms that sanitize outputs based on confidential data. Two dominant frameworks exist: statistical disclosure control (SDC) and the more recent differential privacy (DP). Despite framing differences, both SDC
-
Innovation Diffusion Processes: Concepts, Models, and Predictions Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-11-19 Mariangela Guidolin, Piero Manfredi
Innovation diffusion processes have attracted considerable research attention for their interdisciplinary character, which combines theories and concepts from disciplines such as mathematics, physics, statistics, social sciences, marketing, economics, and technological forecasting. The formal representation of innovation diffusion processes historically used epidemic models borrowed from biology, departing
-
Simulation-Based Bayesian Analysis Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-11-19 Martyn Plummer
I consider the development of Markov chain Monte Carlo (MCMC) methods, from late-1980s Gibbs sampling to present-day gradient-based methods and piecewise-deterministic Markov processes. In parallel, I show how these ideas have been implemented in successive generations of statistical software for Bayesian inference. These software packages have been instrumental in popularizing applied Bayesian modeling
-
The Role of Statistics in Promoting Data Reusability and Research Transparency Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-11-19 Sarah M. Nusser
The value of research data has grown as the emphasis on research transparency and data-intensive research has increased. Data sharing is now required by funders and publishers and is becoming a disciplinary expectation in many fields. However, practices promoting data reusability and research transparency are poorly understood, making it difficult for statisticians and other researchers to reframe
-
Models for Integer Data Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-11-19 Dimitris Karlis, Naushad Mamode Khan
Over the past few years, interest has increased in models defined on positive and negative integers. Several application areas lead to data that are differences between positive integers. Some important examples are price changes measured discretely in financial applications, pre- and posttreatment measurements of discrete outcomes in clinical trials, the difference in the number of goals in sports
-
Statistical Applications to Cognitive Diagnostic Testing Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-11-09 Susu Zhang, Jingchen Liu, Zhiliang Ying
Diagnostic classification tests are designed to assess examinees’ discrete mastery status on a set of skills or attributes. Such tests have gained increasing attention in educational and psychological measurement. We review diagnostic classification models and their applications to testing and learning, discuss their statistical and machine learning connections and related challenges, and introduce
-
Sustainable Statistical Capacity-Building for Africa: The Biostatistics Case Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-11-05 Tarylee Reddy, Rebecca N. Nsubuga, Tobias Chirwa, Ziv Shkedy, Ann Mwangi, Ayele Tadesse Awoke, Luc Duchateau, Paul Janssen
Several major global challenges, including climate change and water scarcity, warrant a scientific approach to generating solutions. Developing high quality and robust capacity in (bio)statistics is key to ensuring sound scientific solutions to these challenges, so collaboration between academic and research institutes should be high on university agendas. To strengthen capacity in the developing world
-
Player Tracking Data in Sports Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-11-01 Stephanie A. Kovalchik
There has been rapid growth in the collection of player tracking data in recent years. These data, providing spatiotemporal locations of players and ball at high resolution, have spurred methodological developments in a range of sports. There have been impacts in the development of player performance measurement (e.g., distance traveled) and in the attribution of value to specific plays (e.g., expected
-
Model Diagnostics and Forecast Evaluation for Quantiles Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-11-01 Tilmann Gneiting, Daniel Wolffram, Johannes Resin, Kristof Kraus, Johannes Bracher, Timo Dimitriadis, Veit Hagenmeyer, Alexander I. Jordan, Sebastian Lerch, Kaleb Phipps, Melanie Schienle
Model diagnostics and forecast evaluation are closely related tasks, with the former concerning in-sample goodness (or lack) of fit and the latter addressing predictive performance out-of-sample. We review the ubiquitous setting in which forecasts are cast in the form of quantiles or quantile-bounded prediction intervals. We distinguish unconditional calibration, which corresponds to classical coverage
-
Generative Models: An Interdisciplinary Perspective Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-11-01 Kris Sankaran, Susan P. Holmes
By linking conceptual theories with observed data, generative models can support reasoning in complex situations. They have come to play a central role both within and beyond statistics, providing the basis for power analysis in molecular biology, theory building in particle physics, and resource allocation in epidemiology, for example. We introduce the probabilistic and computational concepts underlying
-
Shared Frailty Methods for Complex Survival Data: A Review of Recent Advances Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-11-01 Malka Gorfine, David M. Zucker
Dependent survival data arise in many contexts. One context is clustered survival data, where survival data are collected on clusters such as families or medical centers. Dependent survival data also arise when multiple survival times are recorded for each individual. Frailty models are one common approach to handle such data. In frailty models, the dependence is expressed in terms of a random effect
-
Model-Based Clustering Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-10-21 Isobel Claire Gormley, Thomas Brendan Murphy, Adrian E. Raftery
Clustering is the task of automatically gathering observations into homogeneous groups, where the number of groups is unknown. Through its basis in a statistical modeling framework, model-based clustering provides a principled and reproducible approach to clustering. In contrast to heuristic approaches, model-based clustering allows for robust approaches to parameter estimation and objective inference
-
Six Statistical Senses Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-10-21 Radu V. Craiu, Ruobin Gong, Xiao-Li Meng
This article proposes a set of categories, each one representing a particular distillation of important statistical ideas. Each category is labeled a “sense” because we think of these as essential in helping every statistical mind connect in constructive and insightful ways with statistical theory, methodologies, and computation, toward the ultimate goal of building statistical phronesis. The illustration
-
A Review of Generalizability and Transportability Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-10-19 Irina Degtiar, Sherri Rose
When assessing causal effects, determining the target population to which the results are intended to generalize is a critical decision. Randomized and observational studies each have strengths and limitations for estimating causal effects in a target population. Estimates from randomized data may have internal validity but are often not representative of the target population. Observational data may
-
Fair Risk Algorithms Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-10-07 Richard A. Berk, Arun Kumar Kuchibhotla, Eric Tchetgen Tchetgen
Machine learning algorithms are becoming ubiquitous in modern life. When used to help inform human decision making, they have been criticized by some for insufficient accuracy, an absence of transparency, and unfairness. Many of these concerns can be legitimate, although they are less convincing when compared with the uneven quality of human decisions. There is now a large literature in statistics
-
Three-Decision Methods: A Sensible Formulation of Significance Tests—and Much Else Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-10-06 Kenneth M. Rice, Chloe A. Krakauer
For real-valued parameters, significance tests can be motivated as three-decision methods, in which we either assert the sign of the parameter above or below a specified null value, or say nothing either way. Tukey viewed this as a “sensible formulation” of tests, unlike the widely taught null hypothesis significance testing (NHST) system that is today's default. We review the three-decision framework
-
High-Dimensional Survival Analysis: Methods and Applications Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-10-06 Stephen Salerno, Yi Li
In the era of precision medicine, time-to-event outcomes such as time to death or progression are routinely collected, along with high-throughput covariates. These high-dimensional data defy classical survival regression models, which are either infeasible to fit or likely to incur low predictability due to overfitting. To overcome this, recent emphasis has been placed on developing novel approaches
-
Data Integration in Bayesian Phylogenetics Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-09-28 Gabriel W. Hassler, Andrew F. Magee, Zhenyu Zhang, Guy Baele, Philippe Lemey, Xiang Ji, Mathieu Fourment, Marc A. Suchard
Researchers studying the evolution of viral pathogens and other organisms increasingly encounter and use large and complex data sets from multiple different sources. Statistical research in Bayesian phylogenetics has risen to this challenge. Researchers use phylogenetics not only to reconstruct the evolutionary history of a group of organisms, but also to understand the processes that guide its evolution
-
Markov Chain Monte Carlo in Practice Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-03-07 Galin L. Jones, Qian Qin
Markov chain Monte Carlo (MCMC) is an essential set of tools for estimating features of probability distributions commonly encountered in modern applications. For MCMC simulation to produce reliable outcomes, it needs to generate observations representative of the target distribution, and it must be long enough so that the errors of Monte Carlo estimates are small. We review methods for assessing the
-
Postprocessing of MCMC Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-03-07 Leah F. South, Marina Riabiz, Onur Teymur, Chris J. Oates
Markov chain Monte Carlo is the engine of modern Bayesian statistics, being used to approximate the posterior and derived quantities of interest. Despite this, the issue of how the output from a Markov chain is postprocessed and reported is often overlooked. Convergence diagnostics can be used to control bias via burn-in removal, but these do not account for (common) situations where a limited computational
-
Post-Selection Inference Annu. Rev. Stat. Appl. (IF 7.4) Pub Date : 2022-03-07 Arun K. Kuchibhotla, John E. Kolassa, Todd A. Kuffner
We discuss inference after data exploration, with a particular focus on inference after model or variable selection. We review three popular approaches to this problem: sample splitting, simultaneous inference, and conditional selective inference. We explain how each approach works and highlight its advantages and disadvantages. We also provide an illustration of these post-selection inference approaches