
样式: 排序: IF: - GO 导出 标记为已读
-
Graph-Based Change-Point Analysis Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2023-03-09 Hao Chen, Lynna Chu
Recent technological advances allow for the collection of massive data in the study of complex phenomena over time and/or space in various fields. Many of these data involve sequences of high-dimensional or non-Euclidean measurements, where change-point analysis is a crucial early step in understanding the data. Segmentation, or offline change-point analysis, divides data into homogeneous temporal
-
Surrogate Endpoints in Clinical Trials Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2023-03-09 Michael R. Elliott
Surrogate markers are often used in clinical trials settings when obtaining a final outcome to evaluate the effectiveness of a treatment requires a long wait, is expensive to obtain, or both. Formal definitions of surrogate marker quality resulting from a large variety of estimation approaches have been proposed over the years. I review this work, with a particular focus on approaches that use the
-
High-Dimensional Data Bootstrap Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2023-03-09 Victor Chernozhukov, Denis Chetverikov, Kengo Kato, Yuta Koike
This article reviews recent progress in high-dimensional bootstrap. We first review high-dimensional central limit theorems for distributions of sample mean vectors over the rectangles, bootstrap consistency results in high dimensions, and key techniques used to establish those results. We then review selected applications of high-dimensional bootstrap: construction of simultaneous confidence sets
-
Second-Generation Functional Data Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2023-03-09 Salil Koner, Ana-Maria Staicu
Modern studies from a variety of fields record multiple functional observations according to either multivariate, longitudinal, spatial, or time series designs. We refer to such data as second-generation functional data because their analysis—unlike typical functional data analysis, which assumes independence of the functions—accounts for the complex dependence between the functional observations and
-
A Brief Tour of Deep Learning from a Statistical Perspective Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2023-03-09 Eric Nalisnick, Padhraic Smyth, Dustin Tran
We expose the statistical foundations of deep learning with the goal of facilitating conversation between the deep learning and statistics communities. We highlight core themes at the intersection; summarize key neural models, such as feedforward neural networks, sequential neural networks, and neural latent variable models; and link these ideas to their roots in probability and statistics. We also
-
Statistical Deep Learning for Spatial and Spatiotemporal Data Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2023-03-09 Christopher K. Wikle, Andrew Zammit-Mangion
Deep neural network models have become ubiquitous in recent years and have been applied to nearly all areas of science, engineering, and industry. These models are particularly useful for data that have strong dependencies in space (e.g., images) and time (e.g., sequences). Indeed, deep models have also been extensively used by the statistical community to model spatial and spatiotemporal data through
-
Confidentiality Protection in the 2020 US Census of Population and Housing Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-11-29 John M. Abowd, Michael B. Hawes
In an era where external data and computational capabilities far exceed statistical agencies’ own resources and capabilities, they face the renewed challenge of protecting the confidentiality of underlying microdata when publishing statistics in very granular form and ensuring that these granular data are used for statistical purposes only. Conventional statistical disclosure limitation methods are
-
Statistical Methods for Exoplanet Detection with Radial Velocities Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-11-22 Nathan C. Hara, Eric B. Ford
Exoplanets can be detected with various observational techniques. Among them, radial velocity (RV) has the key advantages of revealing the architecture of planetary systems and measuring planetary mass and orbital eccentricities. RV observations are poised to play a key role in the detection and characterization of Earth twins. However, the detection of such small planets is not yet possible due to
-
Statistical Machine Learning for Quantitative Finance Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-11-22 M. Ludkovski
We survey the active interface of statistical learning methods and quantitative finance models. Our focus is on the use of statistical surrogates, also known as functional approximators, for learning input–output relationships relevant for financial tasks. Given the disparate terminology used among statisticians and financial mathematicians, we begin by reviewing the main ingredients of surrogate construction
-
Approximate Methods for Bayesian Computation Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-11-22 Radu V. Craiu, Evgeny Levi
Rich data generating mechanisms are ubiquitous in this age of information and require complex statistical models to draw meaningful inference. While Bayesian analysis has seen enormous development in the last 30 years, benefitting from the impetus given by the successful application of Markov chain Monte Carlo (MCMC) sampling, the combination of big data and complex models conspire to produce significant
-
Fifty Years of the Cox Model Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-11-21 John D. Kalbfleisch, Douglas E. Schaubel
The Cox model is now 50 years old. The seminal paper of Sir David Cox has had an immeasurable impact on the analysis of censored survival data, with applications in many different disciplines. This work has also stimulated much additional research in diverse areas and led to important theoretical and practical advances. These include semiparametric models, nonparametric efficiency, and partial likelihood
-
Statistical Data Privacy: A Song of Privacy and Utility Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-11-19 Aleksandra Slavković, Jeremy Seeman
To quantify trade-offs between increasing demand for open data sharing and concerns about sensitive information disclosure, statistical data privacy (SDP) methodology analyzes data release mechanisms that sanitize outputs based on confidential data. Two dominant frameworks exist: statistical disclosure control (SDC) and the more recent differential privacy (DP). Despite framing differences, both SDC
-
Innovation Diffusion Processes: Concepts, Models, and Predictions Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-11-19 Mariangela Guidolin, Piero Manfredi
Innovation diffusion processes have attracted considerable research attention for their interdisciplinary character, which combines theories and concepts from disciplines such as mathematics, physics, statistics, social sciences, marketing, economics, and technological forecasting. The formal representation of innovation diffusion processes historically used epidemic models borrowed from biology, departing
-
Simulation-Based Bayesian Analysis Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-11-19 Martyn Plummer
I consider the development of Markov chain Monte Carlo (MCMC) methods, from late-1980s Gibbs sampling to present-day gradient-based methods and piecewise-deterministic Markov processes. In parallel, I show how these ideas have been implemented in successive generations of statistical software for Bayesian inference. These software packages have been instrumental in popularizing applied Bayesian modeling
-
The Role of Statistics in Promoting Data Reusability and Research Transparency Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-11-19 Sarah M. Nusser
The value of research data has grown as the emphasis on research transparency and data-intensive research has increased. Data sharing is now required by funders and publishers and is becoming a disciplinary expectation in many fields. However, practices promoting data reusability and research transparency are poorly understood, making it difficult for statisticians and other researchers to reframe
-
Models for Integer Data Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-11-19 Dimitris Karlis, Naushad Mamode Khan
Over the past few years, interest has increased in models defined on positive and negative integers. Several application areas lead to data that are differences between positive integers. Some important examples are price changes measured discretely in financial applications, pre- and posttreatment measurements of discrete outcomes in clinical trials, the difference in the number of goals in sports
-
Statistical Applications to Cognitive Diagnostic Testing Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-11-09 Susu Zhang, Jingchen Liu, Zhiliang Ying
Diagnostic classification tests are designed to assess examinees’ discrete mastery status on a set of skills or attributes. Such tests have gained increasing attention in educational and psychological measurement. We review diagnostic classification models and their applications to testing and learning, discuss their statistical and machine learning connections and related challenges, and introduce
-
Sustainable Statistical Capacity-Building for Africa: The Biostatistics Case Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-11-05 Tarylee Reddy, Rebecca N. Nsubuga, Tobias Chirwa, Ziv Shkedy, Ann Mwangi, Ayele Tadesse Awoke, Luc Duchateau, Paul Janssen
Several major global challenges, including climate change and water scarcity, warrant a scientific approach to generating solutions. Developing high quality and robust capacity in (bio)statistics is key to ensuring sound scientific solutions to these challenges, so collaboration between academic and research institutes should be high on university agendas. To strengthen capacity in the developing world
-
Player Tracking Data in Sports Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-11-01 Stephanie A. Kovalchik
There has been rapid growth in the collection of player tracking data in recent years. These data, providing spatiotemporal locations of players and ball at high resolution, have spurred methodological developments in a range of sports. There have been impacts in the development of player performance measurement (e.g., distance traveled) and in the attribution of value to specific plays (e.g., expected
-
Model Diagnostics and Forecast Evaluation for Quantiles Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-11-01 Tilmann Gneiting, Daniel Wolffram, Johannes Resin, Kristof Kraus, Johannes Bracher, Timo Dimitriadis, Veit Hagenmeyer, Alexander I. Jordan, Sebastian Lerch, Kaleb Phipps, Melanie Schienle
Model diagnostics and forecast evaluation are closely related tasks, with the former concerning in-sample goodness (or lack) of fit and the latter addressing predictive performance out-of-sample. We review the ubiquitous setting in which forecasts are cast in the form of quantiles or quantile-bounded prediction intervals. We distinguish unconditional calibration, which corresponds to classical coverage
-
Generative Models: An Interdisciplinary Perspective Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-11-01 Kris Sankaran, Susan P. Holmes
By linking conceptual theories with observed data, generative models can support reasoning in complex situations. They have come to play a central role both within and beyond statistics, providing the basis for power analysis in molecular biology, theory building in particle physics, and resource allocation in epidemiology, for example. We introduce the probabilistic and computational concepts underlying
-
Shared Frailty Methods for Complex Survival Data: A Review of Recent Advances Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-11-01 Malka Gorfine, David M. Zucker
Dependent survival data arise in many contexts. One context is clustered survival data, where survival data are collected on clusters such as families or medical centers. Dependent survival data also arise when multiple survival times are recorded for each individual. Frailty models are one common approach to handle such data. In frailty models, the dependence is expressed in terms of a random effect
-
Model-Based Clustering Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-10-21 Isobel Claire Gormley, Thomas Brendan Murphy, Adrian E. Raftery
Clustering is the task of automatically gathering observations into homogeneous groups, where the number of groups is unknown. Through its basis in a statistical modeling framework, model-based clustering provides a principled and reproducible approach to clustering. In contrast to heuristic approaches, model-based clustering allows for robust approaches to parameter estimation and objective inference
-
Six Statistical Senses Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-10-21 Radu V. Craiu, Ruobin Gong, Xiao-Li Meng
This article proposes a set of categories, each one representing a particular distillation of important statistical ideas. Each category is labeled a “sense” because we think of these as essential in helping every statistical mind connect in constructive and insightful ways with statistical theory, methodologies, and computation, toward the ultimate goal of building statistical phronesis. The illustration
-
A Review of Generalizability and Transportability Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-10-19 Irina Degtiar, Sherri Rose
When assessing causal effects, determining the target population to which the results are intended to generalize is a critical decision. Randomized and observational studies each have strengths and limitations for estimating causal effects in a target population. Estimates from randomized data may have internal validity but are often not representative of the target population. Observational data may
-
Fair Risk Algorithms Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-10-07 Richard A. Berk, Arun Kumar Kuchibhotla, Eric Tchetgen Tchetgen
Machine learning algorithms are becoming ubiquitous in modern life. When used to help inform human decision making, they have been criticized by some for insufficient accuracy, an absence of transparency, and unfairness. Many of these concerns can be legitimate, although they are less convincing when compared with the uneven quality of human decisions. There is now a large literature in statistics
-
Three-Decision Methods: A Sensible Formulation of Significance Tests—and Much Else Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-10-06 Kenneth M. Rice, Chloe A. Krakauer
For real-valued parameters, significance tests can be motivated as three-decision methods, in which we either assert the sign of the parameter above or below a specified null value, or say nothing either way. Tukey viewed this as a “sensible formulation” of tests, unlike the widely taught null hypothesis significance testing (NHST) system that is today's default. We review the three-decision framework
-
High-Dimensional Survival Analysis: Methods and Applications Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-10-06 Stephen Salerno, Yi Li
In the era of precision medicine, time-to-event outcomes such as time to death or progression are routinely collected, along with high-throughput covariates. These high-dimensional data defy classical survival regression models, which are either infeasible to fit or likely to incur low predictability due to overfitting. To overcome this, recent emphasis has been placed on developing novel approaches
-
Data Integration in Bayesian Phylogenetics Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-09-28 Gabriel W. Hassler, Andrew F. Magee, Zhenyu Zhang, Guy Baele, Philippe Lemey, Xiang Ji, Mathieu Fourment, Marc A. Suchard
Researchers studying the evolution of viral pathogens and other organisms increasingly encounter and use large and complex data sets from multiple different sources. Statistical research in Bayesian phylogenetics has risen to this challenge. Researchers use phylogenetics not only to reconstruct the evolutionary history of a group of organisms, but also to understand the processes that guide its evolution
-
Markov Chain Monte Carlo in Practice Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Galin L. Jones, Qian Qin
Markov chain Monte Carlo (MCMC) is an essential set of tools for estimating features of probability distributions commonly encountered in modern applications. For MCMC simulation to produce reliable outcomes, it needs to generate observations representative of the target distribution, and it must be long enough so that the errors of Monte Carlo estimates are small. We review methods for assessing the
-
Postprocessing of MCMC Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Leah F. South, Marina Riabiz, Onur Teymur, Chris J. Oates
Markov chain Monte Carlo is the engine of modern Bayesian statistics, being used to approximate the posterior and derived quantities of interest. Despite this, the issue of how the output from a Markov chain is postprocessed and reported is often overlooked. Convergence diagnostics can be used to control bias via burn-in removal, but these do not account for (common) situations where a limited computational
-
Post-Selection Inference Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Arun K. Kuchibhotla, John E. Kolassa, Todd A. Kuffner
We discuss inference after data exploration, with a particular focus on inference after model or variable selection. We review three popular approaches to this problem: sample splitting, simultaneous inference, and conditional selective inference. We explain how each approach works and highlight its advantages and disadvantages. We also provide an illustration of these post-selection inference approaches
-
Quantum Computing in a Statistical Context Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Yazhen Wang, Hongzhi Liu
Quantum computing is widely considered a frontier of interdisciplinary research and involves fields ranging from computer science to physics and from chemistry to engineering. On the one hand, the stochastic essence of quantum physics results in the random nature of quantum computing; thus, there is an important role for statistics to play in the development of quantum computing. On the other hand
-
Vine Copula Based Modeling Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Claudia Czado, Thomas Nagler
With the availability of massive multivariate data comes a need to develop flexible multivariate distribution classes. The copula approach allows marginal models to be constructed for each variable separately and joined with a dependence structure characterized by a copula. The class of multivariate copulas was limited for a long time to elliptical (including the Gaussian and t-copula) and Archimedean
-
Discrete Latent Variable Models Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Francesco Bartolucci, Silvia Pandolfi, Fulvia Pennoni
We review the discrete latent variable approach, which is very popular in statistics and related fields. It allows us to formulate interpretable and flexible models that can be used to analyze complex datasets in the presence of articulated dependence structures among variables. Specific models including discrete latent variables are illustrated, such as finite mixture, latent class, hidden Markov
-
Measure Transportation and Statistical Decision Theory Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Marc Hallin
Unlike the real line, the real space, in dimension d ≥ 2, is not canonically ordered. As a consequence, extending to a multivariate context fundamental univariate statistical tools such as quantiles, signs, and ranks is anything but obvious. Tentative definitions have been proposed in the literature but do not enjoy the basic properties (e.g., distribution-freeness of ranks, their independence with
-
Basis-Function Models in Spatial Statistics Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Noel Cressie, Matthew Sainsbury-Dale, Andrew Zammit-Mangion
Spatial statistics is concerned with the analysis of data that have spatial locations associated with them, and those locations are used to model statistical dependence between the data. The spatial data are treated as a single realization from a probability model that encodes the dependence through both fixed effects and random effects, where randomness is manifest in the underlying spatial process
-
A Variational View on Statistical Multiscale Estimation Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Markus Haltmeier, Housen Li, Axel Munk
We present a unifying view on various statistical estimation techniques including penalization, variational, and thresholding methods. These estimators are analyzed in the context of statistical linear inverse problems including nonparametric and change point regression, and high-dimensional linear models as examples. Our approach reveals many seemingly unrelated estimation schemes as special instances
-
Score-Driven Time Series Models Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Andrew C. Harvey
The construction of score-driven filters for nonlinear time series models is described, and they are shown to apply over a wide range of disciplines. Their theoretical and practical advantages over other methods are highlighted. Topics covered include robust time series modeling, conditional heteroscedasticity, count data, dynamic correlation and association, censoring, circular data, and switching
-
Granger Causality: A Review and Recent Advances Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Ali Shojaie, Emily B. Fox
Introduced more than a half-century ago, Granger causality has become a popular tool for analyzing time series data in many application domains, from economics and finance to genomics and neuroscience. Despite this popularity, the validity of this framework for inferring causal relationships among time series has remained the topic of continuous debate. Moreover, while the original definition was general
-
Effects of Causes and Causes of Effects Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 A. Philip Dawid, Monica Musio
We describe and contrast two distinct problem areas for statistical causality: studying the likely effects of an intervention (effects of causes) and studying whether there is a causal link between the observed exposure and outcome in an individual case (causes of effects). For each of these, we introduce and compare various formal frameworks that have been proposed for that purpose, including the
-
Causality and the Cox Regression Model Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Torben Martinussen
This article surveys results concerning the interpretation of the Cox hazard ratio in connection to causality in a randomized study with a time-to-event response. The Cox model is assumed to be correctly specified, and we investigate whether the typical end product of such an analysis, the estimated hazard ratio, has a causal interpretation as a hazard ratio. It has been pointed out that this is not
-
Framing Causal Questions in Life Course Epidemiology Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Bianca L. De Stavola, Moritz Herle, Andrew Pickles
We describe the principles of counterfactual thinking in providing more precise definitions of causal effects and some of the implications of this work for the way in which causal questions in life course research are framed and evidence evaluated. Terminology is explained and examples of common life course analyses are discussed that focus on the timing of exposures, the mediation of their effects
-
Current Advances in Neural Networks Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Víctor Gallego, David Ríos Insua
This article reviews current advances and developments in neural networks. This requires recalling some of the earlier work in the field. We emphasize Bayesian approaches and their benefits compared to more standard maximum likelihood treatments. Several representative experiments using varied modern neural architectures are presented.
-
Methods Based on Semiparametric Theory for Analysis in the Presence of Missing Data Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Marie Davidian
A statistical model is a class of probability distributions assumed to contain the true distribution generating the data. In parametric models, the distributions are indexed by a finite-dimensional parameter characterizing the scientific question of interest. Semiparametric models describe the distributions in terms of a finite-dimensional parameter and an infinite-dimensional component, offering more
-
Risk Measures: Robustness, Elicitability, and Backtesting Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Xue Dong He, Steven Kou, Xianhua Peng
Risk measures are used not only for financial institutions’ internal risk management but also for external regulation (e.g., in the Basel Accord for calculating the regulatory capital requirements for financial institutions). Though fundamental in risk management, how to select a good risk measure is a controversial issue. We review the literature on risk measures, particularly on issues such as subadditivity
-
Recent Challenges in Actuarial Science Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Paul Embrechts, Mario V. Wüthrich
For centuries, mathematicians and, later, statisticians, have found natural research and employment opportunities in the realm of insurance. By definition, insurance offers financial cover against unforeseen events that involve an important component of randomness, and consequently, probability theory and mathematical statistics enter insurance modeling in a fundamental way. In recent years, a data
-
Value of Information Analysis in Models to Inform Health Policy Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Christopher H. Jackson, Gianluca Baio, Anna Heath, Mark Strong, Nicky J. Welton, Edward C.F. Wilson
Value of information (VoI) is a decision-theoretic approach to estimating the expected benefits from collecting further information of different kinds, in scientific problems based on combining one or more sources of data. VoI methods can assess the sensitivity of models to different sources of uncertainty and help to set priorities for further data collection. They have been widely applied in healthcare
-
Sibling Comparison Studies Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Arvid Sjölander, Thomas Frisell, Sara Öberg
Unmeasured confounding is one of the main sources of bias in observational studies. A popular way to reduce confounding bias is to use sibling comparisons, which implicitly adjust for several factors in the early environment or upbringing without requiring them to be measured or known. In this article we provide a broad exposition of the statistical analysis methods for sibling comparison studies.
-
A Practical Guide to Family Studies with Lifetime Data Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Thomas H. Scheike, Klaus Kähler Holst
Familial aggregation refers to the fact that a particular disease may be overrepresented in some families due to genetic or environmental factors. When studying such phenomena, it is clear that one important aspect is the age of onset of the disease in question, and in addition, the data will typically be right-censored. Therefore, one must apply lifetime data methods to quantify such dependence and
-
Is There a Cap on Longevity? A Statistical Review Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Léo R. Belzile, Anthony C. Davison, Jutta Gampe, Holger Rootzén, Dmitrii Zholud
There is sustained and widespread interest in understanding the limit, if there is any, to the human life span. Apart from its intrinsic and biological interest, changes in survival in old age have implications for the sustainability of social security systems. A central question is whether the endpoint of the underlying lifetime distribution is finite. Recent analyses of data on the oldest human lifetimes
-
Perspective on Data Science Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Roger D. Peng, Hilary S. Parker
The field of data science currently enjoys a broad definition that includes a wide array of activities which borrow from many other established fields of study. Having such a vague characterization of a field in the early stages might be natural, but over time maintaining such a broad definition becomes unwieldy and impedes progress. In particular, the teaching of data science is hampered by the seeming
-
Value of Information Analysis in Models to Inform Health Policy. Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2022-03-07 Christopher H Jackson,Gianluca Baio,Anna Heath,Mark Strong,Nicky J Welton,Edward C F Wilson
Value of information (VoI) is a decision-theoretic approach to estimating the expected benefits from collecting further information of different kinds, in scientific problems based on combining one or more sources of data. VoI methods can assess the sensitivity of models to different sources of uncertainty and help to set priorities for further data collection. They have been widely applied in healthcare
-
Twenty-First-Century Statistical and Computational Challenges in Astrophysics Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2021-03-08 Eric D. Feigelson, Rafael S. de Souza, Emille E.O. Ishida, Gutti Jogesh Babu
Modern astronomy has been rapidly increasing our ability to see deeper into the Universe, acquiring enormous samples of cosmic populations. Gaining astrophysical insights from these data sets requires a wide range of sophisticated statistical and machine learning methods. Long-standing problems in cosmology include characterization of galaxy clustering and estimation of galaxy distances from photometric
-
Statistical Connectomics Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2021-03-08 Jaewon Chung, Eric Bridgeford, Jesús Arroyo, Benjamin D. Pedigo, Ali Saad-Eldin, Vivek Gopalakrishnan, Liang Xiang, Carey E. Priebe, Joshua T. Vogelstein
The data science of networks is a rapidly developing field with myriad applications. In neuroscience, the brain is commonly modeled as a connectome, a network of nodes connected by edges. While there have been thousands of papers on connectomics, the statistics of networks remains limited and poorly understood. Here, we provide an overview from the perspective of statistical network science of the
-
Statistical Applications in Educational Measurement Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2021-03-08 Hua-Hua Chang, Chun Wang, Susu Zhang
Educational measurement assigns numbers to individuals based on observed data to represent individuals’ educational properties such as abilities, aptitudes, achievements, progress, and performance. The current review introduces a selection of statistical applications to educational measurement, ranging from classical statistical theory (e.g., Pearson correlation and the Mantel–Haenszel test) to more
-
Quantile Regression for Survival Data Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2021-03-08 Limin Peng
Quantile regression offers a useful alternative strategy for analyzing survival data. Compared with traditional survival analysis methods, quantile regression allows for comprehensive and flexible evaluations of covariate effects on a survival outcome of interest while providing simple physical interpretations on the time scale. Moreover, many quantile regression methods enjoy easy and stable computation
-
Adaptive Enrichment Designs in Clinical Trials Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2021-03-08 Peter F. Thall
Adaptive enrichment designs for clinical trials may include rules that use interim data to identify treatment-sensitive patient subgroups, select or compare treatments, or change entry criteria. A common setting is a trial to compare a new biologically targeted agent to standard therapy. An enrichment design's structure depends on its goals, how it accounts for patient heterogeneity and treatment effects
-
Flexible Models for Complex Data with Applications Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2021-03-08 Christophe Ley, Slađana Babić, Domien Craens
Probability distributions are the building blocks of statistical modeling and inference. It is therefore of the utmost importance to know which distribution to use in what circumstances, as wrong choices will inevitably entail a biased analysis. In this article, we focus on circumstances involving complex data and describe the most popular flexible models for these settings. We focus on the following
-
Tensors in Statistics Annu. Rev. Stat. Appl. (IF 7.917) Pub Date : 2021-03-08 Xuan Bi, Xiwei Tang, Yubai Yuan, Yanqing Zhang, Annie Qu
This article provides an overview of tensors, their properties, and their applications in statistics. Tensors, also known as multidimensional arrays, are generalizations of matrices to higher orders and are useful data representation architectures. We first review basic tensor concepts and decompositions, and then we elaborate traditional and recent applications of tensors in the fields of recommender