-
Inference of partial correlations of a multivariate Gaussian time series Biometrika (IF 2.7) Pub Date : 2024-02-26 A S DiLernia, M Fiecas, L Zhang
We derive an asymptotic joint distribution and novel covariance estimator for the partial correlations of a multivariate Gaussian time series given mild regularity conditions. Using our derived asymptotic distribution, we develop a Wald confidence interval and testing procedure for inference of individual partial correlations for time series data. Through simulation we demonstrate that our proposed
-
Network-adjusted covariates for community detection Biometrika (IF 2.7) Pub Date : 2024-02-24 Y Hu, W Wang
Community detection is a crucial task in network analysis that can be significantly improved by incorporating subject-level information, i.e., covariates. Existing methods have shown the effectiveness of using covariates on the low-degree nodes, but rarely discuss the case where communities have significantly different density levels, i.e. multiscale networks. In this paper, we introduce a novel method
-
Selective conformal inference with false coverage-statement rate control Biometrika (IF 2.7) Pub Date : 2024-02-19 Yajie Bao, Yuyang Huo, Haojie Ren, Changliang Zou
Conformal inference is a popular tool for constructing prediction intervals. We consider here the scenario of post-selection/selective conformal inference, that is prediction intervals are reported only for individuals selected from unlabelled test data. To account for multiplicity, we develop a general split conformal framework to construct selective prediction intervals with the false coverage-statement
-
The Promises of Parallel Outcomes Biometrika (IF 2.7) Pub Date : 2024-02-17 Ying Zhou, Dingke Tang, Dehan Kong, Linbo Wang
A key challenge in causal inference from observational studies is the identification and estimation of causal effects in the presence of unmeasured confounding. In this paper, we introduce a novel approach for causal inference that leverages information in multiple outcomes to deal with unmeasured confounding. An important assumption in our approach is conditional independence among multiple outcomes
-
Doubly robust estimation under covariate-induced dependent left truncation Biometrika (IF 2.7) Pub Date : 2024-02-11 Yuyao Wang, Andrew Ying, Ronghui Xu
Summary In prevalent cohort studies with follow-up, the time-to-event outcome is subject to left truncation leading to selection bias. For estimation of the distribution of time-to-event, conventional methods adjusting for left truncation tend to rely on the quasi-independence assumption that the truncation time and the event time are independent on the observed region. This assumption is violated
-
Explicit solutions for the asymptotically-optimal bandwidth in cross-validation Biometrika (IF 2.7) Pub Date : 2024-02-08 Karim M Abadir, Michel Lubrano
Summary We show that least squares cross-validation methods share a common structure which has an explicit asymptotic solution, when the chosen kernel is asymptotically separable in bandwidth and data. For density estimation with a multivariate Student t(ν) kernel, the cross-validation criterion becomes asymptotically equivalent to a polynomial of only three terms. Our bandwidth formulae are simple
-
Regression analysis of group-tested current status data Biometrika (IF 2.7) Pub Date : 2024-02-08 Shuwei Li, Tao Hu, Lianming Wang, Christopher S McMahan, Joshua M Tebbs
Summary Group testing is an effective way to reduce the time and cost associated with conducting large-scale screening for infectious diseases. Benefits are realized through testing pools formed by combining specimens, such as blood or urine, from different individuals. In some studies, individuals are assessed only once and a time-to-event endpoint is recorded, for example, the time until infection
-
On the failure of the bootstrap for Chatterjee's rank correlation Biometrika (IF 2.7) Pub Date : 2024-02-04 Zhexiao Lin, Fang Han
Summary While researchers commonly use the bootstrap to quantify the uncertainty of an estimator, it has been noticed that the standard bootstrap, in general, does not work for Chatterjee's rank correlation. In this paper, we provide proof of this issue under an additional independence assumption, and complement our theory with simulation evidence for general settings. Chatterjee's rank correlation
-
Asymptotically constant risk estimator of the time-average variance constant Biometrika (IF 2.7) Pub Date : 2024-02-03 K W Chan, C Y Yau
Summary Estimation of the time-average variance constant is important for statistical analyses involving dependent data. This problem is difficult as it relies on a bandwidth parameter. Specifically, the optimal choices of the bandwidths of all existing estimators depend on the estimand itself and another unknown parameter which is very difficult to estimate. Thus, optimal variance estimation is unachievable
-
A note on minimax robustness of designs against correlated or heteroscedastic responses Biometrika (IF 2.7) Pub Date : 2024-01-20 D P Wiens
Summary We present a result according to which certain functions of covariance matrices are maximized at scalar multiples of the identity matrix. This is used to show that experimental designs that are optimal under an assumption of independent, homoscedastic responses can be minimax robust, in broad classes of alternate covariance structures. In particular it can justify the common practice of disregarding
-
Efficient nonparametric estimation of Toeplitz covariance matrices Biometrika (IF 2.7) Pub Date : 2024-01-17 K Klockmann, T Krivobokova
A new efficient nonparametric estimator for Toeplitz covariance matrices is proposed. This estimator is based on a data transformation that translates the problem of Toeplitz covariance matrix estimation to the problem of mean estimation in an approximate Gaussian regression. The resulting Toeplitz covariance matrix estimator is positive definite by construction, fully data-driven and computationally
-
On Selecting and Conditioning in Multiple Testing and Selective Inference Biometrika (IF 2.7) Pub Date : 2023-12-22 Jelle J Goeman, Aldo Solari
We investigate a class of methods for selective inference that condition on a selection event. Such methods follow a two-stage process. First, a data-driven collection of hypotheses is chosen from some large universe of hypotheses. Subsequently, inference takes place within this data-driven collection, conditioned on the information that was used for the selection. Examples of such methods include
-
Central limit theorems for local network statistics Biometrika (IF 2.7) Pub Date : 2023-12-22 P A Maugis
Summary Subgraph counts, in particular the number of occurrences of small shapes such as triangles, characterize properties of random networks. As a result, they have seen wide use as network summary statistics. Subgraphs are typically counted globally, making existing approaches unable to describe vertex-specific characteristics. In contrast, rooted subgraphs focus on vertex neighbourhoods, and are
-
Conformalized survival analysis with adaptive cutoffs Biometrika (IF 2.7) Pub Date : 2023-12-01 Yu Gui, Rohan Hore, Zhimei Ren, Rina Foygel Barber
Summary This paper introduces an assumption-lean method that constructs valid and efficient lower predictive bounds (LPBs) for survival times with censored data.We build on recent work by Candès et al. (2021), whose approach first subsets the data to discard any data points with early censoring times, and then uses a reweighting technique (namely, weighted conformal inference (Tibshirani et al., 2019))
-
Phylogenetic Association Analysis with Conditional Rank Correlation Biometrika (IF 2.7) Pub Date : 2023-12-01 Shulei Wang, Bo Yuan, T Tony Cai, Hongzhe Li
Summary Phylogenetic association analysis plays a crucial role in investigating the correlation between microbial compositions and specific outcomes of interest in microbiome studies. However, existing methods for testing such associations have limitations related to the assumption of a linear association in high-dimensional settings and the handling of confounding effects. Therefore, there is a need
-
Familial inference: Tests for hypotheses on a family of centres Biometrika (IF 2.7) Pub Date : 2023-11-28 Ryan Thompson, Catherine S Forbes, Steven N Maceachern, Mario Peruggia
Statistical hypotheses are translations of scientific hypotheses into statements about one or more distributions, often concerning their centre. Tests that assess statistical hypotheses of centre implicitly assume a specific centre, e.g., the mean or median. Yet, scientific hypotheses do not always specify a particular centre. This ambiguity leaves the possibility for a gap between scientific theory
-
Maximum Likelihood Estimation for Semiparametric Regression Models with Interval-Censored Multistate Data Biometrika (IF 2.7) Pub Date : 2023-11-24 Yu Gu, Donglin Zeng, Gerardo Heiss, D Y Lin
Summary Interval-censored multistate data arise in many studies of chronic diseases, where the health status of a subject can be characterized by a finite number of disease states and the transition between any two states is only known to occur over a broad time interval. We relate potentially time-dependent covariates to multistate processes through semiparametric proportional intensity models with
-
On varimax asymptotics in network models and spectral methods for dimensionality reduction Biometrika (IF 2.7) Pub Date : 2023-11-20 J Cape
Summary Varimax factor rotations, while popular among practitioners in psychology and statistics since being introduced by H.Kaiser, have historically been viewed with skepticism and suspicion by some theoreticians and mathematical statisticians. Now, work by K. Rohe and M. Zeng provides new, fundamental insight: varimax rotations provably perform statistical estimation in certain classes of latent
-
Second term improvement to generalised linear mixed model asymptotics Biometrika (IF 2.7) Pub Date : 2023-11-16 Luca Maestrini, Aishwarya Bhaskaran, Matt P Wand
Summary A recent article on generalised linear mixed model asymptotics, Jiang et al. (2022), derived the rates of convergence for the asymptotic variances of maximum likelihood estimators. If m denotes the number of groups and n is the average within-group sample size then the asymptotic variances have orders m − 1 and (mn)−1, depending on the parameter. We extend this theory to provide explicit forms
-
Discussion of 'Statistical inference for streamed longitudinal data'. Biometrika (IF 2.7) Pub Date : 2023-11-15 Yang Ning,Jingyi Duan
-
Projective Independence Tests in High Dimensions: the Curses and the Cures Biometrika (IF 2.7) Pub Date : 2023-11-15 Yaowu Zhang, Liping Zhu
Summary Testing independence between high dimensional random vectors is fundamentally different from testing independence between univariate random variables. Take the projection correlation as an example. It suffers from at least three issues. First, it has a high computational complexity of O{n3 (p + q)}, where n, p and q are the respective sample size and dimensions of the random vectors. This limits
-
Generalized kernel two-sample tests Biometrika (IF 2.7) Pub Date : 2023-11-14 Hoseung Song, Hao Chen
Summary Kernel two-sample tests have been widely used for multivariate data to test equality of distributions. However, existing tests based on mapping distributions into a reproducing kernel Hilbert space mainly target specific alternatives and do not work well for some scenarios when the dimension of the data is moderate to high due to the curse of dimensionality. We propose a new test statistic
-
Testing Serial Independence of Object-Valued Time Series Biometrika (IF 2.7) Pub Date : 2023-11-12 Feiyu Jiang, Hanjia Gao, Xiaofeng Shao
Summary We propose a novel method for testing serial independence of object-valued time series in metric spaces, which is more general than Euclidean or Hilbert spaces. The proposed method is fully nonparametric, free of tuning parameters and can capture all nonlinear pairwise dependence. The key concept used in this paper is the distance covariance in metric spaces, which is extended to auto-distance
-
On inference in high-dimensional logistic regression models with separated data Biometrika (IF 2.7) Pub Date : 2023-11-07 R M Lewis, H S Battey
Direct use of the likelihood function typically produces severely biased estimates when the dimension of the parameter vector is large relative to the effective sample size. With linearly separable data generated from a logistic regression model, the loglikelihood function asymptotes and the maximum likelihood estimator does not exist. We show that an exact analysis for each regression coefficient
-
Likelihood-based Inference under Non-Convex Boundary Constraints Biometrika (IF 2.7) Pub Date : 2023-10-19 J Y Wang, Z S YE, Y Chen
Summary Likelihood-based inference under nonconvex constraints on model parameters has become increasingly common in biomedical research. In this paper, we establish large-sample properties of the maximum likelihood estimator when the true parameter value lies at the boundary of a nonconvex parameter space. We further derive the asymptotic distribution of the likelihood ratio test statistic under nonconvex
-
Nonparametric priors with full-range borrowing of information Biometrika (IF 2.7) Pub Date : 2023-10-19 F Ascolani, B Franzolini, A Lijoi, I Prünster
Summary Modelling of the dependence structure across heterogeneous data is crucial for Bayesian inference since it directly impacts the borrowing of information. Despite the extensive advances over the last two decades, most available proposals only allow for nonnegative correlations. We derive a new class of dependent nonparametric priors that can induce correlations of any sign, thus introducing
-
On geometric convergence for MALA under simple conditions Biometrika (IF 2.7) Pub Date : 2023-10-03 Alain Oliviero-Durmus, Éric Moulines
Summary While the Metropolis Adjusted Langevin Algorithm (MALA) is a popular and widely used Markov chain Monte Carlo method, very few papers derive conditions that ensure its convergence. In particular, to the authors' knowledge, assumptions that are both easy to verify and guarantee geometric convergence, are still missing. In this work, we establish V-uniformly geometric convergence for MALA under
-
Efficient Evaluation of Natural Stochastic Policies in Offline Reinforcement Learning Biometrika (IF 2.7) Pub Date : 2023-09-26 Nathan Kallus, Masatoshi Uehara
We study the efficient off-policy evaluation of natural stochastic policies, which are defined in terms of deviations from the unknown behaviour policy. This is a departure from the literature on off-policy evaluation that largely consider the evaluation of explicitly specified policies. Crucially, offline reinforcement learning with natural stochastic policies can help alleviate issues of weak overlap
-
Selective machine learning of doubly robust functionals Biometrika (IF 2.7) Pub Date : 2023-09-25 Y Cui, E Tchetgen Tchetgen
While model selection is a well-studied topic in parametric and nonparametric regression or density estimation, selection of possibly high-dimensional nuisance parameters in semiparametric problems is far less developed. In this paper, we propose a selective machine learning framework for making inferences about a finite-dimensional functional defined on a semiparametric model, when the latter admits
-
An eigenvector-assisted estimation framework for signal-plus-noise matrix models Biometrika (IF 2.7) Pub Date : 2023-09-19 Fangzheng Xie, Dingbo Wu
Summary In this paper, we develop an eigenvector-assisted estimation framework for a collection of signal-plus-noise matrix models arising in high-dimensional statistics and many applications. The framework is built upon a novel asymptotically unbiased estimating equation using the leading eigenvectors of the data matrix. However, the estimator obtained by directly solving the estimating equation could
-
E-values as unnormalized weights in multiple testing Biometrika (IF 2.7) Pub Date : 2023-09-15 Nikolaos Ignatiadis, Ruodu Wang, Aaditya Ramdas
Summary We study how to combine p-values and e-values, and design multiple testing procedures where both p-values and e-values are available for every hypothesis. Our results provide a new perspective on multiple testing with data-driven weights: while standard weighted multiple testing methods require the weights to deterministically add up to the number of hypotheses being tested, we show that this
-
Retrospective causal inference with multiple effect variables Biometrika (IF 2.7) Pub Date : 2023-09-12 Wei Li, Zitong Lu, Jinzhu Jia, Min Xie, Zhi Geng
Summary As highlighted in Dawid (2000) and Pearl & Mackenzie (2018), deducing the causes of given effects is a more challenging problem than evaluating the effects of causes in causal inference. Lu et al. (2023) proposed an approach for deducing causes of a single effect variable based on posterior causal effects. In many applications, there are multiple effect variables, and thus they can be used
-
Estimation of prediction error in time series Biometrika (IF 2.7) Pub Date : 2023-09-09 Alexander Aue, Prabir Burman
Summary The accurate estimation of prediction errors in time series is an important problem. It immediately affects the accuracy of prediction intervals but also the quality of a number of widely used time series model selection criteria such as AIC and others. Except for simple cases, however, it is difficult or even infeasible to obtain exact analytical expressions for one-step and multi-step predictions
-
Order-based Structure Learning without Score Equivalence Biometrika (IF 2.7) Pub Date : 2023-09-05 Hyunwoong Chang, James Cai, Quan Zhou
SUMMARY We propose an empirical Bayes formulation of the structure learning problem, where the prior specification assumes that all node variables have the same error variance, an assumption known to ensure the identifiability of the underlying causal directed acyclic graph (DAG). To facilitate efficient posterior computation, we approximate the posterior probability of each ordering by that of a best
-
More Efficient Exact Group Invariance Testing: using a Representative Subgroup Biometrika (IF 2.7) Pub Date : 2023-09-01 N W Koning, J Hemerik
We consider testing invariance of a distribution under an algebraic group of transformations, such as permutations or sign-flips. As such groups are typically huge, tests based on the full group are often computationally infeasible. Hence, it is standard practice to use a random subset of transformations. We improve upon this by replacing the random subset with a strategically chosen, fixed subgroup
-
Deep Kronecker Network Biometrika (IF 2.7) Pub Date : 2023-08-30 Long Feng, Guang Yang
Summary We develop a novel framework named Deep Kronecker Network for the analysis of medical imaging data, including magnetic resonance imaging (MRI), functional MRI, computed tomography, and more. Medical imaging data differs from general images in two main aspects: i) the sample size is often considerably smaller, and ii) the interpretation of the model is usually more crucial than predicting the
-
Kernel interpolation generalizes poorly Biometrika (IF 2.7) Pub Date : 2023-08-04 Yicheng Li, Haobo Zhang, Qian Lin
One of the most interesting problems in the recent renaissance of the studies in kernel regression might be whether kernel interpolation can generalize well, since it may help us understand the ‘benign overfitting phenomenon’ reported in the literature on deep networks. In this paper, under mild conditions, we show that for any ε > 0, the generalization error of kernel interpolation is lower bounded
-
τ -censored weighted Benjamini-Hochberg procedures under independence Biometrika (IF 2.7) Pub Date : 2023-08-02 Haibing Zhao, Huijuan Zhou
In the field of multiple hypothesis testing, auxiliary information can be leveraged to enhance the efficiency of test procedures. A common way to make use of auxiliary information is by weighting p-values. However, when the weights are learned from data, controlling the finite-sample false discovery rate becomes challenging, and most existing weighted procedures only guarantee false discovery rate
-
Covariate-Adjusted Log-Rank Test: Guaranteed Efficiency Gain and Universal Applicability Biometrika (IF 2.7) Pub Date : 2023-07-27 Ting Ye, Jun Shao, Yanyao Yi
Summary Nonparametric covariate adjustment is considered for log-rank type tests of treatment effect with right-censored time-to-event data from clinical trials applying covariate-adaptive randomization. Our proposed covariate-adjusted log-rank test has a simple explicit formula and a guaranteed efficiency gain over the unadjusted test. We also show that our proposed test achieves universal applicability
-
Online Inference with Debiased Stochastic Gradient Descent Biometrika (IF 2.7) Pub Date : 2023-07-27 Ruijian Han, Lan Luo, Yuanyuan Lin, Jian Huang
Summary We propose a debiased stochastic gradient descent algorithm for online statistical inference with high-dimensional data. Our approach combines the debiasing technique developed in high-dimensional statistics with the stochastic gradient descent algorithm. It can be used for efficiently constructing confidence intervals in an online fashion. Our proposed algorithm has several appealing aspects:
-
An anomaly arising in the analysis of processes with more than one source of variability Biometrika (IF 2.7) Pub Date : 2023-07-18 H S Battey, Peter Mccullagh
Summary It is frequently observed in practice that the Wald statistic gives a poor assessment of the statistical significance of a variance component. This paper provides detailed analytic insight into the phenomenon by way of two simple models, which point to an atypical geometry as the source of the aberration. The latter can in principle be checked numerically to cover situations of arbitrary complexity
-
Kernel Methods for Causal Functions: Dose, Heterogeneous, and Incremental Response Curves Biometrika (IF 2.7) Pub Date : 2023-07-05 R Singh, L Xu, A Gretton
Summary We propose estimators based on kernel ridge regression for nonparametric causal functions such as dose, heterogeneous and incremental response curves. The treatment and covariates may be discrete or continuous in general spaces. Due to a decomposition property specific to the reproducing kernel Hilbert space, our estimators have simple closed form solutions. We prove uniform consistency with
-
A cross-validation-based statistical theory for point processes Biometrika (IF 2.7) Pub Date : 2023-06-27 Ottmar Cronie, Mehdi Moradi, Christophe A N Biscio
Motivated by cross-validation’s general ability to reduce overfitting and mean square error, we develop a cross-validation-based statistical theory for general point processes. It is based on the combination of two novel concepts for general point processes: cross-validation and prediction errors. Our cross-validation approach uses thinning to split a point process/pattern into pairs of training and
-
Robust Sample Weighting to Facilitate Individualized Treatment Rule Learning for a Target Population Biometrika (IF 2.7) Pub Date : 2023-06-19 Rui Chen, Jared D Huling, Guanhua Chen, Menggang Yu
Summary Learning individualized treatment rules is an important topic in precision medicine. Current literature mainly focuses on deriving individualized treatment rules from a single source population. We consider the observational data setting when the source population differs from a target population of interest. Compared with causal generalization for the average treatment effect which is a scalar
-
A mark-specific quantile regression model Biometrika (IF 2.7) Pub Date : 2023-06-18 Lianqiang Qu, Liuquan Sun, Yanqing Sun
Summary Quantile regression has become a widely used tool for analysing competing risks data. However, quantile regression for competing risks data with a continuous mark is still scarce. The mark variable is an extension of cause-of-failure in a classical competing risks model where cause of failure is replaced by a continuous mark only observed at uncensored failure times. An example of the continuous
-
Interpolating Discriminant Functions in High-Dimensional Gaussian Latent Mixtures Biometrika (IF 2.7) Pub Date : 2023-06-08 Xin Bing, Marten Wegkamp
Summary This paper considers binary classification of high-dimensional features under a postulated model with a low-dimensional latent Gaussian mixture structure and nonvanishing noise. A generalized least squares estimator is used to estimate the direction of the optimal separating hyperplane. The estimated hyperplane is shown to interpolate on the training data. While the direction vector can be
-
No-harm calibration for generalized Oaxaca–Blinder estimators Biometrika (IF 2.7) Pub Date : 2023-06-08 P L Cohen, C B Fogarty
Summary In randomized experiments, adjusting for observed features when estimating treatment effects has been proposed as a way to improve asymptotic efficiency. However, among parametric methods, only linear regression has been proven to form an estimate of the average treatment effect that is asymptotically no less efficient than the treated-minus-control difference in means regardless of the true
-
One-step TMLE for targeting cause-specific absolute risks and survival curves Biometrika (IF 2.7) Pub Date : 2023-05-25 H C W Rytgaard, M J Van Der Laan
This paper considers one-step targeted maximum likelihood estimation methodology for multi-dimensional causal parameters in general survival and competing risks settings where event times take place on the positive real line ℝ+ and are subject to right-censoring. We focus on effects of baseline treatment decisions possibly confounded by pre-treatment covariates, but remark that our work generalizes
-
A linear adjustment based approach to posterior drift in transfer learning Biometrika (IF 2.7) Pub Date : 2023-05-11 Subha Maity, Diptavo Dutta, Jonathan Terhorst, Yuekai Sun, Moulinath Banerjee
We present new models and methods for the posterior drift problem where the regression function in the target domain is modelled as a linear adjustment, on an appropriate scale, of that in the source domain, and study the theoretical properties of our proposed estimators in the binary classification problem. The core idea of our model inherits the simplicity and the usefulness of generalized linear
-
Bayesian learning of network structures from interventional experimental data Biometrika (IF 2.7) Pub Date : 2023-05-11 F Castelletti, S Peluso
Directed Acyclic Graphs (DAGs) provide an effective framework for learning causal relationships among variables given multivariate observations. Under pure observational data, DAGs encoding the same conditional independencies cannot be distinguished and are collected into Markov equivalence classes. In many contexts however, observational measurements are supplemented by interventional data that improve
-
Universal Robust Regression via Maximum Mean Discrepancy Biometrika (IF 2.7) Pub Date : 2023-05-11 P Alquier, M Gerber
Many modern datasets are collected automatically and are thus easily contaminated by outliers. This has led to a renewed interest in robust estimation, including new notions of robustness such as robustness to adversarial contamination of the data. However, most robust estimation methods are designed for a specific model. Notably, many methods were proposed recently to obtain robust estimators in linear
-
Treatment Effect Quantiles in Stratified Randomized Experiments and Matched Observational Studies Biometrika (IF 2.7) Pub Date : 2023-05-08 Yongchang Su, Xinran Li
Summary Evaluating the treatment effect has become an important topic for many applications. However, most existing literature focuses mainly on average treatment effects. When the individual effects are heavy-tailed or have outlier values, not only may the average effect not be appropriate for summarizing treatment effects, but also the conventional inference for it can be sensitive and possibly invalid
-
Characterizing M-estimators Biometrika (IF 2.7) Pub Date : 2023-05-08 Timo Dimitriadis, Tobias Fissler, Johanna Ziegel
Summary We characterize the full classes of M-estimators for semiparametric models of general functionals by formally connecting the theory of consistent loss functions from forecast evaluation with the theory of M-estimation. This novel characterization result allows us to leverage existing results on loss functions known from the literature on forecast evaluation in estimation theory. We exemplify
-
Power and Sample Size Calculations for Rerandomization Biometrika (IF 2.7) Pub Date : 2023-05-03 Zach Branson, Xinran Li, Peng Ding
Summary Power analyses are an important aspect of experimental design, because they help determine how experiments are implemented in practice. It is common to specify a desired level of power and compute the sample size necessary to obtain that power. Such calculations are well-known for completely randomized experiments, but there can be many benefits to using other experimental designs. For example
-
Statistical summaries of unlabelled evolutionary trees Biometrika (IF 2.7) Pub Date : 2023-04-26 Rajanala Samyak, Julia A Palacios
SUMMARY Rooted and ranked phylogenetic trees are mathematical objects that are useful in modelling hierarchical data and evolutionary relationships with applications to many fields such as evolutionary biology and genetic epidemiology. Bayesian phylogenetic inference usually explores the posterior distribution of trees via Markov Chain Monte Carlo methods. However, assessing uncertainty and summarizing
-
Populations of Unlabelled Networks: Graph Space Geometry and Generalized Geodesic Principal Components Biometrika (IF 2.7) Pub Date : 2023-04-04 Anna Calissano, Aasa Feragen, Simone Vantini
Summary Statistical analysis for populations of networks is widely applicable but challenging as networks have strongly non-Euclidean behaviour. Graph space is an exhaustive framework for studying populations of unlabelled networks which are weighted or unweighted, uni- or multi-layered, directed or undirected. Viewing graph space as the quotient of a Euclidean space with respect to a finite group
-
Tailored inference for finite populations: conditional validity and transfer across distributions Biometrika (IF 2.7) Pub Date : 2023-04-02 Ying Jin, Dominik Rothenhäusler
Parameters of sub-populations can be more relevant than those of super-populations. For example, a healthcare provider may be interested in the effect of a treatment plan for a specific subset of their patients; policymakers may be concerned with the impact of a policy in a particular state within a given population. In these cases, the focus is on a specific finite population, as opposed to an infinite
-
Scalable subsampling: computation, aggregation and inference Biometrika (IF 2.7) Pub Date : 2023-03-18 Dimitris N Politis
Subsampling has seen a resurgence in the Big Data era where the standard, full-resample size bootstrap can be infeasible to compute. Nevertheless, even choosing a single random subsample of size b can be computationally challenging with both b and the sample size n being very large. The paper at hand shows how a set of appropriately chosen, non-random subsamples can be used to conduct effective—and
-
Causal inference with misspecified exposure mappings: separating definitions and assumptions Biometrika (IF 2.7) Pub Date : 2023-03-16 F Sävje
Exposure mappings facilitate investigations of complex causal effects when units interact in experiments. Current methods require experimenters to use the same exposure mappings both to define the effect of interest and to impose assumptions on the interference structure. However, the two roles rarely coincide in practice, and experimenters are forced to make the often questionable assumption that
-
√2-Estimation for Smooth Eigenvectors of Matrix-Valued Functions Biometrika (IF 2.7) Pub Date : 2023-03-15 Giovanni Motta, Wei Biao Wu, Mohsen Pourahmadi
Summary Modern statistical methods for multivariate time series rely on the eigendecomposition of matrix-valued functions such as time-varying covariance and spectral density matrices. The curse of indeterminacy or misidentification of smooth eigenvector functions has not received much attention. We resolve this important problem and recover smooth trajectories by examining the distance between the