
显示样式: 排序: IF: - GO 导出
-
Graphical Models for Processing Missing Data J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2021-01-14 Karthika Mohan; Judea Pearl
Abstract This paper reviews recent advances in missing data research using graphical models to represent multivariate dependencies. We first examine the limitations of traditional frameworks from three different perspectives: transparency, estimability and testability. We then show how procedures based on graphical models can overcome these limitations and provide meaningful performance guarantees
-
High-Dimensional Spatial Quantile Function-on-Scalar Regression J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2021-01-06 Zhengwu Zhang; Xiao Wang; Linglong Kong; Hongtu Zhu
Abstract This paper develops a novel spatial quantile function-on-scalar regression model, which studies the conditional spatial distribution of a high-dimensional functional response given scalar predictors. With the strength of both quantile regression and copula modeling, we are able to explicitly characterize the conditional distribution of the functional or image response on the whole spatial
-
Discovering Heterogeneous Exposure Effects Using Randomization Inference in Air Pollution Studies J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2021-01-06 Kwonsang Lee; Dylan S. Small; Francesca Dominici
Abstract Several studies have provided strong evidence that long-term exposure to air pollution, even at low levels, increases risk of mortality. As regulatory actions are becoming prohibitively expensive, robust evidence to guide the development of targeted interventions to protect the most vulnerable is needed. In this paper, we introduce a novel statistical method that (i) discovers subgroups whose
-
A semiparametric kernel independence test with application to mutational signatures J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2021-01-06 DongHyuk Lee; Bin Zhu
Abstract Cancers arise owing to somatic mutations, and the characteristic combinations of somatic mutations form mutational signatures. Despite many mutational signatures being identified, mutational processes underlying a number of mutational signatures remain unknown, which hinders the identification of interventions that may reduce somatic mutation burdens and prevent the development of cancer.
-
Estimation of Optimal Individualized Treatment Rules Using a Covariate-Specific Treatment Effect Curve with High-dimensional Covariates J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-22 Wenchuan Guo; Xiao-Hua Zhou; Shujie Ma
Abstract With a large number of baseline covariates, we propose a new semi-parametric modeling strategy for heterogeneous treatment effect estimation and individualized treatment selection, which are two major goals in personalized medicine. We achieve the first goal through estimating a covariate-specific treatment effect (CSTE) curve modeled as an unknown function of a weighted linear combination
-
A Bayesian State-Space Approach to Mapping Directional Brain Networks J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-22 Huazhang Li; Yaotian Wang; Guofen Yan; Yinge Sun; Seiji Tanabe; Chang-Chia Liu; Mark Quigg; Tingting Zhang
Abstract The human brain is a directional network system of brain regions involving directional connectivity. Seizures are a directional network phenomenon as abnormal neuronal activities start from a seizure onset zone (SOZ) and propagate to otherwise healthy regions. To localize the SOZ of an epileptic patient, clinicians use intracranial EEG (iEEG) to record the patient’s intracranial brain activity
-
A Tuning-free Robust and Efficient Approach to High-dimensional Regression J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-18 Lan Wang; Bo Peng; Jelena Bradic; Runze Li; Yunan Wu
Abstract We introduce a novel approach for high-dimensional regression with theoretical guarantees. The new procedure overcomes the challenge of tuning parameter selection of Lasso and possesses several appealing properties. It uses an easily simulated tuning parameter that automatically adapts to both the unknown random error distribution and the correlation structure of the design matrix. It is robust
-
Comment on “A Tuning-Free Robust and Efficient Approach to High-Dimensional Regression” J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-18 Po-Ling Loh
(2020). Comment on “A Tuning-Free Robust and Efficient Approach to High-Dimensional Regression”. Journal of the American Statistical Association: Vol. 115, No. 532, pp. 1715-1716.
-
Discussion of “A Tuning-Free Robust and Efficient Approach to High-Dimensional Regression” J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-18 Xiudi Li; Ali Shojaie
(2020). Discussion of “A Tuning-Free Robust and Efficient Approach to High-Dimensional Regression”. Journal of the American Statistical Association: Vol. 115, No. 532, pp. 1717-1719.
-
Comment on “A Tuning-Free Robust and Efficient Approach to High-Dimensional Regression” J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-18 Jianqing Fan; Cong Ma; Kaizheng Wang
(2020). Comment on “A Tuning-Free Robust and Efficient Approach to High-Dimensional Regression”. Journal of the American Statistical Association: Vol. 115, No. 532, pp. 1720-1725.
-
Rejoinder to “A Tuning-Free Robust and Efficient Approach to High-Dimensional Regression” J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-18 Lan Wang; Bo Peng; Jelena Bradic; Runze Li; Yunan Wu
(2020). Rejoinder to “A Tuning-Free Robust and Efficient Approach to High-Dimensional Regression”. Journal of the American Statistical Association: Vol. 115, No. 532, pp. 1726-1729.
-
Handbook of Approximate Bayesian Computation. J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-18 Jordan J. Franks
(2020). Handbook of Approximate Bayesian Computation. Journal of the American Statistical Association: Vol. 115, No. 532, pp. 2100-2101.
-
Handbook of Mixture Analysis. J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-18 Yen-Chi Chen
(2020). Handbook of Mixture Analysis. Journal of the American Statistical Association: Vol. 115, No. 532, pp. 2101-2102.
-
The Statistical Analysis of Multivariate Failure Time Data: A Marginal Modeling Approach. J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-18 Richard J. Cook
(2020). The Statistical Analysis of Multivariate Failure Time Data: A Marginal Modeling Approach. Journal of the American Statistical Association: Vol. 115, No. 532, pp. 2102-2104.
-
Editorial Collaborators J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-18
(2020). Editorial Collaborators. Journal of the American Statistical Association: Vol. 115, No. 532, pp. 2105-2113.
-
Nonparametric tests of the causal null with non-discrete exposures J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-17 Ted Westling
Abstract In many scientific studies, it is of interest to determine whether an exposure has a causal effect on an outcome. In observational studies, this is a challenging task due to the presence of confounding variables that affect both the exposure and the outcome. Many methods have been developed to test for the presence of a causal effect when all such confounding variables are observed and when
-
Semi-parametric multinomial logistic regression for multivariate point pattern data J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-17 Kristian Bjørn Hessellund; Ganggang Xu; Yongtao Guan; Rasmus Waagepetersen
Abstract We propose a new method for analysis of multivariate point pattern data observed in a heterogeneous environment and with complex intensity functions. We suggest semi-parametric models for the intensity functions that depend on an unspecified factor common to all types of points. This is for example well suited for analyzing spatial covariate effects on events such as street crime activities
-
Survival regression models with dependent Bayesian nonparametric priors J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-17 Alan Riva-Palacio; Fabrizio Leisen; Jim Griffin
Abstract We present a novel Bayesian nonparametric model for regression in survival analysis. Our model builds on the classical neutral to the right model of Doksum (1974) and on the Cox proportional hazards model of Kim and Lee (2003). The use of a vector of dependent Bayesian nonparametric priors allows us to efficiently model the hazard as a function of covariates whilst allowing nonproportionality
-
Covariate Information Number for Feature Screening in Ultrahigh-Dimensional Supervised Problems J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-16 Debmalya Nandy; Francesca Chiaromonte; Runze Li
Abstract Contemporary high-throughput experimental and surveying techniques give rise to ultrahigh-dimensional supervised problems with sparse signals; that is, a limited number of observations (n), each with a very large number of covariates ( p ≫ n ) , only a small share of which is truly associated with the response. In these settings, major concerns on computational burden, algorithmic stability
-
Sensitivity Analysis via the Proportion of Unmeasured Confounding J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-16 Matteo Bonvini; Edward H. Kennedy
Abstract In observational studies, identification of ATEs is generally achieved by assuming that the correct set of confounders has been measured and properly included in the relevant models. Because this assumption is both strong and untestable, a sensitivity analysis should be performed. Common approaches include modeling the bias directly or varying the propensity scores to probe the effects of
-
Statistical Modeling for Spatio-Temporal Data from Stochastic Convection-Diffusion Processes J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-14 Xiao Liu; Kyongmin Yeo; Siyuan Lu
Abstract This paper proposes a physical-statistical modeling approach for spatio-temporal data arising from a class of stochastic convection-diffusion processes. Such processes are widely found in scientific and engineering applications where fundamental physics imposes critical constraints on how data can be modeled and how models should be interpreted. The idea of spectrum decomposition is employed
-
Estimating Malaria Vaccine Efficacy in the Absence of a Gold Standard Case Definition: Mendelian Factorial Design J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-14 Raiden B. Hasegawa; Dylan S. Small
Abstract Accurate estimates of malaria vaccine efficacy require a reliable definition of a malaria case. However, the symptoms of clinical malaria are unspecific, overlapping with other childhood illnesses. Additionally, children in endemic areas tolerate varying levels of parasitemia without symptoms. Together, this makes finding a gold-standard case definition challenging. We present a method to
-
Multicategory Angle-based Learning for Estimating Optimal Dynamic Treatment Regimes with Censored Data J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-14 Fei Xue; Yanqing Zhang; Wenzhuo Zhou; Haoda Fu; Annie Qu
Abstract An optimal dynamic treatment regime (DTR) consists of a sequence of decision rules in maximizing long-term benefits, which is applicable for chronic diseases such as HIV infection or cancer. In this paper, we develop a novel angle-based approach to search the optimal DTR under a multicategory treatment framework for survival data. The proposed method targets to maximize the conditional survival
-
Minimax efficient random experimental design strategies with application to model-robust design for prediction J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-14 Timothy W. Waite; David C. Woods
Abstract In game theory and statistical decision theory, a random (i.e. mixed) decision strategy often outperforms a deterministic strategy in minimax expected loss. As experimental design can be viewed as a game pitting the Statistician against Nature, the use of a random strategy to choose a design will often be beneficial. However, the topic of minimax-efficient random strategies for design selection
-
Biased Encouragements and Heterogeneous Effects in an Instrumental Variable Study of Emergency General Surgical Outcomes J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-14 Colin B. Fogarty; Kwonsang Lee; Rachel R. Kelz; Luke J. Keele
Abstract We investigate the efficacy of surgical versus non-surgical management for two gastrointestinal conditions, colitis and diverticulitis, using observational data. We deploy an instrumental variable design with surgeons’ tendencies to operate as an instrument. Assuming instrument validity, we find that non-surgical alternatives can reduce both hospital length of stay and the risk of complications
-
Optimal design of experiments for implicit models J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-11 Belmiro P.M. Duarte; Anthony C. Atkinson; José F.O. Granjo; Nuno M.C. Oliveira
Abstract Explicit models representing the response variables as functions of the control variables are standard in virtually all scientific fields. For these models there is a vast literature on the optimal design of experiments to provide good estimates of the parameters with the use of minimal resources. Contrarily, the optimal design of experiments for implicit models is more complex and has not
-
Modeling High-Dimensional Time Series: A Factor Model with Dynamically Dependent Factors and Diverging Eigenvalues J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-11 Zhaoxing Gao; Ruey S. Tsay
Abstract This article proposes a new approach to modeling high-dimensional time series by treating a p-dimensional time series as a nonsingular linear transformation of certain common factors and idiosyncratic components. Unlike the approximate factor models, we assume that the factors capture all the non-trivial dynamics of the data, but the cross-sectional dependence may be explained by both the
-
Semiparametric Inference for Non-monotone Missing-Not-at-Random Data: the No Self-Censoring Model J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-11 Daniel Malinsky; Ilya Shpitser; Eric J. Tchetgen Tchetgen
Abstract We study the identification and estimation of statistical functionals of multivariate data missing non-monotonically and not-at-random, taking a semiparametric approach. Specifically, we assume that the missingness mechanism satisfies what has been previously called “no self-censoring” or “itemwise conditionally independent nonresponse,” which roughly corresponds to the assumption that no
-
Topic Modeling on Triage Notes with Semi-orthogonal Non-negative Matrix Factorization J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-11 Yutong Li; Ruoqing Zhu; Annie Qu; Han Ye; Zhankun Sun
Abstract Emergency Department (ED) crowding is a universal health issue that affects the efficiency of hospital management and patient care quality. ED crowding frequently occurs when a request for a ward-bed for a patient is delayed until a doctor makes an admission decision. In this case study, we build a classifier to predict the disposition of patients using manually typed nurse notes collected
-
IFAA: Robust association identification and Inference For Absolute Abundance in microbiome analyses J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-10 Zhigang Li; Lu Tian; A. James O’Malley; Margaret R. Karagas; Anne G. Hoen; Brock C. Christensen; Juliette C. Madan; Quran Wu; Raad Z. Gharaibeh; Christian Jobin; Hongzhe Li
Abstract The target of inference in microbiome analyses is usually relative abundance (RA) because RA in a sample (e.g., stool) can be considered as an approximation of RA in an entire ecosystem (e.g., gut). However, inference on RA suffers from the fact that RA are calculated by dividing absolute abundances (AA) over the common denominator (CD), the summation of all AA (i.e., library size). Because
-
Constrained Functional Regression of National Forest Inventory Data over Time Using Remote Sensing Observations J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-10 Md Kamrul Hasan Khan; Avishek Chakraborty; Giovanni Petris; Barry T. Wilson
Abstract The USDA Forest Service uses satellite imagery, along with a sample of national forest inventory field plots, to monitor and predict changes in forest conditions over time throughout the United States. We specifically focus on a 230, 400 hectare region in north-central Wisconsin between 2003 - 2012 . The auxiliary data from the satellite imagery of this region are relatively dense in space
-
Hierarchical Transformed Scale Mixtures for Flexible Modeling of Spatial Extremes on Datasets with Many Locations J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-07 Likun Zhang; Benjamin A. Shaby; Jennifer L. Wadsworth
Abstract Flexible spatial models that allow transitions between tail dependence classes have recently appeared in the literature. However, inference for these models is computationally prohibitive, even in moderate dimensions, due to the necessity of repeatedly evaluating the multivariate Gaussian distribution function. In this work, we attempt to achieve truly high-dimensional inference for extremes
-
Multiscale quantile segmentation J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-07 Laura Jula Vanegas; Merle Behr; Axel Munk
Abstract We introduce a new methodology for analyzing serial data by quantile regression assuming that the underlying quantile function consists of constant segments. The procedure does not rely on any distributional assumption besides serial independence. It is based on a multiscale statistic, which allows to control the (finite sample) probability for selecting the correct number of segments S at
-
LAWS: A Locally Adaptive Weighting and Screening Approach To Spatial Multiple Testing J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-12-07 T. Tony Cai; Wenguang Sun; Yin Xia
Abstract Exploiting spatial patterns in large-scale multiple testing promises to improve both power and interpretability of false discovery rate (FDR) analyses. This article develops a new class of locally–adaptive weighting and screening (LAWS) rules that directly incorporates useful local patterns into inference. The idea involves constructing robust and structure-adaptive weights according to the
-
Efficient estimation of optimal regimes under a no direct effect assumption* J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-11-30 Lin Liu; Zach Shahn; James M. Robins; Andrea Rotnitzky
Abstract We derive new estimators of an optimal joint testing and treatment regime under the no direct effect assumption that a given laboratory, diagnostic, or screening test has no effect on a patient’s clinical outcomes except through the effect of the test results on the choice of treatment. We model the optimal joint strategy with an optimal structural nested mean model (opt-SNMM). The proposed
-
High-dimensional vector autoregressive time series modeling via tensor decomposition J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-11-30 Di Wang; Yao Zheng; Heng Lian; Guodong Li
Abstract The classical vector autoregressive model is a fundamental tool for multivariate time series analysis. However, it involves too many parameters when the number of time series and lag order are even moderately large. This paper proposes to rearrange the transition matrices of the model into a tensor form such that the parameter space can be restricted along three directions simultaneously via
-
Mehler’s Formula, Branching Process, and Compositional Kernels of Deep Neural Networks J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-11-20 Tengyuan Liang; Hai Tran-Bach
Abstract We utilize a connection between compositional kernels and branching processes via Mehler’s formula to study deep neural networks. This new probabilistic insight provides us a novel perspective on the mathematical role of activation functions in compositional neural networks. We study the unscaled and rescaled limits of the compositional kernels and explore the different phases of the limiting
-
On constraining projections of future climate using observations and simulations from multiple climate models J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-11-18 Philip G. Sansom; David B. Stephenson; Thomas J. Bracegirdle
Abstract Numerical climate models are used to project future climate change due to both anthropogenic and natural causes. Differences between projections from different climate models are a major source of uncertainty about future climate. Emergent relationships shared by multiple climate models have the potential to constrain our uncertainty when combined with historical observations. We combine projections
-
Predicting the Number of Future Events J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-11-16 Qinglong Tian; Fanqi Meng; Daniel J. Nordman; William Q. Meeker
Abstract This paper describes prediction methods for the number of future events from a population of units associated with an on-going time-to-event process. Examples include the prediction of warranty returns and the prediction of the number of future product failures that could cause serious threats to property or life. Important decisions such as whether a product recall should be mandated are
-
Gaining Outlier Resistance with Progressive Quantiles: Fast Algorithms and Theoretical Studies J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-11-16 Yiyuan She; Zhifeng Wang; Jiahui Shen
Abstract Outliers widely occur in big-data applications and may severely affect statistical estimation and inference. In this paper, a framework of outlier-resistant estimation is introduced to robustify an arbitrarily given loss function. It has a close connection to the method of trimming and includes explicit outlyingness parameters for all samples, which in turn facilitates computation, theory
-
Functional Sequential Treatment Allocation* J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-11-16 Anders Bredahl Kock; David Preinerstorfer; Bezirgen Veliyev
Abstract Consider a setting in which a policy maker assigns subjects to treatments, observing each outcome before the next subject arrives. Initially, it is unknown which treatment is best, but the sequential nature of the problem permits learning about the effectiveness of the treatments. While the multi-armed-bandit literature has shed much light on the situation when the policy maker compares the
-
Stochastic gradient Markov chain Monte Carlo J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-11-11 Christopher Nemeth; Paul Fearnhead
Abstract Markov chain Monte Carlo (MCMC) algorithms are generally regarded as the gold standard technique for Bayesian inference. They are theoretically well-understood and conceptually simple to apply in practice. The drawback of MCMC is that performing exact inference generally requires all of the data to be processed at each iteration of the algorithm. For large data sets, the computational cost
-
Variational Bayes for high-dimensional linear regression with sparse priors J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-11-09 Kolyan Ray; Botond Szabó
Abstract We study a mean-field spike and slab variational Bayes (VB) approximation to Bayesian model selection priors in sparse high-dimensional linear regression. Under compatibility conditions on the design matrix, oracle inequalities are derived for the mean-field VB approximation, implying that it converges to the sparse truth at the optimal rate and gives optimal prediction of the response vector
-
Coupled generation* J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-11-02 Ben Dai; Xiaotong Shen; Wing Wong
Abstract Instance generation creates representative examples to interpret a learning model, as in regression and classification. For example, representative sentences of a topic of interest describe the topic specifically for sentence categorization. In such a situation, a large number of unlabeled observations may be available in addition to labeled data, for example, many unclassified text corpora
-
Smaller p-values via indirect information J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-11-02 Peter Hoff
Abstract This article develops p-values for evaluating means of normal populations that make use of indirect or prior information. A p-value of this type is based on a biased frequentist hypothesis test that has optimal average power with respect to a probability distribution that encodes indirect information about the mean parameter, resulting in a smaller p-value if the indirect information is accurate
-
Regression Analysis of Asynchronous Longitudinal Functional and Scalar Data J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-11-02 Ting Li; Tengfei Li; Zhongyi Zhu; Hongtu Zhu
Abstract Many modern large-scale longitudinal neuroimaging studies, such as the Alzheimer’s Disease Neuroimaging Initiative (ADNI) study, have collected/are collecting asynchronous scalar and functional variables that are measured at distinct time points. The analyses of temporally asynchronous functional and scalar variables pose major technical challenges to many existing statistical approaches.
-
A Multi-resolution Theory for Approximating Infinite-p-Zero-n: Transitional Inference, Individualized Predictions, and a World Without Bias-Variance Trade-off J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-10-29 Xinran Li; Xiao-Li Meng
Abstract Transitional inference is an empiricism concept, rooted and practiced in clinical medicine since ancient Greece. Knowledge and experiences gained from treating one entity (e.g., a disease or a group of patients) are applied to treat a related but distinctively different one (e.g., a similar disease or a new patient). This notion of “transition to the similar” renders individualized treatments
-
Markov Neighborhood Regression for High-Dimensional Inference J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-10-28 Faming Liang; Jingnan Xue; Bochao Jia
Abstract This paper proposes an innovative method for constructing confidence intervals and assessing p-values in statistical inference for high-dimensional linear models. The proposed method has successfully broken the high-dimensional inference problem into a series of low-dimensional inference problems: For each regression coefficient βi , the confidence interval and p-value are computed by regressing
-
Random Partition Models for Microclustering Tasks J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-10-28 Brenda Betancourt; Giacomo Zanella; Rebecca C. Steorts
Abstract Traditional Bayesian random partition models assume that the size of each cluster grows linearly with the number of data points. While this is appealing for some applications, this assumption is not appropriate for other tasks such as entity resolution, modeling of sparse networks, and DNA sequencing tasks. Such applications require models that yield clusters whose sizes grow sublinearly with
-
Inter-Subject Analysis: A Partial Gaussian Graphical Model Approach J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-10-28 Cong Ma; Junwei Lu; Han Liu
Abstract Different from traditional intra-subject analysis, the goal of Inter-Subject Analysis (ISA) is to explore the dependency structure between different subjects with the intra-subject dependency as nuisance. ISA has important applications in neuroscience to study the functional connectivity between brain regions under natural stimuli. We propose a modeling framework for ISA that is based on Gaussian
-
Nonlinear spectral analysis: A local Gaussian approach J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-10-26 Lars Arne Jordanger; Dag Tjøstheim
Abstract The spectral distribution f ( ω ) of a stationary time series { Y t } t ∈ Z can be used to investigate whether or not periodic structures are present in { Y t } t ∈ Z , but f ( ω ) has some limitations due to its dependence on the autocovariances γ ( h ) . For example, f ( ω ) can not distinguish white i.i.d. noise from GARCH-type models (whose terms are dependent, but uncorrelated), which
-
Heterocedasticity-Adjusted Ranking and Thresholding for Large-Scale Multiple Testing J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-10-26 Luella Fu; Bowen Gang; Gareth M. James; Wenguang Sun
Abstract Standardization has been a widely adopted practice in multiple testing, for it takes into account the variability in sampling and makes the test statistics comparable across different study units. However, despite conventional wisdom to the contrary, we show that there can be a significant loss in information from basing hypothesis tests on standardized statistics rather than the full data
-
Asymptotic Theory of Eigenvectors for Random Matrices with Diverging Spikes* J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-10-23 Jianqing Fan; Yingying Fan; Xiao Han; Jinchi Lv
Abstract Characterizing the asymptotic distributions of eigenvectors for large random matrices poses important challenges yet can provide useful insights into a range of statistical applications. To this end, in this paper we introduce a general framework of asymptotic theory of eigenvectors (ATE) for large spiked random matrices with diverging spikes and heterogeneous variances, and establish the
-
Robust Post-Matching Inference J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-10-23 Alberto Abadie; Jann Spiess
Abstract Nearest-neighbor matching is a popular nonparametric tool to create balance between treatment and control groups in observational studies. As a preprocessing step before regression, matching reduces the dependence on parametric modeling assumptions. In current empirical practice, however, the matching step is often ignored in the calculation of standard errors and confidence intervals. In
-
Learning Latent Factors from Diversified Projections and its Applications to Over-Estimated and Weak Factors J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-10-20 Jianqing Fan; Yuan Liao
Abstract Estimations and applications of factor models often rely on the crucial condition that the number of latent factors is consistently estimated, which in turn also requires that factors be relatively strong, data are stationary and weak serial dependence, and the sample size be fairly large, although in practical applications, one or several of these conditions may fail. In these cases it is
-
Privacy-Preserving Parametric Inference: A Case for Robust Statistics J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-10-16 Marco Avella-Medina
Abstract Differential privacy is a cryptographically motivated approach to privacy that has become a very active field of research over the last decade in theoretical computer science and machine learning. In this paradigm, one assumes there is a trusted curator who holds the data of individuals in a database and the goal of privacy is to simultaneously protect individual data while allowing the release
-
BAGS: A Bayesian Adaptive Group Sequential Trial Design with Subgroup-Specific Survival Comparisons J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-10-16 Ruitao Lin; Peter F. Thall; Ying Yuan
Abstract A Bayesian group sequential design is proposed that performs survival comparisons within patient subgroups in randomized trials where treatment–subgroup interactions may be present. A latent subgroup membership variable is assumed to allow the design to adaptively combine homogeneous subgroups, or split heterogeneous subgroups, to improve the procedure’s within-subgroup power. If a baseline
-
Hierarchical community detection by recursive partitioning J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-10-08 Tianxi Li; Lihua Lei; Sharmodeep Bhattacharyya; Koen Van den Berge; Purnamrita Sarkar; Peter J. Bicke; Elizaveta Levina
Abstract The problem of community detection in networks is usually formulated as finding a single partition of the network into some “correct” number of communities. We argue that it is more interpretable and in some regimes more accurate to construct a hierarchical tree of communities instead. This can be done with a simple top-down recursive partitioning algorithm, starting with a single community
-
Highly Scalable Bayesian Geostatistical Modeling via Meshed Gaussian Processes on Partitioned Domains J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-10-08 Michele Peruzzi; Sudipto Banerjee; Andrew O. Finley
Abstract We introduce a class of scalable Bayesian hierarchical models for the analysis of massive geostatistical datasets. The underlying idea combines ideas on high-dimensional geostatistics by partitioning the spatial domain and modeling the regions in the partition using a sparsity-inducing directed acyclic graph (DAG). We extend the model over the DAG to a well-defined spatial process, which we
-
Causal bounds for outcome-dependent sampling in observational studies J. Am. Stat. Assoc. (IF 3.989) Pub Date : 2020-10-06 Erin E. Gabriel; Michael C. Sachs; Arvid Sjölander
Abstract Outcome-dependent sampling designs are common in many different scientific fields including epidemiology, ecology, and economics. As with all observational studies, such designs often suffer from unmeasured confounding, which generally precludes the nonparametric identification of causal effects. Nonparametric bounds can provide a way to narrow the range of possible values for a nonidentifiable