
样式: 排序: IF: - GO 导出 标记为已读
-
Tail Spectral Density Estimation and Its Uncertainty Quantification: Another Look at Tail Dependent Time Series Analysis J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-03-29 Ting Zhang, Beibei Xu
Abstract We consider the estimation and uncertainty quantification of the tail spectral density, which provide a foundation for tail spectral analysis of tail dependent time series. The tail spectral density has a particular focus on serial dependence in the tail, and can reveal dependence information that is otherwise not discoverable by the traditional spectral analysis. Understanding the convergence
-
Feature Screening with Conditional Rank Utility for Big-data Classification J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-03-28 Xingxiang Li, Chen Xu
Abstract Feature screening is a commonly-used strategy to eliminate irrelevant features in high-dimensional classification. When one encounters big datasets with both high dimensionality and huge sample size, the conventional screening methods become computationally costly or even infeasible. In this paper, we introduce a novel screening utility, Conditional Rank Utility (CRU), and propose a distributed
-
Test of Significance for High-dimensional Thresholds with Application to Individualized Minimal Clinically Important Difference J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-03-28 Huijie Feng, Jingyi Duan, Yang Ning, Jiwei Zhao
Abstract This work is motivated by learning the individualized minimal clinically important difference, a vital concept to assess clinical importance in various biomedical studies. We formulate the scientific question into a high-dimensional statistical problem where the parameter of interest lies in an individualized linear threshold. The goal is to develop a hypothesis testing procedure for the significance
-
Assessing disparities in Americans’ exposure to PCBs and PBDEs based on NHANES pooled biomonitoring data J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-03-28 Yan Liu, Dewei Wang, Li Li, Dingsheng Li
Abstract The National Health and Nutrition Examination Survey (NHANES) has been continuously biomonitoring Americans’ exposure to two families of harmful environmental chemicals: polychlorinated biphenyls (PCBs) and polybrominated diphenyl ethers (PBDEs). However, biomonitoring these chemicals is expensive. To save cost, in 2005, NHANES resorted to pooled biomonitoring; i.e., amalgamating individual
-
Cohesion and Repulsion in Bayesian Distance Clustering J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-03-24 Abhinav Natarajan, Maria De Iorio, Andreas Heinecke, Emanuel Mayer, Simon Glenn
Abstract Clustering in high-dimensions poses many statistical challenges. While traditional distance-based clustering methods are computationally feasible, they lack probabilistic interpretation and rely on heuristics for estimation of the number of clusters. On the other hand, probabilistic model-based clustering techniques often fail to scale and devising algorithms that are able to effectively explore
-
Bayesian Conditional Transformation Models J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-03-21 Manuel Carlan, Thomas Kneib, Nadja Klein
Abstract Recent developments in statistical regression methodology shift away from pure mean regression towards distributional regression models. One important strand thereof is that of conditional transformation models (CTMs). CTMs infer the entire conditional distribution directly by applying a transformation function to the response conditionally on a set of covariates towards a simple log-concave
-
Semiparametric proximal causal inference J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-03-16 Yifan Cui, Hongming Pu, Xu Shi, Wang Miao, Eric Tchetgen Tchetgen
Abstract Skepticism about the assumption of no unmeasured confounding, also known as exchangeability, is often warranted in making causal inferences from observational data; because exchangeability hinges on an investigator’s ability to accurately measure covariates that capture all potential sources of confounding. In practice, the most one can hope for is that covariate measurements are at best proxies
-
Fixed-domain Posterior Contraction Rates for Spatial Gaussian Process Model with Nugget J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-03-15 Cheng Li, Saifei Sun, Yichen Zhu
Spatial Gaussian process regression models typically contain finite dimensional covariance parameters that need to be estimated from the data. We study the Bayesian estimation of covariance paramet...
-
Variable Selection for High-dimensional Nodal Attributes in Social Networks with Degree Heterogeneity* J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-03-07 Jia Wang, Xizhen Cai, Xiaoyue Niu, Runze Li
Abstract We consider a class of network models, in which the connection probability depends on ultrahigh-dimensional nodal covariates (homophily) and node-specific popularity (degree heterogeneity). A Bayesian method is proposed to select nodal features in both dense and sparse networks under a mild assumption on popularity parameters. The proposed approach is implemented via Gibbs sampling. To alleviate
-
Doubly robust capture-recapture methods for estimating population size J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-03-07 Manjari Das, Edward H. Kennedy, Nicholas P. Jewell
Abstract Estimation of population size using incomplete lists has a long history across many biological and social sciences. For example, human rights groups often construct partial lists of victims of armed conflicts, to estimate the total number of victims. Earlier statistical methods for this setup often use parametric assumptions, or rely on suboptimal plug-in-type nonparametric estimators; but
-
Distributed Inference for Spatial Extremes Modeling in High Dimensions J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-03-06 Emily C. Hector, Brian J. Reich
Abstract Extreme environmental events frequently exhibit spatial and temporal dependence. These data are often modeled using max stable processes (MSPs) that are computationally prohibitive to fit for as few as a dozen observations. Supposed computationally-efficient approaches like the composite likelihood remain computationally burdensome with a few hundred observations. In this paper, we propose
-
On Learning and Testing of Counterfactual Fairness through Data Preprocessing J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-03-06 Haoyu Chen, Wenbin Lu, Rui Song, Pulak Ghosh
Abstract Machine learning has become more important in real-life decision-making but people are concerned about the ethical problems it may bring when used improperly. Recent work brings the discussion of machine learning fairness into the causal framework and elaborates on the concept of Counterfactual Fairness. In this paper, we develop the Fair Learning through dAta Preprocessing (FLAP) algorithm
-
Adversarial Machine Learning: Bayesian Perspectives J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-03-02 David Rios Insua, Roi Naveiro, Víctor Gallego, Jason Poulos
Abstract Adversarial Machine Learning (AML) is emerging as a major field aimed at protecting machine learning (ML) systems against security threats: in certain scenarios there may be adversaries that actively manipulate input data to fool learning systems. This creates a new class of security vulnerabilities that ML systems may face, and a new desirable property called adversarial robustness essential
-
Hypotheses Testing from Complex Survey Data Using Bootstrap Weights: A Unified Approach J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-03-02 Jae-kwang Kim, J. N. K. Rao, Zhonglei Wang
Abstract Standard statistical methods without taking proper account of the complexity of a survey design can lead to erroneous inferences when applied to survey data due to unequal selection probabilities, clustering, and other design features. In particular, the type I error rates of hypotheses tests using standard methods can be much larger than the nominal significance level. Methods incorporating
-
Estimation and Inference for High-Dimensional Generalized Linear Models with Knowledge Transfer J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-02-28 Sai Li, Linjun Zhang, T. Tony Cai, Hongzhe Li
Abstract Transfer learning provides a powerful tool for incorporating data from related studies into a target study of interest. In epidemiology and medical studies, the classification of a target disease could borrow information across other related diseases and populations. In this work, we consider transfer learning for high-dimensional generalized linear models (GLMs). A novel algorithm, TransHDGLM
-
A General M-estimation Theory in Semi-Supervised Framework J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-02-28 Shanshan Song, Yuanyuan Lin, Yong Zhou
Abstract We study a class of general M-estimators in the semi-supervised setting, wherein the data are typically a combination of a relatively small labeled dataset and large amounts of unlabeled data. A new estimator, which efficiently uses the useful information contained in the unlabeled data, is proposed via a projection technique. We prove consistency and asymptotic normality, and provide an inference
-
Factor Modelling for Clustering High-dimensional Time Series J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-02-22 Bo Zhang, Guangming Pan, Qiwei Yao, Wang Zhou
Abstract We propose a new unsupervised learning method for clustering a large number of time series based on a latent factor structure. Each cluster is characterized by its own cluster-specific factors in addition to some common factors which impact on all the time series concerned. Our setting also offers the flexibility that some time series may not belong to any clusters. The consistency with explicit
-
Network Inference Using the Hub Model and Variants J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-02-22 Zhibing He, Yunpeng Zhao, Peter Bickel, Charles Weko, Dan Cheng, Jirui Wang
Abstract Statistical network analysis primarily focuses on inferring the parameters of an observed network. In many applications, especially in the social sciences, the observed data is the groups formed by individual subjects. In these applications, the network is itself a parameter of a statistical model. Zhao and Weko (2019) propose a model-based approach, called the hub model, to infer implicit
-
Causal Inference in Transcriptome-Wide Association Studies with Invalid Instruments and GWAS Summary Data J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-02-22 Haoran Xue, Xiaotong Shen, Wei Pan
Abstract Transcriptome-wide association studies (TWAS) have recently emerged as a popular tool to discover (putative) causal genes by integrating an outcome GWAS dataset with another gene expression/transcriptome GWAS (called eQTL) dataset. In our motivating and target application, we’d like to identify causal genes for low-density lipoprotein cholesterol (LDL), which is crucial for developing new
-
Instrumental Variable Estimation of Marginal Structural Mean Models for Time-Varying Treatment J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-02-21 Haben Michael, Yifan Cui, Scott A. Lorch, Eric J. Tchetgen Tchetgen
Abstract Robins (1997b) introduced marginal structural models (MSMs), a general class of counterfactual models for the joint effects of time-varying treatment regimes in complex longitudinal studies subject to time-varying confounding. In his work, identification of MSM parameters is established under a sequential randomization assumption (SRA), which rules out unmeasured confounding of treatment assignment
-
Online Smooth Backfitting for Generalized Additive Models J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-02-21 Ying Yang, Fang Yao, Peng Zhao
Abstract We propose an online smoothing backfitting method for generalized additive models coupled with local linear estimation. The idea can be extended to general nonlinear optimization problems. The strategy is to use an appropriate-order expansion to approximate the nonlinear equations and store the coefficients as sufficient statistics which can be updated in an online manner by the dynamic candidate
-
Nonlinear causal discovery with confounders* J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-02-14 Chunlin Li, Xiaotong Shen, Wei Pan
Abstract This article introduces a causal discovery method to learn nonlinear relationships in a directed acyclic graph with correlated Gaussian errors due to confounding. First, we derive model identifiability under the sublinear growth assumption. Then, we propose a novel method, named the Deconfounded Functional Structure Estimation (DeFuSE), consisting of a deconfounding adjustment to remove the
-
Crowdsourcing Utilizing Subgroup Structure of Latent Factor Modeling J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-02-10 Qi Xu, Yubai Yuan, Junhui Wang, Annie Qu
Abstract Crowdsourcing has emerged as an alternative solution for collecting large scale labels. However, the majority of recruited workers are not domain experts, so their contributed labels could be noisy. In this paper, we propose a two-stage model to predict the true labels for multicategory classification tasks in crowdsourcing. In the first stage, we fit the observed labels with a latent factor
-
Solving Estimating Equations With Copulas J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-02-08 Thomas Nagler, Thibault Vatter
Abstract Thanks to their ability to capture complex dependence structures, copulas are frequently used to glue random variables into a joint model with arbitrary marginal distributions. More recently, they have been applied to solve statistical learning problems such as regression or classification. Framing such approaches as solutions of estimating equations, we generalize them in a unified framework
-
Intraday Periodic Volatility Curves J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-02-07 Torben G. Andersen, Tao Su, Viktor Todorov, Zhiyuan Zhang
Abstract The volatility of financial asset returns displays pronounced variation over the trading day. Our goal is nonparametric inference for the average intraday volatility pattern, viewed as a function of time-of-day. The functional inference is based on a long span of high-frequency return data. Our setup allows for general forms of volatility dynamics, including time-variation in the intraday
-
Bayesian Modeling with Spatial Curvature Processes J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-02-06 Aritra Halder, Sudipto Banerjee, Dipak K. Dey
Abstract Spatial process models are widely used for modeling point-referenced variables arising from diverse scientific domains. Analyzing the resulting random surface provides deeper insights into the nature of latent dependence within the studied response. We develop Bayesian modeling and inference for rapid changes on the response surface to assess directional curvature along a given trajectory
-
A Two-Sample Conditional Distribution Test Using Conformal Prediction and Weighted Rank Sum J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-02-06 Xiaoyu Hu, Jing Lei
Abstract We consider the problem of testing the equality of conditional distributions of a response variable given a vector of covariates between two populations. Such a hypothesis testing problem can be motivated from various machine learning and statistical inference scenarios, including transfer learning and causal predictive inference. We develop a nonparametric test procedure inspired from the
-
Skeleton Clustering: Dimension-Free Density-Aided Clustering J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-01-30 Zeyu Wei, Yen-Chi Chen
Abstract We introduce a density-aided clustering method called Skeleton Clustering that can detect clusters in multivariate and even high-dimensional data with irregular shapes. To bypass the curse of dimensionality, we propose surrogate density measures that are less dependent on the dimension but have intuitive geometric interpretations. The clustering framework constructs a concise representation
-
Bayesian Robustness: A Nonasymptotic Viewpoint J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-01-30 Kush Bhatia, Yi-An Ma, Anca D. Dragan, Peter L. Bartlett, Michael I. Jordan
Abstract We study the problem of robustly estimating the posterior distribution for the setting where observed data can be contaminated with potentially adversarial outliers. We propose Rob-ULA, a robust variant of the Unadjusted Langevin Algorithm (ULA), and provide a finite-sample analysis of its sampling distribution. In particular, we show that after T=O˜(d/εacc) iterations, we can sample from
-
On Semiparametrically Dynamic Functional-coefficient Autoregressive Spatio-Temporal Models with Irregular Location Wide Nonstationarity J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-01-24 Zudi Lu, Xiaohang Ren, Rongmao Zhang
Abstract Nonlinear dynamic modelling of spatio-temporal data is often a challenge, especially due to irregularly observed locations and location-wide non-stationarity. In this paper we propose a semiparametric family of Dynamic Functional-coefficient Autoregressive Spatio-Temporal (DyFAST) models to address the difficulties. We specify the autoregressive smoothing coefficients depending dynamically
-
Kernel Estimation of Bivariate Time-varying Coefficient Model for Longitudinal Data with Terminal Event J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-01-18 Yue Wang, Bin Nan, John D. Kalbfleisch
Abstract We propose a nonparametric bivariate time-varying coefficient model for longitudinal measurements with the occurrence of a terminal event that is subject to right censoring. The time-varying coefficients capture the longitudinal trajectories of covariate effects along with both the followup time and the residual lifetime. The proposed model extends the parametric conditional approach given
-
Bayesian Conjugacy in Probit, Tobit, Multinomial Probit and Extensions: A Review and New Results J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-01-18 Niccolò Anceschi, Augusto Fasano, Daniele Durante, Giacomo Zanella
Abstract A broad class of models that routinely appear in several fields can be expressed as partially or fully discretized Gaussian linear regressions. Besides including classical Gaussian response settings, this class also encompasses probit, multinomial probit and tobit regression, among others, thereby yielding one of the most widely-implemented families of models in routine applications. The relevance
-
Are Latent Factor Regression and Sparse Regression Adequate? J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-01-18 Jianqing Fan, Zhipeng Lou, Mengxin Yu
Abstract We propose the Factor Augmented (sparse linear) Regression Model (FARM) that not only admits both the latent factor regression and sparse linear regression as special cases but also bridges dimension reduction and sparse regression together. We provide theoretical guarantees for the estimation of our model under the existence of sub-Gaussian and heavy-tailed noises (with bounded (1+ϑ)-th moment
-
Variational Bayes for fast and accurate empirical likelihood inference J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-01-18 Weichang Yu, Howard D. Bondell
Abstract We develop a fast and accurate approach to approximate posterior distributions in the Bayesian empirical likelihood framework. Bayesian empirical likelihood allows for the use of Bayesian shrinkage without specification of a full likelihood but is notorious for leading to several computational difficulties. By coupling the stochastic variational Bayes procedure with an adjusted empirical likelihood
-
Optimal linear discriminant analysis for high-dimensional functional data J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-01-06 Kaijie Xue, Jin Yang, Fang Yao
Abstract Most of existing methods of functional data classification deal with one or a few processes. In this work we tackle classification of high-dimensional functional data, in which each observation is potentially associated with a large number of functional processes, p, which is comparable to or even much larger than the sample size n. The challenge arises from the complex inter-correlation structures
-
A Scale-free Approach for False Discovery Rate Control in Generalized Linear Models J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-01-06 Chenguang Dai, Buyu Lin, Xin Xing, Jun S. Liu
Abstract The generalized linear model (GLM) has been widely used in practice to model counts or other types of non-Gaussian data. This paper introduces a framework for feature selection in the GLM that can achieve robust false discovery rate (FDR) control. The main idea is to construct a mirror statistic based on data perturbation to measure the importance of each feature. FDR control is achieved by
-
A Correlated Network Scale-up Model: Finding the Connection Between Subpopulations J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-01-06 Ian Laga, Le Bao, Xiaoyue Niu
Abstract Aggregated relational data (ARD), formed from “How many X’s do you know?” questions, is a powerful tool for learning important network characteristics with incomplete network data. Compared to traditional survey methods, ARD is attractive as it does not require a sample from the target population and does not ask respondents to self-reveal their own status. This is helpful for studying hard-to-reach
-
Copula based Cox proportional hazards models for dependent censoring J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-01-06 Negera Wakgari Deresa, Ingrid Van Keilegom
Abstract Most existing copula models for dependent censoring in the literature assume that the parameter defining the copula is known. However, prior knowledge on this dependence parameter is often unavailable. In this paper we propose a novel model under which the copula parameter does not need to be known. The model is based on a parametric copula model for the relation between the survival time
-
Compositional Graphical Lasso Resolves the Impact of Parasitic Infection on Gut Microbial Interaction Networks in a Zebrafish Model J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-01-06 Chuan Tian, Duo Jiang, Austin Hammer, Thomas Sharpton, Yuan Jiang
Abstract Understanding how microbes interact with each other is key to revealing the underlying role that microorganisms play in the host or environment and to identifying microorganisms as an agent that can potentially alter the host or environment. For example, understanding how the microbial interactions associate with parasitic infection can help resolve potential drug or diagnostic test for parasitic
-
Differential Privacy for Government Agencies—Are We There Yet? J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-01-05 Jörg Drechsler
Abstract Government agencies typically need to take potential risks of disclosure into account whenever they publish statistics based on their data or give external researchers access to collected data. In this context, the promise of formal privacy guarantees offered by concepts such as differential privacy seems to be the panacea enabling the agencies to quantify and control the privacy loss incurred
-
Data Science Ethics: Concepts, Techniques and Cautionary Tales. J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-01-05 Sabrina Giordano
Published in Journal of the American Statistical Association (Just accepted, 2023)
-
Higher-order least squares: assessing partial goodness of fit of linear causal models J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-01-04 Christoph Schultheiss, Peter Bühlmann, Ming Yuan
Abstract We introduce a simple diagnostic test for assessing the overall or partial goodness of fit of a linear causal model with errors being independent of the covariates. In particular, we consider situations where hidden confounding is potentially present. We develop a method and discuss its capability to distinguish between covariates that are confounded with the response by latent variables and
-
The Potential of Factor Analysis for Replication, Generalization, and Integration J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-01-03 Paul De Boeck, Michael L. DeKay, Menglin Xu
Published in Journal of the American Statistical Association (Vol. 117, No. 540, 2022)
-
Modeling and Learning From Variation and Covariation J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-01-03 Blakeley B. McShane, Ulf Böckenholt, Karsten T. Hansen
Published in Journal of the American Statistical Association (Vol. 117, No. 540, 2022)
-
Semiparametric Regression with R J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2023-01-03 Zixiao Wang, Yi Feng, Lin Liu
Published in Journal of the American Statistical Association (Vol. 117, No. 540, 2022)
-
Fair Policy Targeting J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2022-12-15 Davide Viviano, Jelena Bradic
Abstract One of the major concerns of targeting interventions on individuals in social welfare programs is discrimination: individualized treatments may induce disparities across sensitive attributes such as age, gender, or race. This paper addresses the question of the design of fair and efficient treatment allocation rules. We adopt the non-maleficence perspective of “first do no harm”: we select
-
Assessing the Most Vulnerable Subgroup to Type II Diabetes Associated with Statin Usage: Evidence from Electronic Health Record Data J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2022-12-15 Xinzhou Guo, Waverly Wei, Molei Liu, Tianxi Cai, Chong Wu, Jingshen Wang
Abstract There have been increased concerns that the use of statins, one of the most commonly prescribed drugs for treating coronary artery disease, is potentially associated with the increased risk of new-onset type II diabetes (T2D). Nevertheless, to date, there is no robust evidence supporting as to whether and what kind of populations are indeed vulnerable for developing T2D after taking statins
-
Confidently Comparing Estimates with the c-value J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2022-12-15 Brian L. Trippe, Sameer K. Deshpande, Tamara Broderick
Modern statistics provides an ever-expanding toolkit for estimating unknown parameters. Consequently, applied statisticians frequently face a difficult decision: retain a parameter estimate from a ...
-
Genetic underpinnings of brain structural connectome for young adults J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2022-12-07 Yize Zhao, Changgee Chang, Jingwen Zhang, Zhengwu Zhang
Abstract With distinct advantages in power over behavioral phenotypes, brain imaging traits have become emerging endophenotypes to dissect molecular contributions to behaviors and neuropsychiatric illnesses. Among different imaging features, brain structural connectivity (i.e., structural connectome) which summarizes the anatomical connections between different brain regions is one of the most cutting
-
A Random Projection Approach to Hypothesis Tests in High-Dimensional Single-Index Models J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2022-12-07 Changyu Liu, Xingqiu Zhao, Jian Huang
Abstract In this paper, we consider the problem of hypothesis testing in high-dimensional single-index models. First, we study the feasibility of applying the classical F-test to a single-index model when the dimension of covariate vector and sample size are of the same order, and derive its asymptotic null distribution and asymptotic local power function. For the ultrahigh-dimensional single-index
-
Crime in Philadelphia: Bayesian Clustering with Particle Optimization J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2022-12-07 Cecilia Balocchi, Sameer K. Deshpande, Edward I. George, Shane T. Jensen
Abstract Accurate estimation of the change in crime over time is a critical first step towards better understanding of public safety in large urban environments. Bayesian hierarchical modeling is a natural way to study spatial variation in urban crime dynamics at the neighborhood level, since it facilitates principled “sharing of information” between spatially adjacent neighborhoods. Typically, however
-
A Flexible Zero-Inflated Poisson-Gamma Model with Application to Microbiome Sequence Count Data J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2022-11-30 Roulan Jiang, Xiang Zhan, Tianying Wang
Abstract In microbiome studies, it is of interest to use a sample from a population of microbes, such as the gut microbiota community, to estimate the population proportion of these taxa. However, due to biases introduced in sampling and preprocessing steps, these observed taxa abundances may not reflect true taxa abundance patterns in the ecosystem. Repeated measures, including longitudinal study
-
Guaranteed Functional Tensor Singular Value Decomposition* J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2022-11-30 Rungang Han, Pixu Shi, Anru R. Zhang
Abstract This paper introduces the functional tensor singular value decomposition (FTSVD), a novel dimension reduction framework for tensors with one functional mode and several tabular modes. The problem is motivated by high-order longitudinal data analysis. Our model assumes the observed data to be a random realization of an approximate CP low-rank functional tensor measured on a discrete time grid
-
Feature Screening for Interval-Valued Response with Application to Study Association between Posted Salary and Required Skills J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2022-11-30 Wei Zhong, Chen Qian, Wanjun Liu, Liping Zhu, Runze Li
Abstract It is important to quantify the differences in returns to skills using the online job advertisements data, which have attracted great interest in both labor economics and statistics fields. In this paper, we study the relationship between the posted salary and the job requirements in online labor markets. There are two challenges to deal with. First, the posted salary is always presented in
-
Adaptive Algorithm for Multi-armed Bandit Problem with High-dimensional Covariates J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2022-11-30 Wei Qian, Ching-Kang Ing, Ji Liu
Abstract This paper studies an important sequential decision making problem known as the multi-armed stochastic bandit problem with covariates. Under a linear bandit framework with high-dimensional covariates, we propose a general multi-stage arm allocation algorithm that integrates both arm elimination and randomized assignment strategies. By employing a class of high-dimensional regression methods
-
Simultaneous Decorrelation of Matrix Time Series* J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2022-11-29 Yuefeng Han, Rong Chen Cun-Hui Zhang, Qiwei Yao
Abstract We propose a contemporaneous bilinear transformation for a p × q matrix time series to alleviate the difficulties in modeling and forecasting matrix time series when p and/or q are large. The resulting transformed matrix assumes a block structure consisting of several small matrices, and those small matrix series are uncorrelated across all times. Hence an overall parsimonious model is achieved
-
Partially Linear Additive Regression with a General Hilbertian Response1 J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2022-11-28 Sungho Cho, Jeong Min Jeon, Dongwoo Kim, Kyusang Yu, Byeong U. Park
Abstract In this paper we develop semiparametric regression techniques for fitting partially linear additive models. The methods are for a general Hilbert-space-valued response. They use a powerful technique of additive regression in profiling out the additive nonparametric components of the models, which necessarily involves additive regression of the non-additive effects of covariates. We show that
-
Finite-dimensional Discrete Random Structures and Bayesian Clustering J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2022-11-18 Antonio Lijoi, Igor Prünster, Tommaso Rigon
Abstract Discrete random probability measures stand out as effective tools for Bayesian clustering. The investigation in the area has been very lively, with a strong emphasis on nonparametric procedures based on either the Dirichlet process or on more flexible generalizations, such as the normalized random measures with independent increments (nrmi). The literature on finite-dimensional discrete priors
-
Sampling: Design and Analysis, 3rd ed. J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2022-11-16 S. Lynne Stokes
Published in Journal of the American Statistical Association (Vol. 117, No. 540, 2022)
-
Proximal Learning for Individualized Treatment Regimes Under Unmeasured Confounding J. Am. Stat. Assoc. (IF 4.369) Pub Date : 2022-11-14 Zhengling Qi, Rui Miao, Xiaoke Zhang
Data-driven individualized decision making has recently received increasing research interest. However, most existing methods rely on the assumption of no unmeasured confounding, which cannot be ensured in practice especially in observational studies. Motivated by the recently proposed proximal causal inference, we develop several proximal learning methods to estimate optimal individualized treatment