显示样式： 排序： IF:  GO 导出

Statistical Inference for Online Decision Making via Stochastic Gradient Descent J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200923
Haoyu Chen; Wenbin Lu; Rui SongOnline decision making aims to learn the optimal decision rule by making personalized decisions and updating the decision rule recursively. It has become easier than before with the help of big data, but new challenges also come along. Since the decision rule should be updated once per step, an offline update which uses all the historical data is inefficient in computation and storage. To this end

Bayesian Regression Using a Prior on the Model Fit: The R2D2 Shrinkage Prior J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200923
Yan Dora Zhang; Brian P. Naughton; Howard D. Bondell; Brian J. ReichPrior distributions for highdimensional linear regression require specifying a joint distribution for the unobserved regression coefficients, which is inherently difficult. We instead propose a new class of shrinkage priors for linear regression via specifying a prior first on the model fit, in particular, the coefficient of determination, and then distributing through to the coefficients in a novel

Estimating Number of Factors by Adjusted Eigenvalues Thresholding J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200921
Jianqing Fan; Jianhua Guo; Shurong ZhengDetermining the number of common factors is an important and practical topic in highdimensional factor models. The existing literature is mainly based on the eigenvalues of the covariance matrix. Owing to the incomparability of the eigenvalues of the covariance matrix caused by the heterogeneous scales of the observed variables, it is not easy to find an accurate relationship between these eigenvalues

Balancing Unobserved Covariates with CovariateAdaptive Randomized Experiments J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200921
Yang Liu; Feifang HuBalancing important covariates is often critical in clinical trials and causal inference. Stratified permuted block (STRPB) and covariateadaptive randomization (CAR) procedures are widely used to balance observed covariates in practice. The balance properties of these procedures with respect to the observed covariates have been well studied. However, it has been questioned whether these methods will

Warp Bridge Sampling: The Next Generation J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200921
Lazhi Wang; David E. Jones; XiaoLi MengBridge sampling is an effective Monte Carlo method for estimating the ratio of normalizing constants of two probability densities, a routine computational problem in statistics, physics, chemistry, and other fields. The Monte Carlo error of the bridge sampling estimator is determined by the amount of overlap between the two densities. In the case of unimodal densities, WarpI, II, and III transformations

Simultaneous Detection of Signal Regions Using Quadratic Scan Statistics With Applications to Whole Genome Association Studies J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200914
Zilin Li; Yaowu Liu; Xihong LinWe consider in this paper detection of signal regions associated with disease outcomes in whole genome association studies. Gene or regionbased methods have become increasingly popular in whole genome association analysis as a complementary approach to traditional individual variant analysis. However, these methods test for the association between an outcome and the genetic variants in a prespecified

LowRank Covariance Function Estimation for Multidimensional Functional Data J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200909
Jiayi Wang; Raymond K. W. Wong; Xiaoke ZhangMultidimensional function data arise from many fields nowadays. The covariance function plays an important role in the analysis of such increasingly common data. In this paper, we propose a novel nonparametric covariance function estimation approach under the framework of reproducing kernel Hilbert spaces (RKHS) that can handle both sparse and dense functional data. We extend multilinear rank structures

Bayesian Semiparametric Longitudinal DriftDiffusion Mixed Models for Tone Learning in Adults J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200908
Giorgio Paulon; Fernando Llanos; Bharath Chandrasekaran; Abhra SarkarUnderstanding how adult humans learn nonnative speech categories such as tone information has shed novel insights into the mechanisms underlying experiencedependent brain plasticity. Scientists have traditionally examined these questions using longitudinal learning experiments under a multicategory decision making paradigm. Driftdiffusion processes are popular in such contexts for their ability

Improved Doubly Robust Estimation in Learning Optimal Individualized Treatment Rules J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200908
Yinghao Pan; YingQi ZhaoIndividualized treatment rules (ITRs) recommend treatment according to patient characteristics. There is a growing interest in developing novel and efficient statistical methods in constructing ITRs. We propose an improved doubly robust estimator of the optimal ITRs. The proposed estimator is based on a direct optimization of an augmented inverseprobability weighted estimator of the expected clinical

Using Maximum EntryWise Deviation to Test the Goodness of Fit for Stochastic Block Models J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200908
Jianwei Hu; Jingfei Zhang; Hong Qin; Ting Yan; Ji ZhuThe stochastic block model is widely used for detecting community structures in network data. How to test the goodness of fit of the model is one of the fundamental problems and has gained growing interests in recent years. In this article, we propose a novel goodnessoffit test based on the maximum entry of the centered and rescaled adjacency matrix for the stochastic block model. One noticeable

LargeScale Datastreams Surveillance via PatternOrientedSampling J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200908
Haojie Ren; Changliang Zou; Nan Chen; Runze LiMonitoring largescale datastreams with limited resources has become increasingly important for realtime detection of abnormal activities in many applications. Despite the availability of large datasets, the challenges associated with designing an efficient changedetection when clustering or spatial pattern exists are not yet well addressed. In this paper, a designadaptive testing procedure is developed

Stochastic Tree Search for Estimating Optimal Dynamic Treatment Regimes J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200908
Yilun Sun; Lu WangA dynamic treatment regime (DTR) is a sequence of decision rules that adapt to the timevarying states of an individual. Blackbox learning methods have shown great potential in predicting the optimal treatments; however, the resulting DTRs lack interpretability, which is of paramount importance for medical experts to understand and implement. We present a stochastic treebased reinforcement learning

Copula Gaussian graphical models for functional data J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200903
Eftychia Solea; Bing LiWe introduce a statistical graphical model for multivariate functional data, which are common in medical applications such as EEG and fMRI. Recently published functional graphical models rely on the multivariate Gaussian process assumption, but we relax it by introducing the Functional Copula Gaussian Graphical Model (FCGGM). This model removes the marginal Gaussian assumption but retains the simplicity

Learning Individualized Treatment Rules for MultipleDomain Latent Outcomes J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200903
Yuan Chen; Donglin Zeng; Yuanjia WangFor many mental disorders, latent mental status from multipledomain psychological or clinical symptoms may perform as a better characterization of the underlying disorder status than a simple summary score of the symptoms, and they may also serve as more reliable and representative features to differentiate treatment responses. Therefore, in order to address the complexity and heterogeneity of treatment

Do School Districts Affect NYC House Prices? Identifying Border Differences Using a Bayesian Nonparametric Approach to Geographic Regression Discontinuity Designs. J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200903
Maxime Rischard; Zach Branson; Luke Miratrix; Luke BornnWhat is the premium on house price for a particular school district? To estimate this in New York City we use a novel implementation of a Geographic Regression Discontinuity Design (GeoRDD) built from Gaussian processes regression (kriging) to model spatial structure. With a GeoRDD, we specifically examine price differences along borders between “treatment” and “control” school districts. GeoRDDs extend

Handbook of Graphical Models J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200901
Genevera I. Allen(2020). Handbook of Graphical Models. Journal of the American Statistical Association: Vol. 115, No. 531, pp. 15551557.

Statistical Computing With R J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200901
Ling Leng(2020). Statistical Computing With R. Journal of the American Statistical Association: Vol. 115, No. 531, pp. 15571558.

Time Series Clustering and Classification J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200901
Ming Chen(2020). Time Series Clustering and Classification. Journal of the American Statistical Association: Vol. 115, No. 531, pp. 15581558.

OffPolicy Estimation of LongTerm Average Outcomes with Applications to Mobile Health J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200826
Peng Liao; Predrag Klasnja; Susan MurphyDue to the recent advancements in wearables and sensing technology, health scientists are increasingly developing mobile health (mHealth) interventions. In mHealth interventions, mobile devices are used to deliver treatment to individuals as they go about their daily lives. These treatments are generally designed to impact a near time, proximal outcome such as stress or physical activity. The mHealth

LogLinear Bayesian Additive Regression Trees for Multinomial Logistic and Count Regression Models J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200826
Jared S. MurrayWe introduce Bayesian additive regression trees (BART) for loglinear models including multinomial logistic regression and count regression with zeroinflation and overdispersion. BART has been applied to nonparametric mean regression and binary classification problems in a range of settings. However, existing applications of BART have been mostly limited to models for Gaussian “data”, either observed

Inference on a New Class of Sample Average Treatment Effects J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200825
Jasjeet S. Sekhon; Yotam ShemTovWe derive new variance formulas for inference on a general class of estimands of causal average treatment effects in a randomized control trial. We generalize the seminal work of Robins and show that when the researcher’s objective is inference on sample average treatment effect of the treated (SATT), a consistent variance estimator exists. Although this estimand is equal to the sample average treatment

Monte Carlo Approximation of Bayes Factors via Mixing with Surrogate Distributions J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200819
Chenguang Dai; Jun S. LiuBy mixing the target posterior distribution with a surrogate distribution, of which the normalizing constant is tractable, we propose a method for estimating the marginal likelihood using the WangLandau algorithm. We show that a faster convergence of the proposed method can be achieved via the momentum acceleration. Two implementation strategies are detailed: (i) facilitating global jumps between

A Semiparametric Approach to Model Effect Modification J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200819
Muxuan Liang; Menggang YuOne fundamental statistical question for research areas such as precision medicine and health disparity is about discovering effect modification of treatment or exposure by observed covariates. We propose a semiparametric framework for identifying such effect modification. Instead of using the traditional outcome models, we directly posit semiparametric models on contrasts, or expected differences

AutoGComputation of Causal Effects on a Network J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200819
Eric J. Tchetgen Tchetgen; Isabel R. Fulcher; Ilya ShpitserMethods for inferring average causal effects have traditionally relied on two key assumptions: (i) the intervention received by one unit cannot causally influence the outcome of another; and (ii) units can be organized into nonoverlapping groups such that outcomes of units in separate groups are independent. In this paper, we develop new statistical methods for causal inference based on a single realization

CrossValidation for Correlated Data J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200813
Assaf Rabinowicz; Saharon RossetKfold crossvalidation (CV) with squared error loss is widely used for evaluating predictive models, especially when strong distributional assumptions cannot be taken. However, CV with squared error loss is not free from distributional assumptions, in particular in cases involving noni.i.d. data. This paper analyzes CV for correlated data. We present a criterion for suitability of standard CV in

Recurrent Events Analysis With Data Collected at Informative Clinical Visits in Electronic Health Records J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200813
Yifei Sun; Charles E. McCulloch; Kieren A. Marr; ChiungYu HuangAlthough increasingly used as a data resource for assembling cohorts, electronic health records (EHRs) pose many analytic challenges. In particular, a patient’s health status influences when and what data are recorded, generating sampling bias in the collected data. In this paper, we consider recurrent event analysis using EHR data. Conventional regression methods for event risk analysis usually require

Anglebased hierarchical classification using exact label embedding J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200813
Yiwei Fan; Xiaoling Lu; Yufeng Liu; Junlong ZhaoHierarchical classification problems are commonly seen in practice. However, most existing methods do not fully utilize the hierarchical information among class labels. In this paper, a novel label embedding approach is proposed, which keeps the hierarchy of labels exactly, and reduces the complexity of the hypothesis space significantly. Based on the newly proposed label embedding approach, a new

Nonparametric maximum likelihood methods for binary response models with random coefficients J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200813
Jiaying Gu; Roger KoenkerThe venerable method of maximum likelihood has found numerous recent applications in nonparametric estimation of regression and shape constrained densities. For mixture models the nonparametric maximum likelihood estimator (NPMLE) of Kiefer and Wolfowitz (1956) plays a central role in recent developments of empirical Bayes methods. The NPMLE has also been proposed by Cosslett (1983) as an estimation

Inferring Phenotypic Trait Evolution on Large Trees With Many Incomplete Measurements J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200813
Gabriel Hassler; Max R. Tolkoff; William L. Allen; Lam Si Tung Ho; Philippe Lemey; Marc A. SuchardComparative biologists are often interested in inferring covariation between multiple biological traits sampled across numerous related taxa. To properly study these relationships, we must control for the shared evolutionary history of the taxa to avoid spurious inference. An additional challenge arises as obtaining a full suite of measurements becomes increasingly difficult with increasing taxa. This

Evaluation of the health impacts of the 1990 Clean Air Act Amendments using causal inference and machine learning J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200731
Rachel C. Nethery; Fabrizia Mealli; Jason D. Sacks; Francesca DominiciWe develop a causal inference approach to estimate the number of adverse health events that were prevented due to changes in exposure to multiple pollutants attributable to a largescale air quality intervention/regulation, with a focus on the 1990 Clean Air Act Amendments (CAAA). We introduce a causal estimand called the Total Events Avoided (TEA) by the regulation, defined as the difference in the

A twopart framework for estimating individualized treatment rules from semicontinuous outcomes J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200729
Jared Huling; Maureen Smith; Guanhua ChenHealth care payments are an important component of health care utilization and are thus a major focus in health services and health policy applications. However, payment outcomes are semicontinuous in that over a given period of time some patients incur no payments and some patients incur large costs. Individualized treatment rules (ITRs) are a major part of the push for tailoring treatments and interventions

Evaluating proxy influence in assimilated paleoclimate reconstructions – Testing the exchangeability of two ensembles of spatial processes J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200728
Trevor Harris; Bo Li; Nathan J. Steiger; Jason E. Smerdon; Naveen Narisetty; J. Derek TuckerClimate field reconstructions (CFR) attempt to estimate spatiotemporal fields of climate variables in the past using climate proxies such as tree rings, ice cores, and corals. Data Assimilation (DA) methods are a recent and promising new means of deriving CFRs that optimally fuse climate proxies with climate model output. Despite the growing application of DAbased CFRs, little is understood about

Estimation of Low Rank High Dimensional Multivariate Linear Models for Multiresponse Data* J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200728
Changliang Zou; Yuan Ke; Wenyang ZhangIn this paper, we study low rank high dimensional multivariate linear models (LRMLM) for high dimensional multiresponse data. We propose an intuitively appealing estimation approach, and develop an algorithm for implementation purposes. Asymptotic properties are established in order to justify the estimation procedure theoretically. Intensive simulation studies are also conducted to demonstrate performance

A Bottomup Approach to Testing Hypotheses That Have a Branching Tree Dependence Structure, with Error Rate Control J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200728
Yunxiao Li; YiJuan Hu; Glen A. SattenModern statistical analyses often involve testing large numbers of hypotheses. In many situations, these hypotheses may have an underlying tree structure that both helps determine the order that tests should be conducted but also imposes a dependency between tests that must be accounted for. Our motivating example comes from testing the association between a trait of interest and groups of microbes

Rare Feature Selection in High Dimensions J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200728
Xiaohan Yan; Jacob BienIt is common in modern prediction problems for many predictor variables to be counts of rarely occurring events. This leads to design matrices in which many columns are highly sparse. The challenge posed by such “rare features” has received little attention despite its prevalence in diverse areas, ranging from natural language processing (e.g., rare words) to biology (e.g., rare species). We show,

Nonparametric estimation of galaxy cluster emissivity and detection of point sources in astrophysics with two lasso penalties J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200728
Jairo DiazRodriguez; Dominique Eckert; Hatef Monajemi; Stéphane Paltani; Sylvain SardyAstrophysicists are interested in recovering the 3D gas emissivity of a galaxy cluster from a 2D telescope image. Blurring and point sources make this inverse problem harder to solve. The conventional approach requires in a first step to identify and mask the point sources. Instead we model all astrophysical components in a single Poisson generalized linear model. To enforce sparsity on the parameters

Genebased Association Testing of Dichotomous Traits with Generalized Linear Mixed Models Using Extended Pedigrees: Applications to Agerelated Macular Degeneration J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200728
Yingda Jiang; ChiYang Chiu; Qi Yan; Wei Chen; Michael B. Gorin; Yvette P. Conley; M’Hamed Lajmi LakhalChaieb; Richard J. Cook; Christopher I. Amos; Alexander F. Wilson; Joan E. BaileyWilson; Francis J. McMahon; Ana I. Vazquez; Ao Yuan; Xiaogang Zhong; Momiao Xiong; Daniel E. Weeks; Ruzong FanGenetics plays a role in agerelated macular degeneration (AMD), a common cause of blindness in the elderly. There is a need for powerful methods for carrying out regionbased association tests between a dichotomous trait like AMD and genetic variants on family data. Here we apply our new generalized functional linear mixed models (GFLMM) developed to test for genebased association in a set of AMD

Assessing partial association between ordinal variables: quantification, visualization, and hypothesis testing J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200716
Dungang Liu; Shaobo Li; Yan Yu; Irini MoustakiPartial association refers to the relationship between variables Y1,Y2,…,YK while adjusting for a set of covariates X={X1,…,Xp}. To assess such an association when Yk’s are recorded on ordinal scales, a classical approach is to use partial correlation between the latent continuous variables. This socalled polychoric correlation is inadequate, as it requires multivariate normality and it only reflects

Learning Optimal Distributionally Robust Individualized Treatment Rules J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200716
Weibin Mo; Zhengling Qi; Yufeng LiuRecent development in the datadriven decision science has seen great advances in individualized decision making. Given data with individual covariates, treatment assignments and outcomes, policy makers best individualized treatment rule (ITR) that maximizes the expected outcome, known as the value function. Many existing methods assume that the training and testing distributions are the same. However

Bias and highdimensional adjustment in observational studies of peer effects J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200716
Dean Eckles; Eytan BakshyPeer effects, in which an individual’s behavior is affected by peers’ behavior, are posited by multiple theories in the social sciences. Randomized field experiments that identify peer effects, however, are often expensive or infeasible, so many studies of peer effects use observational data, which is expected to suffer from confounding. Here we show, in the context of information and media diffusion

Semiparametric fractional imputation using Gaussian mixture models for handling multivariate missing data J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200716
Hejian Sang; Jae Kwang Kim; Danhyang LeeItem nonresponse is frequently encountered in practice. Ignoring missing data can lose efficiency and lead to misleading inference. Fractional imputation is a frequentist approach of imputation for handling missing data. However, the parametric fractional imputation of Kim (2011) may be subject to bias under model misspecification. In this paper, we propose a novel semiparametric fractional imputation

Bayesian Joint Modeling of Multiple Brain Functional Networks J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200716
Joshua Lukemire; Suprateek Kundu; Giuseppe Pagnoni; Ying GuoInvestigating the similarity and changes in brain networks under different mental conditions has become increasingly important in neuroscience research. A standard separate estimation strategy fails to pool information across networks and hence has reduced estimation accuracy and power to detect betweennetwork differences. Motivated by a fMRI Stroop task experiment that involves multiple related tasks

A semiparametric instrumental variable approach to optimal treatment regimes under endogeneity J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200714
Yifan Cui; Eric Tchetgen TchetgenThere is a fastgrowing literature on estimating optimal treatment regimes based on randomized trials or observational studies under a key identifying condition of no unmeasured confounding. Because confounding by unmeasured factors cannot generally be ruled out with certainty in observational studies or randomized trials subject to noncompliance, we propose a general instrumental variable approach

The Hellinger correlation J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200709
Gery Geenens; Pierre Lafaye de MicheauxIn this paper, the defining properties of any valid measure of the dependence between two continuous random variables are revisited and complemented with two original ones, shown to imply other usual postulates. While other popular choices are proved to violate some of these requirements, a class of dependence measures satisfying all of them is identified. One particular measure, that we call the Hellinger

Likelihoodbased Inference for Partially Observed Epidemics on Dynamic Networks J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200709
Fan Bu; Allison E. Aiello; Jason Xu; Alexander VolfovskyWe propose a generative model and an inference scheme for epidemic processes on dynamic, adaptive contact networks. Network evolution is formulated as a linkMarkovian process, which is then coupled to an individuallevel stochastic SIR model, in order to describe the interplay between the dynamics of the disease spread and the contact network underlying the epidemic. A Markov chain Monte Carlo framework

AdaBoost semiparametric model averaging prediction for multiple categories J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200709
Jialiang Li; Jing Lv; Alan T.K. Wan; Jun LiaoModel average techniques are very useful for modelbased prediction. However most earlier works in this field focused on parametric models and continuous responses. In this paper, we study varying coefficient multinomial logistic models and propose a semiparametric model averaging prediction (SMAP) approach for multicategory outcomes. The proposed procedure does not need any artificial specification

Restricted Spatial Regression Methods: Implications for Inference J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200709
Kori Khan; Catherine A. CalderThe issue of spatial confounding between the spatial random effect and the fixed effects in regression analyses has been identified as a concern in the statistical literature. Multiple authors have offered perspectives and potential solutions. In this paper, for the areal spatial data setting, we show that many of the methods designed to alleviate spatial confounding can be viewed as special cases

Irrational Exuberance: Correcting Bias in Probability Estimates J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200709
Gareth M. James; Peter Radchenko; Bradley RavaWe consider the common setting where one observes probability estimates for a large number of events, such as default risks for numerous bonds. Unfortunately, even with unbiased estimates, selecting events corresponding to the most extreme probabilities can result in systematically underestimating the true level of uncertainty. We develop an empirical Bayes approach “Excess Certainty Adjusted Probabilities”

DistributionFree Multisample Tests Based on Optimal Matchings with Applications to Single Cell Genomics J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200709
Somabha Mukherjee; Divyansh Agarwal; Nancy R. Zhang; Bhaswar B. BhattacharyaIn this paper we propose a nonparametric graphical test based on optimal matching, for assessing the equality of multiple unknown multivariate probability distributions. Our procedure pools the data from the different classes to create a graph based on the minimum nonbipartite matching, and then utilizes the number of edges connecting data points from different classes to examine the closeness between

More Efficient Policy Learning via Optimal Retargeting J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200709
Nathan KallusPolicy learning can be used to extract individualized treatment regimes from observational data in healthcare, civics, ecommerce, and beyond. One big hurdle to policy learning is a commonplace lack of overlap in the data for different actions, which can lead to unwieldy policy evaluation and poorly performing learned policies. We study a solution to this problem based on retargeting, that is, changing

Covariate Adaptive False Discovery Rate Control with Applications to OmicsWide Multiple Testing J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200709
Xianyang Zhang; Jun ChenConventional multiple testing procedures often assume hypotheses for different features are exchangeable. However, in many scientific applications, additional covariate information regarding the patterns of signals and nulls are available. In this paper, we introduce an FDR control procedure in largescale inference problem that can incorporate covariate information. We develop a fast algorithm to

Semiparametric Estimation of the Distribution of Episodically Consumed Foods Measured with Error J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200709
Félix Camirand Lemyre; Raymond J. Carroll; Aurore DelaigleDietary data collected from 24hour dietary recalls are observed with significant measurement errors. In the nonparametric curve estimation literature, much of the effort has been devoted to designing methods that are consistent under contamination by noise, and which have been traditionally applied for analysing those data. However, some foods such as alcohol or fruits are consumed only episodically

Estimating a change point in a sequence of very highdimensional covariance matrices J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200709
Holger Dette; G. M. Pan; Q. YangThis paper considers the problem of estimating a change point in the covariance matrix in a sequence of highdimensional vectors, where the dimension is substantially larger than the sample size. A twostage approach is proposed to efficiently estimate the location of the change point. The first step consists of a reduction of the dimension to identify elements of the covariance matrices corresponding

Personalized Policy Learning using Longitudinal Mobile Health Data J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200709
Xinyu Hu; Min Qian; Bin Cheng; Ying Kuen CheungPersonalized policy represents a paradigm shift from onedecisionruleforall users to an individualized decision rule for each user. Developing personalized policy in mobile health applications imposes challenges. First, for lack of adherence, data from each user are limited. Second, unmeasured contextual factors can potentially impact on decision making. Aiming to optimize immediate rewards, we

Distributionfree consistent independence tests via centeroutward ranks and signs J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200630
Hongjian Shi; Mathias Drton; Fang HanThis paper investigates the problem of testing independence of two random vectors of general dimensions. For this, we give for the first time a distributionfree consistent test. Our approach combines distance covariance with the centeroutward ranks and signs developed in Hallin (2017). In technical terms, the proposed test is consistent and distributionfree in the family of multivariate distributions

Modelfree Feature Screening and FDR Control with Knockoff Features J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200625
Wanjun Liu; Yuan Ke; Jingyuan Liu; Runze LiThis paper proposes a modelfree and dataadaptive feature screening method for ultrahigh dimensional data. The proposed method is based on the projection correlation which measures the dependence between two random vectors. This projection correlation based method does not require specifying a regression model, and applies to data in the presence of heavy tails and multivariate responses. It enjoys

Causal Inference with Interference and Noncompliance in TwoStage Randomized Experiments* J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200625
Kosuke Imai; Zhichao Jiang; Anup MalaniIn many social science experiments, subjects often interact with each other and as a result one unit’s treatment influences the outcome of another unit. Over the last decade, a significant progress has been made towards causal inference in the presence of such interference between units. Researchers have shown that the twostage randomization of treatment assignment enables the identification of average

Identification and estimation of treatment and interference effects in observational studies on networks J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200625
Laura Forastiere; Edoardo M. Airoldi; Fabrizia MealliCausal inference on a population of units connected through a network often presents technical challenges, including how to account for interference. In the presence of interference, for instance, potential outcomes of a unit depend on its treatment as well as on the treatments of other units, such as its neighbors in the network. In observational studies, a further complication is that the typical

Minibatch MetropolisHastings with Reversible SGLD Proposal J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200624
TungYu Wu; Y. X. Rachel Wang; Wing H. WongTraditional MCMC algorithms are computationally intensive and do not scale well to large data. In particular, the MetropolisHastings (MH) algorithm requires passing over the entire dataset to evaluate the likelihood ratio in each iteration. We propose a general framework for performing MHMCMC using minibatches of the whole dataset and show that this gives rise to approximately a tempered stationary

On design orthogonality, maximin distance and projection uniformity for computer experiments J. Am. Stat. Assoc. (IF 3.989) Pub Date : 20200624
Yaping Wang; Fasheng Sun; Hongquan XuSpacefilling designs are widely used in both computer and physical experiments. Columnorthogonality, maximin distance and projection uniformity are three basic and popular spacefilling criteria proposed from different perspectives, but their relationships have been rarely investigated. We show that the average squared correlation metric is a function of the pairwise L2distances between the rows