显示样式： 排序： IF:  GO 导出

Influence Diagnostics for HighDimensional Lasso Regression J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190611
Bala Rajaratnam; Steven Roberts; Doug Sparks; Honglin YuAbstract The increased availability of highdimensional data, and appeal of a “sparse” solution has made penalized likelihood methods commonplace. Arguably the most widely utilized of these methods is ℓ1 regularization, popularly known as the lasso. When the lasso is applied to highdimensional data, observations are relatively few; thus, each observation can potentially have tremendous influence on

Distributed Generalized CrossValidation for DivideandConquer Kernel Ridge Regression and Its Asymptotic Optimality J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190528
Ganggang Xu; Zuofeng Shang; Guang ChengTuning parameter selection is of critical importance for kernel ridge regression. To date, a datadriven tuning method for divideandconquer kernel ridge regression (dKRR) has been lacking in the literature, which limits the applicability of dKRR for large datasets. In this article, by modifying the generalized crossvalidation (GCV) score, we propose a distributed generalized crossvalidation (dGCV)

ComponentBased Regularization of Multivariate Generalized Linear Mixed Models J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190604
Jocelyn Chauvet; Catherine Trottier; Xavier Bry(2019). ComponentBased Regularization of Multivariate Generalized Linear Mixed Models. Journal of Computational and Graphical Statistics: Vol. 28, No. 4, pp. 909920.

Simultaneous Variable and Covariance Selection With the Multivariate SpikeandSlab LASSO J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190517
Sameer K. Deshpande; Veronika Ročková; Edward I. GeorgeWe propose a Bayesian procedure for simultaneous variable and covariance selection using continuous spikeandslab priors in multivariate linear regression models where q possibly correlated responses are regressed onto p predictors. Rather than relying on a stochastic search through the highdimensional model space, we develop an ECM algorithm similar to the EMVS procedure of Ročková and George targeting

The Generalized Ridge Estimator of the Inverse Covariance Matrix J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190606
Wessel N. van WieringenAbstract The ridge inverse covariance estimator is generalized to allow for entrywise penalization. An efficient algorithm for its evaluation is proposed. Its computational accuracy is benchmarked against implementations of specific cases the generalized ridge inverse covariance estimator encompasses. The proposed estimator shrinks toward a userspecified, nonrandom target matrix and is shown to be

An Expectation Conditional Maximization Approach for Gaussian Graphical Models J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190619
Zehang Richard Li; Tyler H. McCormickBayesian graphical models are a useful tool for understanding dependence relationships among many variables, particularly in situations with external prior information. In highdimensional settings, the space of possible graphs becomes enormous, rendering even stateoftheart Bayesian stochastic search computationally infeasible. We propose a deterministic alternative to estimate Gaussian and Gaussian

Beyond Prediction: A Framework for Inference With Variational Approximations in Mixture Models J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190626
T. Westling; T. H. McCormickVariational inference is a popular method for estimating model parameters and conditional distributions in hierarchical and mixed models, which arise frequently in many settings in the health, social, and biological sciences. Variational inference in a frequentist context works by approximating intractable conditional distributions with a tractable family and optimizing the resulting lower bound on

Adaptive Incremental Mixture Markov Chain Monte Carlo J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190607
Florian Maire; Nial Friel; Antonietta Mira; Adrian E. RafteryAbstract We propose adaptive incremental mixture Markov chain Monte Carlo (AIMM), a novel approach to sample from challenging probability distributions defined on a general statespace. While adaptive MCMC methods usually update a parametric proposal kernel with a global rule, AIMM locally adapts a semiparametric kernel. AIMM is based on an independent Metropolis–Hastings proposal distribution which

Incremental Mixture Importance Sampling With Shotgun Optimization J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190528
Biljana Jonoska Stojkova; David A. CampbellThis article proposes a general optimization strategy, which combines results from different optimization or parameter estimation methods to overcome shortcomings of a single method. Shotgun optimization is developed as a framework which employs different optimization strategies, criteria, or conditional targets to enable wider likelihood exploration. The introduced shotgun optimization approach is

Easily Parallelizable and Distributable Class of Algorithms for Structured Sparsity, with Optimal Acceleration J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190528
Seyoon Ko; Donghyeon Yu; JoongHo WonMany statistical learning problems can be posed as minimization of a sum of two convex functions, one typically a composition of nonsmooth and linear functions. Examples include regression under structured sparsity assumptions. Popular algorithms for solving such problems, for example, ADMM, often involve nontrivial optimization subproblems or smoothing approximation. We consider two classes of primal–dual

Damped Anderson Acceleration With Restarts and Monotonicity Control for Accelerating EM and EMlike Algorithms J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190528
Nicholas C. Henderson; Ravi VaradhanThe expectationmaximization (EM) algorithm is a wellknown iterative method for computing maximum likelihood estimates in a variety of statistical problems. Despite its numerous advantages, a main drawback of the EM algorithm is its frequently observed slow convergence which often hinders the application of EM algorithms in highdimensional problems or in other complex settings. To address the need

Projection Pursuit Based on Gaussian Mixtures and Evolutionary Algorithms J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190612
Luca Scrucca; Alessio SerafiniWe propose a projection pursuit (PP) algorithm based on Gaussian mixture models (GMMs). The negentropy obtained from a multivariate density estimated by GMMs is adopted as the PP index to be maximized. For a fixed dimension of the projection subspace, the GMMbased density estimation is projected onto that subspace, where an approximation of the negentropy for Gaussian mixtures is computed. Then, genetic

A Metaheuristic Adaptive Cubature Based Algorithm to Find Bayesian Optimal Designs for Nonlinear Models J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190605
Ehsan Masoudi; Heinz Holling; Belmiro P. M. Duarte; Weng Kee WongFinding Bayesian optimal designs for nonlinear models is a difficult task because the optimality criterion typically requires us to evaluate complex integrals before we perform a constrained optimization. We propose a hybridized method where we combine an adaptive multidimensional integration algorithm and a metaheuristic algorithm called imperialist competitive algorithm to find Bayesian optimal designs

Simultaneous Registration and Clustering for Multidimensional Functional Data J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190610
Pengcheng Zeng; Jian Qing Shi; WonSeok KimThe clustering for functional data with misaligned problems has drawn much attention in the last decade. Most methods do the clustering after those functional data being registered and there has been little research using both functional and scalar variables. In this article, we propose a simultaneous registration and clustering model via twolevel models, allowing the use of both types of variables

Flexible and Interpretable Models for Survival Data J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190520
Jiacheng Wu; Daniela WittenAs data sets continue to increase in size, there is growing interest in methods for prediction that are both flexible and interpretable. A flurry of recent work on this topic has focused on additive modeling in the regression setting, and in particular, on the use of dataadaptive nonlinear functions that can be used to flexibly model each covariate’s effect, conditional on the other features in the

Stable Multiple Time Step Simulation/Prediction From Lagged Dynamic Network Regression Models J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190528
Abhirup Mallik; Zack W. AlmquistChanges in computation and automated data collection have greatly increased interest in statistical models of dynamic networks. Many of the models employed for inference on largescale dynamic networks suffer from limited forward simulation/prediction capabilities. One major problem with many of the forward simulation procedures is a tendency for the model to become degenerate in only a few time steps

Improving Spectral Clustering Using the Asymptotic Value of the Normalized Cut J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190520
David P. HofmeyrSpectral clustering (SC) is a popular and versatile clustering method based on a relaxation of the normalized graph cut objective. Despite its popularity, selecting the number of clusters and tuning the important scaling parameter remain challenging problems in practical applications of SC. Popular heuristics have been proposed, but corresponding theoretical results are scarce. In this article, we

VariableDomain Functional Principal Component Analysis J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190610
Jordan T. Johns; Ciprian Crainiceanu; Vadim Zipunnikov; Jonathan GellarWe introduce a novel method of principal component analysis for data with varying domain lengths for each functional observation. We refer to this technique as variabledomain functional principal component analysis, or vdFPCA. We fit a trivariate smoother using penalized thin plate splines to estimate the covariance as a function of the domain length. Principal components are then calculated through

Fast Generalized Linear Models by Database Sampling and OneStep Polishing J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190619
Thomas LumleyIn this article, I show how to fit a generalized linear model to N observations on p variables stored in a relational database, using one sampling query and one aggregation query, as long as N12+δ observations can be stored in memory, for some δ>0. The resulting estimator is fully efficient and asymptotically equivalent to the maximum likelihood estimator, and so its variance can be estimated from

Good Plot Symbols by Default J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190731
Heman RobinsonScatterplots require different symbols for different purposes. For presentation, aesthetically pleasing symbols are popular. For analysis, highly discriminable symbols aid pattern detection. This study identifies a default symbol set suitable for both presentation and analysis. This is achieved by using popular symbols with preattentive differences. Supplemental materials for this article are available

Correction J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190626
(2019). Correction. Journal of Computational and Graphical Statistics: Vol. 28, No. 4, pp. ii.

Correction J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190906
(2019). Correction. Journal of Computational and Graphical Statistics: Vol. 28, No. 4, pp. iiii.

Correction J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20191223
(2019). Correction. Journal of Computational and Graphical Statistics: Vol. 28, No. 4, pp. 10171017.

MM Algorithms For Variance Components Models. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20191009
Hua Zhou,Liuyi Hu,Jin Zhou,Kenneth LangeVariance components estimation and mixed model analysis are central themes in statistics with applications in numerous scientific disciplines. Despite the best efforts of generations of statisticians and numerical analysts, maximum likelihood estimation and restricted maximum likelihood estimation of variance component models remain numerically challenging. Building on the minorizationmaximization

Efficient algorithms for Bayesian Nearest Neighbor Gaussian Processes. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190924
Andrew O Finley,Abhirup Datta,Bruce C Cook,Douglas C Morton,Hans E Andersen,Sudipto BanerjeeWe consider alternate formulations of recently proposed hierarchical Nearest Neighbor Gaussian Process (NNGP) models (Datta et al., 2016a) for improved convergence, faster computing time, and more robust and reproducible Bayesian inference. Algorithms are defined that improve CPU memory management and exploit existing highperformance numerical linear algebra libraries. Computational and inferential

Partition Weighted Approach for Estimating the Marginal Posterior Density with Applications. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190703
YuBo Wang,MingHui Chen,Lynn Kuo,Paul O LewisThe computation of marginal posterior density in Bayesian analysis is essential in that it can provide complete information about parameters of interest. Furthermore, the marginal posterior density can be used for computing Bayes factors, posterior model probabilities, and diagnostic measures. The conditional marginal density estimator (CMDE) is theoretically the best for marginal density estimation

Likelihood Inference for Large Scale Stochastic Blockmodels with Covariates based on a DivideandConquer Parallelizable Algorithm with Communication. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190227
Sandipan Roy,Yves Atchadé,George MichailidisWe consider a stochastic blockmodel equipped with node covariate information, that is helpful in analyzing social network data. The key objective is to obtain maximum likelihood estimates of the model parameters. For this task, we devise a fast, scalable Monte Carlo EM type algorithm based on casecontrol approximation of the loglikelihood coupled with a subsampling approach. A key feature of the

A Modified Random Survival Forests Algorithm for High Dimensional Predictors and SelfReported Outcomes. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190216
Hui Xu,Xiangdong Gu,Mahlet G Tadesse,Raji BalasubramanianWe present an ensemble treebased algorithm for variable selection in high dimensional datasets, in settings where a timetoevent outcome is observed with error. The proposed methods are motivated by selfreported outcomes collected in largescale epidemiologic studies, such as the Women's Health Initiative. The proposed methods equally apply to imperfect outcomes that arise in other settings such

Fast Bootstrap Confidence Intervals for Continuous Threshold Linear Regression. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190213
Youyi FongContinuous threshold regression is a common type of nonlinear regression that is attractive to many practitioners for its easy interpretability. More widespread adoption of threshold regression faces two challenges: (i) the computational complexity of fitting threshold regression models and (ii) obtaining correct coverage of confidence intervals under model misspecification. Both challenges result

Rank Conditional Coverage and Confidence Intervals in HighDimensional Problems. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190212
Jean Morrison,Noah SimonConfidence interval procedures used in low dimensional settings are often inappropriate for high dimensional applications. When many parameters are estimated, marginal confidence intervals associated with the most significant estimates have very low coverage rates: They are too small and centered at biased estimates. The problem of forming confidence intervals in high dimensional settings has previously

Algorithms for Fitting the Constrained Lasso. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20190109
Brian R Gaines,Juhyun Kim,Hua ZhouWe compare alternative computing strategies for solving the constrained lasso problem. As its name suggests, the constrained lasso extends the widelyused lasso to handle linear constraints, which allow the user to incorporate prior information into the model. In addition to quadratic programming, we employ the alternating direction method of multipliers (ADMM) and also derive an efficient solution

Interactive Visualization of Hierarchically Structured Data. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20181113
Kris Sankaran,Susan HolmesWe introduce methods for visualization of data structured along trees, especially hierarchically structured collections of time series. To this end, we identify questions that often emerge when working with hierarchical data and provide an R package to simplify their investigation. Our key contribution is the adaptation of the visualization principles of focuspluscontext and linking to the study

Multiresolution Network Models. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20181105
Bailey K Fosdick,Tyler H McCormick,Thomas Brendan Murphy,Tin Lok James Ng,Ted WestlingMany existing statistical and machine learning tools for social network analysis focus on a single level of analysis. Methods designed for clustering optimize a global partition of the graph, whereas projectionbased approaches (e.g., the latent space model in the statistics literature) represent in rich detail the roles of individuals. Many pertinent questions in sociology and economics, however,

Tensorontensor regression. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20181020
Eric F LockWe propose a framework for the linear prediction of a multiway array (i.e., a tensor) from another multiway array of arbitrary dimension, using the contracted tensor product. This framework generalizes several existing approaches, including methods to predict a scalar outcome from a tensor, a matrix from a matrix, or a tensor from a scalar. We describe an approach that exploits the multiway structure

Optimal Designs for MultiResponse Nonlinear Regression Models With Several Factors via Semidefinite Programming. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20180820
Weng Kee Wong,Yue Yin,Julie ZhouWe use semidefinite programming (SDP) to find a variety of optimal designs for multiresponse linear models with multiple factors, and for the first time, extend the methodology to find optimal designs for multiresponse nonlinear models and generalized linear models with multiple factors. We construct transformations that (i) facilitate improved formulation of the optimal design problems into SDP

Computationally Efficient Estimation for the Generalized Odds Rate Mixture Cure Model with IntervalCensored Data. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20180605
Jie Zhou,Jiajia Zhang,Wenbin LuFor semiparametric survival models with interval censored data and a cure fraction, it is often difficult to derive nonparametric maximum likelihood estimation due to the challenge in maximizing the complex likelihood function. In this paper, we propose a computationally efficient EM algorithm, facilitated by a gammapoisson data augmentation, for maximum likelihood estimation in a class of generalized

Additive FunctiononFunction Regression. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20180522
Janet S Kim,AnaMaria Staicu,Arnab Maity,Raymond J Carroll,David RuppertWe study additive functiononfunction regression where the mean response at a particular time point depends on the time point itself, as well as the entire covariate trajectory. We develop a computationally efficient estimation methodology based on a novel combination of spline bases with an eigenbasis to represent the trivariate kernel function. We discuss prediction of a new response trajectory

Estimating Individual Treatment Effect in Observational Data Using Random Forest Methods. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20180501
Min Lu,Saad Sadiq,Daniel J Feaster,Hemant IshwaranEstimation of individual treatment effect in observational data is complicated due to the challenges of confounding and selection bias. A useful inferential framework to address this is the counterfactual (potential outcomes) model, which takes the hypothetical stance of asking what if an individual had received both treatments. Making use of random forests (RF) within the counterfactual framework

OneStep Generalized Estimating Equations with Large Cluster Sizes. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20180210
Stuart Lipsitz,Garrett Fitzmaurice,Debajyoti Sinha,Nathanael Hevelone,Jim Hu,Louis L NguyenMedical studies increasingly involve a large sample of independent clusters, where the cluster sizes are also large. Our motivating example from the 2010 Nationwide Inpatient Sample (NIS) has 8,001,068 patients and 1049 clusters, with average cluster size of 7627. Consistent parameter estimates can be obtained naively assuming independence, which are inefficient when the intracluster correlation (ICC)

Superheat: An R package for creating beautiful and extendable heatmaps for visualizing complex data. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20180101
Rebecca L Barter,Bin YuThe technological advancements of the modern era have enabled the collection of huge amounts of data in science and beyond. Extracting useful information from such massive datasets is an ongoing challenge as traditional data visualization tools typically do not scale well in highdimensional settings. An existing visualization technique that is particularly well suited to visualizing large datasets

Penalized nonparametric scalaronfunction regression via principal coordinates. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20171209
Philip T Reiss,David L Miller,PeiShien Wu,WenYu HuaA number of classical approaches to nonparametric regression have recently been extended to the case of functional predictors. This paper introduces a new method of this type, which extends intermediaterank penalized smoothing to scalaronfunction regression. In the proposed method, which we call principal coordinate ridge regression, one regresses the response on leading principal coordinates defined

Identifying Mixtures of Mixtures Using Bayesian Estimation. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20170620
Gertraud MalsinerWalli,Sylvia FrühwirthSchnatter,Bettina GrünThe use of a finite mixture of normal distributions in modelbased clustering allows us to capture nonGaussian data clusters. However, identifying the clusters from the normal components is challenging and in general either achieved by imposing constraints on the model or by using postprocessing procedures. Within the Bayesian framework, we propose a different approach based on sparse finite mixtures

Regression Models For Multivariate Count Data. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20170330
Yiwen Zhang,Hua Zhou,Jin Zhou,Wei SunData with multivariate count responses frequently occur in modern applications. The commonly used multinomiallogit model is limiting due to its restrictive meanvariance structure. For instance, analyzing count data from the recent RNAseq technology by the multinomiallogit model leads to serious errors in hypothesis testing. The ubiquity of overdispersion and complicated correlation structures

Efficient computation of the joint sample frequency spectra for multiple populations. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20170228
John A Kamm,Jonathan Terhorst,Yun S SongA wide range of studies in population genetics have employed the sample frequency spectrum (SFS), a summary statistic which describes the distribution of mutant alleles at a polymorphic site in a sample of DNA sequences and provides a highly efficient dimensional reduction of largescale population genomic variation data. Recently, there has been much interest in analyzing the joint SFS data from multiple

Bayesian Model Assessment in Joint Modeling of Longitudinal and Survival Data with Applications to Cancer Clinical Trials. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20170228
Danjie Zhang,MingHui Chen,Joseph G Ibrahim,Mark E Boye,Wei ShenJoint models for longitudinal and survival data are routinely used in clinical trials or other studies to assess a treatment effect while accounting for longitudinal measures such as patientreported outcomes (PROs). In the Bayesian framework, the deviance information criterion (DIC) and the logarithm of the pseudo marginal likelihood (LPML) are two wellknown Bayesian criteria for comparing joint

Fused Lasso Additive Model. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20170228
Ashley Petersen,Daniela Witten,Noah SimonWe consider the problem of predicting an outcome variable using p covariates that are measured on n independent observations, in a setting in which additive, flexible, and interpretable fits are desired. We propose the fused lasso additive model (FLAM), in which each additive function is estimated to be piecewise constant with a small number of adaptivelychosen knots. FLAM is the solution to a convex

Accelerated Pathfollowing Iterative Shrinkage Thresholding Algorithm with Application to Semiparametric Graph Estimation. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20170131
Tuo Zhao,Han LiuWe propose an accelerated pathfollowing iterative shrinkage thresholding algorithm (APISTA) for solving high dimensional sparse nonconvex learning problems. The main difference between APISTA and the pathfollowing iterative shrinkage thresholding algorithm (PISTA) is that APISTA exploits an additional coordinate descent subroutine to boost the computational performance. Such a modification, though

Bayesian Variable Selection on Model Spaces Constrained by Heredity Conditions. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20170114
Daniel TaylorRodriguez,Andrew Womack,Nikolay BliznyukThis paper investigates Bayesian variable selection when there is a hierarchical dependence structure on the inclusion of predictors in the model. In particular, we study the type of dependence found in polynomial response surfaces of orders two and higher, whose model spaces are required to satisfy weak or strong heredity conditions. These conditions restrict the inclusion of higherorder terms depending

A Robust ModelFree Feature Screening Method for UltrahighDimensional Data. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20170101
Jingnan Xue,Faming LiangFeature screening plays an important role in dimension reduction for ultrahighdimensional data. In this paper, we introduce a new feature screening method and establish its sure independence screening property under the ultrahighdimensional setting. The proposed method works based on the nonparanormal transformation and HenzeZirkler's test; that is, it first transforms the response variable and

Efficient Data Augmentation for Fitting Stochastic Epidemic Models to Prevalence Data. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20170101
Jonathan Fintzi,Xiang Cui,Jon Wakefield,Vladimir N MininStochastic epidemic models describe the dynamics of an epidemic as a disease spreads through a population. Typically, only a fraction of cases are observed at a set of discrete times. The absence of complete information about the time evolution of an epidemic gives rise to a complicated latent variable problem in which the state space size of the epidemic grows large as the population size increases

Sequential CoSparse Factor Regression. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20170101
Aditya Mishra,Dipak K Dey,Kun ChenIn multivariate regression models, a sparse singular value decomposition of the regression component matrix is appealing for reducing dimensionality and facilitating interpretation. However, the recovery of such a decomposition remains very challenging, largely due to the simultaneous presence of orthogonality constraints and cosparsity regularization. By delving into the underlying statistical data

Formal Hypothesis Tests for Additive Structure in Random Forests. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20170101
Lucas Mentch,Giles HookerWhile statistical learning methods have proved powerful tools for predictive modeling, the blackbox nature of the models they produce can severely limit their interpretability and the ability to conduct formal inference. However, the natural structure of ensemble learners like bagged trees and random forests has been shown to admit desirable asymptotic properties when base learners are built with

Reinforced Anglebased Multicategory Support Vector Machines. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20161129
Chong Zhang,Yufeng Liu,Junhui Wang,Hongtu ZhuThe Support Vector Machine (SVM) is a very popular classification tool with many successful applications. It was originally designed for binary problems with desirable theoretical properties. Although there exist various Multicategory SVM (MSVM) extensions in the literature, some challenges remain. In particular, most existing MSVMs make use of k classification functions for a kclass problem, and

Parameter Expanded Algorithms for Bayesian Latent Variable Modeling of Genetic Pleiotropy Data. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20161019
Lizhen Xu,Radu V Craiu,Lei Sun,Andrew D PatersonMotivated by genetic association studies of pleiotropy, we propose a Bayesian latent variable approach to jointly study multiple outcomes. The models studied here can incorporate both continuous and binary responses, and can account for serial and cluster correlations. We consider Bayesian estimation for the model parameters, and we develop a novel MCMC algorithm that builds upon hierarchical centering

Laplace Variational Approximation for Semiparametric Regression in the Presence of Heteroskedastic Errors. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20160927
Bruce D Bugbee,F Jay Breidt,Mark J van der WoerdVariational approximations provide fast, deterministic alternatives to Markov Chain Monte Carlo for Bayesian inference on the parameters of complex, hierarchical models. Variational approximations are often limited in practicality in the absence of conjugate posterior distributions. Recent work has focused on the application of variational methods to models with only partial conjugacy, such as in semiparametric

Bayesian Ensemble Trees (BET) for Clustering and Prediction in Heterogeneous Data. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20160816
Leo L Duan,John P Clancy,Rhonda D SzczesniakWe propose a novel "treeaveraging" model that utilizes the ensemble of classification and regression trees (CART). Each constituent tree is estimated with a subset of similar data. We treat this grouping of subsets as Bayesian Ensemble Trees (BET) and model them as a Dirichlet process. We show that BET determines the optimal number of trees by adapting to the data heterogeneity. Compared with the

Computational Aspects of Optional Pólya Tree. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20160525
Hui Jiang,John Chong Mu,Kun Yang,Chao Du,Luo Lu,Wing Hung WongOptional Pólya tree (OPT) is a flexible nonparametric Bayesian prior for density estimation. Despite its merits, the computation for OPT inference is challenging. In this paper we present time complexity analysis for OPT inference and propose two algorithmic improvements. The first improvement, named limitedlookahead optional Pólya tree (LLOPT), aims at accelerating the computation for OPT inference

Covariance Partition Priors: A Bayesian Approach to Simultaneous Covariance Estimation for Longitudinal Data. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20160514
J T Gaskins,M J DanielsThe estimation of the covariance matrix is a key concern in the analysis of longitudinal data. When data consists of multiple groups, it is often assumed the covariance matrices are either equal across groups or are completely distinct. We seek methodology to allow borrowing of strength across potentially similar groups to improve estimation. To that end, we introduce a covariance partition prior which

ConvexLAR: An Extension of Least Angle Regression. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20160427
Wei Xiao,Yichao Wu,Hua ZhouThe least angle regression (LAR) was proposed by Efron, Hastie, Johnstone and Tibshirani (2004) for continuous model selection in linear regression. It is motivated by a geometric argument and tracks a path along which the predictors enter successively and the active predictors always maintain the same absolute correlation (angle) with the residual vector. Although it gains popularity quickly, its

Gene regulation network inference with joint sparse Gaussian graphical models. J. Comput. Graph. Stat. (IF 1.882) Pub Date : 20160210
Hyonho Chun,Xianghua Zhang,Hongyu ZhaoRevealing biological networks is one key objective in systems biology. With microarrays, researchers now routinely measure expression profiles at the genome level under various conditions, and, such data may be utilized to statistically infer gene regulation networks. Gaussian graphical models (GGMs) have proven useful for this purpose by modeling the Markovian dependence among genes. However, a single