显示样式： 排序： IF:  GO 导出

A new classification tree method with interaction detection capability Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210727
Ahhyoun Kim, Hyunjoong KimA new classification tree algorithm is presented. It has a novel variable selection algorithm that can effectively detect interactions. The algorithm uses a lookahead approach that considers not only the significance at the current node, but also the significance at child nodes to detect the interaction. It is also different from other classification tree methods in that it finds the splitting point

Bayesian model selection for highdimensional Ising models, with applications to educational data Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210727
Jaewoo Park, Ick Hoon Jin, Michael SchweinbergerDoublyintractable posterior distributions arise in many applications of statistics concerned with discrete and dependent data, including physics, spatial statistics, machine learning, the social sciences, and other fields. A specific example is psychometrics, which has adapted highdimensional Ising models from machine learning, with a view to studying the interactions among binary item responses

Deep learning for quantile regression under right censoring: DeepQuantreg Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210727
Yichen Jia, JongHyeon JeongThe computational prediction algorithm of neural network, or deep learning, has drawn much attention recently in statistics as well as in image recognition and natural language processing. Particularly in statistical application for censored survival data, the loss function used for optimization has been mainly based on the partial likelihood from Cox's model and its variations to utilize existing

Bootstrapping multivariate portmanteau tests for vector autoregressive models with weak assumptions on errors Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210727
Muyi Li, Yanfen ZhangThis article discusses diagnostic checking for vector autoregressive models with uncorrelated but not independent innovations. In this situation, the multivariate portmanteau tests are severely oversized due to the misspecification of critical values obtained from the χ2 distribution. To address this issue, a random weighting bootstrap procedure is proposed to approximate the null distribution when

Robust regression with compositional covariates Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210715
Aditya Mishra, Christian L. MüllerMany biological highthroughput datasets, such as targeted ampliconbased and metagenomic sequencing data, are compositional. A common exploratory data analysis task is to infer robust statistical associations between highdimensional microbial compositions and habitat or hostrelated covariates. To address this, a general robust statistical regression framework RobRegCC (Robust Regression with Compositional

Optimal designs for orderofaddition experiments Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210715
Yuna Zhao, Dennis K.J. Lin, MinQian LiuThe orderofaddition (OofA) designs have received significant attention over recent years. It is of great interest to seek for efficient fractional OofA designs especially when the number of components is large. It has been recognized that constructing efficient fractional OofA designs is a challenging work. A systematic construction method for a class of efficient fractional OofA designs, called

Efficient estimation in a partially specified nonignorable propensity score model Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210721
Mengyan Li, Yanyuan Ma, Jiwei ZhaoConsider the regression setting where the response variable is subject to missing data and the covariates are fully observed. A nonignorable propensity score model, i.e., the probability that the response is observed conditional on all variables depends on the missing values themselves, is assumed throughout the paper. In such problems, model misspecification and model identifiability are two critical

A more powerful test of equality of highdimensional twosample means Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210712
Huaiyu Zhang, Haiyan WangA new test is proposed for testing the equality of two sample means in high dimensional data in which the sample sizes may be much less than the dimension. The test is constructed based on a studentized average of squared componentwise tstatistics. Asymptotic normality of the test statistic was derived under H0. Theoretical properties of the power function were given under local alternatives. The

Inference for partially observed epidemic dynamics guided by Kalman filtering techniques Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210712
Romain Narci, Maud Delattre, Catherine Larédo, Elisabeta VerguDespite the recent development of methods dealing with partially observed epidemic dynamics (unobserved model coordinates, discrete and noisy outbreak data), limitations remain in practice, mainly related to the quantity of augmented data and calibration of numerous tuning parameters. In particular, as coordinates of dynamic epidemic models are coupled, the presence of unobserved coordinates leads

Correlation for treeshaped datasets and its Bayesian estimation Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210705
Shanjun Mao, Xiaodan Fan, Jie HuTreeshaped datasets have arisen in various research and industrial fields, such as gene expression data measured on a cell lineage tree and information spreading on treeshaped paths. Certain correlation measure between two treeshaped datasets, i.e., how the values increase or decrease together along corresponding paths of the two trees, is desired; but the tree topology prohibits the use of classical

Graph informed sliced inverse regression Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210629
Eugen Pircalabelu, Andreas ArtemiouA new method is developed for performing sufficient dimension reduction when probabilistic graphical models are being used to estimate parameters. The procedure enriches the domain of application of dimension reduction techniques to settings where (i) p the number of variables in the model is much larger than the available sample size n, (ii) p is much larger than the number of slices H the model uses

Projectionaveragingbased cumulative covariance and its use in goodnessoffit testing for singleindex models Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210618
Kai Xu, Yeqing ZhouA projectionaveragingbased cumulative divergence to characterize the conditional mean independence is proposed. As a natural extension of Zhou et al. (2020), the new metric has several appealing features. It ranges from zero to one, and equals zero if and only if the conditional mean independence holds. It has an elegant closedform expression that involves no tuning parameters, making it easy to

Semiparametric leastsquares regression with doublycensored data Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210624
Taehwa Choi, Arlene K.H. Kim, Sangbum ChoiDouble censoring often occurs in biomedical research, such as HIV/AIDS clinical trials, when an outcome of interest is subject to both left censoring and right censoring. It can also be seen as a mixture of exact and current status data and has long been investigated by several authors for theoretical and practical purposes. In this article, we propose the BuckleyJames method for an accelerated failure

Outlier detection in networks with missing links Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210625
Solenne Gaucher, Olga Klopp, Geneviève RobinOutliers arise in networks due to different reasons such as fraudulent behaviour of malicious users or default in measurement instruments and can significantly impair network analyses. In addition, reallife networks are likely to be incompletely observed, with missing links due to individual nonresponse or machine failures. Therefore, identifying outliers in the presence of missing links is a crucial

On efficient exact experimental designs for ordered treatments Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210623
Satya Prakash Singh, Ori DavidovIn a recent paper Singh and Davidov (2019) derive approximate optimal designs for experiments with ordered treatments. Specifically, maxi–min and intersection–union designs were explored. These designs, which address different types of hypothesis testing problems, provide a substantial improvement over standard designs in terms of power, or equivalently, sample size requirements. In practice however

Equivalence class selection of categorical graphical models Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210618
Federico Castelletti, Stefano PelusoLearning the structure of dependence relations between variables is a pervasive issue in the statistical literature. A directed acyclic graph (DAG) can represent a set of conditional independencies, but different DAGs may encode the same set of relations and are indistinguishable using observational data. Equivalent DAGs can be collected into classes, each represented by a partially directed graph

Covariate balancing functional propensity score for functional treatments in crosssectional observational studies Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210611
Xiaoke Zhang, Wu Xue, Qiyue WangFunctional data analysis, which handles data arising from curves, surfaces, volumes, manifolds and beyond in a variety of scientific fields, is a rapidly developing area in modern statistics and data science in the recent decades. The effect of a functional variable on an outcome is an essential theme in functional data analysis, but a majority of related studies are restricted to correlational effects

Active set algorithms for estimating shapeconstrained density ratios Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210610
Lutz Dümbgen, Alexandre Mösching, Christof StrählIn many instances, imposing a constraint on the shape of a density is a reasonable and flexible assumption. It offers an alternative to parametric models, which can be too rigid, and to other nonparametric methods, which require the choice of tuning parameters. The nonparametric estimation of logconcave or logconvex density ratios is treated by means of active set algorithms in a unified framework

Categorical CVA biplots Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210609
D.T. Rodwell, C.J. van der Merwe, S. GardnerLubbeTechniques to visualise and understand large amounts of data are of paramount importance. In most settings, this data is usually multivariate, which further stresses the need for effective visualisation techniques. Multivariate visualisation techniques such as canonical variate analysis (CVA) biplots allow for simultaneous lowerdimensional visualisation and data classification by incorporating classspecific

Assessing dynamic effects on a Bayesian matrixvariate dynamic linear model: An application to taskbased fMRI data analysis Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210607
Johnatan Cardona Jiménez, Carlos A. de B. PereiraA modeling procedure for taskbased functional magnetic resonance imaging (fMRI) data analysis using a Bayesian matrixvariate dynamic linear model (MVDLM) is presented. With this type of model, less complex than the more traditional temporalspatial models, it is possible to take into account the temporal and, at least locally, the spatial structures that are usually present in this type of data.

Multimodal Bayesian registration of noisy functions using Hamiltonian Monte Carlo Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210608
J. Derek Tucker, Lyndsay Shand, Kenny ChowdharyFunctional data registration is a necessary processing step for many applications. The observed data can be inherently noisy, often due to measurement error or natural process uncertainty; which most functional alignment methods cannot handle. A pair of functions can also have multiple optimal alignment solutions, which is not addressed in current literature. In this paper, a flexible Bayesian approach

Twosample high dimensional mean test based on prepivots Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210602
Santu Ghosh, Deepak Nag Ayyala, Rafael HellebuyckTesting equality of mean vectors is a very commonly used criterion when comparing two multivariate random variables. Traditional tests such as Hotelling's T2 become either unusable or output small power when the number of variables is greater than the combined sample size. A novel method is proposed using both prepivoting and Edgeworth expansion for testing the equality of two population mean vectors

Feature filter for estimating central mean subspace and its sparse solution Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210526
Pei Wang, Xiangrong Yin, Qingcong Yuan, Richard KryscioSufficient dimension reduction, replacing the original predictors with a few linear combinations while keeping all the regression information, has been widely studied. A key goal is to find the central mean subspace, the intersection of all subspaces that provide such a reduction. To this end, a new sufficient dimension reduction method is proposed, with two estimation procedures, through a novel approach

A stochastic block model approach for the analysis of multilevel networks: An application to the sociology of organizations Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210601
SaintClair ChabertLiddell, Pierre Barbillon, Sophie Donnet, Emmanuel LazegaA multilevel network is defined as the junction of two interaction networks, one level representing the interactions between individuals and the other the interactions between organizations. The levels are linked by an affiliation relationship, each individual belonging to a unique organization. A new Stochastic Block Model is proposed as a unified probalistic framework tailored for multilevel networks

Assessing the effective sample size for large spatial datasets: A block likelihood approach Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210521
Jonathan Acosta, Alfredo Alegría, Felipe Osorio, Ronny VallejosThe development of new techniques for sample size reduction has attracted growing interest in recent decades. Recent findings allow us to quantify the amount of duplicated information within a sample of spatial data through the socalled effective sample size (ESS), whose definition arises from the Fisher information that is associated with maximum likelihood estimation. However, in all circumstances

Fast multivariate empirical cumulative distribution function with connection to kernel density estimation Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210513
Nicolas Langrené, Xavier WarinThe problem of computing empirical cumulative distribution functions (ECDF) efficiently on large, multivariate datasets, is revisited. Computing an ECDF at one evaluation point requires O(N) operations on a dataset composed of N data points. Therefore, a direct evaluation of ECDFs at N evaluation points requires a quadratic O(N2) operations, which is prohibitive for largescale problems. Two fast and

Fitting jump additive models Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210513
Yicheng Kang, Yueyong Shi, Yuling Jiao, Wendong Li, Dongdong XiangJump regression analysis (JRA) provides a useful tool for estimating discontinuous functional relationships between a response and predictors. Most existing JRA methods consider the problems where there is only one or two predictors. It is unclear whether these methods can be directly extended to cases where there are multiple predictors. A jump additive model and a jumppreserving backfitting procedure

Distributed onestep upgraded estimation for nonuniformly and nonrandomly distributed data Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210429
Feifei Wang, Yingqiu Zhu, Danyang Huang, Haobo Qi, Hansheng WangOneshottype (or divideandconquer) estimators have been widely used for distributed statistical analysis. However, their outstanding statistical efficiency hinges on two critical conditions. The first is the uniformity condition, which requires that the sample sizes allocated to different Workers should be as comparable as possible. The second one is the randomness condition, which requires that

Fast and scalable computations for Gaussian hierarchical models with intrinsic conditional autoregressive spatial random effects Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210429
Marco A.R. Ferreira, Erica M. Porter, Christopher T. FranckFast algorithms are developed for Bayesian analysis of Gaussian hierarchical models with intrinsic conditional autoregressive (ICAR) spatial random effects. To achieve computational speedups, first a result is proved on the equivalence between the use of an improper CAR prior with centering on the fly and the use of a sumzero constrained ICAR prior. This equivalence result then provides the key insight

A motif building process for simulating random networks Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210428
Alan M. Polansky, Paramahansa PramanikA simple stochastic process is described which provides a useful basis for generating some types of random networks. The process is based on an iterative building block technique that uses a motif profile as a conditional probability model. The conditional iterative form of the algorithm insures that the calculations required to simulate an observed random network are relatively simple and does not

Bayesian subgroup analysis in regression using mixture models Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210421
Yunju Im, Aixin TanHeterogeneity occurs in many regression problems, where members from different latent subgroups respond differently to the covariates of interest (e.g., treatments) even after adjusting for other covariates. A Bayesian model called the mixture of finite mixtures (MFM) can be used to identify these subgroups, a key feature of which is that the number of subgroups is modeled as a random variable and

Gaussian Bayesian network comparisons with graph ordering unknown Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210501
Hongmei Zhang, Xianzheng Huang, Shengtong Han, Faisal I. Rezwan, Wilfried Karmaus, Hasan Arshad, John W. HollowayA Bayesian approach is proposed that unifies Gaussian Bayesian network constructions and comparisons between two networks (identical or differential) for data with graph ordering unknown. When sampling graph ordering, to escape from local maximums, an adjusted single queue equienergy algorithm is applied. The conditional posterior probability mass function for network differentiation is derived and

SIMEX estimation in parametric modal regression with measurement error Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210501
Jianhong Shi, Yujing Zhang, Ping Yu, Weixing SongFor a class of parametric modal regression models with measurement error, a simulation extrapolation estimation procedure is proposed in this paper for estimating the modal regression coefficients. Large sample properties of the proposed estimation procedure, including the consistency and asymptotic normality, are thoroughly investigated. Simulation studies are conducted to evaluate its robustness

Estimation of high dimensional factor model with multiple thresholdtype regime shifts Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210501
Jianhong WuAbstract This paper considers the estimation of high dimensional factor model with multiple thresholdtype regime shifts in factor loadings. Firstly, the number of thresholds is determined by comparing the number of factors in the adjacent subintervals. Secondly, the thresholds are estimated one by one by concentrated least squares, and then the factors and loadings are obtained by the principal component

A mappingbased universal Kriging model for orderofaddition experiments in drug combination studies Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210501
Qian Xiao, Hongquan XuAbstract In modern pharmaceutical studies, treatments may include several drugs added sequentially, and the drugs’ orderofaddition can have significant impacts on their efficacy. In practice, experiments enumerating all possible drug sequences are often not affordable, and appropriate statistical models which can accurately predict all cases using only a small number of experimental trials are required

Fast Bayesian estimation of spatial count data models Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210501
Prateek Bansal, Rico Krueger, Daniel J. GrahamSpatial count data models are used to explain and predict the frequency of phenomena such as traffic accidents in geographically distinct entities such as census tracts or road segments. These models are typically estimated using Bayesian Markov chain Monte Carlo (MCMC) simulation methods, which, however, are computationally expensive and do not scale well to large datasets. Variational Bayes (VB)

Communicationefficient distributed estimator for generalized linear models with a diverging number of covariates Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210501
Ping Zhou, Zhen Yu, Jingyi Ma, Maozai Tian, Ye FanDistributed statistical inference has recently attracted immense attention. The asymptotic efficiency of the maximum likelihood estimator (MLE), the onestep MLE, and the aggregated estimating equation estimator are established for generalized linear models under the "large $n$, diverging $p_n$" framework, where the dimension of the covariates $p_n$ grows to infinity at a polynomial rate $o(n^\alpha)$

Support vector subset scan for spatial pattern detection Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210501
Dylan Fitzpatrick, Yun Ni, Daniel B. NeillAbstract Discovery of localized and irregularly shaped anomalous patterns in spatial data provides useful context for operational decisions across many policy domains. The support vector subset scan (SVSS) integrates the penalized fast subset scan with a kernel support vector machine classifier to accurately detect spatial clusters without imposing hard constraints on the shape or size of the pattern

Efficient inference for stochastic differential equation mixedeffects models using correlated particle pseudomarginal algorithms Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210501
Samuel Wiqvist, Andrew Golightly, Ashleigh T. McLean, Umberto PicchiniWe perform fully Bayesian inference for stochastic differential equation mixedeffects models (SDEMEMs) using data at discrete times that may be incomplete and subject to measurement error. SDEMEMs are flexible hierarchical models that are able to account for random variability inherent in the underlying timedynamics, as well as the variability between experimental units and, optionally, account for

Compromise design for combination experiment of two drugs Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210501
Hengzhen Huang, Xueping ChenAbstract Preclinical experiment on twodrug combination is a stepping stone to multidrug combination studies. Experimental designs have been proposed in the literature to test the presence of synergism between the combined drugs. However, a design that is efficient for synergy testing is not necessarily desirable for dose–response modeling and the latter is important for future development on drug

Robustness of costeffectiveness analyses of cluster randomized trials assuming bivariate normality against skewed cost data Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210501
Md Abu Manju, Math J.J.M. Candel, Gerard J.P. van BreukelenAbstract The bivariate normal multilevel model (MLM) provides a flexible modelling framework for cost effectiveness analyses (CEAs) alongside cluster randomized trials (CRTs) as well as for sample size calculations of these trials. The bivariate MLM assumes a joint normal distribution for effects and costs, both within (individual level) and between (cluster level) clusters. A typical problem in CEAs

Embedding and learning with signatures Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210501
Adeline FermanianSequential and temporal data arise in many fields of research, such as quantitative finance, medicine, or computer vision. The present article is concerned with a novel approach for sequential learning, called the signature method, and rooted in rough path theory. Its basic principle is to represent multidimensional paths by a graded feature set of their iterated integrals, called the signature. This

A novel method of marginalisation using low discrepancy sequences for integrated nested Laplace approximations Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210501
Paul T. Brown, Chaitanya Joshi, Stephen Joe, Håvard RueRecently, it has been shown that approximations to marginal posterior distributions obtained using a low discrepancy sequence (LDS) can outperform standard gridbased methods with respect to both accuracy and computational efficiency. This recent method, which we will refer to as LDSStM, can also produce good approximations to multimodal posteriors. However, implementation of LDSStM into integrated

A nonparametric test for comparing conditional ROC curves Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210501
Arís FanjulHevia, Wenceslao GonzálezManteiga, Juan Carlos PardoFernándezAbstract Comparing the accuracy and the behaviour of different diagnostic procedures is one of the main objectives of the Receiver Operating Characteristic (ROC) curve analysis. Along with the diagnostic variables it is usual to observe other covariates, but that extra information has been hardly ever considered for the comparison of this kind of curves. A new nonparametric test is proposed for the

Fast inference for semivarying coefficient models via local averaging Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210501
Heng Peng, Chuanlong Xie, Jingxin ZhaoAbstract The semivarying coefficient models are widely used in the application of finance, economics, medical science and many other areas. In general, the functional coefficients are estimated by local smoothing methods, e.g. local linear estimator. So the computation cost is severe because one should pointwisely estimate the value of a coefficient function. In this paper, we give an insight into

Semiparametric quantile regression using family of quantilebased asymmetric densities Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210501
Irène Gijbels, Rezaul Karim, Anneleen VerhasseltAbstract Quantile regression is an important tool in data analysis. Linear regression, or more generally, parametric quantile regression imposes often too restrictive assumptions. Nonparametric regression avoids making distributional assumptions, but might have the disadvantage of not exploiting distributional modelling elements that might be brought in. A semiparametric approach towards estimating

Kendall regression coefficient Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210501
Eckhard LiebscherAbstract A new multivariate extension of Kendall’s dependence coefficient tailored for use in regression analysis is introduced. This coefficient is called Kendall regression coefficient and indicates how well the response variable can be approximated by a strictly increasing function of the regressor (predictor) variables. The properties of this coefficient are examined. In the second part the empirical

Generalized cosparse factor regression Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210501
Aditya Mishra, Dipak K. Dey, Yong Chen, Kun ChenMultivariate regression techniques are commonly applied to explore the associations between large numbers of outcomes and predictors. In realworld applications, the outcomes are often of mixed types, including continuous measurements, binary indicators, and counts, and the observations may also be incomplete. Building upon the recent advances in mixedoutcome modeling and sparse matrix factorization

Bayesian multivariate latent class profile analysis: Exploring the developmental progression of youth depression and substance use Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210424
Jung Wun Lee, Hwan Chung, Saebom JeonMultivariate latent class profile analysis (MLCPA) is a useful tool for exploring the stagesequential process of multiple latent class variables, but the inference can be challenging due to the highdimensional latent structure of the model. In this paper, a Bayesian approach via Markov chain Monte Carlo (MCMC) is proposed for MLCPA as an alternative to the maximumlikelihood (ML) method. Compared

Robust communicationefficient distributed composite quantile regression and variable selection for massive data Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210422
Kangning Wang, Shaomin Li, Benle ZhangStatistical analysis of massive data is becoming more and more common. Distributed composite quantile regression (CQR) for massive data is proposed in this paper. Specifically, the global CQR loss function is approximated by a surrogate one on the first machine, which relates to the local data only through their gradients, then the estimator is obtained on the first machine by minimizing the surrogate

Communicationefficient distributed Mestimation with missing data Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210416
Jianwei Shi, Guoyou Qin, Huichen Zhu, Zhongyi ZhuIn the big data era, practical applications often encounter incomplete data. Current distributed methods, ignoring missingness, may cause inconsistent estimates. Motivated by that, a distributed algorithm is developed for Mestimation with missing data. The proposed algorithm is communicationefficient, where only gradient information is transferred to the central machine. The parameters of interest

Combining heterogeneous spatial datasets with processbased spatial fusion models: A unifying framework Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210420
Craig Wang, Reinhard FurrerIn modern spatial statistics, the structure of data has become more heterogeneous. Depending on the types of spatial data, different modeling strategies are used. For example, kriging approaches for geostatistical data; Gaussian Markov random field models for lattice data; or log Gaussian Cox process models for pointpattern data. Despite these different modeling choices, the nature of underlying datagenerating

Harmless label noise and informative softlabels in supervised classification Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210419
Daniel Ahfock, Geoffrey J. McLachlanManual labelling of training examples is common practice in supervised learning. When the labelling task is of nontrivial difficulty, the supplied labels may not be equal to the groundtruth labels, and label noise is introduced into the training dataset. If the manual annotation is carried out by multiple experts, the same training example can be given different class assignments by different experts

A Bayesian semiparametric vector Multiplicative Error Model Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210415
Nicola Donelli, Stefano Peluso, Antonietta MiraInteractions among multiple time series of positive random variables are crucial in diverse financial applications, from spillover effects to volatility interdependence. A popular model in this setting is the vector Multiplicative Error Model (vMEM) which poses a linear iterative structure on the dynamics of the conditional mean, perturbed by a multiplicative innovation term. A main limitation of vMEM

An ensemble of inverse moment estimators for sufficient dimension reduction Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210408
Qin Wang, Yuan XueSufficient dimension reduction (SDR) is known to be a useful tool in data visualization and information retrieval for high dimensional data. Many wellknown SDR approaches investigate the inverse conditional moments of the predictors given the response. Motivated by the idea of the aggregate dimension reduction, we propose an ensemble of inverse moment estimators to explore the central subspace. The

Fast Bayesian inference using Laplace approximations in nonparametric double additive locationscale models with right and intervalcensored data Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210414
Philippe LambertPenalized Bsplines are commonly used in additive models to describe smooth changes in a response with quantitative covariates. This is usually done through the conditional mean in the exponential family using generalized additive models with an indirect impact on other conditional moments. Another common strategy is to focus on several loworder conditional moments, leaving the full conditional distribution

Bayes linear analysis for ordinary differential equations Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210324
Matthew Jones, Michael Goldstein, David Randell, Philip JonathanDifferential equation models are used in a wide variety of scientific fields to describe the behaviour of physical systems. Commonly, solutions to given systems of differential equations are not available in closedform; in such situations, the solution to the system is generally approximated numerically. The numerical solution obtained will be systematically different from the (unknown) true solution

A class of Birnbaum–Saunders type kernel density estimators for nonnegative data Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210413
Yoshihide KakizawaNonparametric density estimation using a class of deformed skew Birnbaum–Saunder (BS) type kernels is suggested for nonnegative data. A remarkable feature of new skew BS type kernel density estimators lies in its general formulation via asymmetry parameter as well as density generator. Mean integrated squared errors of the proposed estimators are investigated, together with strong consistency and asymptotic

Testing error heterogeneity in censored linear regression Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210331
Caiyun Fan, Wenbin Lu, Yong ZhouIn censored linear regression, a key assumption is that the error is independent of predictors. We develop an omnibus test to check error heterogeneity in censored linear regression. Our approach is based on testing the variance component in a working kernel machine regression model. The limiting null distribution of the proposed test statistic is shown to be a weighted sum of independent chisquared

Generalized accelerated hazards mixture cure models with intervalcensored data Comput. Stat. Data Anal. (IF 1.681) Pub Date : 20210414
Xiaoyu Liu, Liming XiangExisting semiparametric mixture cure models with intervalcensored data often assume a survival model, such as the Cox proportional hazards model, proportional odds model, accelerated failure time model, or their transformations for the susceptible subjects. There are cases in practice that such conventional assumptions may be inappropriate for modeling survival outcomes of susceptible subjects. We