
显示样式: 排序: IF: - GO 导出
-
A class of Birnbaum–Saunders type kernel density estimators for nonnegative data Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-04-13 Yoshihide Kakizawa
Nonparametric density estimation using a class of deformed skew Birnbaum–Saunder (BS) type kernels is suggested for nonnegative data. A remarkable feature of new skew BS type kernel density estimators lies in its general formulation via asymmetry parameter as well as density generator. Mean integrated squared errors of the proposed estimators are investigated, together with strong consistency and asymptotic
-
Testing error heterogeneity in censored linear regression Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-03-31 Caiyun Fan, Wenbin Lu, Yong Zhou
In censored linear regression, a key assumption is that the error is independent of predictors. We develop an omnibus test to check error heterogeneity in censored linear regression. Our approach is based on testing the variance component in a working kernel machine regression model. The limiting null distribution of the proposed test statistic is shown to be a weighted sum of independent chi-squared
-
Communication-efficient distributed M-estimation with missing data Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-04-16 Jianwei Shi, Guoyou Qin, Huichen Zhu, Zhongyi Zhu
In the big data era, practical applications often encounter incomplete data. Current distributed methods, ignoring missingness, may cause inconsistent estimates. Motivated by that, a distributed algorithm is developed for M-estimation with missing data. The proposed algorithm is communication-efficient, where only gradient information is transferred to the central machine. The parameters of interest
-
A Bayesian semiparametric vector Multiplicative Error Model Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-04-15 Nicola Donelli, Antonietta Mira, Stefano Peluso
Interactions among multiple time series of positive random variables are crucial in diverse financial applications, from spillover effects to volatility interdependence. A popular model in this setting is the vector Multiplicative Error Model (vMEM) which poses a linear iterative structure on the dynamics of the conditional mean, perturbed by a multiplicative innovation term. A main limitation of vMEM
-
Generalized accelerated hazards mixture cure models with interval-censored data Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-04-14 Xiaoyu Liu, Liming Xiang
Existing semiparametric mixture cure models with interval-censored data often assume a survival model, such as the Cox proportional hazards model, proportional odds model, accelerated failure time model, or their transformations for the susceptible subjects. There are cases in practice that such conventional assumptions may be inappropriate for modeling survival outcomes of susceptible subjects. We
-
Fast Bayesian inference using Laplace approximations in nonparametric double additive location-scale models with right- and interval-censored data Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-04-14 Philippe Lambert
Penalized B-splines are commonly used in additive models to describe smooth changes in a response with quantitative covariates. This is usually done through the conditional mean in the exponential family using generalized additive models with an indirect impact on other conditional moments. Another common strategy is to focus on several low-order conditional moments, leaving the full conditional distribution
-
Copula Particle Filters Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-04-03 Carlos E. Rodríguez, Stephen G. Walker
A novel analysis of the state space model is presented. It is shown that by modifying the standard recursive update it is possible to apply a copula model to eliminate a particular integral, which is typically performed using importance sampling. With Bayesian models, copulas have recently been shown to provide predictive densities directly, avoiding integrals altogether. As in every particle filter
-
Testing the first-order separability hypothesis for spatio-temporal point patterns Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-04-07 Mohammad Ghorbani, Nafiseh Vafaei, Jiří Dvořák, Mari Myllymäki
First-order separability of a spatio-temporal point process plays a fundamental role in the analysis of spatio-temporal point pattern data. While it is often a convenient assumption that simplifies the analysis greatly, existing non-separable structures should be accounted for in the model construction. Three different tests are proposed to investigate this hypothesis as a step of preliminary data
-
Parallel integrative learning for large-scale multi-response regression with incomplete outcomes Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-04-07 Ruipeng Dong, Daoji Li, Zemin Zheng
Multi-task learning is increasingly used to investigate the association structure between multiple responses and a single set of predictor variables in many applications. In the era of big data, the coexistence of incomplete outcomes, large number of responses, and high dimensionality in predictors poses unprecedented challenges in estimation, prediction and computation. In this paper, we propose a
-
A kernel-based measure for conditional mean dependence Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-04-09 Tingyu Lai, Zhongzhan Zhang, Yafei Wang
A novel metric, called kernel-based conditional mean dependence (KCMD), is proposed to measure and test the departure from conditional mean independence between a response variable Y and a predictor variable X, based on the reproducing kernel embedding and the Hilbert-Schmidt norm of a tensor operator. The KCMD has several appealing merits. It equals zero if and only if the conditional mean of Y given
-
In the pursuit of sparseness: A new rank-preserving penalty for a finite mixture of factor analyzers Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-04-06 Nam-Hwui Kim, Ryan P. Browne
A finite mixture of factor analyzers is an effective method for achieving parsimony in model-based clustering. Introducing a penalization term for the factor loading can lead to sparse estimates. However, in the pursuit of sparseness, one can end up with rank-deficient solutions regardless of the number of factors assumed. In light of this issue, a new penalty-based method that can fit a finite mixture
-
Robust MAVE through nonconvex penalized regression Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-04-08 Jing Zhang, Qin Wang, D'Arcy Mays
High dimensionality has been a significant feature in modern statistical modeling. Sufficient dimension reduction (SDR) as an efficient tool aims at reducing the original high dimensional predictors without losing any regression information. Minimum average variance estimation (MAVE) is a popular approach in SDR among others. However, it is not robust to outliers in the response due to the use of least
-
An ensemble of inverse moment estimators for sufficient dimension reduction Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-04-08 Qin Wang, Yuan Xue
Sufficient dimension reduction (SDR) is known to be a useful tool in data visualization and information retrieval for high dimensional data. Many well-known SDR approaches investigate the inverse conditional moments of the predictors given the response. Motivated by the idea of the aggregate dimension reduction, we propose an ensemble of inverse moment estimators to explore the central subspace. The
-
Composite quantile regression for ultra-high dimensional semiparametric model averaging Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-03-30 Chaohui Guo, Jing Lv, Jibo Wu
To estimate the joint multivariate regression function, a robust ultra-high dimensional semiparametric model averaging approach is developed. Specifically, a three-stage estimation procedure is proposed. In the first step, the joint multivariate function can be approximated by a weighted average of one-dimensional marginal regression functions which can be estimated robustly by the composite quantile
-
Time stable empirical best predictors under a unit-level model Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-03-22 María Guadarrama, Domingo Morales, Isabel Molina
Comparability as well as stability over time are highly desirable properties of regularly published statistics, specially when they are related to important issues such as people’s living conditions. For instance, poverty statistics displaying drastic changes from one period to the next for the same area have low credibility. In fact, longitudinal surveys that collect information on the same phenomena
-
Marginal false discovery rate for a penalized transformation survival model Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-04-02 Weijuan Liang, Shuangge Ma, Cunjie Lin
Survival analysis that involves moderate/high dimensional covariates has become common. Most of the existing analyses have been focused on estimation and variable selection, using penalization and other regularization techniques. To draw more definitive conclusions, a handful of studies have also conducted inference. The recently developed mFDR (marginal false discovery rate) technique provides an
-
Bias-corrected Kullback–Leibler distance criterion based model selection with covariables missing at random Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-03-31 Yuting Wei, Qihua Wang, Xiaogang Duan, Jing Qin
A model selection problem for the conditional probability function of the response variable Y given the covariable vector (X,Z) is considered under the case where X is missing at random. And two novel model selection criteria are suggested. It is shown that the model selection by these two criteria is consistent and that the population parameter estimators, corresponding to the selected model, are
-
Two-sample test in high dimensions through random selection Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-03-16 Tao Qiu, Wangli Xu, Liping Zhu
Testing the equality for two-sample means with high dimensional distributions is a fundamental problem in statistics. In the past two decades, many efforts have been devoted to comparing the mean vectors of two populations. Many existing tests rely on naive diagonal or trace estimators of the covariance matrix, ignoring the dependence structure between variables. To make more use of the dependence
-
Robust tests for time series comparison based on Laplace periodograms Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-03-18 Lei Jin
Statistical comparison of time series is useful for the detection of mechanical damage and many other real-world applications. New methods have been proposed to check whether two semi-stationary time series have the same normalized dynamics. The proposed methods differ from traditional methods in that they are based on the Laplace periodogram, which is a robust tool to analyze the serial dependence
-
Latent association graph inference for binary transaction data Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-03-27 David Reynolds, Luis Carvalho
A novel approach to the problem of statistical inference for multivariate binary transaction data is proposed. A fundamental question that arises from this data, often referred to as market basket data, is how the items relate to one another. These relationships are naturally expressed by a graph and transactions can be modelled as samples of cliques from this association graph. A hierarchical model
-
Frequentist delta-variance approximations with mixed-effects models and TMB Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-03-23 Nan Zheng, Noel Cadigan
Measures of uncertainty are investigated for estimates and predictions using nonlinear mixed-effects models including state–space models in particular. These nonlinear mixed-effects models include fixed parameters and random effects. Maximum likelihood estimation of the parameters and conditional mean predictors of random effects are commonly used to estimate important quantities for a wide spectrum
-
Bayes linear analysis for ordinary differential equations Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-03-24 Matthew Jones, Michael Goldstein, David Randell, Philip Jonathan
Differential equation models are used in a wide variety of scientific fields to describe the behaviour of physical systems. Commonly, solutions to given systems of differential equations are not available in closed-form; in such situations, the solution to the system is generally approximated numerically. The numerical solution obtained will be systematically different from the (unknown) true solution
-
Robust distributed modal regression for massive data Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-03-18 Kangning Wang, Shaomin Li
Modal regression is a good alternative of the mean regression and likelihood based methods, because of its robustness and high efficiency. A robust communication-efficient distributed modal regression for the distributed massive data is proposed in this paper. Specifically, the global modal regression objective function is approximated by a surrogate one at the first machine, which relates to the local
-
Ensemble sparse estimation of covariance structure for exploring genetic disease data Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-03-15 Xiaoning Kang, Mingqiu Wang
High-dimensional data often occur nowadays in various areas, such as genetic and microarray data. The covariance matrix is of fundamental importance in analyzing the relationship between multivariate variables. A powerful tool for estimating a covariance matrix is the modified Cholesky decomposition, which allows for unconstrained estimation and guarantees the positive definiteness of the estimate
-
FunCC: A new bi-clustering algorithm for functional data with misalignment Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-03-22 Marta Galvani, Agostino Torti, Alessandra Menafoglio, Simone Vantini
The problem of bi-clustering functional data, which has recently been addressed in literature, is considered. A definition of ideal functional bi-cluster is given and a novel bi-clustering method, called Functional Cheng and Church (FunCC), is developed. The introduced algorithm searches for non-overlapping and non-exhaustive bi-clusters in a set of functions which are naturally ordered in matrix structure
-
Promote sign consistency in the joint estimation of precision matrices Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-03-04 Qingzhao Zhang, Shuangge Ma, Yuan Huang
The Gaussian graphical model is a popular tool for inferring the relationships among random variables, where the precision matrix provides a natural interpretation of conditional independence. With high-dimensional data, sparsity of the precision matrix is often assumed, and various regularization methods have been applied for estimation. In several scenarios, it is desirable to conduct the joint estimation
-
Tests for differential Gaussian Bayesian networks based on quadratic inference functions Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-03-08 Xianzheng Huang, Hongmei Zhang
Hypotheses testing procedures based on quadratic inference functions are proposed to test whether two Gaussian Bayesian networks are differential in structure, strength of associations between nodes, or both. Bootstrap procedures are developed to estimate p-values to quantify the statistical significance of the tests. Operating characteristics of these testing procedures are investigated using synthetic
-
Hidden semi-Markov-switching quantile regression for time series Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-03-05 Antonello Maruotti, Lea Petrella, Luca Sposito
A hidden semi-Markov-switching quantile regression model is introduced as an extension of the hidden Markov-switching one. The proposed model allows for arbitrary sojourn-time distributions in the states of the Markov-switching chain. Parameters estimation is carried out via maximum likelihood estimation method using the Asymmetric Laplace distribution. As a by product of the model specification, the
-
Hypothesis testing of varying coefficients for regional quantiles Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-03-04 Seyoung Park, Eun Ryung Lee
Testing the behavior of varying coefficients (VC) over a range of quantiles is important in the field of regression analysis. This study tests whether coefficient functions in varying quantile regression share common structural information across a certain range of quantile levels, even when linear combinations of covariates are unspecified in the null hypothesis. Our approach allows varying the coefficients
-
Nonparametric density estimation and bandwidth selection with B-spline bases: A novel Galerkin method Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-03-05 J. Lars Kirkby, Álvaro Leitao, Duy Nguyen
A general and efficient nonparametric density estimation procedure for local bases, including B-splines, is proposed, which employs a novel statistical Galerkin method combined with basis duality theory. To select the bandwidth, an efficient cross-validation procedure is introduced, based on closed-form expressions in terms of the primal and dual B-spline basis. By utilizing a closed-form expression
-
Deep distribution regression Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-02-22 Rui Li, Brian J. Reich, Howard D. Bondell
Due to their flexibility and predictive performance, machine-learning based regression methods have become an important tool for predictive modeling and forecasting. However, most methods focus on estimating the conditional mean or specific quantiles of the target quantity and do not provide the full conditional distribution, which contains uncertainty information that might be crucial for decision
-
Censored mean variance sure independence screening for ultrahigh dimensional survival data Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-02-24 Wei Zhong, Jiping Wang, Xiaolin Chen
Feature screening has become an indispensable statistical modeling tool for ultrahigh dimensional data analysis. This article introduces a new model-free marginal feature screening approach for ultrahigh dimensional survival data with right censoring. The new procedure could be used for survival data with both ultrahigh dimensional categorical and continuous covariates. Motivated by Cui et al. (2015)
-
Subgroup causal effect identification and estimation via matching tree Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-02-23 Yuyang Zhang, Patrick Schnell, Chi Song, Bin Huang, Bo Lu
Inferring causal effect from observational studies is a central topic in many scientific fields, including social science, health and medicine. The statistical methodology for estimating population average causal effect has been well established. However, the methods for identifying and estimating subpopulation causal effects are relatively less developed. Part of the challenge is that the subgroup
-
Generalized k-means in GLMs with applications to the outbreak of COVID-19 in the United States Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-03-10 Tonglin Zhang, Ge Lin
Generalized k-means can be combined with any similarity or dissimilarity measure for clustering. Using the well known likelihood ratio or F-statistic as the dissimilarity measure, a generalized k-means method is proposed to group generalized linear models (GLMs) for exponential family distributions. Given the number of clusters k, the proposed method is established by the uniform most powerful unbiased
-
A new class of stochastic EM algorithms. Escaping local maxima and handling intractable sampling Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-01-09 Stéphanie Allassonnière, Juliette Chevallier
The expectation–maximization (EM) algorithm is a powerful computational technique for maximum likelihood estimation in incomplete data models. When the expectation step cannot be performed in closed form, a stochastic approximation of EM (SAEM) can be used. The convergence of the SAEM toward critical points of the observed likelihood has been proved and its numerical efficiency has been demonstrated
-
Tuning-free ridge estimators for high-dimensional generalized linear models Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-02-28 Shih-Ting Huang, Fang Xie, Johannes Lederer
Ridge estimators regularize the squared Euclidean lengths of parameters. Such estimators are mathematically and computationally attractive but involve tuning parameters that need to be calibrated. It is shown that ridge estimators can be modified such that tuning parameters can be avoided altogether, and the resulting estimator can improve on the prediction accuracies of standard ridge estimators combined
-
Dissimilarity functions for rank-invariant hierarchical clustering of continuous variables Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-02-13 Sebastian Fuchs, F. Marta L. Di Lascio, Fabrizio Durante
A theoretical framework is presented for a (copula-based) notion of dissimilarity between continuous random vectors and its main properties are studied. The proposed dissimilarity assigns the smallest value to a pair of random vectors that are comonotonic. Various properties of this dissimilarity are studied, with special attention to those that are prone to the hierarchical agglomerative methods,
-
Clusterwise functional linear regression models Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-02-14 Ting Li, Xinyuan Song, Yingying Zhang, Hongtu Zhu, Zhongyi Zhu
Classical clusterwise linear regression is a useful method for investigating the relationship between scalar predictors and scalar responses with heterogeneous variation of regression patterns for different subgroups of subjects. This paper extends the classical clusterwise linear regression to incorporate multiple functional predictors by representing the functional coefficients in terms of a functional
-
Clustering with the Average Silhouette Width Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-02-10 Fatima Batool, Christian Hennig
The Average Silhouette Width (ASW) is a popular cluster validation index to estimate the number of clusters. The question whether it also is suitable as a general objective function to be optimized for finding a clustering is addressed. Two algorithms (the standard version OSil and a fast version FOSil) are proposed, and they are compared with existing clustering methods in an extensive simulation
-
High dimensional regression for regenerative time-series: An application to road traffic modeling Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-02-11 Mohammed Bouchouia, François Portier
A statistical predictive model in which a high-dimensional time-series regenerates at the end of each day is used to model road traffic. Due to the regeneration, prediction is based on a daily modeling using a vector autoregressive model that combines linearly the past observations of the day. Due to the high-dimension, the learning algorithm follows from an ℓ1-penalization of the regression coefficients
-
Estimating robot strengths with application to selection of alliance members in FIRST robotics competitions Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-02-12 Alejandro Lim, Chin-Tsang Chiang, Jen-Chieh Teng
Since the inception of the FIRST Robotics Competition (FRC) and its special playoff system, robotics teams have longed to appropriately quantify the strengths of their designed robots. The FRC includes a playground draft-like phase (alliance selection), arguably the most game-changing part of the competition, in which the top-8 robotics teams in a tournament based on the FRC’s ranking system assess
-
Response adaptive designs for Phase II trials with binary endpoint based on context-dependent information measures Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-01-30 Ksenia Kasianova, Mark Kelbert, Pavel Mozgunov
In many rare disease Phase II clinical trials, two objectives are of interest to an investigator: maximising the statistical power and maximising the number of patients responding to the treatment. These two objectives are competing, therefore, clinical trial designs offering a balance between them are needed. Recently, it was argued that response-adaptive designs such as families of multi-arm bandit
-
Explicit-duration Hidden Markov Models for quantum state estimation Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-02-10 Alessandra Luati, Marco Novelli
An explicit-duration Hidden Markov Model with a nonparametric kernel estimator of the state duration distribution is specified. The motivation comes from the physical problem of extracting the maximum information from an open quantum system subject to an external perturbation, which induces a change in the dynamics of the system. A nonparametric kernel estimator for discrete data is introduced, which
-
Robust designs for dose–response studies: Model and labelling robustness Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-02-05 Douglas P. Wiens
Methods for the construction of dose–response designs are presented that are robust against possible model misspecifications and mislabelled responses. The asymptotic properties are studied, leading to asymptotically minimax designs that minimize the maximum – over neighbourhoods of both types of model inadequacies – value of the mean squared error of the predictions. Both sequential and adaptive approaches
-
Confidence intervals for spatial scan statistic Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-02-12 Ivair R. Silva, Luiz Duczmal, Martin Kulldorff
The spatial scan statistic is a popular statistical tool to detect geographical clusters of diseases. The basic problem of constructing confidence intervals for the relative risk of the most likely cluster has remained an open question. To cover this lack, a Monte Carlo based interval estimator for the relative risk of the primary cluster is derived. The method works for the circular spatial scan statistic
-
Robust variable selection for model-based learning in presence of adulteration Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-01-26 Andrea Cappozzo, Francesca Greselin, Thomas Brendan Murphy
The problem of identifying the most discriminating features when performing supervised learning has been extensively investigated. In particular, several methods for variable selection have been proposed in model-based classification. The impact of outliers and wrongly labeled units on the determination of relevant predictors has instead received far less attention, with almost no dedicated methodologies
-
Computation of projection regression depth and its induced median Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-01-27 Yijun Zuo
Notions of depth in regression have been introduced and studied in the literature. The most famous example is Regression Depth (RD), which is a direct extension of location depth to regression. The projection regression depth (PRD) is the extension of another prevailing location depth, the projection depth, to regression. The computation issues of the RD have been discussed in the literature. The computation
-
Optimal treatment regimes for competing risk data using doubly robust outcome weighted learning with bi-level variable selection Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-01-14 Yizeng He, Soyoung Kim, Mi-Ok Kim, Wael Saber, Kwang Woo Ahn
The goal of the optimal treatment regime is maximizing treatment benefits via personalized treatment assignments based on the observed patient and treatment characteristics. Parametric regression-based outcome learning approaches require exploring complex interplay between the outcome and treatment assignments adjusting for the patient and treatment covariates, yet correctly specifying such relationships
-
Mixture of linear experts model for censored data: A novel approach with scale-mixture of normal distributions Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-01-24 Elham Mirfarah, Mehrdad Naderi, Ding-Geng Chen
Mixture of linear experts (MoE) model is one of the widespread statistical frameworks for modeling, classification, and clustering of data. Built on the normality assumption of the error terms for mathematical and computational convenience, the classical MoE model has two challenges: (1) it is sensitive to atypical observations and outliers, and (2) it might produce misleading inferential results for
-
Unsupervised image segmentation with Gaussian Pairwise Markov Fields Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-01-21 Hugo Gangloff, Jean-Baptiste Courbot, Emmanuel Monfrini, Christophe Collet
Modeling strongly correlated random variables is a critical task in the context of latent variable models. A new probabilistic model, called Gaussian Pairwise Markov Field, is presented to generalize existing Markov Fields latent variables models, and to introduce more correlations between variables. This is done by considering the correlations within Gaussian Markov Random Fields models which are
-
A stochastic block model approach for the analysis of multilevel networks: An application to the sociology of organizations Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-01-26 Saint-Clair Chabert-Liddell, Pierre Barbillon, Sophie Donnet, Emmanuel Lazega
A multilevel network is defined as the junction of two interaction networks, one level representing the interactions between individuals and the other the interactions between organizations. The levels are linked by an affiliation relationship, each individual belonging to a unique organization. A new Stochastic Block Model is proposed as a unified probabilistic framework tailored for multilevel networks
-
Variable selection in finite mixture of regression models with an unknown number of components Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-01-26 Kuo-Jung Lee, Martin Feldkircher, Yi-Chi Chen
A Bayesian framework for finite mixture models to deal with model selection and the selection of the number of mixture components simultaneously is presented. For that purpose, a feasible reversible jump Markov Chain Monte Carlo algorithm is proposed to model each component as a sparse regression model. This approach is made robust to outliers by using a prior that induces heavy tails and works well
-
Testing conditional mean through regression model sequence using Yanai’s generalized coefficient of determination Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-01-19 Masao Ueki
In high-dimensional data analysis such as in genomics, repeated univariate regression for each variable is utilized to screen useful variables. However, signals jointly detectable with other variables may be overlooked. While the saturated model using all variables may not work in high-dimensional data, based on prior knowledge, group-wise analysis for a pre-defined group is often developed, but the
-
Approximate computation of projection depths Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-01-09 Rainer Dyckerhoff, Pavlo Mozharovskyi, Stanislav Nagy
Data depth is a concept in multivariate statistics that measures the centrality of a point in a given data cloud in Rd. If the depth of a point can be represented as the minimum of the depths with respect to all one-dimensional projections of the data, then the depth satisfies the so-called projection property. Such depths form an important class that includes many of the depths that have been proposed
-
Partition-based feature screening for categorical data via RKHS embeddings Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-01-14 Jun Lu, Lu Lin, WenWu Wang
This paper proposes a new screening procedure for the ultrahigh dimensional data with a categorical response. By exploiting the group structure among predictors, a new partition-based screening approach is developed via the reproducing kernel Hilbert space (RKHS) embeddings in the maximum mean discrepancy framework. Consequently, the new method is able to identify the influential group of predictors
-
An exchange algorithm for optimal calibration of items in computerized achievement tests Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-01-18 Mahmood Ul Hassan, Frank Miller
The importance of large scale achievement tests, like national tests in school, eligibility tests for university, or international assessments for evaluation of students, is increasing. Pretesting of questions for the above mentioned tests is done to determine characteristic properties of the questions by adding them to an ordinary achievement test. If computerized tests are used, it has been shown
-
Sum of Kronecker products representation and its Cholesky factorization for spatial covariance matrices from large grids Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-01-06 Jian Cao, Marc G. Genton, David E. Keyes, George M. Turkiyyah
The sum of Kronecker products (SKP) representation for spatial covariance matrices from gridded observations and a corresponding adaptive-cross-approximation-based framework for building the Kronecker factors are investigated. The time cost for constructing an n-dimensional covariance matrix is O(nk2) and the total memory footprint is O(nk), where k is the number of Kronecker factors. The memory footprint
-
Normal variance mixtures: Distribution, density and parameter estimation Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-01-15 Erik Hintz, Marius Hofert, Christiane Lemieux
Efficient algorithms for computing the distribution function, (log-)density function and for estimating the parameters of multivariate normal variance mixtures are introduced. For the evaluation of the distribution function, randomized quasi-Monte Carlo (RQMC) methods are utilized in a way that improves upon existing methods proposed for the special case of normal and t distributions. For evaluating
-
Regression analysis of asynchronous longitudinal data with informative observation processes Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-01-11 Dayu Sun, Hui Zhao, Jianguo Sun
A great deal of literature has been established for regression analysis of longitudinal data but most of the existing methods assume that covariates can be observed completely or at the same observation times for the response variable, and the observation process is independent of the response variable completely or given covariates. As pointed out by many authors, in practice, one may face the situation
-
Principal component analysis using frequency components of multivariate time series Comput. Stat. Data Anal. (IF 1.186) Pub Date : 2021-01-06 Raanju R. Sundararajan
Dimension reduction techniques for multivariate time series decompose the observed series into a few useful independent/orthogonal univariate components. A spectral domain method is developed for multivariate second-order stationary time series that linearly transforms the observed series into several groups of lower-dimensional multivariate subseries. These multivariate subseries have non-zero spectral