当前期刊: Stat Go to current issue    加入关注   
显示样式:        排序: IF: - GO 导出
我的关注
我的收藏
您暂时未登录!
登录
  • Forecasting Hurricane‐Related Power Outages via Locally Optimized Random Forests
    Stat (IF 0.766) Pub Date : 2021-01-16
    Tim Coleman; Mary Frances Dorn; Kim Kaufeld; Lucas Mentch

    Standard supervised learning procedures are validated against a test set that is assumed to have come from the same distribution as the training data. However, in many problems, the test data may have come from a different distribution. We consider the case of having many labeled observations from one distribution, P1, and wanting to make predictions at unlabeled points that come from P2. We combine

    更新日期:2021-01-18
  • A fast algorithm for integrative community detection of multi‐layer networks
    Stat (IF 0.766) Pub Date : 2021-01-15
    Jiangzhou Wang; Jianhua Guo; Binghui Liu

    Multi‐layer networks are often used to represent multiple types of relationships between nodes in network studies. In this paper, we investigate the community detection problem in multi‐layer networks. Specifically, we consider the multi‐layer stochastic block (MLSBM), which assumes that the community memberships are shared across all network layers, while other model parameters can be different between

    更新日期:2021-01-16
  • Semiparametric Bayes Instrumental Variable Estimation with Many Weak Instruments
    Stat (IF 0.766) Pub Date : 2021-01-15
    Ryo Kato; Takahiro Hoshino

    We develop a new semiparametric Bayes instrumental variables estimation method. We employ the form of the regression function of the first‐stage equation and the disturbances are modelled nonparametrically to achieve better predictive power of the endogenous variables, whereas we use parametric formulation in the second‐stage equation, which is of interest in inference. Our simulation studies show

    更新日期:2021-01-16
  • Detecting changes in mean in the presence of time‐varying autocovariance
    Stat (IF 0.766) Pub Date : 2021-01-15
    Euan T. McGonigle; Rebecca Killick; Matthew A. Nunes

    There has been much attention in recent years to the problem of detecting mean changes in a piecewise constant time series. Often, methods assume that the noise can be taken to be independent, identically distributed (IID), which in practice may not be a reasonable assumption. There is comparatively little work studying the problem of mean changepoint detection in time series with non‐trivial autocovariance

    更新日期:2021-01-16
  • Weighted empirical likelihood for heteroscedastic varying coefficient partially nonlinear models with missing data
    Stat (IF 0.766) Pub Date : 2021-01-15
    Guo‐Liang Fan; Lu‐Lu Wang; Hong‐Xia Xu

    In this article, a weighted empirical likelihood technique for constructing the empirical likelihood confidence regions is applied to study the heteroscedastic varying coefficient partially nonlinear models with missing response data. We first give the estimator of the error variance based on the Nadaraya‐Watson kernel estimation method. Then a weighted empirical log‐likelihood ratio of the unknown

    更新日期:2021-01-16
  • Multilevel Joint Modeling of Hospitalization and Survival in Patients on Dialysis
    Stat (IF 0.766) Pub Date : 2021-01-15
    Esra Kürüm; Danh V. Nguyen; Yihao Li; Connie M. Rhee; Kamyar Kalantar‐Zadeh; Damla Şentürk

    More than 720,000 patients with end‐stage renal disease in the U.S. require life‐sustaining dialysis treatment. In this population of typically older patients with a high morbidity burden, hospitalization is frequent at a rate of about twice per patient‐year. Aside from frequent hospitalizations, which is a major source of death risk, overall mortality in dialysis patients is higher than other comparable

    更新日期:2021-01-16
  • When will gradient methods converge to max‐margin classifier under ReLU models?
    Stat (IF 0.766) Pub Date : 2020-12-31
    Tengyu Xu; Yi Zhou; Kaiyi Ji; Yingbin Liang

    We study the implicit bias of gradient descent methods in solving a binary classification problem over a linearly separable dataset. The classifier is described by a nonlinear ReLU model and the objective function adopts the exponential loss function. We first characterize the landscape of the loss function and show that there can exist spurious asymptotic local minimal besides asymptotic global minimal

    更新日期:2020-12-31
  • Low Rank Approximation for Smoothing Spline via Eigensystem Truncation
    Stat (IF 0.766) Pub Date : 2020-12-29
    Danqing Xu; Yuedong Wang

    Smoothing splines provide a powerful and flexible means for nonparametric estimation and inference. With a cubic time complexity, fitting smoothing spline models to large data is computationally prohibitive. In this paper, we use the theoretical optimal eigenspace to derive a low rank approximation of the smoothing spline estimates. We develop a method to approximate the eigensystem when it is unknown

    更新日期:2020-12-30
  • MuSP: A Multi‐step Screening Procedure for Sparse Recovery
    Stat (IF 0.766) Pub Date : 2020-12-25
    Yuehan Yang; Ji Zhu; Edward I. George

    We propose a Multi‐step Screening Procedure (MuSP) for the recovery of sparse linear models in high‐dimensional data. This method is based on a repeated small penalty strategy that quickly converges to an estimate within a few iterations. Specifically, in each iteration, an adaptive lasso regression with a small penalty is fit within the reduced feature space obtained from the previous step, rendering

    更新日期:2020-12-26
  • Efficient Split Likelihood‐based Method for Community Detection of Large‐scale Networks
    Stat (IF 0.766) Pub Date : 2020-12-18
    Jiangzhou Wang; Binghui Liu; Jianhua Guo

    Stochastic block model (SBM) is widely employed as a canonical model for network community detection. Recovering community labels under SBM is not a trivial task, since its theoretical optimization problem is NP‐hard. To solve this problem, numerous statistical methods have been developed in the literature, most of which are, however, not applicable to large‐scale networks. To overcome this limitation

    更新日期:2020-12-21
  • Visualizing the Food Landscape of Durham, North Carolina
    Stat (IF 0.766) Pub Date : 2020-12-17
    Joseph L. Graves; Gizem Templeton; Lauren Davis; Seong‐Tae Kim

    In partnership with community leaders of Durham, North Carolina, the Duke World Food Policy Center is creating a Durham Food Justice Plan (DFJP) for envisioning an equitable food system. The Food Justice plan serves to incorporate Durham’s local food history in terms of combating historical and present injustices in the food system. We propose creating an integrative, interactive visual for DFJP to

    更新日期:2020-12-17
  • Predicting Lifespan of Drosophila Melanogaster: A Novel Application of Convolutional Neural Networks and Zero‐Inflated Autoregressive Conditional Poisson Model
    Stat (IF 0.766) Pub Date : 2020-12-08
    Yi Zhang; V.A. Samaranayake; Gayla R. Olbricht; Matthew Thimgan

    A model to classify the lifespan of Drosophila, the fruit fly, into short and long‐lived categories based on a sleep characteristic, extracted from activity data, is developed using a two‐stage process. Stage one models the per minute activity counts of each fly using a zero‐inflated Autoregressive Conditional Poisson model. These probabilities are allowed to vary hourly, reflecting the circadian and

    更新日期:2020-12-09
  • Weight Normalized Deep Neural Networks
    Stat (IF 0.766) Pub Date : 2020-12-08
    Yixi Xu; Xiao Wang

    The generalization error is the difference between the expected risk and the empirical risk of a learning algorithm. This generalization error can be upper bounded by the Rademacher complexity of the underlying hypothesis class with high probability. This paper studies the function class of Lp,q weight normalized deep neural networks. We present a general framework for norm‐based capacity control and

    更新日期:2020-12-09
  • Outcome weighted ψ‐learning for individualized treatment rules
    Stat (IF 0.766) Pub Date : 2020-12-07
    Mingyang Liu; Xiaotong Shen; Wei Pan

    An individualized treatment rule is often employed to maximize a certain patient‐specific clinical outcome based on his/her clinical or genomic characteristics as well as heterogeneous response to treatments. Although developing such a rule is conceptually important to personalized medicine, existing methods such as the partial least squares Qian and Murphy (2011) suffers from the difficulty of indirect

    更新日期:2020-12-07
  • On the Estimation Bias in First‐Order Bifurcating Autoregressive Models
    Stat (IF 0.766) Pub Date : 2020-12-07
    Tamer M. Elbayoumi; Sayed A. Mostafa

    In this paper, we study the bias of the least‐squares (LS) estimation for the stationary first‐order bifurcating autoregressive [BAR(1)] model which is commonly used to model binary tree‐structured data that appear in many applications, most famously cell‐lineage applications. We first show that the LS estimator can have large bias for both small and moderate sized samples and that this bias is dependent

    更新日期:2020-12-07
  • Sub‐Weibull distributions: Generalizing sub‐Gaussian and sub‐Exponential properties to heavier tailed distributions
    Stat (IF 0.766) Pub Date : 2020-10-01
    Mariia Vladimirova; Stéphane Girard; Hien Nguyen; Julyan Arbel

    We propose the notion of sub‐Weibull distributions, which are characterized by tails lighter than (or equally light as) the right tail of a Weibull distribution. This novel class generalizes the sub‐Gaussian and sub‐Exponential families to potentially heavier tailed distributions. Sub‐Weibull distributions are parameterized by a positive tail index θ and reduce to sub‐Gaussian distributions for θ =

    更新日期:2020-12-07
  • Better Together: Extending JMP with Open Source Software
    Stat (IF 0.766) Pub Date : 2020-12-03
    Nascif Abousalh‐Neto; Meijian Guan; Ruth Hummel

    JMP is commercial software designed for interactive data analysis and exploration. JMP’s high‐level, visual interface makes it an outstanding tool for teaching best practices, methods, and model building techniques. JMP is also designed for extensibility, with features that allow the embedding of and deployment to open source packages and environments. In this paper, we will explore use cases that

    更新日期:2020-12-03
  • On a new test of fit to the beta distribution
    Stat (IF 0.766) Pub Date : 2020-12-02
    Bruno Ebner; Shawn C. Liebenberg

    We propose a new L2‐type goodness‐of‐fit test for the family of beta distributions based on a conditional moment characterisation. The asymptotic null distribution is identified, and since it depends on the underlying parameters, a parametric bootstrap procedure is proposed. Consistency against all alternatives that satisfy a convergence criterion is shown, and a Monte Carlo simulation study indicates

    更新日期:2020-12-03
  • Bayesian Inference for Polycrystalline Materials
    Stat (IF 0.766) Pub Date : 2020-11-27
    James Matuk; Oksana Chkrebtii; Stephen Niezgoda

    Polycrystalline materials, such as metals, are comprised of heterogeneously oriented crystals. Observed crystal orientations are modelled as a sample from an orientation distribution function (ODF), which determines a variety of material properties and is therefore of great interest to practitioners. Observations consist of quaternions, 4‐dimensional unit vectors reflecting both orientation and rotation

    更新日期:2020-11-27
  • Creating optimal conditions for reproducible data analysis in R with ‘fertile’†
    Stat (IF 0.766) Pub Date : 2020-11-26
    Audrey M. Bertin; Benjamin S. Baumer

    The advancement of scientific knowledge increasingly depends on ensuring that data‐driven research is reproducible: that two people with the same data obtain the same results. However, while the necessity of reproducibility is clear, there are significant behavioral and technical challenges that impede its widespread implementation and no clear consensus on standards of what constitutes reproducibility

    更新日期:2020-11-27
  • Data Visualization Case Studies for High‐Dimensional Data Validation
    Stat (IF 0.766) Pub Date : 2020-11-26
    Aaron R. Williams

    Microsimulation and synthetic data are often high‐dimensional, requiring extensive validation and exploration to compare results against certain benchmarks. In both cases, validation is necessary to ensure that the many univariate distributions and multivariate relationships in the new data are similar to the many univariate distributions and multivariate relationships in the underlying data. This

    更新日期:2020-11-27
  • Functional Singular Spectrum Analysis
    Stat (IF 0.766) Pub Date : 2020-11-25
    Hossein Haghbin; Seyed Morteza Najibi; Rahim Mahmoudvand; Jordan Trinka; Mehdi Maadooliat

    In this paper, we develop a new extension of the Singular Spectrum Analysis (SSA) called functional SSA to analyze functional time series. The new methodology is constructed by integrating ideas from functional data analysis and univariate SSA. Specifically, we introduce a trajectory operator in the functional world, which is equivalent to the trajectory matrix in the regular SSA. In the regular SSA

    更新日期:2020-11-27
  • Likelihood‐Based Inference for Generalized Linear Mixed Models: Inference with the R Package glmm
    Stat (IF 0.766) Pub Date : 2020-11-25
    Christina Knudson; Sydney Benson; Charles Geyer; Galin Jones

    The R package glmm enables likelihood‐based inference for generalized linear mixed models with a canonical link. No other publicly‐available software accurately conducts likelihood‐based inference for generalized linear mixed models with crossed random effects. glmm is able to do so by approximating the likelihood function and two derivatives using importance sampling. The importance sampling distribution

    更新日期:2020-11-25
  • Statistical Significance Calculations for Scenarios in Visual Inference
    Stat (IF 0.766) Pub Date : 2020-11-25
    Susan Vanderplas; Christian Röttger; Dianne Cook; Heike Hofmann

    Statistical inference provides the protocols for conducting rigorous science, but data plots provide the opportunity to discover the unexpected. These disparate endeavors are bridged by visual inference, where a lineup protocol can be employed for statistical testing. Human observers are needed to assess the lineups, typically using a crowd‐sourcing service. This paper describes a new approach for

    更新日期:2020-11-25
  • Statistical Inference for Nonparametric Censored Regression
    Stat (IF 0.766) Pub Date : 2020-11-23
    Guangcai Mao; Jing Zhang

    Nonparametric regression is of primary importance in many statistical applications. For the data with censored outcome, how to construct a confidence band for regression function is a basic issue but has limited researches. We propose a procedure to construct the pointwise and simultaneous confidence bands for regression function based on a debiased estimator, which is proposed by correcting the bias

    更新日期:2020-11-25
  • Hosting a Data Science Hackathon with Limited Resources
    Stat (IF 0.766) Pub Date : 2020-11-23
    Kristin Kuter; Christopher Wedrychowicz

    In this paper we will detail our experiences developing and organizing an annual machine learning competition at Saint Mary’s College. We will detail our process of collecting data for the competition as well as the logistical challenges faced when hosting such an event at a small liberal arts college. We believe that this report will be of interest to colleagues teaching data science at institutions

    更新日期:2020-11-23
  • Modern Multiple Imputation with Functional Data
    Stat (IF 0.766) Pub Date : 2020-11-23
    Aniruddha Rajendra Rao; Matthew Reimherr

    This work considers the problem of fitting functional models with sparsely and irregularly sampled functional data. It overcomes the limitations of the state‐of‐the‐art methods, which face major challenges in the fitting of more complex non‐linear models. Currently, many of these models cannot be consistently estimated unless the number of observed points per curve grows sufficiently quickly with the

    更新日期:2020-11-23
  • Modernizing k‐Nearest Neighbors
    Stat (IF 0.766) Pub Date : 2020-11-23
    Robin Elizabeth Yancey; Bochao Xin; Norm Matloff

    k‐nearest neighbors} (k‐NN) method is one of the oldest statistical/machine learning techniques. It is included in virtually every major package, such as caret, parsnp, mlr3 and scikit‐learn. Yet those packages do not go beyond the basics. With today's high‐speed computation capability, k‐NN can be made much more powerful. Here we present directions in which that can be done:

    更新日期:2020-11-23
  • Closed‐form Expressions for Maximum Mean Discrepancy with Applications to Wasserstein Auto‐Encoders
    Stat (IF 0.766) Pub Date : 2020-11-17
    Raif M. Rustamov

    The Maximum Mean Discrepancy (MMD) has found numerous applications in statistics and machine learning, most recently as a penalty in the Wasserstein Auto‐Encoder (WAE). In this paper we compute closed‐form expressions for estimating the Gaussian kernel based MMD between a given distribution and the standard multivariate normal distribution. This formula reveals a connection to the Baringhaus‐Henze‐Epps‐Pulley

    更新日期:2020-11-17
  • Nested model averaging on solution path for high‐dimensional linear regression
    Stat (IF 0.766) Pub Date : 2020-09-24
    Yang Feng; Qingfeng Liu

    We study the nested model averaging method on the solution path for a high‐dimensional linear regression problem. In particular, we propose to combine model averaging with regularized estimators (e.g., lasso, elastic net, and Sorted L‐One Penalized Estimation [SLOPE]) on the solution path for high‐dimensional linear regression. In simulation studies, we first conduct a systematic investigation on the

    更新日期:2020-11-13
  • Semi‐supervised logistic learning based on exponential tilt mixture models
    Stat (IF 0.766) Pub Date : 2020-09-04
    Xinwei Zhang; Zhiqiang Tan

    Consider semi‐supervised learning for classification, where both labelled and unlabelled data are available for training. The goal is to exploit both datasets to achieve higher prediction accuracy than just using labelled data alone. We develop a semi‐supervised logistic learning method based on exponential tilt mixture models by extending a statistical equivalence between logistic regression and exponential

    更新日期:2020-11-12
  • Forecasting subnational COVID‐19 mortality using a day‐of‐the‐week adjusted Bayesian hierarchical model
    Stat (IF 0.766) Pub Date : 2020-11-06
    Justin J. Slater; Patrick E. Brown; Jeffrey S. Rosenthal

    As of October 2020, the death toll from the COVID‐19 pandemic has risen over 1.1 million deaths worldwide. Reliable estimates of mortality due to COVID‐19 are important to guide intervention strategies such as lockdowns and social distancing measures. In this paper, we develop a data‐driven model that accurately and consistently estimates COVID‐19 mortality at the regional level early in the epidemic

    更新日期:2020-11-09
  • Gradual Variance Change Point Detection with A Smoothly‐changing Mean Trend
    Stat (IF 0.766) Pub Date : 2020-11-03
    Wanfeng Liang; Libai Xu

    In contrast to the analysis of abrupt changes, methods for detecting gradual change points are less developed. In this paper we are interested in the scenario that the variance of data may vary gradually while the mean of data changes in a smooth fashion. We propose a penalized weighted least squares approach with an iterative estimation procedure to detect the gradual variance change point with smoothly‐changing

    更新日期:2020-11-03
  • Mann‐Whitney Test for Two‐phase Stratified Sampling
    Stat (IF 0.766) Pub Date : 2020-10-30
    Takumi Saegusa

    We consider the Mann‐Whitney test for two‐phase stratified sampling. In this design, the i.i.d. sample is obtained at the first phase and then stratified based on auxiliary variables. At the second phase, stratified subsamples are obtained without replacement to collect variables of interest. The resultant data are biased and dependent sample due to stratification and sampling without replacement.

    更新日期:2020-11-02
  • VtNet: a Neural Network with Variable Importance Assessment
    Stat (IF 0.766) Pub Date : 2020-10-30
    Lixiang Zhang; Lin Lin; Jia Li

    The architectures of many neural networks rely heavily on the underlying grid associated with the variables, for instance, the lattice of pixels in an image. For general biomedical data without a grid structure, the multi‐layer perceptron (MLP) and deep belief network (DBN) are often used. However, in these networks, variables are treated homogeneously in the sense of network structure; and it is difficult

    更新日期:2020-11-02
  • A family of parsimonious mixtures of multivariate Poisson‐lognormal distributions for clustering multivariate count data
    Stat (IF 0.766) Pub Date : 2020-08-25
    Sanjeena Subedi; Ryan P. Browne

    Multivariate count data are commonly encountered through high‐throughput sequencing technologies in bioinformatics, text mining, or sports analytics. Although the Poisson distribution seems a natural fit to these count data, its multivariate extension is computationally expensive. In most cases, mutual independence among the variables is assumed; however, this fails to take into account the correlation

    更新日期:2020-10-30
  • Randomized estimation of functional covariance operator via subsampling
    Stat (IF 0.766) Pub Date : 2020-08-22
    Shiyuan He; Xiaomeng Yan

    Covariance operators are fundamental concepts and modelling tools for many functional data analysis methods, such as functional principal component analysis. However, the empirical (or estimated) covariance operator becomes too costly to compute when the functional dataset gets big. This paper studies a randomized algorithm for covariance operator estimation. The algorithm works by sampling and rescaling

    更新日期:2020-10-30
  • On the non‐asymptotic and sharp lower tail bounds of random variables
    Stat (IF 0.766) Pub Date : 2020-09-12
    Anru R. Zhang; Yuchen Zhou

    The non‐asymptotic tail bounds of random variables play crucial roles in probability, statistics, and machine learning. Despite much success in developing upper bounds on tail probabilities in literature, the lower bounds on tail probabilities are relatively fewer. In this paper, we introduce systematic and user‐friendly schemes for developing non‐asymptotic lower bounds of tail probabilities. In addition

    更新日期:2020-10-30
  • Causal inference in the presence of missing data using a random forest based matching algorithm
    Stat (IF 0.766) Pub Date : 2020-10-23
    Tristan Hillis; Maureen A. Guarcello; Richard A. Levine; Juanjuan Fan

    Observational studies require matching across groups over multiple confounding variables. Across the literature, matching algorithms fail to handle this issue. In this way, missing values are regularly imputed prior to being considered in the matching process. However, imputing is not always practical, forcing us to drop an observation due to the deficiency of the chosen algorithm, decreasing the power

    更新日期:2020-10-26
  • Directional analysis for point patterns on linear networks
    Stat (IF 0.766) Pub Date : 2020-10-15
    Mehdi Moradi; Jorge Mateu; Carles Comas

    Statistical analysis of point processes often assumes that the underlying process is isotropic in the sense that its distribution is invariant under rotation. For point processes on ℝ2, some tests based on the K‐ and nearest neighbour orientation functions have been proposed to check such an assumption. However, anisotropy and directional analysis need proper caution when dealing with point processes

    更新日期:2020-10-16
  • Bayesian Group Learning for Shot Selection of Professional Basketball Players
    Stat (IF 0.766) Pub Date : 2020-10-15
    Guanyu Hu; Hou‐Cheng Yang; Yishu Xue

    In this paper, we develop a group learning approach to analyze the underlying heterogeneity structure of shot selection among professional basketball players in the NBA. We propose a mixture of finite mixtures (MFM) model to capture the heterogeneity of shot selection among different players based on Log Gaussian Cox process (LGCP). Our proposed method can simultaneously estimate the number of groups

    更新日期:2020-10-16
  • Self‐Supervised Learning for Outlier Detection
    Stat (IF 0.766) Pub Date : 2020-10-14
    Jan Diers; Christian Pigorsch

    The identification of outliers is mainly based on unannotated data and therefore constitutes an unsupervised problem. The lack of a label leads to numerous challenges that do not occur or only occur to a lesser extent when using annotated data and supervised methods. In this paper, we focus on two of these challenges: the selection of hyperparameters and the selection of informative features. To this

    更新日期:2020-10-15
  • Linear screening for high‐dimensional computer experiments
    Stat (IF 0.766) Pub Date : 2020-10-02
    Chunya Li; Daijun Chen; Shifeng Xiong

    In this paper we propose a linear variable screening method for computer experiments when the number of input variables is larger than the number of runs. This method uses a linear model to model the nonlinear data, and screens the important variables by existing screening methods for linear models. When the underlying simulator is nearly sparse, we prove that the linear screening method is asymptotically

    更新日期:2020-10-02
  • Semi‐supervised joint learning for longitudinal clinical events classification using neural network models
    Stat (IF 0.766) Pub Date : 2020-08-11
    Weijing Tang; Jiaqi Ma; Akbar K. Waljee; Ji Zhu

    The success of deep learning neural network models often relies on the accessibility of a large number of labelled training data. In many health care settings, however, only a small number of accurately labelled data are available while unlabelled data are abundant. Further, input variables such as clinical events in the medical settings are usually of longitudinal nature, which poses additional challenges

    更新日期:2020-10-02
  • Noisy low‐rank matrix completion under general bases
    Stat (IF 0.766) Pub Date : 2020-07-28
    Lei Shi; Changliang Zou

    In this paper, we consider the low‐rank matrix completion problem under general bases, which intends to recover a structured matrix via a linear combination of prespecified bases. Existing works focus primarily on orthonormal bases; however, it is often necessary to adopt nonorthonormal bases in some real applications. Thus, there is a great need to address the feasibility of some popular estimators

    更新日期:2020-10-02
  • Visual Tests for Elliptically Symmetric Distributions
    Stat (IF 0.766) Pub Date : 2020-09-24
    Pritha Guha; Biman Chakraborty

    We propose a visual test of goodness of fit for families of elliptically symmetric distributions based on a test statistic derived from scale‐scale plots. The scale‐scale plots are constructed based on the volume functionals of the central rank regions. The test is motivated through the multivariate normal distributions, and extended to a test of elliptical symmetry. We derive the asymptotic properties

    更新日期:2020-09-25
  • Nonasymptotic support recovery for high dimensional sparse covariance matrices
    Stat (IF 0.766) Pub Date : 2020-09-19
    Adam B. Kashlak; Linglong Kong

    For high dimensional data, the standard empirical estimator for the covariance matrix is very poor, and thus many methods have been proposed to more accurately estimate the covariance structure of high dimensional data. In this article, we consider estimation under the assumption of sparsity, but regularize with respect to the individual false positive rate for incorrectly including a matrix entry

    更新日期:2020-09-20
  • Expectile Regression via Deep Residual Networks
    Stat (IF 0.766) Pub Date : 2020-09-18
    Yiyi Yin; Hui Zou

    Expectile is a generalization of the expected value in probability and statistics. In finance and risk management, the expectile is considered to be an important risk measure due to its connection with gain‐loss ratio and its coherent and elicitable properties. Linear multiple expectile regression was proposed in 1987 for estimating the conditional expectiles of a response given a set of covariates

    更新日期:2020-09-20
  • Mixed effects envelope models
    Stat (IF 0.766) Pub Date : 2020-09-11
    Yuyang Shi; Linquan Ma; Lan Liu

    When multiple measures are collected repeatedly over time, redundancy typically exists among responses. The envelope method was recently proposed to reduce the dimension of responses without loss of information in regression with multivariate responses. It can gain substantial efficiency over the standard least squares estimator. In this paper, we generalize the envelope method to mixed effects models

    更新日期:2020-09-11
  • Deep learning from a statistical perspective
    Stat (IF 0.766) Pub Date : 2020-08-31
    Yubai Yuan, Yujia Deng, Yanqing Zhang, Annie Qu

    As one of the most rapidly developing artificial intelligence techniques, deep learning has been applied in various machine learning tasks and has received great attention in data science and statistics. Regardless of the complex model structure, deep neural networks can be viewed as a nonlinear and nonparametric generalization of existing statistical models. In this review, we introduce several popular

    更新日期:2020-08-31
  • Cross‐dimple in the cross‐covariance functions of bivariate isotropic random fields on spheres
    Stat (IF 0.766) Pub Date : 2020-08-27
    Alfredo Alegría

    Multivariate random fields allow to simultaneously model multiple spatially indexed variables, playing a fundamental role in geophysical, environmental, and climate disciplines. This paper introduces the concept of cross‐dimple for bivariate isotropic random fields on spheres and proposes an approach to build parametric models that possess this attribute. Our findings are based on the spectral representation

    更新日期:2020-08-27
  • Exponential family tensor completion with auxiliary information
    Stat (IF 0.766) Pub Date : 2020-08-24
    Jichen Yang, Nan Zhang

    Tensor completion is among the most important tasks in tensor data analysis, which aims to fill the missing entries of a partially observed tensor. In many real applications, non‐Gaussian data such as binary or count data are frequently collected. Thus, it is inappropriate to assume that observations are normally distributed and formulate tensor completion with least squares based approaches. In this

    更新日期:2020-08-24
  • Deep fiducial inference
    Stat (IF 0.766) Pub Date : 2020-08-16
    Gang Li; Jan Hannig

    Since the mid‐2000s, there has been a resurrection of interest in modern modifications of fiducial inference. To date, the main computational tool to extract a generalized fiducial distribution is Markov chain Monte Carlo (MCMC). We propose an alternative way of computing a generalized fiducial distribution that could be used in complex situations. In particular, to overcome the difficulty when the

    更新日期:2020-08-16
  • Robust inference for nonlinear regression models from the Tsallis score: application to COVID-19 contagion in Italy.
    Stat (IF 0.766) Pub Date : 2020-08-12
    Paolo Girardi,Luca Greco,Valentina Mameli,Monica Musio,Walter Racugno,Erlis Ruli,Laura Ventura

    We discuss an approach of robust fitting on non‐linear regression models, in both frequentist and Bayesian approaches, which can be employed to model and predict the contagion dynamics of the coronavirus disease 2019 (COVID‐19) in Italy. The focus is on the analysis of epidemic data using robust dose–response curves, but the functionality is applicable to arbitrary non‐linear regression models.

    更新日期:2020-08-12
  • A Bayesian non‐parametric approach for automatic clustering with feature weighting
    Stat (IF 0.766) Pub Date : 2020-08-11
    Debolina Paul; Swagatam Das

    Despite being a well‐known problem, feature weighting and feature selection are a major predicament for clustering. Most of the algorithms, which provide weighting or selection of features, require the number of clusters to be known in advance. On the other hand, the existing automatic clustering procedures that can determine the number of clusters are computationally expensive and often do not make

    更新日期:2020-08-11
  • Disjunct support spike‐and‐slab priors for variable selection in regression under quasi‐sparseness
    Stat (IF 0.766) Pub Date : 2020-08-11
    Daniel Andrade; Kenji Fukumizu

    Sparseness of the regression coefficient vector is often a desirable property, because, among other benefits, sparseness improves interpretability. In practice, many true regression coefficients might be negligibly small, but nonzero, which we refer to as quasi‐sparseness. Spike‐and‐slab priors can be tuned to ignore very small regression coefficients and, as a consequence, provide a trade‐off between

    更新日期:2020-08-11
  • Sparse nonparametric regression with regularized tensor product kernel
    Stat (IF 0.766) Pub Date : 2020-08-11
    Hang Yu, Yuanjia Wang, Donglin Zeng

    With growing interest to use black‐box machine learning for complex data with many feature variables, it is critical to obtain a prediction model that only depends on a small set of features to maximize generalizability. Therefore, feature selection remains to be an important and challenging problem in modern applications. Most of the existing methods for feature selection are based on either parametric

    更新日期:2020-08-11
  • Model checking for parametric single‐index quantile models
    Stat (IF 0.766) Pub Date : 2020-08-06
    Liangliang Yuan; Wenhui Liu; Xuemin Zi; Zhaojun Wang

    In this work, we construct a lack‐of‐fit test for testing parametric single‐index quantile regression models. We apply the kernel smoothing technique for the multivariate nonparametric estimation involved in this task. To avoid the “curse of dimensionality” in multivariate nonparametric estimation and to fully utilize the information contained in the model, we employ a sufficient dimension reduction

    更新日期:2020-08-06
  • Small run size design for model identification in 3m factorial experiments
    Stat (IF 0.766) Pub Date : 2020-08-04
    Fariba Z. Labbaf, Hooshang Talebi

    An active interaction in a main effect plan may cause biased estimation of the parameters in an analysis of variance (ANOVA) model. A fractional factorial design (FFD) with higher order resolution can resolve the alias problem, however, with a considerable number of runs. Alternatively, a search design (SD), the so‐called main effect plus k plan (MEP.k), with much less number of runs than FFD, is able

    更新日期:2020-08-04
  • Mixture modelling of categorical sequences with secondary components
    Stat (IF 0.766) Pub Date : 2020-07-30
    Xuwen Zhu

    In this paper, the forward selected first‐order Markov mixture (FSFOMM) is proposed for modelling heterogeneous categorical sequences with secondary components capable of detecting outlying sequences within each cluster. Such sequences are assumed to have different transition probabilities in certain states. The model provides an attractive and flexible tool for diagnostics of unusual behaviours and

    更新日期:2020-07-30
Contents have been reproduced by permission of the publishers.
导出
全部期刊列表>>
微生物研究
亚洲大洋洲地球科学
NPJ欢迎投稿
自然科研论文编辑
ERIS期刊投稿
欢迎阅读创刊号
自然职场,为您触达千万科研人才
spring&清华大学出版社
城市可持续发展前沿研究专辑
Springer 纳米技术权威期刊征稿
全球视野覆盖
施普林格·自然新
chemistry
物理学研究前沿热点精选期刊推荐
自然职位线上招聘会
欢迎报名注册2020量子在线大会
化学领域亟待解决的问题
材料学研究精选新
GIANT
ACS ES&T Engineering
ACS ES&T Water
屿渡论文,编辑服务
阿拉丁试剂right
上海中医药大学
浙江大学
清华大学
南科大
北京理工大学
清华
隐藏1h前已浏览文章
课题组网站
新版X-MOL期刊搜索和高级搜索功能介绍
ACS材料视界
清华大学-1
武汉大学
浙江大学
天合科研
x-mol收录
试剂库存
down
wechat
bug