-
Beta regression for double‐bounded response with correlated high‐dimensional covariates Stat (IF 1.7) Pub Date : 2024-03-12 Jianxuan Liu
Continuous responses measured on a standard unit interval are ubiquitous in many scientific disciplines. Statistical models built upon a normal error structure do not generally work because they can produce biassed estimates or result in predictions outside either bound. In real‐life applications, data are often high‐dimensional, correlated and consist of a mixture of various data types. Little literature
-
-
Visualisation and outlier detection for probability density function ensembles Stat (IF 1.7) Pub Date : 2024-03-12 Alexander C. Murph, Justin D. Strait, Kelly R. Moran, Jeffrey D. Hyman, Philip H. Stauffer
Exploratory data analysis (EDA) for functional data—data objects where observations are entire functions—is a difficult problem that has seen significant attention in recent literature. This surge in interest is motivated by the ubiquitous nature of functional data, which are prevalent in applications across fields such as meteorology, biology, medicine and engineering. Empirical probability density
-
Graph‐based mutually exciting point processes for modelling event times in docked bike‐sharing systems Stat (IF 1.7) Pub Date : 2024-03-08 Francesco Sanna Passino, Yining Che, Carlos Cardoso Correia Perello
This paper introduces graph‐based mutually exciting processes (GB‐MEP) to model event times in network point processes, focusing on an application to docked bike‐sharing systems. GB‐MEP incorporates known relationships between nodes in a graph within the intensity function of a node‐based multivariate Hawkes process. This approach reduces the number of parameters to a quantity proportional to the number
-
Optimal designs for crossover model with partial interactions Stat (IF 1.7) Pub Date : 2024-03-07 Futao Zhang, Pierre Druilhet, Xiangshun Kong
SummaryThis paper studies the universally optimal designs for estimating total effects under crossover models with partial interactions. We provide necessary and sufficient conditions for a symmetric design to be universally optimal, based on which algorithms can be used to derive optimal symmetric designs under any form of the within‐block covariance matrix. To cope with the computational complexity
-
Table inference for combinatorial origin‐destination choices in agent‐based population synthesis Stat (IF 1.7) Pub Date : 2024-03-06 Ioannis Zachos, Theodoros Damoulas, Mark Girolami
A key challenge in agent‐based mobility simulations is the synthesis of individual agent socioeconomic profiles. Such profiles include locations of agent activities, which dictate the quality of the simulated travel patterns. These locations are typically represented in origin‐destination matrices that are sampled using coarse travel surveys. This is because fine‐grained trip profiles are scarce and
-
D$$ D $$‐optimal designs for multi‐response linear models with two groups Stat (IF 1.7) Pub Date : 2024-03-05 Xin Liu, Lei He, Rong‐Xian Yue
In recent years, multi‐response linear models have gained significant popularity in various statistical applications. However, the design aspects of multi‐response linear models with group‐wise considerations have received limited attention in the literature. This paper aims to thoroughly investigate ‐optimal designs for such models. Specifically, we focus on scenarios involving two groups, where the
-
Image registration for zooming: A statistically consistent local feature mapping approach Stat (IF 1.7) Pub Date : 2024-03-05 Sujay Das, Anik Roy, Partha Sarathi Mukherjee
Image registration is a widely used tool for matching two images of the same scene with one another. In the literature, several image registration techniques are available to register rigid‐body and non‐rigid‐body transformations. One such important transformation is zooming. There are very few feature‐based methods that address this particular problem. These methods fail miserably when there are only
-
Asymptotic behaviour of a non‐autonomous multispecies Holling type II model with a complex type of noises Stat (IF 1.7) Pub Date : 2024-03-05 Libai Xu, Xintong Ma, Yanyan Zhao
The deterministic non‐autonomous multispecies Holling type II model and its stochastic version with a simple type of noise have been proposed to infer multispecies community structure. However, these models fail to account for complex types of noises, which may render the model overly simplistic. In this paper, a non‐autonomous multispecies Holling type II model with a complex type of noise has been
-
What matters to graduate students? Experiences at a statistical consulting center from pre‐ to post‐COVID‐19 pandemic Stat (IF 1.7) Pub Date : 2024-03-04 Marianne Huebner, Steven J. Pierce, Andrew J. Dennhardt, Hope Akaeze, Nicole Jess, Wenjuan Ma
The COVID‐19 pandemic led to unprecedented changes in all levels of society, including the statistical consulting field. This paper focuses on the experiences of graduate student consultants and clients at our statistical consulting center (SCC) that operates all year independent of semesters. During the lockdown period, work continued without interruption and was conducted remotely, but there was
-
Machine collaboration Stat (IF 1.7) Pub Date : 2024-03-01 Qingfeng Liu, Yang Feng
We propose a new ensemble framework for supervised learning, called machine collaboration (MaC), using a collection of possibly heterogeneous base learning methods (hereafter, base machines) for prediction tasks. Unlike bagging/stacking (a parallel and independent framework) and boosting (a sequential and top‐down framework), MaC is a type of circular and recursive learning framework. The circular
-
Highly private large‐sample tests for contingency tables Stat (IF 1.7) Pub Date : 2024-02-29 Sungkyu Jung, Seung Woo Kwak
Differential privacy is a foundational concept for safeguarding sensitive individual information when releasing data or statistical analysis results. In this study, we concentrate on the protection of privacy in the context of goodness‐of‐fit (GOF) and independence tests, utilizing perturbed contingency tables that adhere to Gaussian differential privacy within the high‐privacy regime, where the degrees
-
Linear mixed models for complex survey data: Implementing and evaluating pairwise likelihood Stat (IF 1.7) Pub Date : 2024-02-27 Thomas Lumley, Xudong Huang
SummaryAs complex‐survey data become more widely used in health and social science research, there is increasing interest in fitting a wider range of regression models. We describe an implementation of two‐level linear mixed models in R using the pairwise composite likelihood approach of Rao and co‐workers. We discuss the computational efficiency of pairwise composite likelihood and compare the estimator
-
A note about why deep learning is deep: A discontinuous approximation perspective Stat (IF 1.7) Pub Date : 2024-02-22 Yongxin Li, Haobo Qi, Hansheng Wang
Deep learning has achieved unprecedented success in recent years. This approach essentially uses the composition of nonlinear functions to model the complex relationship between input features and output labels. However, a comprehensive theoretical understanding of why the hierarchical layered structure can exhibit superior expressive power is still lacking. In this paper, we provide an explanation
-
An EWMA sign chart for monitoring processes with fixed and variable sample sizes Stat (IF 1.7) Pub Date : 2024-02-09 Abdul Haq
This study addresses limitations in the nonparametric EWMA sign chart with fixed control limits (FCLs), particularly when facing time-varying sample sizes. The FCLs-based EWMA sign chart has a variable conditional false alarm rate (CFAR), especially at the startup of a process or after recovering from an out-of-control signal. To overcome these limitations, we propose a nonparametric EWMA sign chart
-
Deep learning models to predict primary open-angle glaucoma Stat (IF 1.7) Pub Date : 2024-02-07 Ruiwen Zhou, J. Philip Miller, Mae Gordon, Michael Kass, Mingquan Lin, Yifan Peng, Fuhai Li, Jiarui Feng, Lei Liu
Glaucoma is a major cause of blindness and vision impairment worldwide, and visual field (VF) tests are essential for monitoring the conversion of glaucoma. While previous studies have primarily focused on using VF data at a single time point for glaucoma prediction, there has been limited exploration of longitudinal trajectories. Additionally, many deep learning techniques treat the time-to-glaucoma
-
Estimation of the density for censored and contaminated data Stat (IF 1.7) Pub Date : 2024-02-07 Ingrid Van Keilegom, Elif Kekeç
Consider a situation where one is interested in estimating the density of a survival time that is subject to random right censoring and measurement errors. This happens often in practice, like in public health (pregnancy length), medicine (duration of infection), ecology (duration of forest fire), among others. We assume a classical additive measurement error model with Gaussian noise and unknown error
-
Reproducible research practices: A tool for effective and efficient leadership in collaborative statistics Stat (IF 1.7) Pub Date : 2024-02-11 Camille J. Hochheimer, Grace N. Bosma, Lauren Gunn-Sandell, Mary D. Sammel
With data and code sharing policies more common and version control more widely used in statistics, standards for reproducible research are higher than ever. Reproducible research practices must keep up with the fast pace of research. To do so, we propose combining modern practices of leadership with best practices for reproducible research in collaborative statistics as an effective tool for ensuring
-
A confidence machine for sparse high-order interaction model Stat (IF 1.7) Pub Date : 2024-02-05 Diptesh Das, Eugene Ndiaye, Ichiro Takeuchi
In predictive modelling for high-stake decision-making, predictors must be not only accurate but also reliable. Conformal prediction (CP) is a promising approach for obtaining the coverage of prediction results with fewer theoretical assumptions. To obtain the prediction set by so-called full-CP, we need to refit the predictor for all possible values of prediction results, which is only possible for
-
Softplus negative binomial network autoregression Stat (IF 1.7) Pub Date : 2024-01-18 Xiangyu Guo, Fukang Zhu
Modelling multivariate time series of counts in a parsimonious way is a popular topic. In this paper, we consider an integer-valued network autoregressive model with a non-random neighbourhood structure, which uses negative binomial distribution as the conditional marginal distribution and the softplus function as the link function. The new model generalizes existing ones in the literature and has
-
Ordered probit Bayesian additive regression trees for ordinal data Stat (IF 1.7) Pub Date : 2024-01-17 Jaeyong Lee, Beom Seuk Hwang
Bayesian additive regression trees (BART) is a nonparametric model that is known for its flexibility and strong statistical foundation. To address a robust and flexible approach to analyse ordinal data, we extend BART into an ordered probit regression framework (OPBART). Further, we propose a semiparametric setting for OPBART (semi-OPBART) to model covariates of interest parametrically and confounding
-
Differentially private outcome-weighted learning for optimal dynamic treatment regime estimation Stat (IF 1.7) Pub Date : 2024-01-17 Dylan Spicker, Erica E. M. Moodie, Susan M. Shortreed
Precision medicine is a framework for developing evidence-based medical recommendations that seeks to determine the optimal sequence of treatments, tailored to all of the relevant, observable patient-level characteristics. Because precision medicine relies on highly sensitive, patient-level data, ensuring the privacy of participants is of great importance. Dynamic treatment regimes (DTRs) provide one
-
Development of network-guided transcriptomic risk score for disease prediction Stat (IF 1.7) Pub Date : 2024-01-16 Xuan Cao, Liangliang Zhang, Kyoungjae Lee
Omics data, routinely collected in various clinical settings, are of a complex and network-structured nature. Recent progress in RNA sequencing (RNA-seq) allows us to explore whole-genome gene expression profiles and to develop predictive model for disease risk. In this study, we propose a novel Bayesian approach to construct RNA-seq-based risk score leveraging gene expression network for disease risk
-
Some benefits of standardisation for conditional extremes Stat (IF 1.7) Pub Date : 2024-01-14 Christian Rohrbeck, Jonathan A. Tawn
A key aspect where extreme values methods differ from standard statistical models is through having asymptotic theory to provide a theoretical justification for the nature of the models used for extrapolation. In multivariate extremes, many different asymptotic theories have been proposed, partly as a consequence of the lack of ordering property with vector random variables. One class of multivariate
-
Developing partnerships for academic data science consulting and collaboration units Stat (IF 1.7) Pub Date : 2024-01-11 Marianne Huebner, Laura Bond, Felesia Stukes, Joel Herndon, David J. Edwards, Gina-Maria Pomann
Data science consulting and collaboration units (DSUs) are core infrastructure for research at universities. Activities span data management, study design, data analysis, data visualization, predictive modelling, preparing reports, manuscript writing and advising on statistical methods and may include an experiential or teaching component. Partnerships are needed for a thriving DSU as an active part
-
-
Equivalence testing for multiple groups Stat (IF 1.7) Pub Date : 2024-01-10 Tony Pourmohamad, Herbert K. H. Lee
Testing for equivalence, rather than testing for a difference, is an important component of some scientific studies. While the focus of the existing literature is on comparing two groups for equivalence, real-world applications arise regularly that require testing across more than two groups. This paper reviews the existing approaches for testing across multiple groups and proposes a novel framework
-
Iterative estimating equations for disease mapping with spatial zero-inflated Poisson data Stat (IF 1.7) Pub Date : 2024-01-10 Pei-Sheng Lin, Jun Zhu, Feng-Chang Lin
Spatial epidemiology often involves the analysis of spatial count data with an unusually high proportion of zero observations. While Bayesian hierarchical models perform very well for zero-inflated data in many situations, a smooth response surface is usually required for the Bayesian methods to converge. However, for infectious disease data with excessive zeros, a Wombling issue with large spatial
-
Significance of modes in the torus by topological data analysis Stat (IF 1.7) Pub Date : 2023-12-17 Changjo Yu, Sungkyu Jung, Jisu Kim
This paper addresses the problem of identifying modes or density bumps in multivariate angular or circular data, which have diverse applications in fields like medicine, biology and physics. We focus on the use of topological data analysis and persistent homology for this task. Specifically, we extend the methods for uncertainty quantification in the context of a torus sample space, where circular
-
Robust nonparametric estimation of average treatment effects: A propensity score-based varying coefficient approach Stat (IF 1.7) Pub Date : 2023-12-12 Zhaoqing Tian, Peng Wu, Zixin Yang, Dingjiao Cai, Qirui Hu
We present a novel nonparametric approach for estimating average treatment effects (ATEs), addressing a fundamental challenge in causal inference research, both in theory and empirical studies. Our method offers an effective solution to mitigate the instability problem caused by propensity scores close to zero or one, which are commonly encountered in (augmented) inverse probability weighting approaches
-
An asymptotically efficient closed-form estimator for the Dirichlet distribution Stat (IF 1.7) Pub Date : 2023-12-13 Jae Ho Chang, Sang Kyu Lee, Hyoung-Moon Kim
Maximum likelihood estimator (MLE) of the Dirichlet distribution is usually obtained by using the Newton–Raphson algorithm. However, in some cases, the computational costs can be burdensome, for example, in real-time processes. Therefore, it is beneficial to develop a closed-form estimator that is as efficient as the MLE for large sample. Here, we suggest asymptotically efficient closed-form estimator
-
A comparative analysis of contractual risks in statistical consulting Stat (IF 1.7) Pub Date : 2023-12-07 David Shilane, Nicole L. Lorenzetti, David K. Kruetter
This study enumerates and compares the risks and rewards of different forms of statistical consulting contracts. We assess three different contract models: project-based fees, hourly fees, and retainer agreements and three different planned durations: project-based, time-based, and evergreen contracts. The requirements of time and effort vary considerably for many aspects of consulting work. The risks
-
Observation-driven exponential smoothing Stat (IF 1.7) Pub Date : 2023-12-07 Dimitris Karlis, Xanthi Pedeli, Cristiano Varin
This article presents an approach to forecasting count time series with a form of exponential smoothing built from observation-driven models. The proposed method is easy to implement and simple to interpret. A variant of the approach is also proposed to handle the impact of outliers on the forecast. The performance of the methodology is studied with simulations and illustrated with an analysis of the
-
Estimation of the ROC curve and the area under it with complex survey data Stat (IF 1.7) Pub Date : 2023-12-04 Amaia Iparragirre, Irantzu Barrio, Inmaculada Arostegui
Logistic regression models are widely applied in daily practice. Hence, it is necessary to ensure they have an adequate predictive performance, which is usually estimated by means of the receiver operating characteristic (ROC) curve and the area under it (area under the curve [AUC]). Traditional estimators of these parameters are thought to be applied to simple random samples but are not appropriate
-
Non-degenerate U-statistics for data missing completely at random with application to testing independence Stat (IF 1.7) Pub Date : 2023-11-27 Danijel Aleksić, Marija Cuparić, Bojana Milošević
Although the era of digitalization has enabled access to large quantities of data, due to their insufficient structuring, some data are often missing, and sometimes, the percentage of missing data is significant compared to the entire sample. On the other hand, most of the statistical methodology is designed for complete data. Here, we explore the asymptotic properties of non-degenerate U-statistics
-
A latent space accumulator model for response time: Applications to cognitive assessment data Stat (IF 1.7) Pub Date : 2023-11-20 Ick Hoon Jin, Jonghyun Yun, Hyunjoo Kim, Minjeong Jeon
Response time has attracted increased interest in educational and psychological assessment for, for example, measuring test takers' processing speed, improving the measurement accuracy of ability and understanding aberrant response behaviour. Most models for response time analysis are based on a parametric assumption about the response time distribution. The Cox proportional hazard model has been utilized
-
Exact confidence intervals for the difference of two proportions based on partially observed binary data Stat (IF 1.7) Pub Date : 2023-11-20 Chongxiu Yu, Weizhen Wang, Zhongzhan Zhang
In a matched pairs experiment, two binary variables are typically observed on all subjects in the experiment. However, when one of the variables is missing on some subjects, we have so called the partially observed binary data that consist of two parts: a multinomial from the subjects with a pair of observed variables and two independent binomials from the subjects with only one observed variable.
-
Mediation analysis with latent factors using simultaneous group-wise and parameter-wise penalization Stat (IF 1.7) Pub Date : 2023-11-06 Xizhen Cai, Qing Wang, Yeying Zhu
Mediation analysis aims to uncover the underlying mechanism of how an exposure variable affects the outcome of interest through one or more than one mediating variables. In the event that the number of candidate mediators is large, variable selection or dimension reduction techniques are often utilized to reduce the dimension of the initial set of mediators. In this paper, we propose a latent variable
-
On the dissipation of ideal Hamiltonian Monte Carlo sampler Stat (IF 1.7) Pub Date : 2023-10-27 Qijia Jiang
We report on what seems to be an intriguing connection between variable integration time and partial velocity refreshment of Ideal Hamiltonian Monte Carlo samplers, both of which can be used for reducing the dissipative behaviour of the dynamics. More concretely, we show that on quadratic potentials, efficiency can be improved through these means by a κ $$ \sqrt{\kappa } $$ factor in Wasserstein-2
-
Determining the dimension of weighted inverse regression ensemble Stat (IF 1.7) Pub Date : 2023-10-25 Yinfeng Chen, Lu Li, Zhou Yu
Sliced inverse regression (SIR) has propelled sufficient dimension reduction (SDR) into a mature and versatile field with wide-ranging applications in statistics, including regression diagnostics, data visualisation, image processing and machine learning. However, traditional inverse regression techniques encounter challenges associated with sparsity arising from slicing operations. Weighted inverse
-
Efficient estimation for the proportional hazards model with left-truncated and interval-censored data Stat (IF 1.7) Pub Date : 2023-10-20 Tianyi Lu, Hongxi Li, Shuwei Li, Liuquan Sun
Interval-censored data often arise in prospective studies involving periodical follow-up for monitoring the failure event occurrence. In addition to censoring, left truncation also occurs if only participants who have not experienced the failure event are enrolled in the study, which clearly induces the selection bias and makes the analysis more complicated. This work provides an efficient maximum
-
Grouped rank centrality: Ranking and grouping from pairwise comparisons simultaneously Stat (IF 1.7) Pub Date : 2023-10-20 Xin-Yu Tian, Jian Shi
Interpretation of ranking can be simplified by grouping when the number of ranking items is large. This paper is concerned with the problem of ranking and grouping from pairwise comparisons simultaneously so that items with similar abilities are clustered into the same group. To achieve this, a penalised spectral ranking method, named as grouped rank centrality, is designed. In the method, the fused
-
On the use of ordered factors as explanatory variables Stat (IF 1.7) Pub Date : 2023-10-16 Adelchi Azzalini
Consider a regression or some regression-type model for a certain response variable where the linear predictor includes an ordered factor among the explanatory variables. The inclusion of a factor of this type can take place in a few different ways, discussed in the pertaining literature. The present contribution proposes a different way of tackling this problem, by constructing a numeric variable
-
Functional time series forecasting: Functional singular spectrum analysis approaches Stat (IF 1.7) Pub Date : 2023-10-17 Jordan Trinka, Hossein Haghbin, Han Lin Shang, Mehdi Maadooliat
We introduce two novel nonparametric forecasting methods designed for functional time series (FTS), namely, functional singular spectrum analysis (FSSA) recurrent and vector forecasting. Our algorithms rely on extracted signals obtained from the FSSA method and innovative recurrence relations to make predictions. These techniques are model-free, capable of predicting nonstationary FTS and utilize a
-
Persistently trained, diffusion-assisted energy-based models Stat (IF 1.7) Pub Date : 2023-10-08 Xinwei Zhang, Zhiqiang Tan, Zhijian Ou
Maximum likelihood (ML) learning for energy-based models (EBMs) is challenging, partly due to nonconvergence of Markov chain Monte Carlo. Several variations of ML learning have been proposed, but existing methods all fail to achieve both posttraining image generation and proper density estimation. We propose to introduce diffusion data and learn a joint EBM, called diffusion-assisted EBMs, through
-
Gaussian process regression and classification using International Classification of Disease codes as covariates Stat (IF 1.7) Pub Date : 2023-10-07 Sanvesh Srivastava, Zongyi Xu, Yunyi Li, W. Nick Street, Stephanie Gilbertson-White
In electronic health records (EHRs) data analysis, nonparametric regression and classification using International Classification of Disease (ICD) codes as covariates remain understudied. Automated methods have been developed over the years for predicting biomedical responses using EHRs, but relatively less attention has been paid to developing patient similarity measures that use ICD codes and chronic
-
Inference for joint quantile and expected shortfall regression Stat (IF 1.7) Pub Date : 2023-10-03 Xiang Peng, Huixia Judy Wang
Quantiles and expected shortfalls are commonly used risk measures in financial risk management. The two measurements are correlated while having distinguished features. In this project, our primary goal is to develop a stable and practical inference method for the conditional expected shortfall. We consider the joint modelling of conditional quantile and expected shortfall to facilitate the statistical
-
Homogeneity of marginal distributions for a large number of populations Stat (IF 1.7) Pub Date : 2023-10-03 M. V. Alba-Fernández, M. D. Jiménez-Gamero
Assume that a random vector ( X , Y ) is observed in k populations and independent samples of that random vector are available at each population. Assume that X and Y have the same dimension. Our purpose is to test the equality of the marginal distributions of X and Y in the k populations when k is large compared to the sample sizes. With this aim, we propose and study a test statistic that compares
-
Nonparanormal graph quilting with applications to calcium imaging Stat (IF 1.7) Pub Date : 2023-09-27 Andersen Chang, Lili Zheng, Gautam Dasarathy, Genevera I. Allen
Probabilistic graphical models have become an important unsupervised learning tool for detecting network structures for a variety of problems, including the estimation of functional neuronal connectivity from two-photon calcium imaging data. However, in the context of calcium imaging, technological limitations only allow for partially overlapping layers of neurons in a brain region of interest to be
-
Asymptotic tail properties of Poisson mixture distributions Stat (IF 1.7) Pub Date : 2023-09-26 Samuel Valiquette, Gwladys Toulemonde, Jean Peyhardi, Éric Marchand, Frédéric Mortier
Count data are omnipresent in many applied fields, often with overdispersion. With mixtures of Poisson distributions representing an elegant and appealing modelling strategy, we focus here on how the tail behaviour of the mixing distribution is related to the tail of the resulting Poisson mixture. We define five sets of mixing distributions, and we identify for each case whenever the Poisson mixture
-
Modified trajectory fitting estimators for multi-regime threshold Ornstein–Uhlenbeck processes Stat (IF 1.7) Pub Date : 2023-09-25 Yuecai Han, Dingwen Zhang
The threshold Ornstein–Uhlenbeck process is a stochastic process governed by m Ornstein–Uhlenbeck subprocesses with the ith playing a role whenever the underlying process is in the ith regime. In this paper, we investigate the parameter estimation for threshold Ornstein–Uhlenbeck processes with multiple thresholds. The classical trajectory fitting method does not apply in this context due to the significantly
-
An importance sampling approach for reliable and efficient inference in Bayesian ordinary differential equation models Stat (IF 1.7) Pub Date : 2023-09-18 Juho Timonen, Nikolas Siccha, Ben Bales, Harri Lähdesmäki, Aki Vehtari
Statistical models can involve implicitly defined quantities, such as solutions to nonlinear ordinary differential equations (ODEs), that unavoidably need to be numerically approximated in order to evaluate the model. The approximation error inherently biases statistical inference results, but the amount of this bias is generally unknown and often ignored in Bayesian parameter inference. We propose
-
Semi-parametric generalized linear model for binomial data with varying cluster sizes Stat (IF 1.7) Pub Date : 2023-09-18 Xinran Qi, Aniko Szabo
The semi-parametric generalized linear model (SPGLM) proposed by Rathouz and Gao assumes that the response is from a general exponential family with unspecified reference distribution and can be applied to model the distribution of binomial event-count data with a constant cluster size. We extend SPGLM to model response distributions of binomial data with varying cluster sizes by assuming marginal
-
On pairwise interaction multivariate Pareto models Stat (IF 1.7) Pub Date : 2023-09-10 Michaël Lalancette
The rich class of multivariate Pareto distributions forms the basis of recently introduced extremal graphical models. However, most existing literature on the topic is focused on the popular parametric family of Hüsler–Reiss distributions. It is shown that the Hüsler–Reiss family is in fact the only continuous multivariate Pareto model that exhibits the structure of a pairwise interaction model, justifying
-
New penalty in information criteria for the ARCH sequence with structural changes Stat (IF 1.7) Pub Date : 2023-09-10 Ryoto Ozaki, Yoshiyuki Ninomiya
For change point models and autoregressive conditional heteroscedasticity (ARCH) models, which have long been important especially in econometrics, we develop information criteria that work well even when considering a combination of these models. Since the change point model does not satisfy the conventional statistical asymptotics, a formal Akaike information criterion (AIC) with twice the number
-
A modified partial envelope tensor response regression Stat (IF 1.7) Pub Date : 2023-09-12 Wenxing Guo, Narayanaswamy Balakrishnan, Shanshan Qin
The envelope model is a useful statistical technique that can be applied to multivariate linear regression problems. It aims to remove immaterial information via sufficient dimension reduction techniques while still gaining efficiency and providing accurate parameter estimates. Recently, envelope tensor versions have been developed to extend this technique to tensor data. In this work, a partial tensor
-
Conditional mixture modelling for heavy-tailed and skewed data Stat (IF 1.7) Pub Date : 2023-08-30 Aqi Dong, Volodymyr Melnykov, Yang Wang, Xuwen Zhu
Overparameterization is a serious concern for multivariate mixture models as it can lead to model overfitting and, as a result, mixture order underestimation. Parsimonious modelling is one of the most effective remedies in this context. In Gaussian mixture models, the majority of parameters is associated with covariance matrices and parsimonious models based on factor analysers and spectral decomposition
-
Likelihood-based inference for linear mixed-effects models using the generalized hyperbolic distribution Stat (IF 1.7) Pub Date : 2023-08-17 Victor H. Lachos, Manuel Galea, Camila Zeller, Marcos O. Prates
In this paper, we develop statistical methodology for the analysis of data under nonnormal distributions, in the context of mixed effects models. Although the multivariate normal distribution is useful in many cases, it is not appropriate, for instance, when the data come from skewed and/or heavy-tailed distributions. To analyse data with these characteristics, in this paper, we extend the standard
-
Proposed variable sampling interval maximum EWMA and distance EWMA charts with unknown process parameters Stat (IF 1.7) Pub Date : 2023-08-16 Rehana Parvin, Michael B. C. Khoo, Sajal Saha, Wei Lin Teoh
The variable sampling interval (VSI) exponentially weighted moving average (EWMA) chart which varies the chart's sampling interval according to the value of the current plotting statistic increases the speed of the standard EWMA chart in detecting shifts. Joint monitoring schemes use a single combined statistic for the mean and variance in process monitoring. To simultaneously monitor the mean and
-
Statistical inference and distributed implementation for linear multicategory SVM Stat (IF 1.7) Pub Date : 2023-08-14 Gaoming Sun, Xiaozhou Wang, Yibo Yan, Riquan Zhang
Support vector machine (SVM) is one of the most prevalent classification techniques due to its excellent performance. The standard binary SVM has been well-studied. However, a large number of multicategory classification problems in the real world are equally worth attention. In this paper, focusing on the computationally efficient multicategory angle-based SVM model, we first study the statistical