-
A multivariate Poisson regression model for count data J. Appl. Stat. (IF 1.031) Pub Date : 2021-01-22 J. M. Muñoz-Pichardo; R. Pino-Mejías; J. García-Heras; F. Ruiz-Muñoz; M. Luz González-Regalado
We propose a new technique for the study of multivariate count data. The proposed model is applied to the study of the number of individuals several fossil species found in a set of geographical observation points. First, we are proposing a multivariate model based on the Poisson distributions, which allows positive and negative correlations between the components. We are extending the log-linear Poisson
-
A new flexible generalized family for constructing many families of distributions J. Appl. Stat. (IF 1.031) Pub Date : 2021-01-20 M. H. Tahir; M. Adnan Hussain; Gauss M. Cordeiro
We propose a new flexible generalized family (NFGF) for constructing many families of distributions. The importance of the NFGF is that any baseline distribution can be chosen and it does not involve any additional parameters. Some useful statistical properties of the NFGF are determined such as a linear representation for the family density, analytical shapes of the density and hazard rate, random
-
How many people participated in candlelight protests? Counting the size of a dynamic crowd J. Appl. Stat. (IF 1.031) Pub Date : 2021-01-13 Seonghun Cho; Johan Lim; Woncheol Jang
The recent controversy about the size of crowds at candlelight protests in Korea raises an interesting question regarding the methods used to estimate crowd size. Protest organizers tend to count all participants in the event from its start to finish, while the police usually report the crowd size at its peak. While several counting methods are available to estimate the size of a crowd at a given time
-
Penalized likelihood approach for the four-parameter kappa distribution J. Appl. Stat. (IF 1.031) Pub Date : 2021-01-12 Nipada Papukdee; Jeong-Soo Park; Piyapatr Busababodhin
The four-parameter kappa distribution (K4D) is a generalized form of some commonly used distributions such as generalized logistic, generalized Pareto, generalized Gumbel, and generalized extreme value (GEV) distributions. Owing to its flexibility, the K4D is widely applied in modeling in several fields such as hydrology and climatic change. For the estimation of the four parameters, the maximum likelihood
-
A novel alpha power transformed exponential distribution with real-life applications J. Appl. Stat. (IF 1.031) Pub Date : 2021-01-12 Muhammad Ijaz; Wali Khan Mashwani; Atilla Göktaş; Yuksel Akay Unvan
It has been observed in the literature of probability theory that the existing probability distributions do not provide an adequate fit and fails to model the lifetime data having a non-monotonic hazard rate shapes. To cover this gap, researchers are working in the modifications of these distributions. In this paper, a new family of probability distributions is introduced called New Alpha Power Transformed
-
The optimized CUSUM and EWMA multi-charts for jointly detecting a range of mean and variance change J. Appl. Stat. (IF 1.031) Pub Date : 2021-01-12 Gideon Mensah Engmann; Dong Han
ABSTRACT This article considers the problem of jointly monitoring the mean and variance of a process by multi-chart schemes. Multi-chart is a combination of several single charts which detects changes in a process quickly. Asymptotic analyses and simulation studies show that the optimized CUSUM multi-chart has optimal performance than optimized EWMA multi-chart in jointly detecting mean and variance
-
Adaptive kernel scaling support vector machine with application to a prostate cancer image study J. Appl. Stat. (IF 1.031) Pub Date : 2021-01-08 Xin Liu; Wenqing He
The support vector machine (SVM) is a popularly used classifier in applications such as pattern recognition, texture mining and image retrieval owing to its flexibility and interpretability. However, its performance deteriorates when the response classes are imbalanced. To enhance the performance of the support vector machine classifier in the imbalanced cases we investigate a new two stage method
-
A nonlinear measurement error model and its application to describing the dependency of health outcomes on dietary intake J. Appl. Stat. (IF 1.031) Pub Date : 2021-01-07 B. Curley
ABSTRACT Many nutritional studies focus on the relationship between individuals' diets and resulting health outcomes. When examining these relationships, researchers are generally interested in individuals' long-term, average intake of nutrients; however, typically only 1–2 days of data are collected. If analyses are performed without accounting for the error in estimating usual intake, estimates will
-
Streaming constrained binary logistic regression with online standardized data J. Appl. Stat. (IF 1.031) Pub Date : 2021-01-06 Benoît Lalloué; Jean-Marie Monnez; Eliane Albuisson
ABSTRACT Online learning is a method for analyzing very large datasets (‘big data’) as well as data streams. In this article, we consider the case of constrained binary logistic regression and show the interest of using processes with an online standardization of the data, in particular to avoid numerical explosions or to allow the use of shrinkage methods. We prove the almost sure convergence of such
-
Restricted calibration and weight trimming approaches for estimation of the population total in business statistics J. Appl. Stat. (IF 1.031) Pub Date : 2021-01-06 Cenker Burak Metin; Sinem Tuğba Şahin Tekin; Yaprak Arzu Özdemir
Some adjustments are made to design weights to reduce the negative effects of non-response and out-of-scope problems. The calibration approach is a weighting process that agrees with the known population values by using auxiliary information. In this study, alternative calibration approaches and weight trimming process that can be used in large data sets with extreme weights and different correlation
-
Threshold single multiplicative neuron artificial neural networks for non-linear time series forecasting J. Appl. Stat. (IF 1.031) Pub Date : 2021-01-06 Asiye Nur Yildirim; Eren Bas; Erol Egrioglu
Single multiplicative neuron artificial neural networks have different importance than many other artificial neural networks because they do not have complex architecture problem, too many parameters and they need more computation time to use. In single multiplicative neuron artificial neural network, it is assumed that there is a one data generation process for time series. Many time series need an
-
Forecasting drought using neural network approaches with transformed time series data J. Appl. Stat. (IF 1.031) Pub Date : 2020-12-31 O. Ozan Evkaya; Fatma Sevinç Kurnaz
ABSTRACT Drought is one of the important and costliest disaster all over the world. With the accelerated progress of climate change, its frequency of occurrence and negative impacts are rapidly increasing. It is crucial to initiate and sustain an early warning system to monitor and predict the possible impacts of future droughts. Recently, with the rise of data driven models, various case studies are
-
Directional monitoring and diagnosis for covariance matrices J. Appl. Stat. (IF 1.031) Pub Date : 2020-12-30 Hongying Jing; Jian Li; Kaizong Bai
Statistical surveillance for covariance matrices has attracted increasing attention recently. Many approaches have been developed for monitoring general shifts that are arbitrary deviations, as well as sparse shifts occurring in only a few elements. This paper considers directional shifts that occur in only one independent parameter, which is common if the process is relatively stable. A directional
-
Risk analysis in the brazilian stock market: copula-APARCH modeling for value-at-risk J. Appl. Stat. (IF 1.031) Pub Date : 2020-12-30 Marcela de Marillac Carvalho; Thelma Sáfadi
Risk management of stock portfolios is a fundamental problem for the financial analysis since it indicates the potential losses of an investment at any given time. The objective of this study is to use bivariate static conditional copulas to quantify the dependence structure and to estimate the risk measure Value-at-Risk (VaR). There were selected stocks that have been performing outstandingly on the
-
Sequential asymmetric third order rotatable designs (SATORDs) J. Appl. Stat. (IF 1.031) Pub Date : 2020-12-29 M. Hemavathi; Eldho Varghese; Shashi Shekhar; Seema Jaggi
Rotatable designs that are available for process/ product optimization trials are mostly symmetric in nature. In many practical situations, response surface designs (RSDs) with mixed factor (unequal) levels are more suitable as these designs explore more regions in the design space but it is hard to get rotatable designs with a given level of asymmetry. When experimenting with unequal factor levels
-
Bayesian hierarchical models for linear networks J. Appl. Stat. (IF 1.031) Pub Date : 2020-12-29 Zainab Al-kaabawi; Yinghui Wei; Rana Moyeed
The purpose of this study is to highlight dangerous motorways via estimating the intensity of accidents and study its pattern across the UK motorway network. Two methods have been developed to achieve this aim. First, the motorway-specific intensity is estimated by using a homogeneous Poisson process. The heterogeneity across motorways is incorporated using two-level hierarchical models. The data structure
-
The process of transferring negative impulses in capital markets – a wavelet analysis J. Appl. Stat. (IF 1.031) Pub Date : 2020-12-26 Milda Maria Burzala
ABSTRACT The empirical research that is presented herein deals with the process of transferring negative impulses in capital markets during the subprime crisis (contagion, comovements, crisis transmission and shocks). A significant and positive contribution of the research conducted is the demonstration of how the wavelet analysis can be used in examining the various responses of the financial markets
-
A new outlier detection method based on convex optimization: application to diagnosis of Parkinson’s disease J. Appl. Stat. (IF 1.031) Pub Date : 2020-12-23 Pakize Taylan; Fatma Yerlikaya-Özkurt; Burcu Bilgiç Uçak; Gerhard-Wilhelm Weber
ABSTRACT Neuroscience is a combination of different scientific disciplines which investigate the nervous system for understanding of the biological basis. Recently, applications to the diagnosis of neurodegenerative diseases like Parkinson’s disease have become very promising by considering different statistical regression models. However, well-known statistical regression models may give misleading
-
Estimation in the partially nonlinear model by continuous optimization J. Appl. Stat. (IF 1.031) Pub Date : 2020-12-23 Fatma Yerlikaya-Özkurt; Pakize Taylan; Müjgan Tez
ABSTRACT A useful model for data analysis is the partially nonlinear model where response variable is represented as the sum of a nonparametric and a parametric component. In this study, we propose a new procedure for estimating the parameters in the partially nonlinear models. Therefore, we consider penalized profile nonlinear least square problem where nonparametric components are expressed as a
-
Dynamic Bayesian adjustment of anticipatory covariates in retrospective data: application to the effect of education on divorce risk J. Appl. Stat. (IF 1.031) Pub Date : 2020-12-23 Parfait Munezero; Gebrenegus Ghilagaber
ABSTRACT We address a problem in inference from retrospective studies where the value of a variable is measured at the date of the survey but is used as covariate to events that have occurred long before the survey. This causes problem because the value of the current-date (anticipatory) covariate does not follow the temporal order of events. We propose a dynamic Bayesian approach for modelling jointly
-
Analyzing partially paired data: when can the unpaired portion(s) be safely ignored? J. Appl. Stat. (IF 1.031) Pub Date : 2020-12-23 Qianya Qi; Li Yan; Lili Tian
Partially paired data, either with incompleteness in one or both arms, are common in practice. For testing equality of means of two arms, practitioners often use only the portion of data with complete pairs and perform paired tests. Although such tests (referred as ‘naive paired tests’) are legitimate, their powers might be low as only partial data are utilized. The recently proposed ‘P-value pooling
-
Empirical evaluation of sub-cohort sampling designs for risk prediction modeling J. Appl. Stat. (IF 1.031) Pub Date : 2020-12-21 Myeonggyun Lee; Anne Zeleniuch-Jacquotte; Mengling Liu
ABSTRACT Sub-cohort sampling designs, such as nested case-control (NCC) and case-cohort (CC) studies, have been widely used to estimate biomarker-disease associations because of their cost effectiveness. These designs have been well studied and shown to maintain relatively high efficiency compared to full-cohort designs, but their performance of building risk prediction models has been less studied
-
High precision implementation of Steck's recursion method for use in goodness-of-fit tests J. Appl. Stat. (IF 1.031) Pub Date : 2020-12-17 Jiefei Wang; Jeffrey C. Miecznikowski
Classical continuous goodness-of-fit (GOF) testing is employed for examining whether the data come from an assumed parametric model. In many cases, GOF tests assume a uniform null distribution and examine extreme values of the order statistics of the samples. Many of these statistics can be expressed by a function of the order statistics and the p-values amount to a joint probability statement based
-
Bivariate Birnbaum-Saunders accelerated lifetime model: estimation and diagnostic analysis J. Appl. Stat. (IF 1.031) Pub Date : 2020-12-14 Maria Ioneris Oliveira; Michelli Barros; Joelson Campos; Francisco José A. Cysneiros
In this paper, we discuss the bivariate Birnbaum-Saunders accelerated lifetime model, in which we have modeled the dependence structure of bivariate survival data through the use of frailty models. Specifically, we propose the bivariate model Birnbaum-Saunders with the following frailty distributions: gamma, positive stable and logarithmic series. We present a study of inference and diagnostic analysis
-
MAP segmentation in Bayesian hidden Markov models: a case study J. Appl. Stat. (IF 1.031) Pub Date : 2020-12-10 Alexey Koloydenko; Kristi Kuljus; Jüri Lember
We consider the problem of estimating the maximum posterior probability (MAP) state sequence for a finite state and finite emission alphabet hidden Markov model (HMM) in the Bayesian setup, where both emission and transition matrices have Dirichlet priors. We study a training set consisting of thousands of protein alignment pairs. The training data is used to set the prior hyperparameters for Bayesian
-
Analyzing the impacts of socio-economic factors on French departmental elections with CoDa methods J. Appl. Stat. (IF 1.031) Pub Date : 2020-12-09 T. H. A. Nguyen; T. Laurent; C. Thomas-Agnan; A. Ruiz-Gazen
ABSTRACT The vote shares by party on a given subdivision of a territory form a vector called composition (mathematically, a vector belonging to a simplex). It is interesting to model these shares and study the impact of the characteristics of the territorial units on the outcome of the elections. In the political economy literature, few regression models are adapted to the case of more than two political
-
Ultrahigh-dimensional sufficient dimension reduction for censored data with measurement error in covariates J. Appl. Stat. (IF 1.031) Pub Date : 2020-12-08 Li-Pang Chen
(2020). Ultrahigh-dimensional sufficient dimension reduction for censored data with measurement error in covariates. Journal of Applied Statistics. Ahead of Print.
-
A review of tests for exponentiality with Monte Carlo comparisons J. Appl. Stat. (IF 1.031) Pub Date : 2020-12-04 Everestus O. Ossai; Mbanefo S. Madukaife; Abimibola V. Oladugba
(2020). A review of tests for exponentiality with Monte Carlo comparisons. Journal of Applied Statistics. Ahead of Print.
-
Editorial J. Appl. Stat. (IF 1.031) Pub Date : 2020-12-04 Jie Chen
(2021). Editorial. Journal of Applied Statistics: Vol. 48, No. 1, pp. 1-3.
-
Optimal B-robust estimators for the parameters of the power Lindley distribution J. Appl. Stat. (IF 1.031) Pub Date : 2020-12-03 Berivan Çakmak; Fatma Zehra Doğru
Parameters of a distribution are generally estimated by using the classical methods such as maximum likelihood (ML) and least squares (LS) estimation. However, these classical methods are very sensitive to outliers. This study, therefore, proposes the application of the optimal B-robust (OBR) estimation method, which is resistant to outliers, to estimate the parameters of power Lindley (PL) distribution
-
A Bayesian approach on the two-piece scale mixtures of normal homoscedastic nonlinear regression models J. Appl. Stat. (IF 1.031) Pub Date : 2020-12-03 Zahra Barkhordar; Mohsen Maleki; Zahra Khodadadi; Darren Wraith; Farajollah Negahdari
In this application note paper, we propose and examine the performance of a Bayesian approach for a homoscedastic nonlinear regression (NLR) model assuming errors with two-piece scale mixtures of normal (TP-SMN) distributions. The TP-SMN is a large family of distributions, covering both symmetrical/ asymmetrical distributions as well as light/heavy tailed distributions, and provides an alternative
-
Robust bootstrap prediction intervals for univariate and multivariate autoregressive time series models J. Appl. Stat. (IF 1.031) Pub Date : 2020-12-01 Ufuk Beyaztas; Han Lin Shang
ABSTRACT The bootstrap procedure has emerged as a general framework to construct prediction intervals for future observations in autoregressive time series models. Such models with outlying data points are standard in real data applications, especially in the field of econometrics. These outlying data points tend to produce high forecast errors, which reduce the forecasting performances of the existing
-
Variational Bayesian inference for association over phylogenetic trees for microorganisms J. Appl. Stat. (IF 1.031) Pub Date : 2020-11-30 Xiaojuan Hao; Kent M. Eskridge; Dong Wang
ABSTRACT With the advance of next generation sequencing technologies, researchers now routinely obtain a collection of microbial sequences with complex phylogenetic relationships. It is often of interest to analyze the association between certain environmental factors and characteristics of the microbial collection. Though methods have been developed to test for association between the microbial composition
-
Optimal designing of two-level skip-lot sampling reinspection plan J. Appl. Stat. (IF 1.031) Pub Date : 2020-11-25 N. Murugeswari; P. Jeyadurga; S. Balamurali
Skip-lot sampling plan is often applied in industries for reducing the cost and effort of the inspection of the product having excellent quality history. Consequence of skip-lot sampling plans is to reduce the cost of inspection so which are more attractive in economical aspect. In this paper, we develop a sampling plan by incorporating the idea of resampling in two-level skip lot sampling plan and
-
An effective deep residual network based class attention layer with bidirectional LSTM for diagnosis and classification of COVID-19 J. Appl. Stat. (IF 1.031) Pub Date : 2020-11-24 Denis A. Pustokhin; Irina V. Pustokhina; Phuoc Nguyen Dinh; Son Van Phan; Gia Nhu Nguyen; Gyanendra Prasad Joshi; Shankar K.
ABSTRACT In recent days, COVID-19 pandemic has affected several people's lives globally and necessitates a massive number of screening tests to detect the existence of the coronavirus. At the same time, the rise of deep learning (DL) concepts helps to effectively develop a COVID-19 diagnosis model to attain maximum detection rate with minimum computation time. This paper presents a new Residual Network
-
Determining the relationship between stock return and financial performance: an analysis on Turkish deposit banks J. Appl. Stat. (IF 1.031) Pub Date : 2020-11-24 M. Esra Atukalp
Banks play a very important role in financial markets due to their intermediary function. The availability of financing to businesses and individuals, the prevalence of branches throughout the country as well as the preference status at the collection point as a result of the habits of savings holders, have made deposit banks more active among other financial institutions. Since the banking system
-
Testing and dating structural changes in copula-based dependence measures J. Appl. Stat. (IF 1.031) Pub Date : 2020-11-23 Florian Stark; Sven Otto
This paper is concerned with testing and dating structural breaks in the dependence structure of multivariate time series. We consider a cumulative sum (CUSUM) type test for constant copula-based dependence measures, such as Spearman's rank correlation and quantile dependencies. The asymptotic null distribution is not known in closed form and critical values are estimated by an i.i.d. bootstrap procedure
-
Structured sparse support vector machine with ordered features J. Appl. Stat. (IF 1.031) Pub Date : 2020-11-18 Kuangnan Fang; Peng Wang; Xiaochen Zhang; Qingzhao Zhang
ABSTRACT In the application of high-dimensional data classification, several attempts have been made to achieve variable selection by replacing the ℓ 2 -penalty with other penalties for the support vector machine (SVM). However, these high-dimensional SVM methods usually do not take into account the special structure among covariates (features). In this article, we consider a classification problem
-
Optimal partitioning for the proportional hazards model J. Appl. Stat. (IF 1.031) Pub Date : 2020-11-18 Usha Govindarajulu; Thaddeus Tarpey
This paper discusses methods for clustering a continuous covariate in a survival analysis model. The advantages of using a categorical covariate defined from discretizing a continuous covariate (via clustering) is (i) enhanced interpretability of the covariate's impact on survival and (ii) relaxing model assumptions that are usually required for survival models, such as the proportional hazards model
-
Stress–strength reliability estimation involving paired observation with ties using bivariate exponentiated half-logistic model J. Appl. Stat. (IF 1.031) Pub Date : 2020-11-18 Thomas Xavier; Joby K. Jose
ABSTRACT This paper deals with the problem of maximum likelihood and Bayesian estimation of stress–strength reliability involving paired observation with ties using bivariate exponentiated half-logistic distribution. This problem is of importance because in some real applications the strength of the component is highly dependent on the stress experienced by it. A bivariate extension of exponentiated
-
Quantification of model risk that is caused by model misspecification J. Appl. Stat. (IF 1.031) Pub Date : 2020-11-17 M.B. Seitshiro; H.P. Mashele
In this paper, we suggest a technique to quantify model risk, particularly model misspecification for binary response regression problems found in financial risk management, such as in credit risk modelling. We choose the probability of default model as one instance of many other credit risk models that may be misspecified in a financial institution. By way of illustrating the model misspecification
-
A three-part regression calibration to handle excess zeroes, skewness and heteroscedasticity in adjusting for measurement error in dietary intake data J. Appl. Stat. (IF 1.031) Pub Date : 2020-11-13 George O. Agogo; Alexander K. Muoka
ABSTRACT Exposure measurement error (ME) biases exposure-outcome associations. Calibration dietary intake data used in the regression calibration (RC) response to adjust for ME are usually right-skewed, heteroscedastic and with excess zeroes. We proposed three-part RC models to handle these distributional complexities simultaneously, while correcting for ME in fish intake. We applied data from the
-
Nonparametric panel stationarity testing with an application to crude oil production J. Appl. Stat. (IF 1.031) Pub Date : 2020-11-13 María José Presno; Manuel Landajo; Paula Fernandez-Gonzalez
A nonparametric panel stationarity test is proposed which offers the advantage of not requiring prior specification of the trend function for each of the series in the panel. A bootstrap implementation of the test is outlined and its finite sample performance is analyzed via Monte Carlo simulations. An application is also included where the proposed test is used to analyze the stochastic properties
-
Bayesian inference: Weibull Poisson model for censored data using the expectation–maximization algorithm and its application to bladder cancer data J. Appl. Stat. (IF 1.031) Pub Date : 2020-11-12 Anurag Pathak; Manoj Kumar; Sanjay Kumar Singh; Umesh Singh
This article focuses on the parameter estimation of experimental items/units from Weibull Poisson Model under progressive type-II censoring with binomial removals (PT-II CBRs). The expectation–maximization algorithm has been used for maximum likelihood estimators (MLEs). The MLEs and Bayes estimators have been obtained under symmetric and asymmetric loss functions. Performance of competitive estimators
-
On the rank-deficient canonical correlation technique solved by analytic spectral decomposition J. Appl. Stat. (IF 1.031) Pub Date : 2020-11-11 Lukáš Malec
Regularization is a well-known and used statistical approach covering individual points or limit approximations. In this study, the canonical correlation analysis (CCA) process of the paths is discussed with partial least squares (PLS) as the other boundary covering transformation to a symmetric eigenvalue (or singular value) problem dependent on a parameter. Two regularizations of the original criterion
-
Bayesian analyses of an exponential-Poisson and related zero augmented type models J. Appl. Stat. (IF 1.031) Pub Date : 2020-11-11 David P. M. Scollnik
We consider several alternatives to the continuous exponential-Poisson distribution in order to accommodate the occurrence of zeros. Three of these are modifications of the exponential-Poisson model. One of these remains a fully continuous model. The other models we consider are all semi-continuous models, each with a discrete point mass at zero and a continuous density on the positive values. All
-
Statistical modeling of computer malware propagation dynamics in cyberspace J. Appl. Stat. (IF 1.031) Pub Date : 2020-11-10 Zijian Fang; Peng Zhao; Maochao Xu; Shouhuai Xu; Taizhong Hu; Xing Fang
ABSTRACT Modeling cyber threats, such as the computer malicious software (malware) propagation dynamics in cyberspace, is an important research problem because models can deepen our understanding of dynamical cyber threats. In this paper, we study the statistical modeling of the macro-level evolution of dynamical cyber attacks. Specifically, we propose a Bayesian structural time series approach for
-
Robust estimation of models for longitudinal data with dropouts and outliers J. Appl. Stat. (IF 1.031) Pub Date : 2020-11-10 Yuexia Zhang; Guoyou Qin; Zhongyi Zhu; Bo Fu
ABSTRACT Missing data and outliers usually arise in longitudinal studies. Ignoring the effects of missing data and outliers will make the classical generalized estimating equation approach invalid. The longitudinal cohort study of rheumatoid arthritis patients was designed to investigate whether the Health Assessment Questionnaire score was associated with baseline covariates and changed with time
-
The role of social capital in environmental protection efforts: evidence from Turkey J. Appl. Stat. (IF 1.031) Pub Date : 2020-11-10 Julide Yildirim; Barış Alpaslan; Erdener Emin Eker
The existing literature has recognized the role and importance of social capital in natural resource management. Several studies provide empirical evidence that higher levels of social capital may positively affect individuals' behavior towards natural resources management. This study is therefore an attempt to investigate the environmental quality impacts of social capital and central government expenditures
-
Migration and students' performance: detecting geographical differences following a curves clustering approach J. Appl. Stat. (IF 1.031) Pub Date : 2020-11-09 Giovanni Boscaino; Gianluca Sottile; Giada Adelfio
Students' migration mobility is the new form of migration: students migrate to improve their skills and become more valued for the job market. The data regard the migration of Italian Bachelors who enrolled at Master Degree level, moving typically from poor to rich areas. This paper investigates the migration and other possible determinants on the Master Degree students' performance. The Clustering
-
Modeling the effects of multiple exposures with unknown group memberships: a Bayesian latent variable approach J. Appl. Stat. (IF 1.031) Pub Date : 2020-11-06 Alexis Zavez; Emeir M. McSorley; Alison J. Yeates; Sally W. Thurston
ABSTRACT We propose a Bayesian latent variable model to allow estimation of the covariate-adjusted relationships between an outcome and a small number of latent exposure variables, using data from multiple observed exposures. Each latent variable is assumed to be represented by multiple exposures, where membership of the observed exposures to latent groups is unknown. Our model assumes that one measured
-
A mixture model with Poisson and zero-truncated Poisson components to analyze road traffic accidents in Turkey J. Appl. Stat. (IF 1.031) Pub Date : 2020-11-06 Hande Konşuk Ünlü; Derek S. Young; Ayten Yiğiter; L. Hilal Özcebe
The analysis of traffic accident data is crucial to address numerous concerns, such as understanding contributing factors in an accident's chain-of-events, identifying hotspots, and informing policy decisions about road safety management. The majority of statistical models employed for analyzing traffic accident data are logically count regression models (commonly Poisson regression) since a count
-
Determinants of the heavily right-tailed residential housing price in Tianjin J. Appl. Stat. (IF 1.031) Pub Date : 2020-11-04 Bojuan Barbara Zhao; Ruijuan Su
The housing price in Tianjin, one of the typical monocentric cities of China, exhibits a heavily right-tailed distribution even after the logarithm transformation of the price, which might lead to a biased estimation of the parameters under normal distribution assumption. Therefore, the extended Cox proportional hazards regression model and the generalized concept of relative risk are used to identify
-
Efficient estimators with categorical ranked set samples: estimation procedures for osteoporosis J. Appl. Stat. (IF 1.031) Pub Date : 2020-11-02 Armin Hatefi; Amirhossein Alvandi
Ranked set sampling (RSS) design as a cost-effective sampling is a powerful tool in situations where measuring the variable of interest is costly and time-consuming; however, ranking information about sampling units can be obtained easily through inexpensive and easy to measure characteristics at little or no cost. In this paper, we study RSS data for analysis of an ordinal population. First, we compare
-
Testing exponentiality based on the extropy of record values J. Appl. Stat. (IF 1.031) Pub Date : 2020-10-31 Peihan Xiong; Weiwei Zhuang; Guoxin Qiu
ABSTRACT In this paper, we first present a characterization of exponential distribution based on the extropy of record values and next introduce a goodness-of-fit test for exponentiality. Monte Carlo simulation is used to compute the critical values of our proposed test for different sample sizes and significance levels. To show the advantage of the proposed test, we adopt 58 competitor tests and compute
-
A new alternative estimation method for Liu-type logistic estimator via particle swarm optimization: an application to data of collapse of Turkish commercial banks during the Asian financial crisis J. Appl. Stat. (IF 1.031) Pub Date : 2020-10-31 Nuriye Sancar; Deniz Inan
In the existence of multicollinearity problem in the logistic model, some important problems may occur in the analysis of the model, such as unstable maximum likelihood estimator with very high standard errors, false inferences. The Liu-type logistic estimator was proposed as two-parameter estimator to overcome multicollinearity problem in the logistic model. In the existing previous studies, the (k
-
A statistical methodology to select covariates in high-dimensional data under dependence. Application to the classification of genetic profiles in oncology J. Appl. Stat. (IF 1.031) Pub Date : 2020-10-27 B. Bastien; T. Boukhobza; H. Dumont; A. Gégout-Petit; A. Muller-Gueudin; C. Thiébaut
ABSTRACT We propose a new methodology for selecting and ranking covariates associated with a variable of interest in a context of high-dimensional data under dependence but few observations. The methodology successively intertwines the clustering of covariates, decorrelation of covariates using Factor Latent Analysis, selection using aggregation of adapted methods and finally ranking. A simulation
-
Revisit to functional data analysis of sleeping energy expenditure J. Appl. Stat. (IF 1.031) Pub Date : 2020-10-27 Seungchul Baek; Yewon Kim; Junyong Park; Jong Soo Lee
ABSTRACT In this paper, we consider the classification problem of functional data including the sleeping energy expenditure (SEE) data, focusing on functional classification. Many existing classification rules are not effective in distinguishing the two classes of SEE data, because the trajectories of each observation have very different patterns for each class. It is often observed that some aspect
-
CD-vine model for capturing complex dependence J. Appl. Stat. (IF 1.031) Pub Date : 2020-10-26 O. Ozan Evkaya; Ceylan Yozgatlıgil; A. Sevtap Selcuk-Kestel
ABSTRACT Copula based finite mixture models allow us to capture the dependence between random variables more flexibly. Although bivariate case of finite mixture models has been commonly studied, limited efforts have been spent on finite mixture of vines. Instead of using classical mixture models, it is possible to incorporate C-vines into the D-vine model (CD-vine) to understand both the dependence
-
On a generalization of the test of endogeneity in a two stage least squares estimation J. Appl. Stat. (IF 1.031) Pub Date : 2020-10-26 Ayyub Sheikhi; Fatemeh Bahador; Mohammad Arashi
In situations that the predictors are correlated with the error term, we propose a bridge estimator in the two-stage least squares estimation. We apply this estimator to overcome the multicollinearity and sparsity of the explanatory variables, when the endogeneity problem is present.The proposed estimator was applied to modify the Durbin-Wu-Hausman (DWH) test of endogeneity in the presence of multicollinearity
Contents have been reproduced by permission of the publishers.