当前位置: X-MOL 学术Stat. Neerl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Selection of influential variables in ordinal data with preponderance of zeros
Statistica Neerlandica ( IF 1.4 ) Pub Date : 2020-08-18 , DOI: 10.1111/stan.12225
Ujjwal Das 1 , Kalyan Das 2
Affiliation  

Presence of excess zero in ordinal data is pervasive in areas like medical and social sciences. Unfortunately, analysis of such kind of data has so far hardly been looked into, perhaps for the reason that the underlying model that fits such data, is not a generalized linear model. Obviously some methodological developments and intensive computations are required. The current investigation is concerned with the selection of variables in such models. In many occasions where the number of predictors is quite large and some of them are not useful, the maximum likelihood approach is not the automatic choice. As, apart from the messy calculations involved, this approach fails to provide efficient estimates of the underlying parameters. The proposed penalized approach includes 1 penalty (LASSO) and the mixture of 1 and 2 penalties (elastic net). We propose a coordinate descent algorithm to fit a wide class of ordinal regression models and select useful variables appearing in both the ordinal regression and the logistic regression based mixing component. A rigorous discussion on the selection of predictors has been made through a simulation study. The proposed method is illustrated by analyzing the severity of driver injury from Michigan upper peninsula road accidents.

中文翻译:

优势为零的序数数据中影响变量的选择

在医学和社会科学等领域,普遍存在序数数据中的零值。不幸的是,到目前为止,几乎尚未研究过此类数据的分析,这可能是因为适合此类数据的基础模型不是广义的线性模型。显然,需要一些方法上的发展和密集的计算。当前的调查与此类模型中变量的选择有关。在许多情况下,预测变量的数量非常大,其中一些没有用,最大似然法不是自动选择。因为除了所涉及的混乱计算之外,该方法无法提供对基本参数的有效估计。所提出的惩罚方法包括1罚分(LASSO)和的混合物12处罚(弹性网)。我们提出了一种协调下降算法,以适合各种类别的序数回归模型,并选择出现在序数回归和基于逻辑回归的混合组件中的有用变量。通过模拟研究,对预测变量的选择进行了严格的讨论。通过分析密歇根州上半岛交通事故对驾驶员伤害的严重性来说明所提出的方法。
更新日期:2020-08-18
down
wechat
bug