当前位置: X-MOL 学术Mathematics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The Higher-Order of Adaptive Lasso and Elastic Net Methods for Classification on High Dimensional Data
Mathematics ( IF 2.4 ) Pub Date : 2021-05-12 , DOI: 10.3390/math9101091
Autcha Araveeporn

The lasso and elastic net methods are the popular technique for parameter estimation and variable selection. Moreover, the adaptive lasso and elastic net methods use the adaptive weights on the penalty function based on the lasso and elastic net estimates. The adaptive weight is related to the power order of the estimator. Normally, these methods focus to estimate parameters in terms of linear regression models that are based on the dependent variable and independent variable as a continuous scale. In this paper, we compare the lasso and elastic net methods and the higher-order of the adaptive lasso and adaptive elastic net methods for classification on high dimensional data. The classification is used to classify the categorical data for dependent variable dependent on the independent variables, which is called the logistic regression model. The categorical data are considered a binary variable, and the independent variables are used as the continuous variable. The high dimensional data are represented when the number of independent variables is higher than the sample sizes. For this research, the simulation of the logistic regression is considered as the binary dependent variable and 20, 30, 40, and 50 as the independent variables when the sample sizes are less than the number of the independent variables. The independent variables are generated from normal distribution on several variances, and the dependent variables are obtained from the probability of logit function and transforming it to predict the binary data. For application in real data, we express the classification of the type of leukemia as the dependent variables and the subset of gene expression as the independent variables. The criterion of these methods is to compare by the average percentage of predicted accuracy value. The results are found that the higher-order of adaptive lasso method is satisfied with large dispersion, but the higher-order of adaptive elastic net method outperforms on small dispersion.

中文翻译:

高维数据分类的自适应套索和弹性网方法的高阶

套索和弹性网方法是用于参数估计和变量选择的流行技术。此外,自适应套索和弹性网方法基于套索和弹性网估计在惩罚函数上使用自适应权重。自适应权重与估计器的幂阶有关。通常,这些方法着重于根据线性回归模型估算参数,这些模型基于作为连续尺度的因变量和自变量。在本文中,我们比较了套索和弹性网方法以及高阶自适应套索和自适应弹性网方法对高维数据的分类。该分类用于对依赖于自变量的因变量的分类数据进行分类,这称为逻辑回归模型。分类数据被认为是二进制变量,并且自变量被用作连续变量。当自变量的数量大于样本大小时,将代表高维数据。对于本研究,当样本量小于自变量数时,逻辑回归的模拟被视为二进制因变量,而20、30、40和50被视为自变量。从几个方差的正态分布中生成自变量,并从logit函数的概率中获取因变量,并将其转换以预测二进制数据。对于实际数据中的应用,我们将白血病类型的分类表示为因变量,将基因表达的子集表示为自变量。这些方法的标准是通过预测精度值的平均百分比进行比较。结果表明,自适应拉索法的高阶可以满足较大的色散,但是自适应弹性网法的高阶在小色散时性能优于。
更新日期:2021-05-12
down
wechat
bug