Variable selection in the Box–Cox power transformation model,Journal of Statistical Planning and Inference

当前位置： X-MOL 学术 › J. Stat. Plann. Inference › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Variable selection in the Box–Cox power transformation model
Journal of Statistical Planning and Inference ( IF 0.8 ) Pub Date : 2021-05-12 , DOI: 10.1016/j.jspi.2021.05.003
Baojiang Chen , Jing Qin , Ao Yuan

High dimensional data are frequently collected across research fields such as genomics, health sciences, economics, and social sciences. Recently, variable selection in the high dimensional setting has drawn great attention, with many effective methods developed to reduce the dimensionality of the data. However, most of these methods apply only to normally or near normally distributed outcomes in a linear regression model, while few studies focus on variable selection for skewed data. Simulation studies show that ignoring an appropriate transformation for the outcome can lead to biased inferences (e.g., missing important covariates). In this paper, we develop a variable selection procedure for the Box–Cox power transformation model by developing a penalized maximum likelihood estimate and deriving the consistency, oracle property, and asymptotic distribution of this estimate. Simulation studies demonstrate that the proposed method can yield higher sensitivity, while the naive method that without doing transformation can lead to lower sensitivity. We apply the proposed method to a gene expression study.

中文翻译：

Box-Cox功率转换模型中的变量选择

高维数据通常在诸如基因组学，健康科学，经济学和社会科学等研究领域中收集。最近，在高维设置中的变量选择引起了极大的关注，已开发出许多有效的方法来减少数据的维数。但是，大多数这些方法仅适用于线性回归模型中的正态分布或接近正态分布的结果，而很少有研究关注偏斜数据的变量选择。仿真研究表明，忽略针对结果的适当转换可能导致有偏见的推断（例如，缺少重要的协变量）。在本文中，我们通过开发受罚的最大似然估计并推导一致性，预言性，和该估计的渐近分布。仿真研究表明，提出的方法可以产生更高的灵敏度，而没有进行转换的天真的方法可以导致更低的灵敏度。我们将提出的方法应用于基因表达研究。

更新日期：2021-05-19

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11