当前位置: X-MOL 学术J. Multivar. Anal. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Forward regression for Cox models with high-dimensional covariates
Journal of Multivariate Analysis ( IF 1.4 ) Pub Date : 2019-09-01 , DOI: 10.1016/j.jmva.2019.02.011
Hyokyoung G Hong 1 , Qi Zheng 2 , Yi Li 3
Affiliation  

Forward regression, a classical variable screening method, has been widely used for model building when the number of covariates is relatively low. However, forward regression is seldom used in high-dimensional settings because of the cumbersome computation and unknown theoretical properties. Some recent works have shown that forward regression, coupled with an extended Bayesian information criterion (EBIC)-based stopping rule, can consistently identify all relevant predictors in high-dimensional linear regression settings. However, the results are based on the sum of residual squares from linear models and it is unclear whether forward regression can be applied to more general regression settings, such as Cox proportional hazards models. We introduce a forward variable selection procedure for Cox models. It selects important variables sequentially according to the increment of partial likelihood, with an EBIC stopping rule. To our knowledge, this is the first study that investigates the partial likelihood-based forward regression in high-dimensional survival settings and establishes selection consistency results. We show that, if the dimension of the true model is finite, forward regression can discover all relevant predictors within a finite number of steps and their order of entry is determined by the size of the increment in partial likelihood. As partial likelihood is not a regular density-based likelihood, we develop some new theoretical results on partial likelihood and use these results to establish the desired sure screening properties. The practical utility of the proposed method is examined via extensive simulations and analysis of a subset of the Boston Lung Cancer Survival Cohort study, a hospital-based study for identifying biomarkers related to lung cancer patients' survival.

中文翻译:


具有高维协变量的 Cox 模型的正向回归



正向回归是一种经典的变量筛选方法,在协变量数量相对较少的情况下被广泛用于模型构建。然而,由于计算繁琐且理论性质未知,前向回归很少用于高维设置。最近的一些工作表明,前向回归与基于扩展贝叶斯信息准则(EBIC)的停止规则相结合,可以一致地识别高维线性回归设置中的所有相关预测变量。然而,结果基于线性模型的残差平方和,尚不清楚正向回归是否可以应用于更一般的回归设置,例如 Cox 比例风险模型。我们引入了 Cox 模型的前向变量选择过程。它根据部分似然的增量顺序选择重要变量,并遵循 EBIC 停止规则。据我们所知,这是第一项研究高维生存环境中基于部分似然的前向回归并建立选择一致性结果的研究。我们表明,如果真实模型的维度是有限的,前向回归可以在有限数量的步骤内发现所有相关的预测变量,并且它们的进入顺序由部分似然增量的大小决定。由于部分似然不是常规的基于密度的似然,因此我们开发了一些关于部分似然的新理论结果,并使用这些结果来建立所需的确定筛选属性。 通过对波士顿肺癌生存队列研究的一个子集进行广泛的模拟和分析,检验了所提出方法的实际效用,该研究是一项基于医院的研究,旨在识别与肺癌患者生存相关的生物标志物。
更新日期:2019-09-01
down
wechat
bug