当前位置: X-MOL 学术Comput. Stat. Data Anal. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Censored mean variance sure independence screening for ultrahigh dimensional survival data
Computational Statistics & Data Analysis ( IF 1.5 ) Pub Date : 2021-02-24 , DOI: 10.1016/j.csda.2021.107206
Wei Zhong , Jiping Wang , Xiaolin Chen

Feature screening has become an indispensable statistical modeling tool for ultrahigh dimensional data analysis. This article introduces a new model-free marginal feature screening approach for ultrahigh dimensional survival data with right censoring. The new procedure could be used for survival data with both ultrahigh dimensional categorical and continuous covariates. Motivated by Cui et al. (2015), a censored mean variance index (cMV) is proposed to measure the dependence between a survival outcome and a categorical covariate. Then a slice-and-fuse method is exploited to modify the cMV index adaptive to continuous covariates. The sure independence screening based on the censored mean variance index (cMV-SIS) is proposed to identify the important covariates for ultrahigh dimensional data with censored survival outcomes. It enjoys many appealing merits inherited in the mean variance index. It is model-free and thus robust to model misspecification. It is also robust to heavy tails and outliers in covariates. Moreover, the sure screening properties are theoretically investigated for both categorical and continuous covariates under some mild technical conditions. Extensive numerical simulations and a real data example have demonstrated the competitive performances of the proposed feature screening method.



中文翻译:

高维生存数据的删失均方差确定独立性筛选

特征筛选已成为超高维数据分析必不可少的统计建模工具。本文介绍了一种新的无模型边际特征筛选方法,该方法通过正确的检查来处理超高维生存数据。新方法可用于具有超高维分类和连续协变量的生存数据。受崔等人启发。(2015年),提出了一种审查均值方差指数(cMV)来衡量生存结果和分类协变量之间的依存关系。然后利用切片和融合方法修改cMV指数,以适应连续协变量。提出了基于审查均值方差指数(cMV-SIS)的确定独立性筛选,以识别具有审查的生存结果的超高维数据的重要协变量。它具有平均方差指数中继承的许多吸引人的优点。它是无模型的,因此对模型错误指定具有鲁棒性。对于协变量中的粗尾和异常值,它也很健壮。此外,在某些温和的技术条件下,对分类协变量和连续协变量均进行了可靠的筛选特性的理论研究。大量的数值模拟和真实的数据示例已经证明了所提出的特征筛选方法的竞争性能。

更新日期:2021-03-15
down
wechat
bug