当前位置: X-MOL 学术Biom. J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Nonparametric screening and feature selection for ultrahigh‐dimensional Case II interval‐censored failure time data
Biometrical Journal ( IF 1.3 ) Pub Date : 2020-07-16 , DOI: 10.1002/bimj.201900154
Qiang Hu 1 , Liang Zhu 2 , Yanyan Liu 3 , Jianguo Sun 4 , Deo Kumar Srivastava 5 , Leslie L Robison 6
Affiliation  

For the analysis of ultrahigh-dimensional data, the first step is often to perform screening and feature selection to effectively reduce the dimensionality while retaining all the active or relevant variables with high probability. For this, many methods have been developed under various frameworks but most of them only apply to complete data. In this paper, we consider an incomplete data situation, case II interval-censored failure time data, for which there seems to be no screening procedure. Basing on the idea of cumulative residual, a model-free or nonparametric method is developed and shown to have the sure independent screening property. In particular, the approach is shown to tend to rank the active variables above the inactive ones in terms of their association with the failure time of interest. A simulation study is conducted to demonstrate the usefulness of the proposed method and, in particular, indicates that it works well with general survival models and is capable of capturing the nonlinear covariates with interactions. Also the approach is applied to a childhood cancer survivor study that motivated this investigation.

中文翻译:


超高维案例II区间删失失效时间数据的非参数筛选和特征选择



对于超高维数据的分析,第一步往往是进行筛选和特征选择,以有效降维,同时保留所有高概率的活跃或相关变量。为此,在各种框架下开发了许多方法,但大多数方法仅适用于完整数据。在本文中,我们考虑一种不完整的数据情况,即案例 II 区间删失故障时间数据,对此似乎没有筛选程序。基于累积残差的思想,开发了一种无模型或非参数方法,并证明其具有可靠的独立筛选特性。特别是,该方法倾向于将活动变量根据其与感兴趣的故障时间的关联性排列在非活动变量之上。进行模拟研究以证明所提出方法的有用性,特别是表明它与一般生存模型配合良好,并且能够捕获具有交互作用的非线性协变量。该方法还应用于一项儿童癌症幸存者研究,该研究激发了这项调查。
更新日期:2020-07-16
down
wechat
bug