当前位置: X-MOL 学术J. Comput. Graph. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Modified Random Survival Forests Algorithm for Non-recurring, Time to Event Outcomes Ascertained Using Imperfect, Self-reports or Laboratory Based Diagnostic Tests
Journal of Computational and Graphical Statistics ( IF 1.4 ) Pub Date : 2018-08-20 , DOI: 10.1080/10618600.2018.1474115
Hui Xu 1 , Xiangdong Gu 1 , Mahlet G Tadesse 2 , Raji Balasubramanian 1
Affiliation  

ABSTRACT We present an ensemble tree-based algorithm for variable selection in high-dimensional datasets, in settings where a time-to-event outcome is observed with error. This work is motivated by self-reported outcomes collected in large-scale epidemiologic studies, such as the Women’s Health Initiative. The proposed methods equally apply to imperfect outcomes that arise in other settings such as data extracted from electronic medical records. To evaluate the performance of our proposed algorithm, we present results from simulation studies, considering both continuous and categorical covariates. We illustrate this approach to discover single nucleotide polymorphisms that are associated with incident Type 2 diabetes in the Women’s Health Initiative. A freely available R package icRSF has been developed to implement the proposed methods. Supplementary material for this article is available online.

中文翻译:

改进的随机生存森林算法,用于使用不完美的自我报告或基于实验室的诊断测试确定的非重复性、事件时间结果

摘要:我们提出了一种基于集成树的算法,用于在观察到事件时间结果有误差的情况下,在高维数据集中进行变量选择。这项工作的动机是在大规模流行病学研究(例如妇女健康倡议)中收集的自我报告结果。所提出的方法同样适用于其他环境中出现的不完美结果,例如从电子病历中提取的数据。为了评估我们提出的算法的性能,我们展示了模拟研究的结果,考虑了连续协变量和分类协变量。我们在妇女健康倡议中举例说明了这种方法,以发现与 2 型糖尿病相关的单核苷酸多态性。已经开发了一个免费的 R 包 icRSF 来实现所提出的方法。本文的补充材料可在线获取。
更新日期:2018-08-20
down
wechat
bug