当前位置: X-MOL 学术Biometrics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An Online Updating Approach for Testing the Proportional Hazards Assumption with Streams of Survival Data
Biometrics ( IF 1.4 ) Pub Date : 2019-11-10 , DOI: 10.1111/biom.13137
Yishu Xue 1 , HaiYing Wang 1 , Jun Yan 1 , Elizabeth D Schifano 1
Affiliation  

The Cox model, which remains as the first choice in analyzing time-to-event data even for large datasets, relies on the proportional hazards (PH) assumption. When survival data arrive sequentially in chunks, a fast and minimally storage intensive approach to test the PH assumption is desirable. We propose an online updating approach that updates the standard test statistic as each new block of data becomes available, and greatly lightens the computational burden. Under the null hypothesis of PH, the proposed statistic is shown to have the same asymptotic distribution as the standard version computed on the entire data stream with the data blocks pooled into one dataset. In simulation studies, the test and its variant based on most recent data blocks maintain their sizes when the PH assumption holds and have substantial power to detect different violations of the PH assumption. We also show in simulation that our approach can be used successfully with "big data" that exceed a single computer's computational resources. The approach is illustrated with the survival analysis of patients with lymphoma cancer from the Surveillance, Epidemiology, and End Results Program. The proposed test promptly identified deviation from the PH assumption that was not captured by the test based on the entire data. This article is protected by copyright. All rights reserved.

中文翻译:

一种使用生存数据流测试比例危害假设的在线更新方法

即使对于大型数据集,Cox 模型仍然是分析事件时间数据的首选,它依赖于比例风险 (PH) 假设。当生存数据以块的形式顺序到达时,需要一种快速且存储密集程度最低的方法来测试 PH 假设。我们提出了一种在线更新方法,在每个新数据块可用时更新标准测试统计量,并大大减轻计算负担。在 PH 的零假设下,所提出的统计量与在整个数据流上计算的标准版本具有相同的渐近分布,其中数据块汇集到一个数据集中。在模拟研究中,当 PH 假设成立时,基于最新数据块的测试及其变体保持其大小,并具有检测不同违反 PH 假设的能力。我们还在模拟中表明,我们的方法可以成功地用于超过单台计算机计算资源的“大数据”。该方法通过来自监测、流行病学和最终结果计划的淋巴瘤癌症患者的生存分析得到说明。提议的测试迅速确定了与基于整个数据的测试未捕获的 PH 假设的偏差。本文受版权保护。版权所有。s 计算资源。该方法通过来自监测、流行病学和最终结果计划的淋巴瘤癌症患者的生存分析得到说明。提议的测试迅速确定了与基于整个数据的测试未捕获的 PH 假设的偏差。本文受版权保护。版权所有。s 计算资源。该方法通过来自监测、流行病学和最终结果计划的淋巴瘤癌症患者的生存分析得到说明。提议的测试迅速确定了与基于整个数据的测试未捕获的 PH 假设的偏差。本文受版权保护。版权所有。
更新日期:2019-11-10
down
wechat
bug