Temporal prediction of future state occupation in a multistate model from high-dimensional baseline covariates via pseudo-value regression,Journal of Statistical Computation and Simulation

当前位置： X-MOL 学术 › J. Stat. Comput. Simul. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Temporal prediction of future state occupation in a multistate model from high-dimensional baseline covariates via pseudo-value regression
Journal of Statistical Computation and Simulation ( IF 1.2 ) Pub Date : 2016-12-20 , DOI: 10.1080/00949655.2016.1263992
Sandipan Dutta ₁ , Susmita Datta ₂ , Somnath Datta ₂

Affiliation

ABSTRACT In many complex diseases such as cancer, a patient undergoes various disease stages before reaching a terminal state (say disease free or death). This fits a multistate model framework where a prognosis may be equivalent to predicting the state occupation at a future time t. With the advent of high-throughput genomic and proteomic assays, a clinician may intent to use such high-dimensional covariates in making better prediction of state occupation. In this article, we offer a practical solution to this problem by combining a useful technique, called pseudo-value (PV) regression, with a latent factor or a penalized regression method such as the partial least squares (PLS) or the least absolute shrinkage and selection operator (LASSO), or their variants. We explore the predictive performances of these combinations in various high-dimensional settings via extensive simulation studies. Overall, this strategy works fairly well provided the models are tuned properly. Overall, the PLS turns out to be slightly better than LASSO in most settings investigated by us, for the purpose of temporal prediction of future state occupation. We illustrate the utility of these PV-based high-dimensional regression methods using a lung cancer data set where we use the patients’ baseline gene expression values.

中文翻译：

通过伪值回归从高维基线协变量对多状态模型中未来状态占用的时间预测

摘要在许多复杂疾病（如癌症）中，患者在达到终末状态（例如无病或死亡）之前会经历不同的疾病阶段。这符合多状态模型框架，其中预测可能等效于预测未来时间 t 的状态占用。随着高通量基因组和蛋白质组学检测的出现，临床医生可能打算使用这种高维协变量来更好地预测状态职业。在本文中，我们通过将一种称为伪值 (PV) 回归的有用技术与潜在因子或惩罚回归方法（如偏最小二乘法 (PLS) 或最小绝对收缩率）相结合，为该问题提供了一个实用的解决方案和选择运算符 (LASSO) 或其变体。我们通过广泛的模拟研究探索了这些组合在各种高维设置中的预测性能。总的来说，只要模型调整得当，这个策略就可以很好地工作。总的来说，在我们调查的大多数情况下，PLS 比 LASSO 略好，目的是对未来状态占领进行时间预测。我们使用肺癌数据集说明了这些基于 PV 的高维回归方法的效用，其中我们使用了患者的基线基因表达值。

更新日期：2016-12-20

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>