当前位置: X-MOL 学术Int. J. Biostat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Significance Tests for Boosted Location and Scale Models with Linear Base-Learners
International Journal of Biostatistics ( IF 1.2 ) Pub Date : 2019-04-16 , DOI: 10.1515/ijb-2018-0110
Tobias Hepp 1, 2 , Matthias Schmid 1 , Andreas Mayr 1
Affiliation  

Generalized additive models for location scale and shape (GAMLSS) offer very flexible solutions to a wide range of statistical analysis problems, but can be challenging in terms of proper model specification. This complex task can be simplified using regularization techniques such as gradient boosting algorithms, but the estimates derived from such models are shrunken towards zero and it is consequently not straightforward to calculate proper confidence intervals or test statistics. In this article, we propose two strategies to obtain p-values for linear effect estimates for Gaussian location and scale models based on permutation tests and a parametric bootstrap approach. These procedures can provide a solution for one of the remaining problems in the application of gradient boosting algorithms for distributional regression in biostatistical data analyses. Results from extensive simulations indicate that in low-dimensional data both suggested approaches are able to hold the type-I error threshold and provide reasonable test power comparable to the Wald-type test for maximum likelihood inference. In high-dimensional data, when gradient boosting is the only feasible inference for this model class, the power decreases but the type-I error is still under control. In addition, we demonstrate the application of both tests in an epidemiological study to analyse the impact of physical exercise on both average and the stability of the lung function of elderly people in Germany.

中文翻译:

具有线性基础学习器的增强位置和比例模型的显着性检验

位置尺度和形状的广义加法模型 (GAMLSS) 为广泛的统计分析问题提供了非常灵活的解决方案,但在正确的模型规范方面可能具有挑战性。这个复杂的任务可以使用正则化技术(例如梯度提升算法)来简化,但是从这些模型得出的估计值会缩小到零,因此计算适当的置信区间或测试统计数据并不简单。在本文中,我们提出了两种策略来获得p-基于置换测试和参数引导方法的高斯位置和比例模型的线性效应估计值。这些程序可以为生物统计数据分析中的分布回归应用梯度提升算法中的剩余问题之一提供解决方案。广泛模拟的结果表明,在低维数据中,两种建议的方法都能够保持 I 类错误阈值,并提供与 Wald 类测试相当的合理测试能力,以进行最大似然推断。在高维数据中,当梯度提升是该模型类唯一可行的推理时,功率降低但 I 类错误仍在控制之中。此外,
更新日期:2019-04-16
down
wechat
bug