当前位置: X-MOL 学术Stat. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Polya tree-based nearest neighborhood regression
Statistics and Computing ( IF 2.2 ) Pub Date : 2022-07-30 , DOI: 10.1007/s11222-021-10076-w
Haoxin Zhuang, Liqun Diao, Grace Yi

Parametric regression, such as linear regression, plays an important role in statistics. The use of parametric regression models typically involves the specification of a regression function of the covariates, the distribution of response and the link between the response and covariates, which are commonly at the risk of misspecification. In this paper, we introduce a fully nonparametric regression model, a Polya tree (PT)-based nearest neighborhood regression. To approximate the true conditional probability measure of the response given the covariate value, we construct a PT-distributed probability measure of the response in the nearest neighborhood of the covariate value of interest. Our proposed method gives consistent and robust estimators, and has a faster convergence rate than the kernel density estimation. We conduct extensive simulation studies and analyze a Combined Cycle Power Plant dataset to compare the performance of our method relative to kernel density estimation, PT density estimation, and linear dependent tail-free process (LDTFP). The studies suggest that the proposed method exhibits the superiority to the kernel and PT density estimation methods in terms of the estimation accuracy and convergence rate and to LDTFP in terms of robustness.



中文翻译:

基于 Polya 树的最近邻域回归

参数回归,例如线性回归,在统计学中起着重要的作用。参数回归模型的使用通常涉及协变量回归函数的规范、响应分布以及响应和协变量之间的联系,这些通常存在错误指定的风险。在本文中,我们介绍了一种完全非参数回归模型,即基于 Polya 树 (PT) 的最近邻域回归。为了在给定协变量值的情况下逼近响应的真实条件概率度量,我们在感兴趣的协变量值的最近邻域中构建响应的 PT 分布概率度量。我们提出的方法给出了一致且稳健的估计器,并且比核密度估计具有更快的收敛速度。我们进行了广泛的模拟研究并分析了联合循环发电厂数据集,以比较我们的方法相对于核密度估计、PT 密度估计和线性相关无尾过程 (LDTFP) 的性能。研究表明,该方法在估计精度和收敛速度方面优于核和 PT 密度估计方法,在鲁棒性方面优于 LDTFP。

更新日期:2022-07-31
down
wechat
bug