On the optimality of sliced inverse regression in high dimensions,Annals of Statistics

当前位置： X-MOL 学术 › Ann. Stat. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

On the optimality of sliced inverse regression in high dimensions
Annals of Statistics ( IF 4.5 ) Pub Date : 2021-01-29 , DOI: 10.1214/19-aos1813
Qian Lin , Xinran Li , Dongming Huang , Jun S. Liu

The central subspace of a pair of random variables $(y,\boldsymbol{x})\in \mathbb{R}^{p+1}$ is the minimal subspace $\mathcal{S}$ such that $y\perp\!\!\!\!\!\perp \boldsymbol{x}|P_{\mathcal{S}}\boldsymbol{x}$. In this paper, we consider the minimax rate of estimating the central space under the multiple index model $y=f(\boldsymbol{\beta }_{1}^{\tau }\boldsymbol{x},\boldsymbol{\beta }_{2}^{\tau }\boldsymbol{x},\ldots,\boldsymbol{\beta }_{d}^{\tau }\boldsymbol{x},\epsilon )$ with at most $s$ active predictors, where $\boldsymbol{x}\sim N(0,\boldsymbol{\Sigma })$ for some class of $\boldsymbol{\Sigma }$. We first introduce a large class of models depending on the smallest nonzero eigenvalue $\lambda $ of $\operatorname{var}(\mathbb{E}[\boldsymbol{x}|y])$, over which we show that an aggregated estimator based on the SIR procedure converges at rate $d\wedge ((sd+s\log (ep/s))/(n\lambda ))$. We then show that this rate is optimal in two scenarios, the single index models and the multiple index models with fixed central dimension $d$ and fixed $\lambda $. By assuming a technical conjecture, we can show that this rate is also optimal for multiple index models with bounded dimension of the central space.

中文翻译：

高维切片逆回归的最优性

\ mathbb {R} ^ {p + 1} $中的一对随机变量$（y，\ boldsymbol {x}）\的中心子空间是最小子空间$ \ mathcal {S} $，使得$ y \ perp \！\！\！\！\！\！\ perp \ boldsymbol {x} | P _ {\ mathcal {S}} \ boldsymbol {x} $。在本文中，我们考虑了在多重索引模型$ y = f（\ boldsymbol {\ beta} _ {1} ^ {\ tau} \ boldsymbol {x}，\ boldsymbol {\ beta } _ {2} ^ {\ tau} \ boldsymbol {x}，\ ldots，\ boldsymbol {\ beta} _ {d} ^ {\ tau} \ boldsymbol {x}，\ epsilon）$不超过$ s $主动预测变量，其中$ \ boldsymbol {x} \ sim N（0，\ boldsymbol {\ Sigma}）$对于某类$ \ boldsymbol {\ Sigma} $。首先，我们根据$ \ operatorname {var}（\ mathbb {E} [\ boldsymbol {x} | y]）$的最小非零特征值$ \ lambda $引入一大类模型，在此之上，我们显示了基于SIR过程的聚合估计量收敛于速率$ d \ wedge（（sd + s \ log（ep / s））/（n \ lambda））$。然后，我们表明此速率在两种情况下是最佳的，即具有固定中心维度$ d $和固定$ \ lambda $的单索引模型和多索引模型。通过假设一个技术猜测，我们可以证明该速率对于中心空间有界的多个索引模型也是最佳的。

更新日期：2021-01-29

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>