Markov Neighborhood Regression for High-Dimensional Inference,Journal of the American Statistical Association

当前位置： X-MOL 学术 › J. Am. Stat. Assoc. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Markov Neighborhood Regression for High-Dimensional Inference
Journal of the American Statistical Association ( IF 3.0 ) Pub Date : 2021-01-19 , DOI: 10.1080/01621459.2020.1841646
Faming Liang ₁ , Jingnan Xue ₂ , Bochao Jia ₃

Affiliation

Abstract

This article proposes an innovative method for constructing confidence intervals and assessing p-values in statistical inference for high-dimensional linear models. The proposed method has successfully broken the high-dimensional inference problem into a series of low-dimensional inference problems: For each regression coefficient β_i, the confidence interval and p-value are computed by regressing on a subset of variables selected according to the conditional independence relations between the corresponding variable X_i and other variables. Since the subset of variables forms a Markov neighborhood of X_i in the Markov network formed by all the variables $X_{1}, X_{2}, \dots, X_{p}$ , the proposed method is coined as Markov neighborhood regression (MNR). The proposed method is tested on high-dimensional linear, logistic, and Cox regression. The numerical results indicate that the proposed method significantly outperforms the existing ones. Based on the MNR, a method of learning causal structures for high-dimensional linear models is proposed and applied to identification of drug sensitive genes and cancer driver genes. The idea of using conditional independence relations for dimension reduction is general and potentially can be extended to other high-dimensional or big data problems as well. Supplementary materials for this article are available online.

中文翻译：

用于高维推理的马尔可夫邻域回归

摘要

本文提出了一种在高维线性模型的统计推断中构建置信区间和评估p值的创新方法。所提出的方法成功地将高维推理问题分解为一系列低维推理问题：对于每个回归系数β _i ，通过对根据条件选择的变量子集进行回归来计算置信区间和p值对应变量X _i与其他变量之间的独立关系。由于变量子集在所有变量形成的马尔可夫网络中形成X _i的马尔可夫邻域 $X_{1}, X_{2}, \dots, X_{p}$ ，所提出的方法被称为马尔可夫邻域回归（MNR）。所提出的方法在高维线性、逻辑和 Cox 回归上进行了测试。数值结果表明，所提出的方法明显优于现有方法。基于MNR，提出了一种学习高维线性模型因果结构的方法，并将其应用于药物敏感基因和癌症驱动基因的识别。使用条件独立关系进行降维的想法是普遍的，并且可能也可以扩展到其他高维或大数据问题。本文的补充材料可在线获取。

更新日期：2021-01-19

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11